Monday, December 16, 2013

A Better Way to Find Your New Favorite Subreddit, With Science

Reddit's front page of cat pictures and memes belies its diverse underbelly of subreddits — hundreds of thousands of link-sharing sites within the larger site that cater to interests ranging from movies to My Little Pony.

The sheer amount of subreddits can be overwhelming, making it difficult for a casual browser to find and contribute to the subreddits that match their interests. Seeking a better way to navigate the massive link-sharing social network, computer science student Randal Olson (Mighigan State) and sociologist Zachary Neal (Michigan State) teamed up to map similarities among subreddits.

Their research produced an interactive map called Redditviz detailing subreddits with overlapping participants. You can see a screenshot of the interactive map below.

Image Credit: Randal Olson/Zachary Neal/RedditViz


To create the map, Olson and Neal collected the 1,000 most recent link submissions and comments from over 850,000 anonymized, active reddit users. With this data, they could see the various subreddits that individual users posted to.

If a user had 10 or more submissions to a particular subreddit, the researchers categorized them as an active poster for that subreddit. After analyzing the hundreds of thousands of users, the team checked to see if the number of shared active posters between two subreddits surpassed a statistically significant level (from the perspective of both subreddits) relative to the team's null model. If both subreddits passed this test, the researchers connected them on the map.

While simply zooming in on the big interactive map proved fascinating, my favorite (and perhaps the most useful) map tool is the ability to search for a particular subreddit and see related ones. Looking up the physics subreddit, for instance, showed its connection to subreddits such as /r/astronomy, /r/math, and /r/space, as you might expect.

Image Credit: Randal Olson/Zachary Neal/RedditViz

Other connections weren't quite as obvious, however. Plugging in /r/personalfinance, for example, yields a number of related subreddits that may not immediately come to mind: /r/malefashionadvice, /r/keto (for ketogenic dieting), and /r/minimalism to name a few. Try it out for yourself, and you might just find something unexpected.

I'm also curious how, if at all, the use of "throwaway" accounts may have altered the data. Many users use temporary accounts when seeking advice or revealing a personal story. The use of throwaway accounts pervade certain subreddits (e.g. advice subreddits such as /r/relationships) but not others, possibly skewing the data ever so slightly. That's assuming throwaways are used for 10 or more comments/submissions, of course.

Olson and Neal detailed their methodology in a paper published on the arXiv late last week. Check it out to satisfy your hunger for data visualization.

No comments:

Post a Comment