How We Index Subreddits
Last updated June 2026
Transparency matters for a discovery tool. Here is exactly how RedditShuffle builds its index of 1.45M+ communities, how it classifies content, and what the filters actually do.
Frequently asked questions
How does RedditShuffle decide if a subreddit is SFW or NSFW?
Every community in the index is classified using a multi-step process. First, we use the over-18 flag that Reddit itself reports for a community - that is the ground truth when available. When it is not, we fall back to a name-and-description heuristic that looks for explicit terms, with an override list to prevent false positives (for example, r/EarthPorn and r/sexeducation are not adult communities despite their names). The default mode is always SFW, so adult communities are excluded unless you explicitly turn on the NSFW toggle. NSFW pages are also blocked from search-engine crawlers in robots.txt as an extra safety layer.
What does the "Popular subreddits only" filter do?
By default the shuffle pulls from the entire index, including small and niche communities. The Popular-only toggle restricts results to communities above a subscriber threshold, so you land on established, active subreddits rather than tiny or dormant ones. This is the main difference between RedditShuffle and Reddit's built-in random button, which selects uniformly from every public community and frequently drops you on dead subreddits. The popular subset is roughly the top 150,000 SFW communities by member count, refreshed when the index is rebuilt.
How often is the index updated?
The subreddit index is rebuilt periodically from a combination of public Reddit data sources and curated lists. Subscriber counts for the largest communities are refreshed from a third-party stats source, so the most prominent communities reflect current numbers. For live, up-to-the-minute data on a single community, every detail page can call Reddit's official API to pull the current subscriber count, description, and creation date on demand.
Where does the subreddit data come from?
The index is assembled from publicly available Reddit community metadata - subreddit names, public descriptions, subscriber counts, and over-18 flags - merged across several open datasets and de-duplicated. We do not scrape private communities, user data, or post content. The index covers more than 1.45 million SFW communities plus a separately gated NSFW set. Banned and deleted communities are filtered out using zero-subscriber and known-ban signals so the tool does not send you to dead links.