64 datasets found
  1. Reddit usage reach in the United States 2023, by ethnicity

    • statista.com
    Updated Feb 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Reddit usage reach in the United States 2023, by ethnicity [Dataset]. https://www.statista.com/statistics/261770/share-of-us-internet-users-who-use-reddit-by-ethnicity/
    Explore at:
    Dataset updated
    Feb 17, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Feb 1, 2024 - Jun 10, 2024
    Area covered
    United States
    Description

    According to a survey of internet users conducted in the United States between February and June, 2024, 14 percent of Black Americans reported having ever used Reddit. Asian Americans appeared to be more likely than both Black and white Americans to have ever used the social media and community forum, with 36 percent of users in the demographic reporting to have used the popular forum and social media.

  2. Reddit usage reach in the United States 2024, by age group

    • statista.com
    Updated Feb 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Reddit usage reach in the United States 2024, by age group [Dataset]. https://www.statista.com/statistics/261766/share-of-us-internet-users-who-use-reddit-by-age-group/
    Explore at:
    Dataset updated
    Feb 17, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Feb 1, 2024 - Jun 10, 2024
    Area covered
    United States
    Description

    According to a survey of adults in the United States in 2024, 46 percent of respondents who used Reddit were aged between 19 and 29 years. Reddit usage tends to be affected by users’ age, with older users reporting lower levels of engagement. Reddit engagement in numbers Reddit is one of the most popular websites in the forum category, allowing users to interact in multiple close-knitted communities organized in sub-threads and divided by topics. In March 2024, Reddit.com registered an average of 2.2 billion monthly visits from desktop and mobile combined. Reddit users are mostly based in North America, with the United States accounting for the biggest share of traffic worldwide by far. The future of Reddit Reddit was created in 2005, was redesigned for the very first time in 2018 to make it more appealing to new users and increase engagement from non-participating guests (jokingly called “lurkers”) who nonetheless enjoy the content. In February 2024, the company announced it was entering the public market by releasing its S-1 registration statement. In 2024, the company generated around 1.3 billion U.S. dollars worldwide in revenues. This translated into an average revenue per user (ARPU) of around 4.21 dollars in the last quarter of 2024.

  3. Distribution of Reddit.com traffic 2024, by country

    • statista.com
    Updated Nov 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Distribution of Reddit.com traffic 2024, by country [Dataset]. https://www.statista.com/statistics/325144/reddit-global-active-user-distribution/
    Explore at:
    Dataset updated
    Nov 11, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    In the six months ending March 2024, the United States accounted for 48.46 percent of traffic to the online forum Reddit.com. The United Kingdom was ranked second, accounting for 7.16 percent of web visits to the social media platform. Reddit in the United States In August 2023, Reddit accounted for slightly over 1.6 percent of social media website traffic in the United States. Founded in 2005, Reddit is a discussion website which enables users to aggregate news by posting links and let other users vote and comment on them. There are thousands of subforums, called subreddits, on a wide range of topics available. One of the most popular subreddits is the AMA (“Ask Me Anything”), where celebrities, public figures or people in unique positions post threads that allow other Reddit users to ask them anything. In 2022, Nicolas Cage's AMA post generated over 238.5 thousand upvotes, making it the most popular AMA of the year. Reddit users in the United States Reddit use in the United States is more prevalent among younger online audiences. During a February 2021 survey, it was found that 36 percent of internet users aged 18 to 29 years and 22 percent of users aged 30 to 49 years used Reddit. However, the reach of the social platform strongly declines with age. Also, whilst around a 23 of male adults in the U.S. access Reddit, only 12 percent of women do the same.

  4. Reddit usage penetration in the UK 2020, by age group

    • statista.com
    Updated May 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Reddit usage penetration in the UK 2020, by age group [Dataset]. https://www.statista.com/statistics/1184024/reddit-user-demographics/
    Explore at:
    Dataset updated
    May 20, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    United Kingdom
    Description

    As of the third quarter of 2020, almost a quarter of UK internet users aged 26 to 35 years used news aggregator and social media platform Reddit. The platform was just as popular with younger internet users aged 15 to 25 years. Just one percent of internet users over the age of 56 said they used Reddit.

  5. Reddit: distribution of global audiences 2024, by gender

    • statista.com
    Updated Feb 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Reddit: distribution of global audiences 2024, by gender [Dataset]. https://www.statista.com/statistics/1255182/distribution-of-users-on-reddit-worldwide-gender/
    Explore at:
    Dataset updated
    Feb 13, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    As of the third quarter of 2024, the majority of Reddit users were male, accounting for 59.8 percent of its audience base. Overall, women accounted for roughly 39.1 percent of the website users. Additionally, most of Reddit's desktop users were based in the United States.

  6. Reddit app user ratio in the U.S. 2021, by age group

    • statista.com
    Updated Feb 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Reddit app user ratio in the U.S. 2021, by age group [Dataset]. https://www.statista.com/statistics/1125159/reddit-us-app-users-age/
    Explore at:
    Dataset updated
    Feb 26, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Mar 2021
    Area covered
    United States
    Description

    As of March 2021, users in their twenties and thirties accounted for almost two-thirds of Reddit active user accounts in the United States. According to recent data, users aged 20 to 29 years, accounted for 28.1 percent of the social news app's user base on the Android platform.

  7. Reddit users in the United States 2019-2028

    • statista.com
    • ai-chatbox.pro
    Updated Jun 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista Research Department (2024). Reddit users in the United States 2019-2028 [Dataset]. https://www.statista.com/topics/3196/social-media-usage-in-the-united-states/
    Explore at:
    Dataset updated
    Jun 13, 2024
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Statista Research Department
    Area covered
    United States
    Description

    The number of Reddit users in the United States was forecast to continuously increase between 2024 and 2028 by in total 10.3 million users (+5.21 percent). After the ninth consecutive increasing year, the Reddit user base is estimated to reach 208.12 million users and therefore a new peak in 2028. Notably, the number of Reddit users of was continuously increasing over the past years.User figures, shown here with regards to the platform reddit, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once. Reddit users encompass both users that are logged in and those that are not.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Reddit users in countries like Mexico and Canada.

  8. Reddit user worldwide 2024, by country

    • statista.com
    Updated Jul 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Reddit user worldwide 2024, by country [Dataset]. https://www.statista.com/forecasts/1174696/reddit-user-by-country
    Explore at:
    Dataset updated
    Jul 10, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 1, 2024 - Dec 31, 2024
    Area covered
    Albania
    Description

    Comparing the *** selected regions regarding the number of Reddit users , the United States is leading the ranking (****** million users) and is followed by the United Kingdom with ***** million users. At the other end of the spectrum is Gabon with **** million users, indicating a difference of ****** million users to the United States. User figures, shown here with regards to the platform reddit, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once. Reddit users encompass both users that are logged in and those that are not.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to *** countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).

  9. o

    SocialGrep Reddit Comment & Sentiment

    • opendatabay.com
    .undefined
    Updated Jul 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datasimple (2025). SocialGrep Reddit Comment & Sentiment [Dataset]. https://www.opendatabay.com/data/ai-ml/1ed11deb-f713-4db9-9e9a-55c0f9107164
    Explore at:
    .undefinedAvailable download formats
    Dataset updated
    Jul 5, 2025
    Dataset authored and provided by
    Datasimple
    Area covered
    Data Science and Analytics
    Description

    This dataset provides an in-depth corpus of posts and comments from the Reddit board /r/datasets, covering its entire history up to 1st March 2022. Its primary purpose is to serve as a collection of datasets related to Reddit content, enabling analysts and data scientists to explore online community data. The data was acquired using SocialGrep. To safeguard user privacy, usernames have been excluded from this dataset, preventing targeted harassment and preserving anonymity. It includes details such as comment body text, sentiment analysis, and comment scores, offering a rich resource for various analytical tasks.

    Columns

    • type: Denotes the type of the data point.
    • id: A unique Base-36 identifier for each comment.
    • subreddit.id: A unique Base-36 identifier for the subreddit where the comment was posted.
    • subreddit.name: The human-readable name of the subreddit.
    • subreddit.nsfw: Indicates whether the comment's subreddit is Not Safe For Work (NSFW).
    • created_utc: The timestamp in Coordinated Universal Time (UTC) when the comment was created.
    • permalink: The permanent link to the comment on Reddit.
    • body: The main text content of the comment.
    • sentiment: The analysed sentiment score for the comment's body text.
    • score: The numerical score assigned to the comment.

    Distribution

    The dataset is structured as a table containing all comments. While the specific file format is typically CSV, the total number of values for key columns such as id, subreddit.id, created_utc, permalink, body, sentiment, and score is 54,848 records. For the subreddit.nsfw column, all 54,848 values indicate 'false', meaning no NSFW subreddits are included in this specific count. The body column shows that 5% of comments are '[deleted]', 2% are '[removed]', and the remaining 93% consist of other content. Sentiment scores range from -1.00 to 1.00, with varying distributions across different ranges. Comment scores range from -65 to 195, also with varying frequencies across score bands.

    Usage

    This dataset is ideally suited for data science and analytics projects. It can be used for: * Natural Language Processing (NLP) tasks, such as text analysis and sentiment classification. * Studying the dynamics of online communities and social networks. * Analyzing user sentiment towards various topics discussed on Reddit. * Exploring the factors influencing comment scores and engagement. * Developing models for content moderation or recommendation based on Reddit data.

    Coverage

    The dataset spans a significant time range, including all posts and comments from the inception of the /r/datasets board up to 1st March 2022. Its geographic scope is global, representing activity across Reddit's platform without specific regional limitations. The demographic scope primarily focuses on the users interacting within the /r/datasets community on Reddit. As mentioned, usernames are specifically excluded to ensure user anonymity.

    License

    CC-BY

    Who Can Use It

    This dataset is valuable for a wide range of users, including: * Data scientists and analysts looking for real-world social media data for their projects. * Researchers in fields such as computer science, social networks, and linguistics, for studying online behaviour and communication patterns. * Developers creating applications that involve text analysis or sentiment prediction. * Anyone interested in gaining insights into Reddit communities and their discussions.

    Dataset Name Suggestions

    • Reddit /r/datasets Comment Log
    • Analysed Reddit Community Posts
    • SocialGrep Reddit Comment & Sentiment
    • Reddit Data Science Discussions
    • Online Community Text Data

    Attributes

    Original Data Source: The Reddit Dataset Dataset

  10. Reddit users in Israel 2020-2028

    • ai-chatbox.pro
    • statista.com
    Updated Oct 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista Research Department (2024). Reddit users in Israel 2020-2028 [Dataset]. https://www.ai-chatbox.pro/?_=%2Ftopics%2F9744%2Fsocial-media-in-israel%2F%23XgboD02vawLbpWJjSPEePEUG%2FVFd%2Bik%3D
    Explore at:
    Dataset updated
    Oct 31, 2024
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Statista Research Department
    Area covered
    Israel
    Description

    The number of Reddit users in Israel was forecast to increase between 2024 and 2028 by in total 0.01 million users (+0.76 percent). This overall increase does not happen continuously, notably not in 2027. The Reddit user base is estimated to amount to 1.32 million users in 2028. Notably, the number of Reddit users of was continuously increasing over the past years.User figures, shown here with regards to the platform reddit, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once. Reddit users encompass both users that are logged in and those that are not.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Reddit users in countries like Bahrain and Kuwait.

  11. Reddit usage reach in the United States 2021, by education

    • statista.com
    Updated Feb 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Reddit usage reach in the United States 2021, by education [Dataset]. https://www.statista.com/statistics/261776/share-of-us-internet-users-who-use-reddit-by-education-level/
    Explore at:
    Dataset updated
    Feb 26, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 25, 2021 - Feb 8, 2021
    Area covered
    United States
    Description

    According to a February 2021 survey of internet users based in the United States, respondents that attended college were more likely to use Reddit, when compared to respondents with lower levels of education. 26 percent of respondents with a bachelor's or advanced degrees reported using the social network, compared to only nine percent of respondents holding a high school diploma or less.

  12. o

    Reddit Data Science Community Conversations

    • opendatabay.com
    .undefined
    Updated Jul 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datasimple (2025). Reddit Data Science Community Conversations [Dataset]. https://www.opendatabay.com/data/ai-ml/a27d0e5e-f087-4294-ba4d-f03598447dda
    Explore at:
    .undefinedAvailable download formats
    Dataset updated
    Jul 6, 2025
    Dataset authored and provided by
    Datasimple
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Social Media and Networking
    Description

    This dataset contains posts and comments extracted from the r/datascience subreddit, a highly active discussion forum on Reddit with over 600,000 contributors. It offers valuable insights into the conversations and trends within the data science community, providing raw material for various analytical endeavours. The content is directly generated by the subreddit's contributors, reflecting authentic community engagement.

    Columns

    • title: The textual title of a Reddit post.
    • score: The score or upvote count for a post or comment, indicating its popularity or agreement.
    • id: A unique identifier assigned to each post or comment.
    • url: The web address for the Reddit post or an associated external link.
    • comms_num: The total number of comments associated with a specific post.
    • created: The Unix timestamp indicating when the post or comment was created.
    • body: The main textual content of a Reddit post or comment.
    • timestamp: Another timestamp field, likely similar to 'created', marking the time of creation.

    Distribution

    The dataset is typically provided in a CSV format. * Score Distribution: Scores vary significantly, ranging from -91 to 2952. A large proportion of entries, specifically 20,526, fall within the -91.00 to 61.15 score range. Another view indicates 20,762 entries are in the 0.00 to 31.75 score range. There are 21,095 unique score values. * Time Coverage Distribution: The data covers a period from December 9, 2021, to April 22, 2022. There are 20,573 unique timestamp values. Activity peaks in late March 2022, with up to 2,830 entries in a single week.

    Usage

    This dataset is ideal for: * Analysing discussion topics prevalent within the r/datascience subreddit. * Understanding the tone of conversations among data science professionals and enthusiasts. * Identifying the dominant sentiment expressed in posts and comments. * Exploring the lexical particularities unique to the data science community's discussions. * Tracking trends and shifts in popular topics and opinions over time.

    Coverage

    The dataset offers global coverage regarding the community discussions. It spans a distinct time range from December 9, 2021, to April 22, 2022. The content reflects the diverse perspectives of over 600,000 contributors to the r/datascience subreddit, providing a wide demographic scope of individuals interested in data science.

    License

    CC0

    Who Can Use It

    • Data scientists and machine learning engineers for natural language processing (NLP) tasks such as topic modeling, sentiment analysis, or text classification.
    • Social media analysts and researchers studying online community behaviour, trends, and user engagement patterns.
    • Linguists and computational linguists examining the specific language usage within professional online forums.
    • Academic researchers interested in the evolution of discussions within the data science field.

    Dataset Name Suggestions

    • Reddit Data Science Community Conversations
    • r/datascience Subreddit Activity Log
    • Data Science Forum Discussions Archive
    • Reddit Data Science Posts and Comments

    Attributes

    Original Data Source: Data Science on Reddit

  13. o

    Ask Reddit Community Conversations

    • opendatabay.com
    .undefined
    Updated Jul 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datasimple (2025). Ask Reddit Community Conversations [Dataset]. https://www.opendatabay.com/data/ai-ml/3e1a278c-48c0-483a-90ee-31de07745fcf
    Explore at:
    .undefinedAvailable download formats
    Dataset updated
    Jul 6, 2025
    Dataset authored and provided by
    Datasimple
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Social Media and Networking
    Description

    This dataset features posts and comments sourced from r/AskReddit, one of Reddit's largest online communities. It serves as a rich repository of questions and answers covering a diverse array of random topics in the English language. The data is unfiltered, making it a valuable resource for sentiment analysis and identifying discussion themes within social media contexts.

    Columns

    • title: The title of the post (relevant for posts).
    • score: The score of the post, indicative of its impact and the number of comments it received (relevant for posts).
    • id: A unique identifier for each post or comment.
    • url: The URL of the post thread (relevant for posts).
    • comms_num: The total number of comments associated with a particular post (relevant for posts).
    • created: The date on which the post or comment was created.
    • body: The main text content of the post or comment.
    • timestamp: A numerical timestamp indicating the time of creation.

    Distribution

    The dataset is typically provided in CSV format. It contains both posts and comments from the r/AskReddit subreddit. While specific overall record counts are not stated, the data includes over 44,900 unique entries for 'score' and features a wide range of timestamps, from approximately 1.63 billion to 1.64 billion, indicating a significant volume of data collected over time. Data collection occurs daily.

    Usage

    This dataset is ideally suited for various applications, including: * Performing sentiment analysis on user-generated content. * Identifying trending discussion topics and popular themes. * Developing and testing Natural Language Processing (NLP) models. * Analysing patterns in online communities and social networks.

    Coverage

    The dataset's coverage is global in region and is collected daily. It encompasses content in the English language. Sample data indicates content from dates ranging from at least September 2021 to January 2022. The dataset captures contributions from a wide variety of users within the r/AskReddit community, without specific demographic filtering mentioned.

    License

    CC0

    Who Can Use It

    This dataset is beneficial for: * Data scientists and researchers focused on text analysis and NLP. * Social media analysts aiming to understand community dynamics and public sentiment. * AI and Machine Learning developers creating models for content classification or topic extraction. * Academics studying online communication patterns and user behaviour.

    Dataset Name Suggestions

    • Ask Reddit Community Conversations
    • Reddit Public Discourse Dataset
    • AskReddit NLP Data Collection
    • Global Forum Q&A Archive

    Attributes

    Original Data Source: Ask Reddit

  14. P

    PANDORA Dataset

    • paperswithcode.com
    • opendatalab.com
    Updated Jan 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matej Gjurković; Mladen Karan; Iva Vukojević; Mihaela Bošnjak; Jan Šnajder (2023). PANDORA Dataset [Dataset]. https://paperswithcode.com/dataset/pandora
    Explore at:
    Dataset updated
    Jan 31, 2023
    Authors
    Matej Gjurković; Mladen Karan; Iva Vukojević; Mihaela Bošnjak; Jan Šnajder
    Description

    PANDORA is the first large-scale dataset of Reddit comments labeled with three personality models (including the well-established Big 5 model) and demographics (age, gender, and location) for more than 10k users.

  15. o

    Reddit Tales From The Job Dataset

    • opendatabay.com
    .undefined
    Updated Jul 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datasimple (2025). Reddit Tales From The Job Dataset [Dataset]. https://www.opendatabay.com/data/ai-ml/1b16d0e8-dc05-49ac-a57b-ee13ae309c1c
    Explore at:
    .undefinedAvailable download formats
    Dataset updated
    Jul 6, 2025
    Dataset authored and provided by
    Datasimple
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Social Media and Networking
    Description

    This dataset contains Reddit posts and comments collected from the r/talesfromthejob subreddit, an online community where individuals share stories and experiences related to their employment. The data, which has not been filtered, was compiled using praw (The Python Reddit API Wrapper). It provides a rich source of social media text for various analytical purposes, including sentiment analysis and identifying discussion topics related to workplace narratives. Both posts and comments are included, offering a dual perspective on user-generated content.

    Columns

    • title: The title of the Reddit post.
    • score: The score of the post, indicating its impact or popularity, often based on upvotes and the number of comments.
    • id: A unique identifier for each post or comment.
    • url: The URL linking directly to the Reddit post thread.
    • comms_num: The total number of comments associated with a given post.
    • created: The date on which the post or comment was created.
    • body: The main text content of the post or comment.
    • timestamp: A numerical timestamp indicating the time of creation.

    Distribution

    The dataset is typically provided in a CSV format. It includes both Reddit posts and their corresponding comments. While specific row or record counts are not stated, the data ranges approximately from March 2012 to January 2022. The dataset is intended for global use.

    Usage

    This dataset is ideally suited for: * Performing sentiment analysis on workplace experiences. * Identifying common discussion topics and themes related to jobs and employment. * Natural Language Processing (NLP) research and model training. * Studying online community behaviour and content patterns on social media platforms.

    Coverage

    The data's geographic scope is global, reflecting content from a worldwide user base on Reddit. The time range covered by the dataset extends from March 2012 to January 2022. There are no specific notes on data availability for particular demographic groups, as the content is derived from a public subreddit.

    License

    CC0

    Who Can Use It

    This dataset is valuable for: * Data scientists and analysts interested in social media content. * Researchers studying workplace culture, sentiment, or online communication. * AI and LLM developers looking for text data to train models on conversational or narrative content. * Organisations seeking insights into employee experiences or public sentiment regarding work environments.

    Dataset Name Suggestions

    • Reddit Tales From The Job Dataset
    • Workplace Stories Reddit Data
    • Job Experiences Text Collection
    • Tales From The Job Reddit Posts & Comments

    Attributes

    Original Data Source: Reddit Tales From The Job

  16. o

    Cleaned Reddit Depression Data

    • opendatabay.com
    .undefined
    Updated Jul 2, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datasimple (2025). Cleaned Reddit Depression Data [Dataset]. https://www.opendatabay.com/data/ai-ml/26a7ea70-6f7a-44dc-aad2-3960a391254c
    Explore at:
    .undefinedAvailable download formats
    Dataset updated
    Jul 2, 2025
    Dataset authored and provided by
    Datasimple
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Mental Health & Wellness
    Description

    This dataset provides cleaned text content from Reddit posts, specifically curated for mental health classification. It is designed to facilitate the development of machine learning models that can identify and classify content related to depression. The raw data was initially collected through web scraping various Subreddits and has undergone processing using multiple Natural Language Processing (NLP) techniques to ensure cleanliness and usability. All content within the dataset is in the English language.

    Columns

    • clean_text: This column contains the processed and cleaned text from Reddit posts. It is the primary input feature for classification tasks.
    • is_depression: This column serves as the label for each post. It is a binary indicator, with '1' signifying that the post is classified as relating to depression and '0' indicating it is not. The dataset contains 3,900 instances labelled as 0 (non-depression) and 3,831 instances labelled as 1 (depression).

    Distribution

    The dataset typically comes in a tabular format, most commonly as a CSV file. It comprises 7,650 unique records or rows, each representing a single Reddit post with its corresponding cleaned text and depression label. While specific file size information is not provided, its structure is straightforward, consisting of two distinct columns.

    Usage

    This dataset is ideally suited for a variety of applications in the field of text classification and natural language processing. It can be effectively used to: * Train and evaluate machine learning models for detecting mental health-related content. * Develop tools for sentiment analysis or topic modelling within social media data. * Support research into online discussions about mental well-being and depression. * Build automated systems for content moderation or early intervention in digital mental health.

    Coverage

    The geographic scope of the dataset is global, as the source material from Reddit is not restricted to any particular region. The data is entirely in the English language. Specific demographic details of the original post creators are not included. Information regarding the precise time range of data collection is not available in the provided sources.

    License

    CCO

    Who Can Use It

    This dataset is valuable for a wide range of users, including: * Data scientists and machine learning engineers who are building and optimising text classification models. * Researchers in mental health, social sciences, and computational linguistics exploring online discourse. * Developers creating applications that leverage AI for mental health support or content analysis. * Academic institutions and students engaged in NLP projects or studies on social media data.

    Dataset Name Suggestions

    • Reddit Mental Health Posts
    • Depression Text Classifier
    • Cleaned Reddit Depression Data
    • Social Media Mental Health Classification
    • NLP Depression Dataset

    Attributes

    Original Data Source: [Depression: Reddit Dataset (Cleaned)]

  17. Reddit usage reach in the United States 2021, by annual household income

    • statista.com
    Updated Feb 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Reddit usage reach in the United States 2021, by annual household income [Dataset]. https://www.statista.com/statistics/261774/share-of-us-internet-users-who-use-reddit-by-annual-income/
    Explore at:
    Dataset updated
    Feb 26, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 25, 2021 - Feb 8, 2021
    Area covered
    United States
    Description

    This statistic shows the share of adults in the United States who were using Reddit as of February 2021, sorted by income. During that period of time, ten percent of respondents earning 30,000 U.S. dollars or less used the social networking site.

  18. f

    Reddit usage and demographics of cases and controls.

    • plos.figshare.com
    xls
    Updated Sep 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    William U. Meyerson; Rick H. Hoyle (2023). Reddit usage and demographics of cases and controls. [Dataset]. http://doi.org/10.1371/journal.pone.0291173.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Sep 8, 2023
    Dataset provided by
    PLOS ONE
    Authors
    William U. Meyerson; Rick H. Hoyle
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Reddit usage and demographics of cases and controls.

  19. f

    Reddit social dimensions

    • figshare.com
    bz2
    Updated Nov 27, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luca Maria Aiello; Sagar Joglekar; Daniele Quercia (2022). Reddit social dimensions [Dataset]. http://doi.org/10.6084/m9.figshare.19918231.v1
    Explore at:
    bz2Available download formats
    Dataset updated
    Nov 27, 2022
    Dataset provided by
    figshare
    Authors
    Luca Maria Aiello; Sagar Joglekar; Daniele Quercia
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Accompanying dataset for paper "Multidimensional Tie Strength and Economic Development"

    reddit_messages_dimensions:

    author: reddit users who sent the message time: timestamp of message dest_author: recipient of message author_state_code: estimated state of residence for the sender dest_state_code: estimated state of residence for the receiver {dimension}_binary_adaptive_{threshold}: binary score that indicates whether the message expresses the given dimension when filtering messages according to the specified threshold

    regression_file:

    state_code: the 2-letter code of the state state: name of the state lat_centroid: latitude of the state centroid lon_centroid: longitude of the state centroid population_2010: resident population in 2010 population_2015: resident population in 2019 population_2019: resident population in 2019 gdp_per_capita_2017: GDP per capita in year 2017 user_count: number of reddit users in the dataset {spatial|social}_diversity_{dimension}_{threshold}: measure of spatial|social diversity calculated for a given social dimension, using the specified threshold on the dimension scores capital: capital of the state diversity_{social|spatial}_all_minstrength_{n}: measure of spatial|social diversity calculated using all social links and a minimum threshold on n messages sent

  20. Reddit brand profile in the UK 2024

    • statista.com
    • ai-chatbox.pro
    Updated Nov 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Reddit brand profile in the UK 2024 [Dataset]. https://www.statista.com/forecasts/1304385/reddit-social-media-brand-profile-in-the-united-kingdom
    Explore at:
    Dataset updated
    Nov 22, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Feb 2024
    Area covered
    United Kingdom
    Description

    How high is the brand awareness of Reddit in the UK?When it comes to social media users, brand awareness of Reddit is at 73 percent in the UK. The survey was conducted using the concept of aided brand recognition, showing respondents both the brand's logo and the written brand name.How popular is Reddit in the UK?In total, 16 percent of UK social media users say they like Reddit.What is the usage share of Reddit in the UK?All in all, 16 percent of social media users in the UK use Reddit.How loyal are the users of Reddit?Around 13 percent of social media users in the UK say they are likely to use Reddit again.What's the buzz around Reddit in the UK?In 2024, about 8 percent of UK social media users had heard about Reddit in the media, on social media, or in advertising over the past four weeks.If you want to compare brands, do deep-dives by survey items of your choice, filter by total online population or users of a certain brand, or drill down on your very own hand-tailored target groups, our Consumer Insights Brand KPI survey has you covered.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Statista (2025). Reddit usage reach in the United States 2023, by ethnicity [Dataset]. https://www.statista.com/statistics/261770/share-of-us-internet-users-who-use-reddit-by-ethnicity/
Organization logo

Reddit usage reach in the United States 2023, by ethnicity

Explore at:
5 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Feb 17, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Feb 1, 2024 - Jun 10, 2024
Area covered
United States
Description

According to a survey of internet users conducted in the United States between February and June, 2024, 14 percent of Black Americans reported having ever used Reddit. Asian Americans appeared to be more likely than both Black and white Americans to have ever used the social media and community forum, with 36 percent of users in the demographic reporting to have used the popular forum and social media.

Search
Clear search
Close search
Google apps
Main menu