Comparing the *** selected regions regarding the number of Reddit users , the United States is leading the ranking (****** million users) and is followed by the United Kingdom with ***** million users. At the other end of the spectrum is Gabon with **** million users, indicating a difference of ****** million users to the United States. User figures, shown here with regards to the platform reddit, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once. Reddit users encompass both users that are logged in and those that are not.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to *** countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).
The number of Reddit users in the United States was forecast to continuously increase between 2024 and 2028 by in total 10.3 million users (+5.21 percent). After the ninth consecutive increasing year, the Reddit user base is estimated to reach 208.12 million users and therefore a new peak in 2028. Notably, the number of Reddit users of was continuously increasing over the past years.User figures, shown here with regards to the platform reddit, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once. Reddit users encompass both users that are logged in and those that are not.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Reddit users in countries like Mexico and Canada.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Reddit is a social news, content rating and discussion website. It's one of the most popular sites on the internet. Reddit has 52 million daily active users and approximately 430 million users who use it once a month. Reddit has different subreddits and here We'll use the r/AskScience Subreddit.
The dataset is extracted from the subreddit /r/AskScience from Reddit. The data was collected between 01-01-2016 and 20-05-2022. It contains 612,668 Datapoints and 25 Columns. The database contains a number of information about the questions asked on the subreddit, the description of the submission, the flair of the question, NSFW or SFW status, the year of the submission, and more. The data is extracted using python and Pushshift's API. A little bit of cleaning is done using NumPy and pandas as well. (see the descriptions of individual columns below).
The dataset contains the following columns and descriptions: author - Redditor Name author_fullname - Redditor Full name contest_mode - Contest mode [implement obscured scores and randomized sorting]. created_utc - Time the submission was created, represented in Unix Time. domain - Domain of submission. edited - If the post is edited or not. full_link - Link of the post on the subreddit. id - ID of the submission. is_self - Whether or not the submission is a self post (text-only). link_flair_css_class - CSS Class used to identify the flair. link_flair_text - Flair on the post or The link flair’s text content. locked - Whether or not the submission has been locked. num_comments - The number of comments on the submission. over_18 - Whether or not the submission has been marked as NSFW. permalink - A permalink for the submission. retrieved_on - time ingested. score - The number of upvotes for the submission. description - Description of the Submission. spoiler - Whether or not the submission has been marked as a spoiler. stickied - Whether or not the submission is stickied. thumbnail - Thumbnail of Submission. question - Question Asked in the Submission. url - The URL the submission links to, or the permalink if a self post. year - Year of the Submission. banned - Banned by the moderator or not.
This dataset can be used for Flair Prediction, NSFW Classification, and different Text Mining/NLP tasks. Exploratory Data Analysis can also be done to get the insights and see the trend and patterns over the years.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This file contains the posting preferences for over 850,000 active reddit users. This sample was taken in mid-2013. This data was used to generate the interactive visualization, "redditviz," and will be analyzed in detail in an upcoming research article. Please cite our paper "Navigating the massive world of reddit" if you use this data in your work. URL: http://arxiv.org/abs/1312.3387 The file is organized as follows: Each line is an entry for an anonymous user. Each user was randomly assigned a unique ID, which is what shows in the first entry of each line. Following the user ID, separated by commas, are the subreddits (i.e., interests) that the user regularly posts in. In order for a user to be considered "active" in that subreddit, they had to post or comment there at least 10 times in their last 1,000 posts and comments.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By Reddit [source]
This dataset contains detailed information on posts, scores and comments from the Reddit subreddit ‘CryptoCurrency’ - a fascinating online community devoted to discussion and analysis of the latest developments in blockchain investments, digital currencies, and other associated topics. Dive into the data to see what ultimate insights cryptocurrency enthusiasts are offering each other - their post titles, scores (the net upvotes a post has received), comment counts, created dates and timestamps are all laid out here for easy exploration. By taking advantage of this unique snapshot into crypto discussions and trends you can gain a better understanding not only of what topics have been popular over time but also how they're being discussed across this passionate community. Are there particular trends or patterns that emerge? It's up to you to uncover them!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset contains posts and comments from the subreddit ‘CryptoCurrency’, which is a widely-followed discussion board devoted to discussing cryptocurrencies, blockchain investments, and other related topics. The dataset contains a large number of posts from the subreddit and their associated scores, comment counts and creation timestamps. This dataset can be used in numerous ways for both research and practical business applications.
First, let's explore what columns are contained within this dataset: title, score, url, comms_num (number of comments), created (date and time post was created), body (actual content of the post), timestamp. With this information at hand you can begin answering key questions such as: What type of topics bring more attention? What topics are not popular? Are there any correlations between posts with higher scores(upvotes) or more comments?
To better understand these questions there are numerous tools that can be employed on this data including Natural Language Processing tools such as TF-IDF vectorizers or Latent Dirichlet Allocation to understand what type of themes dominate these conversations. Additionally machine learning algorithms such as clustering techniques like K Nearest Neighbors or Unsupervised Learning techniques like Principal Component Analysis could help uncover insights from this data set. For example if we wanted to find out which words in titles correlated with higher scores then KNN could give us a better understanding as it would build clusters based on similar titles/words and show how each vary in relation score wise giving us an overview on how related words influence scores before analyzing content or any other factors within the data set.
Furthermore Reddit users actively engage with posts so by looking at comment counts insight can also be taken into effect regarding popularity etc... For example one may observe that whenever new coin values arise they tend to have more comments than usual - an insight indicating high levels of user engagement at certain moments in time when compared to regular periods which could be useful when making comparisons between individual coins etc..
Overall this data can provide tremendous value depending on its usage case - whether it stands for research purposes only or applied analytics geared towards predicting prices/engagement/ user sentiment etc it all depends but nonetheless opportunities lie within unlocking financial opportunities through cryptocurrency discussion found on reddit thus making it highly valuable for multiple purposes utilized properly!
- This dataset can be used to create a sentiment analysis of the comments and posts on CryptoCurrency topics and how these conversations have changed over time. This can help ascertain how different events within the crypto market have been received by investors, speculators, and other users on the subreddit.
- The dataset can also be utilized to identify trends in successful topics of conversation (in terms of post scores) and give insight into what types of topics are popular among Redditors in the CryptoCurrency space.
- Furthermore, this dataset could provide insight into user behavior on CryptoCurrency subreddits by enabling analysis around peak times for certain conversations or post popularity as well as which users tend to comment or post more frequently in response times vs others
If you use this dataset in your research, please credit the original authors. Data Source
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The ego-nets of Eastern European users collected from the music streaming service Deezer in February 2020. Nodes are users and edges are mutual follower relationships. The related task is the prediction of gender for the ego node in the graph.
The social networks of developers who starred popular machine learning and web development repositories (with at least 10 stars) until 2019 August. Nodes are users and links are follower relationships. The task is to decide whether a social network belongs to web or machine learning developers. We only included the largest component (at least with 10 users) of graphs.
Discussion and non-discussion based threads from Reddit which we collected in May 2018. Nodes are Reddit users who participate in a discussion and links are replies between them. The task is to predict whether a thread is discussion based or not (binary classification).
The ego-nets of Twitch users who participated in the partnership program in April 2018. Nodes are users and links are friendships. The binary classification task is to predict using the ego-net whether the ego user plays a single or multple games. Players who play a single game usually have a more dense ego-net.
Stanford Network Analysis Platform (SNAP) is a general purpose, high performance system for analysis and manipulation of large networks. Graphs consists of nodes and directed/undirected/multiple edges between the graph nodes. Networks are graphs with data on nodes and/or edges of the network.
The core SNAP library is written in C++ and optimized for maximum performance and compact graph representation. It easily scales to massive networks with hundreds of millions of nodes, and billions of edges. It efficiently manipulates large graphs, calculates structural properties, generates regular and random graphs, and supports attributes on nodes and edges. Besides scalability to large graphs, an additional strength of SNAP is that nodes, edges and attributes in a graph or a network can be changed dynamically during the computation.
SNAP was originally developed by Jure Leskovec in the course of his PhD studies. The first release was made available in Nov, 2009. SNAP uses a general purpose STL (Standard Template Library)-like library GLib developed at Jozef Stefan Institute. SNAP and GLib are being actively developed and used in numerous academic and industrial projects.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Birds Aren't Real (r/BirdsArentReal), is the official subreddit for the "most woke among us". It is described as "a safe haven for believers to gather, support one another in these times of adversity, and share images and stories that propel the cause forward. The birds work for the bourgeoisie".
A bit of context here: a significant number of members of Generation Z actively propagate (as a joke or seriously) the myth that birds doesn't exist anymore, because were gradually replaced by Government with drones.
The movement took a certain momentum recently, here is a selection of articles documenting this strange phenomena:
* Birds Aren’t Real, or Are They? Inside a Gen Z Conspiracy Theory
* ‘Birds Aren’t Real’: How A Parody Conspiracy Movement Fought ‘misinformation With Lunacy’
https://img.republicworld.com/republic-prod/stories/promolarge/xhdpi/wizpfcbxdds0f9sm_1639621762.jpeg" alt="">
The data is not filtered.
Reddit posts and commits from subreddit r/BirdsArentReal.
Script used for collection can be found here: Reddit extract content
Use the texts in this dataset to:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset for the article "Survey on the US National Security Agency (NSA): Opinions, Concern and Information Security Awareness", presented at " Central European Conference on Information and Intelligent Systems (CECIIS)" in September 2015.
Dataset of a simple survey with 8 questions. The survey was constructed as online survey and posted to the r/SampleSize community of reddit website (www.reddit.com). The survey was active 14-22 January 2014 and collected 444 answers.
Survey questions and possible answers were as follows (the order of response alternatives in the questionnaire was as presented): 1. How would you judge your level of proficiency in using information technology? Answers (5 levels): Not proficient at all, Slightly proficient, Somewhat proficient, Reasonably proficient, Very proficient. 2. Where do you live? Answers (4 categories): US, EU, BRIC countries (Brazil, Russia, India, China), Other. 3. How familiar are you with function of the NSA (United States National Security Agency)? Answers (5 levels): Not familiar at all, Slightly familiar, Somewhat familiar, Reasonably familiar, Very familiar. 4. What is your general opinion on the NSA? Answers (9 levels): Extremely negative, Decidedly negative, Somewhat negative, Slightly negative, Neutral, Slightly positive, Somewhat positive, Decidedly positive, Extremely positive. 5. How familiar are you with actions of Edward Snowden in relation to the NSA activities? Answers (5 levels): Not familiar at all, Slightly familiar, Somewhat familiar, Reasonably familiar, Very familiar. 6. Did the NSA revelations (information disseminated by Edward Snowden) change your opinion of the NSA? Answers (7 levels): Greatly diminished, Somewhat diminished, Slightly diminished, Did not influence, Slightly improved, Somewhat improved, Greatly improved. 7. Do you find NSA revelations concerning? Answers (5 levels): Not concerning at all, Slightly concerning, Somewhat concerning, Decidedly concerning, Very concerning. 8. Did the NSA revelations increase your information security awareness? Answers (5 levels): Not at all, Slightly increased, Somewhat increased, Decidedly increased, Greatly increased.
The number of Reddit users in Africa was forecast to continuously increase between 2024 and 2028 by in total 4.7 million users (+66.67 percent). After the eighth consecutive increasing year, the Reddit user base is estimated to reach 11.78 million users and therefore a new peak in 2028. Notably, the number of Reddit users of was continuously increasing over the past years.User figures, shown here with regards to the platform reddit, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once. Reddit users encompass both users that are logged in and those that are not.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Reddit users in countries like North America and Asia.
This statistic shows a ranking of the estimated number of Reddit users in 2020 in Africa, differentiated by country. The user numbers have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once. Reddit users encompass both users that are logged in and those that are not.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in more than 150 countries and regions worldwide. All input data are sourced from international institutions, national statistical offices, and trade associations. All data has been are processed to generate comparable datasets (see supplementary notes under details for more information).
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Comparing the *** selected regions regarding the number of Reddit users , the United States is leading the ranking (****** million users) and is followed by the United Kingdom with ***** million users. At the other end of the spectrum is Gabon with **** million users, indicating a difference of ****** million users to the United States. User figures, shown here with regards to the platform reddit, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once. Reddit users encompass both users that are logged in and those that are not.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to *** countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).