100+ datasets found

Internet Use Characteristics of 27 Participants Who Self-Reported Problem...
plos.figshare.com
datasetcatalog.nlm.nih.gov
xls
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wen Li; Jennifer E. O’Brien; Susan M. Snyder; Matthew O. Howard (2023). Internet Use Characteristics of 27 Participants Who Self-Reported Problem Internet Use. [Dataset]. http://doi.org/10.1371/journal.pone.0117372.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0117372.t002
Dataset updated
Jun 1, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Wen Li; Jennifer E. O’Brien; Susan M. Snyder; Matthew O. Howard
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
YDQ ≥ 5 indicates Internet addiction. YDQ scores of 3 or 4 = potential IA. CIUS ≥ 21 indicates compulsive Internet use.Internet Use Characteristics of 27 Participants Who Self-Reported Problem Internet Use.
Number of internet and social media users worldwide 2025
statista.com
abripper.com
Updated Oct 16, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Number of internet and social media users worldwide 2025 [Dataset]. https://www.statista.com/statistics/617136/digital-population-worldwide/
Explore at:
Dataset updated
Oct 16, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
World
Description
As of October 2025, 6.04 billion individuals worldwide were internet users, which amounted to 73.2 percent of the global population. Of this total, 5.66 billion, or 68.7 percent of the world's population, were social media users. Global internet usage Connecting billions of people worldwide, the internet is a core pillar of the modern information society. Northern Europe ranked first among worldwide regions by the share of the population using the internet in 2025. In the Netherlands, Norway, and Saudi Arabia, 99 percent of the population used the internet as of February 2025. North Korea was at the opposite end of the spectrum, with virtually no internet usage penetration among the general population, ranking last worldwide. Eastern Asia was home to the largest number of online users worldwide—over 1.34 billion at the latest count. Southern Asia ranked second, with around 1.2 billion internet users. China, India, and the United States rank ahead of other countries worldwide by the number of internet users. Worldwide internet user demographics As of 2024, the share of female internet users worldwide was 65 percent, five percent less than that of men. Gender disparity in internet usage was bigger in African countries, with around a 10-percent difference. Worldwide regions, like the Commonwealth of Independent States and Europe, showed a smaller usage gap between these two genders. As of 2024, global internet usage was higher among individuals between 15 and 24 years old across all regions, with young people in Europe representing the most considerable usage penetration, 98 percent. In comparison, the worldwide average for the age group of 15 to 24 years was 79 percent. The income level of the countries was also an essential factor for internet access, as 93 percent of the population of the countries with high income reportedly used the internet, as opposed to only 27 percent of the low-income markets.
f
Data_Sheet_1_Investigating links between Internet literacy, Internet use,...
frontiersin.figshare.com
datasetcatalog.nlm.nih.gov
docx
Updated Sep 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Qiaolei Jiang; Zonghai Chen; Zizhong Zhang; Can Zuo (2023). Data_Sheet_1_Investigating links between Internet literacy, Internet use, and Internet addiction among Chinese youth and adolescents in the digital age.docx [Dataset]. http://doi.org/10.3389/fpsyt.2023.1233303.s001
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/fpsyt.2023.1233303.s001
Dataset updated
Sep 7, 2023
Dataset provided by
Frontiers
Authors
Qiaolei Jiang; Zonghai Chen; Zizhong Zhang; Can Zuo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
IntroductionIn current digital era, adolescents’ Internet use has increased exponentially, with the Internet playing a more and more important role in their education and entertainment. However, due to the ongoing cognitive, emotion, and social development processes, youth and adolescents are more vulnerable to Internet addiction. Attention has been paid to the increased use of Internet during the COVID-19 pandemic and the influence of Internet literacy in prevention and intervention of Internet addiction.MethodsThe present study proposes a conceptual model to investigate the links between Internet literacy, Internet use of different purpose and duration, and Internet addiction among Chinese youth and adolescents. In this study, N = 2,276 adolescents studying in primary and secondary schools in East China were recruited, and they completed self-reports on sociodemographic characteristics, Internet literacy scale, Internet use, and Internet addiction scale.ResultsThe results showed a significant relationship between Internet use and Internet addiction. To be specific, the duration of Internet use significantly and positively affected Internet addiction. With different dimensions of Internet literacy required, entertainment-oriented Internet use had positive impact on Internet addiction, while education-oriented Internet use exerted negative effects on Internet addiction. As for Internet literacy, knowledge and skills for Internet (positively) and Internet self-management (negatively) significantly influenced the likelihood of Internet addiction.DiscussionThe findings suggest that Internet overuse increases the risk of Internet addiction in youth and adolescents, while entertainment-oriented rather than education-oriented Internet use is addictive. The role of Internet literacy is complicated, with critical Internet literacy preventing the development of Internet addiction among youth and adolescents, while functional Internet literacy increasing the risk.
Global number of internet users 2005-2024
statista.com
Updated May 6, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Global number of internet users 2005-2024 [Dataset]. https://www.statista.com/statistics/273018/number-of-internet-users-worldwide/
Explore at:
Dataset updated
May 6, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
World
Description
As of 2024, the estimated number of internet users worldwide was 5.5 billion, up from 5.3 billion in the previous year. This share represents 68 percent of the global population. Internet access around the world Easier access to computers, the modernization of countries worldwide, and increased utilization of smartphones have allowed people to use the internet more frequently and conveniently. However, internet penetration often pertains to the current state of development regarding communications networks. As of January 2023, there were approximately 1.05 billion total internet users in China and 692 million total internet users in the United States. Online activities Social networking is one of the most popular online activities worldwide, and Facebook is the most popular online network based on active usage. As of the fourth quarter of 2023, there were over 3.07 billion monthly active Facebook users, accounting for well more than half of the internet users worldwide. Connecting with family and friends, expressing opinions, entertainment, and online shopping are amongst the most popular reasons for internet usage.
f
Data from: Characteristics of Internet Addiction/Pathological Internet Use...
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Feb 3, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Howard, Matthew O.; O’Brien, Jennifer E.; Snyder, Susan M.; Li, Wen (2015). Characteristics of Internet Addiction/Pathological Internet Use in U.S. University Students: A Qualitative-Method Investigation [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001897105
Explore at:
Dataset updated
Feb 3, 2015
Authors
Howard, Matthew O.; O’Brien, Jennifer E.; Snyder, Susan M.; Li, Wen
Area covered
United States
Description
Studies have identified high rates and severe consequences of Internet Addiction/Pathological Internet Use (IA/PIU) in university students. However, most research concerning IA/PIU in U.S. university students has been conducted within a quantitative research paradigm, and frequently fails to contextualize the problem of IA/PIU. To address this gap, we conducted an exploratory qualitative study using the focus group approach and examined 27 U.S. university students who self-identified as intensive Internet users, spent more than 25 hours/week on the Internet for non-school or non-work-related activities and who reported Internet-associated health and/or psychosocial problems. Students completed two IA/PIU measures (Young’s Diagnostic Questionnaire and the Compulsive Internet Use Scale) and participated in focus groups exploring the natural history of their Internet use; preferred online activities; emotional, interpersonal, and situational triggers for intensive Internet use; and health and/or psychosocial consequences of their Internet overuse. Students’ self-reports of Internet overuse problems were consistent with results of standardized measures. Students first accessed the Internet at an average age of 9 (SD = 2.7), and first had a problem with Internet overuse at an average age of 16 (SD = 4.3). Sadness and depression, boredom, and stress were common triggers of intensive Internet use. Social media use was nearly universal and pervasive in participants’ lives. Sleep deprivation, academic under-achievement, failure to exercise and to engage in face-to-face social activities, negative affective states, and decreased ability to concentrate were frequently reported consequences of intensive Internet use/Internet overuse. IA/PIU may be an underappreciated problem among U.S. university students and warrants additional research.
Attitudes towards the internet in Australia 2025
statista.com
Updated Apr 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Umair Bashir (2025). Attitudes towards the internet in Australia 2025 [Dataset]. https://www.statista.com/topics/1145/internet-usage-worldwide/
Explore at:
Dataset updated
Apr 11, 2025
Dataset provided by
Statistahttp://statista.com/
Authors
Umair Bashir
Description
When asked about "Attitudes towards the internet", most Australian respondents pick "It is important to me to have mobile internet access in any place" as an answer. 55 percent did so in our online survey in 2025. Looking to gain valuable insights about users of internet providers worldwide? Check out our reports on consumers who use internet providers. These reports give readers a thorough picture of these customers, including their identities, preferences, opinions, and methods of communication.
Average daily time spent on social media worldwide 2012-2025
statista.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista, Average daily time spent on social media worldwide 2012-2025 [Dataset]. https://www.statista.com/statistics/433871/daily-social-media-usage-worldwide/
Explore at:
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide
Description
As of February 2025, the average daily social media usage of internet users worldwide amounted to 141 minutes per day, down from 143 minutes in the previous year. Currently, the country with the most time spent on social media per day is Brazil, with online users spending an average of 3 hours and 49 minutes on social media each day. In comparison, the daily time spent with social media in the U.S. was just 2 hours and 16 minutes. Global social media usage Currently, the global social network penetration rate is 62.3 percent. Northern Europe had an 81.7 percent social media penetration rate, topping the ranking of global social media usage by region. Eastern and Middle Africa closed the ranking with 10.1 and 9.6 percent usage reach, respectively. People access social media for a variety of reasons. Users like to find funny or entertaining content and enjoy sharing photos and videos with friends, but mainly use social media to stay in touch with current events and friends. Global impact of social media Social media has a wide-reaching and significant impact on not only online activities but also offline behavior and life in general. During a global online user survey in February 2019, a significant share of respondents stated that social media had increased their access to information, ease of communication, and freedom of expression. On the flip side, respondents also felt that social media had worsened their personal privacy, increased polarization in politics, and heightened everyday distractions.
Same News - Different Sources
kaggle.com
zip
Updated Oct 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2022). Same News - Different Sources [Dataset]. https://www.kaggle.com/datasets/thedevastator/same-news-different-sources
Explore at:
zip(262582 bytes)Available download formats
Dataset updated
Oct 28, 2022
Authors
The Devastator
Description
Same News Different Sources

How different sources report on the same events

About this dataset

Do you ever feel like you're being inundated with news from all sides, and you can't keep up? Well, you're not alone. In today's age of social media and 24-hour news cycles, it can be difficult to know what's going on in the world. And with so many different news sources to choose from, it can be hard to know who to trust.

That's where this dataset comes in. It captures data related to individuals' Sentiment Analysis toward different news sources. The data was collected by administering a survey to individuals who use different news sources. The survey responses were then analyzed to obtain the sentiment score for each news source.

So if you're feeling overwhelmed by the news, don't worry – this dataset has you covered. With its insights on which news sources are trustworthy and which ones aren't, you'll be able to make informed decisions about what to read – and what to skip

How to use the dataset

The Twitter Sentiment Analysis dataset can be used to analyze the impact of social media on news consumption. This data can be used to study how individuals' sentiments towards different news sources vary based on the source they use. The dataset can also be used to study how different factors, such as the time of day or the topic of the news, affect an individual's sentiments

Research Ideas

Identify which news sources are most trusted by the public.

Understand what topics are most important to the public.

Understand how different news sources report on the same issue

Columns

File: news.csv | Column name | Description | |:-----------------------|:------------------------------------------------------| | **** | | | Title | The title of the news article. (String) | | Date | The date the news article was published. (Date) | | Time | The time the news article was published. (Time) | | Score | The sentiment score of the news article. (Float) | | Number of Comments | The number of comments on the news article. (Integer) |

File: news_api.csv | Column name | Description | |:--------------|:------------------------------------------------| | **** | | | Title | The title of the news article. (String) | | Date | The date the news article was published. (Date) | | Source | The news source the article is from. (String) |

File: politics.csv | Column name | Description | |:-----------------------|:------------------------------------------------------| | **** | | | Title | The title of the news article. (String) | | Date | The date the news article was published. (Date) | | Time | The time the news article was published. (Time) | | Score | The sentiment score of the news article. (Float) | | Number of Comments | The number of comments on the news article. (Integer) |

File: sports.csv | Column name | Description | |:-----------------------|:------------------------------------------------------| | **** | | | Title | The title of the news article. (String) | | Date | The date the news article was published. (Date) | | Time | The time the news article was published. (Time) | | Score | The sentiment score of the news article. (Float) | | Number of Comments | The number of comments on the news article. (Integer) |

File: television.csv | Column name | Description | |:-----------------------|:------------------------------------------------------| | **** | | | Title | The title of the news article. (String) | | Date | The date the news article was published. (Date) | | Time | The time the news article was published. (Time) | | Score | The sentiment score of the news article. (Float) | | Number of Comments | The number of comments on the news article. (Integer) |

File: trending.csv | Column name | Description ...
D
Replication Data for: Social internet use by people with ID
dataverse.nl
csv, pdf, xlsx
Updated Mar 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hannah Van Alem; Hannah Van Alem; Noud Frielink; Noud Frielink; Petri J. C. M. Embregts; Petri J. C. M. Embregts (2025). Replication Data for: Social internet use by people with ID [Dataset]. http://doi.org/10.34894/EFDBZW
Explore at:
xlsx(2460512), pdf(679873), csv(5826122), pdf(177390), xlsx(146171), xlsx(113455), pdf(173046)Available download formats
Unique identifier
https://doi.org/10.34894/EFDBZW
Dataset updated
Mar 24, 2025
Dataset provided by
DataverseNL
Authors
Hannah Van Alem; Hannah Van Alem; Noud Frielink; Noud Frielink; Petri J. C. M. Embregts; Petri J. C. M. Embregts
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Data for overview of peer-reviewed articles up to November 2024 on the reasons for social internet usage by people with intellectual disabilities. RQ: Why do people with ID engage in social internet use?
Data from: Internet access - households and individuals
ons.gov.uk
cy.ons.gov.uk
xlsx
Updated Aug 7, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office for National Statistics (2020). Internet access - households and individuals [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/householdcharacteristics/homeinternetandsocialmediausage/datasets/internetaccesshouseholdsandindividualsreferencetables
Explore at:
xlsxAvailable download formats
Dataset updated
Aug 7, 2020
Dataset provided by
Office for National Statisticshttp://www.ons.gov.uk/
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
Annual data on internet usage in Great Britain, including frequency of internet use, internet activities and internet purchasing.
Age distribution of internet users worldwide 2024
statista.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista, Age distribution of internet users worldwide 2024 [Dataset]. https://www.statista.com/statistics/272365/age-distribution-of-internet-users-worldwide/
Explore at:
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Feb 2024
Area covered
Worldwide
Description
As of February 2024, over a third of online users worldwide were aged between 25 and 34 years. Website visitors in this age bracket constituted the biggest group of online users worldwide. Also, 19 percent of global online users were aged 18 to 24 years. The global digital population aged 65 or older represented approximately 4.2 percent of all internet users worldwide. Social media usage and Meta Social media is a major driver of internet use, with a global penetration rate of 62.2 percent. On average, internet users spend 143 minutes per day on social media, highlighting its significant impact on daily online activities. The usage of social media is mostly dominated by Meta platforms, which own four of the largest social media platforms. Facebook leads the ranking with over three billion active users, followed by Instagram and WhatsApp. Instagram's global popularity Meta’s social video platform, Instagram, had long been one of the most engaging social media platforms worldwide, and it was projected to reach 1.44 billion monthly active users. Instagram was particularly favored by users aged 18 to 34, thanks to its ability to offer a variety of interactive content, from images and carousels. This diverse range of content types was a key factor in its popularity among its young user base.
w
Data Use in Academia Dataset
datacatalog.worldbank.org
csv, utf-8
Updated Nov 27, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Semantic Scholar Open Research Corpus (S2ORC) (2023). Data Use in Academia Dataset [Dataset]. https://datacatalog.worldbank.org/search/dataset/0065200/data_use_in_academia_dataset
Explore at:
utf-8, csvAvailable download formats
Dataset updated
Nov 27, 2023
Dataset provided by
Semantic Scholar Open Research Corpus (S2ORC)
Brian William Stacy
License
https://datacatalog.worldbank.org/public-licenses?fragment=cchttps://datacatalog.worldbank.org/public-licenses?fragment=cc
Description
This dataset contains metadata (title, abstract, date of publication, field, etc) for around 1 million academic articles. Each record contains additional information on the country of study and whether the article makes use of data. Machine learning tools were used to classify the country of study and data use.

Our data source of academic articles is the Semantic Scholar Open Research Corpus (S2ORC) (Lo et al. 2020). The corpus contains more than 130 million English language academic papers across multiple disciplines. The papers included in the Semantic Scholar corpus are gathered directly from publishers, from open archives such as arXiv or PubMed, and crawled from the internet.

We placed some restrictions on the articles to make them usable and relevant for our purposes. First, only articles with an abstract and parsed PDF or latex file are included in the analysis. The full text of the abstract is necessary to classify the country of study and whether the article uses data. The parsed PDF and latex file are important for extracting important information like the date of publication and field of study. This restriction eliminated a large number of articles in the original corpus. Around 30 million articles remain after keeping only articles with a parsable (i.e., suitable for digital processing) PDF, and around 26% of those 30 million are eliminated when removing articles without an abstract. Second, only articles from the year 2000 to 2020 were considered. This restriction eliminated an additional 9% of the remaining articles. Finally, articles from the following fields of study were excluded, as we aim to focus on fields that are likely to use data produced by countries’ national statistical system: Biology, Chemistry, Engineering, Physics, Materials Science, Environmental Science, Geology, History, Philosophy, Math, Computer Science, and Art. Fields that are included are: Economics, Political Science, Business, Sociology, Medicine, and Psychology. This third restriction eliminated around 34% of the remaining articles. From an initial corpus of 136 million articles, this resulted in a final corpus of around 10 million articles.

Due to the intensive computer resources required, a set of 1,037,748 articles were randomly selected from the 10 million articles in our restricted corpus as a convenience sample.

The empirical approach employed in this project utilizes text mining with Natural Language Processing (NLP). The goal of NLP is to extract structured information from raw, unstructured text. In this project, NLP is used to extract the country of study and whether the paper makes use of data. We will discuss each of these in turn.

To determine the country or countries of study in each academic article, two approaches are employed based on information found in the title, abstract, or topic fields. The first approach uses regular expression searches based on the presence of ISO3166 country names. A defined set of country names is compiled, and the presence of these names is checked in the relevant fields. This approach is transparent, widely used in social science research, and easily extended to other languages. However, there is a potential for exclusion errors if a country’s name is spelled non-standardly.

The second approach is based on Named Entity Recognition (NER), which uses machine learning to identify objects from text, utilizing the spaCy Python library. The Named Entity Recognition algorithm splits text into named entities, and NER is used in this project to identify countries of study in the academic articles. SpaCy supports multiple languages and has been trained on multiple spellings of countries, overcoming some of the limitations of the regular expression approach. If a country is identified by either the regular expression search or NER, it is linked to the article. Note that one article can be linked to more than one country.

The second task is to classify whether the paper uses data. A supervised machine learning approach is employed, where 3500 publications were first randomly selected and manually labeled by human raters using the Mechanical Turk service (Paszke et al. 2019).[1] To make sure the human raters had a similar and appropriate definition of data in mind, they were given the following instructions before seeing their first paper:

Each of these documents is an academic article. The goal of this study is to measure whether a specific academic article is using data and from which country the data came.
There are two classification tasks in this exercise:
1. identifying whether an academic article is using data from any country
2. Identifying from which country that data came.
For task 1, we are looking specifically at the use of data. Data is any information that has been collected, observed, generated or created to produce research findings. As an example, a study that reports findings or analysis using a survey data, uses data. Some clues to indicate that a study does use data includes whether a survey or census is described, a statistical model estimated, or a table or means or summary statistics is reported.
After an article is classified as using data, please note the type of data used. The options are population or business census, survey data, administrative data, geospatial data, private sector data, and other data. If no data is used, then mark "Not applicable". In cases where multiple data types are used, please click multiple options.[2]
For task 2, we are looking at the country or countries that are studied in the article. In some cases, no country may be applicable. For instance, if the research is theoretical and has no specific country application. In some cases, the research article may involve multiple countries. In these cases, select all countries that are discussed in the paper.
We expect between 10 and 35 percent of all articles to use data.

The median amount of time that a worker spent on an article, measured as the time between when the article was accepted to be classified by the worker and when the classification was submitted was 25.4 minutes. If human raters were exclusively used rather than machine learning tools, then the corpus of 1,037,748 articles examined in this study would take around 50 years of human work time to review at a cost of $3,113,244, which assumes a cost of $3 per article as was paid to MTurk workers.

A model is next trained on the 3,500 labelled articles. We use a distilled version of the BERT (bidirectional Encoder Representations for transformers) model to encode raw text into a numeric format suitable for predictions (Devlin et al. (2018)). BERT is pre-trained on a large corpus comprising the Toronto Book Corpus and Wikipedia. The distilled version (DistilBERT) is a compressed model that is 60% the size of BERT and retains 97% of the language understanding capabilities and is 60% faster (Sanh, Debut, Chaumond, Wolf 2019). We use PyTorch to produce a model to classify articles based on the labeled data. Of the 3,500 articles that were hand coded by the MTurk workers, 900 are fed to the machine learning model. 900 articles were selected because of computational limitations in training the NLP model. A classification of “uses data” was assigned if the model predicted an article used data with at least 90% confidence.

The performance of the models classifying articles to countries and as using data or not can be compared to the classification by the human raters. We consider the human raters as giving us the ground truth. This may underestimate the model performance if the workers at times got the allocation wrong in a way that would not apply to the model. For instance, a human rater could mistake the Republic of Korea for the Democratic People’s Republic of Korea. If both humans and the model perform the same kind of errors, then the performance reported here will be overestimated.

The model was able to predict whether an article made use of data with 87% accuracy evaluated on the set of articles held out of the model training. The correlation between the number of articles written about each country using data estimated under the two approaches is given in the figure below. The number of articles represents an aggregate total of
Internet Penetration in Percentage
figshare.com
xlsx
Updated May 18, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matheus Lotto (2021). Internet Penetration in Percentage [Dataset]. http://doi.org/10.6084/m9.figshare.14614581.v2
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.14614581.v2
Dataset updated
May 18, 2021
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Matheus Lotto
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Raw Data of manuscript: "Social isolation intensified the interests in toothache-related digital information during the COVID-19 pandemic"
Internet usage frequency in Germany in 2019, by activity
statista.com
Updated Nov 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Internet usage frequency in Germany in 2019, by activity [Dataset]. https://www.statista.com/statistics/1188894/internet-usage-frequency-activity-germany/
Explore at:
Dataset updated
Nov 27, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Mar 26, 2019 - Apr 2, 2019
Area covered
Germany
Description
In 2019, ** percent of respondents used the internet almost daily. This survey depicts the frequency of online activities in Germany in 2019. Other popular daily activities included reading articles and posts online, as well as using social media.
f
Stimation results for different internet usage modes.
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Aug 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Huang, Huan; Li, Xiaodi; Qin, Dongxue; Ma, Zhifei; Zhang, Xiangmin (2024). Stimation results for different internet usage modes. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001288762
Explore at:
Dataset updated
Aug 30, 2024
Authors
Huang, Huan; Li, Xiaodi; Qin, Dongxue; Ma, Zhifei; Zhang, Xiangmin
Description
Stimation results for different internet usage modes.

Number of global social network users 2017-2028

statista.com
de.statista.com

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

Stacy Jo Dixon, Number of global social network users 2017-2028 [Dataset]. https://www.statista.com/topics/1164/social-networks/

Explore at:

Dataset provided by

Statistahttp://statista.com/

Authors

Stacy Jo Dixon

Description

How many people use social media?

              Social media usage is one of the most popular online activities. In 2024, over five billion people were using social media worldwide, a number projected to increase to over six billion in 2028.

              Who uses social media?
              Social networking is one of the most popular digital activities worldwide and it is no surprise that social networking penetration across all regions is constantly increasing. As of January 2023, the global social media usage rate stood at 59 percent. This figure is anticipated to grow as lesser developed digital markets catch up with other regions
              when it comes to infrastructure development and the availability of cheap mobile devices. In fact, most of social media’s global growth is driven by the increasing usage of mobile devices. Mobile-first market Eastern Asia topped the global ranking of mobile social networking penetration, followed by established digital powerhouses such as the Americas and Northern Europe.

              How much time do people spend on social media?
              Social media is an integral part of daily internet usage. On average, internet users spend 151 minutes per day on social media and messaging apps, an increase of 40 minutes since 2015. On average, internet users in Latin America had the highest average time spent per day on social media.

              What are the most popular social media platforms?
              Market leader Facebook was the first social network to surpass one billion registered accounts and currently boasts approximately 2.9 billion monthly active users, making it the most popular social network worldwide. In June 2023, the top social media apps in the Apple App Store included mobile messaging apps WhatsApp and Telegram Messenger, as well as the ever-popular app version of Facebook.

f
Data from: Diagnostic Criteria for Problematic Internet Use among U.S....
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Jan 18, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Howard, Matthew O.; O’Brien, Jennifer E.; Li, Wen; Snyder, Susan M. (2016). Diagnostic Criteria for Problematic Internet Use among U.S. University Students: A Mixed-Methods Evaluation [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001533824
Explore at:
Dataset updated
Jan 18, 2016
Authors
Howard, Matthew O.; O’Brien, Jennifer E.; Li, Wen; Snyder, Susan M.
Description
Empirical studies have identified increasing rates of problematic Internet use worldwide and a host of related negative consequences. However, researchers disagree as to whether problematic Internet use is a subtype of behavioral addiction. Thus, there are not yet widely accepted and validated diagnostic criteria for problematic Internet use. To address this gap, we used mixed-methods to examine the extent to which signs and symptoms of problematic Internet use mirror DSM-5 diagnostic criteria for substance use disorder, gambling disorder, and Internet gaming disorder. A total of 27 university students, who self-identified as intensive Internet users and who reported Internet-use-associated health and/or psychosocial problems were recruited. Students completed two measures that assess problematic Internet use (Young’s Diagnostic Questionnaire and the Compulsive Internet Use Scale) and participated in focus groups exploring their experiences with problematic Internet use. Results of standardized measures and focus group discussions indicated substantial overlap between students’ experiences of problematic Internet use and the signs and symptoms reflected in the DSM-5 criteria for substance use disorder, gambling disorder, and Internet gaming disorder. These signs and symptoms included: a) use Internet longer than intended, b) preoccupation with the Internet, c) withdrawal symptoms when unable to access the Internet, d) unsuccessful attempts to stop or reduce Internet use, e) craving, f) loss of interest in hobbies or activities other than the Internet, g) excessive Internet use despite the knowledge of related problems, g) use of the Internet to escape or relieve a negative mood, and h) lying about Internet use. Tolerance, withdrawal symptoms, and recurrent Internet use in hazardous situations were uniquely manifested in the context of problematic Internet use. Implications for research and practice are discussed.
f
Average treatment effect of internet usage on farmers’ adoption behavior.
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Aug 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Qin, Dongxue; Li, Xiaodi; Ma, Zhifei; Huang, Huan; Zhang, Xiangmin (2024). Average treatment effect of internet usage on farmers’ adoption behavior. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001288723
Explore at:
Dataset updated
Aug 30, 2024
Authors
Qin, Dongxue; Li, Xiaodi; Ma, Zhifei; Huang, Huan; Zhang, Xiangmin
Description
Average treatment effect of internet usage on farmers’ adoption behavior.
Digital divide among people with disabilities: Analysis of data from a...
plos.figshare.com
xlsx
Updated Jun 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mariusz Duplaga (2023). Digital divide among people with disabilities: Analysis of data from a nationwide study for determinants of Internet use and activities performed online [Dataset]. http://doi.org/10.1371/journal.pone.0179825
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0179825
Dataset updated
Jun 2, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Mariusz Duplaga
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
IntroductionThe Internet is both an opportunity as well as a challenge for people with disabilities. However, this segment of the population is usually indicated among social groups experiencing digital divide. The study is focused on the analysis of factors determining Internet usage and undertaking specific activities online among people with disabilities based on a nationwide study performed in 2013 in Poland.MethodsSecondary analysis was performed on the data of persons who declared disability status in 2013 “Social Diagnosis” study. Multivariate logistic regression models were developed for the use of the Internet and performing three types of activities online.ResultsAmong 3,556 respondents with disability 51.02% were females, 25.19% 65 years of age and over and 33.05% were Internet users. The predictors of Internet usage included the degree of disability, place of residence, level of education, marital status, occupational status, net income, use of health care service and the use of mobile phone. The odds ratio that a person with disability belonging to the oldest category will use the Internet was only 0.04 (95% CI 0.02–0.09), when compared to the youngest category. The odds that a person with disability from the highest category of education will use the Internet were 18 times higher than in the case of persons with only basic education (OR 18.17, 95% CI 11.70–28.21). Common predictors of online activities (accessing websites of public institutions, checking and sending emails, publishing own content on the Internet) included age category and net income.ConclusionsPeople with disabilities in Poland are facing a significant digital divide. The factors determining the use of the Internet in this group are similar to those of the general population. On the other hand, people with disabilities who are active online, access diversified types of services including presentation of their own content online.

Data from: WikiReddit: Tracing Information and Attention Flows Between...

zenodo.org

bin

Updated May 4, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

Patrick Gildersleve; Patrick Gildersleve; Anna Beers; Anna Beers; Viviane Ito; Viviane Ito; Agustin Orozco; Agustin Orozco; Francesca Tripodi; Francesca Tripodi (2025). WikiReddit: Tracing Information and Attention Flows Between Online Platforms [Dataset]. http://doi.org/10.5281/zenodo.14653265

Explore at:

binAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.14653265

Dataset updated

May 4, 2025

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Patrick Gildersleve; Patrick Gildersleve; Anna Beers; Anna Beers; Viviane Ito; Viviane Ito; Agustin Orozco; Agustin Orozco; Francesca Tripodi; Francesca Tripodi

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Time period covered

Jan 15, 2025

Description

Preprint

Gildersleve, P., Beers, A., Ito, V., Orozco, A., & Tripodi, F. (2025). WikiReddit: Tracing Information and Attention Flows Between Online Platforms. arXiv [Cs.CY]. https://doi.org/10.48550/arXiv.2502.04942

Accepted at the International AAAI Conference on Web and Social Media (ICWSM) 2025

Abstract

The World Wide Web is a complex interconnected digital ecosystem, where information and attention flow between platforms and communities throughout the globe. These interactions co-construct how we understand the world, reflecting and shaping public discourse. Unfortunately, researchers often struggle to understand how information circulates and evolves across the web because platform-specific data is often siloed and restricted by linguistic barriers. To address this gap, we present a comprehensive, multilingual dataset capturing all Wikipedia links shared in posts and comments on Reddit from 2020 to 2023, excluding those from private and NSFW subreddits. Each linked Wikipedia article is enriched with revision history, page view data, article ID, redirects, and Wikidata identifiers. Through a research agreement with Reddit, our dataset ensures user privacy while providing a query and ID mechanism that integrates with the Reddit and Wikipedia APIs. This enables extended analyses for researchers studying how information flows across platforms. For example, Reddit discussions use Wikipedia for deliberation and fact-checking which subsequently influences Wikipedia content, by driving traffic to articles or inspiring edits. By analyzing the relationship between information shared and discussed on these platforms, our dataset provides a foundation for examining the interplay between social media discourse and collaborative knowledge consumption and production.

Datasheet

Motivation

The motivations for this dataset stem from the challenges researchers face in studying the flow of information across the web. While the World Wide Web enables global communication and collaboration, data silos, linguistic barriers, and platform-specific restrictions hinder our ability to understand how information circulates, evolves, and impacts public discourse. Wikipedia and Reddit, as major hubs of knowledge sharing and discussion, offer an invaluable lens into these processes. However, without comprehensive data capturing their interactions, researchers are unable to fully examine how platforms co-construct knowledge. This dataset bridges this gap, providing the tools needed to study the interconnectedness of social media and collaborative knowledge systems.

Composition

WikiReddit, a comprehensive dataset capturing all Wikipedia mentions (including links) shared in posts and comments on Reddit from 2020 to 2023, excluding those from private and NSFW (not safe for work) subreddits. The SQL database comprises 336K total posts, 10.2M comments, 1.95M unique links, and 1.26M unique articles spanning 59 languages on Reddit and 276 Wikipedia language subdomains. Each linked Wikipedia article is enriched with its revision history and page view data within a ±10-day window of its posting, as well as article ID, redirects, and Wikidata identifiers. Supplementary anonymous metadata from Reddit posts and comments further contextualizes the links, offering a robust resource for analysing cross-platform information flows, collective attention dynamics, and the role of Wikipedia in online discourse.

Collection Process

Data was collected from the Reddit4Researchers and Wikipedia APIs. No personally identifiable information is published in the dataset. Data from Reddit to Wikipedia is linked via the hyperlink and article titles appearing in Reddit posts.

Preprocessing/cleaning/labeling

Extensive processing with tools such as regex was applied to the Reddit post/comment text to extract the Wikipedia URLs. Redirects for Wikipedia URLs and article titles were found through the API and mapped to the collected data. Reddit IDs are hashed with SHA-256 for post/comment/user/subreddit anonymity.

Uses

We foresee several applications of this dataset and preview four here. First, Reddit linking data can be used to understand how attention is driven from one platform to another. Second, Reddit linking data can shed light on how Wikipedia's archive of knowledge is used in the larger social web. Third, our dataset could provide insights into how external attention is topically distributed across Wikipedia. Our dataset can help extend that analysis into the disparities in what types of external communities Wikipedia is used in, and how it is used. Fourth, relatedly, a topic analysis of our dataset could reveal how Wikipedia usage on Reddit contributes to societal benefits and harms. Our dataset could help examine if homogeneity within the Reddit and Wikipedia audiences shapes topic patterns and assess whether these relationships mitigate or amplify problematic engagement online.

Distribution

The dataset is publicly shared with a Creative Commons Attribution 4.0 International license. The article describing this dataset should be cited: https://doi.org/10.48550/arXiv.2502.04942

Maintenance

Patrick Gildersleve will maintain this dataset, and add further years of content as and when available.

SQL Database Schema

Table: `posts`

Column Name	Type	Description
`subreddit_id`	TEXT	The unique identifier for the subreddit.
`crosspost_parent_id`	TEXT	The ID of the original Reddit post if this post is a crosspost.
`post_id`	TEXT	Unique identifier for the Reddit post.
`created_at`	TIMESTAMP	The timestamp when the post was created.
`updated_at`	TIMESTAMP	The timestamp when the post was last updated.
`language_code`	TEXT	The language code of the post.
`score`	INTEGER	The score (upvotes minus downvotes) of the post.
`upvote_ratio`	REAL	The ratio of upvotes to total votes.
`gildings`	INTEGER	Number of awards (gildings) received by the post.
`num_comments`	INTEGER	Number of comments on the post.

Table: `comments`

Column Name	Type	Description
`subreddit_id`	TEXT	The unique identifier for the subreddit.
`post_id`	TEXT	The ID of the Reddit post the comment belongs to.
`parent_id`	TEXT	The ID of the parent comment (if a reply).
`comment_id`	TEXT	Unique identifier for the comment.
`created_at`	TIMESTAMP	The timestamp when the comment was created.
`last_modified_at`	TIMESTAMP	The timestamp when the comment was last modified.
`score`	INTEGER	The score (upvotes minus downvotes) of the comment.
`upvote_ratio`	REAL	The ratio of upvotes to total votes for the comment.
`gilded`	INTEGER	Number of awards (gildings) received by the comment.

Table: `postlinks`

Column Name	Type	Description
`post_id`	TEXT	Unique identifier for the Reddit post.
`end_processed_valid`	INTEGER	Whether the extracted URL from the post resolves to a valid URL.
`end_processed_url`	TEXT	The extracted URL from the Reddit post.
`final_valid`	INTEGER	Whether the final URL from the post resolves to a valid URL after redirections.
`final_status`	INTEGER	HTTP status code of the final URL.
`final_url`	TEXT	The final URL after redirections.
`redirected`	INTEGER	Indicator of whether the posted URL was redirected (1) or not (0).
`in_title`	INTEGER	Indicator of whether the link appears in the post title (1) or post body (0).

Table: `commentlinks`

Column Name	Type	Description
`comment_id`	TEXT	Unique identifier for the Reddit comment.
`end_processed_valid`	INTEGER	Whether the extracted URL from the comment resolves to a valid URL.
`end_processed_url`	TEXT	The extracted URL from the comment.
`final_valid`	INTEGER	Whether the final URL from the comment resolves to a valid URL after redirections.
`final_status`	INTEGER	HTTP status code of the final

Facebook

Twitter

Click to copy link

Link copied

Cite

Wen Li; Jennifer E. O’Brien; Susan M. Snyder; Matthew O. Howard (2023). Internet Use Characteristics of 27 Participants Who Self-Reported Problem Internet Use. [Dataset]. http://doi.org/10.1371/journal.pone.0117372.t002

Internet Use Characteristics of 27 Participants Who Self-Reported Problem Internet Use.

Explore at:

xlsAvailable download formats

Unique identifier

https://doi.org/10.1371/journal.pone.0117372.t002

Dataset updated

Jun 1, 2023

Dataset provided by

PLOShttp://plos.org/

Authors

Wen Li; Jennifer E. O’Brien; Susan M. Snyder; Matthew O. Howard

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

YDQ ≥ 5 indicates Internet addiction. YDQ scores of 3 or 4 = potential IA. CIUS ≥ 21 indicates compulsive Internet use.Internet Use Characteristics of 27 Participants Who Self-Reported Problem Internet Use.

Clear search

Close search

Google apps

Main menu

Internet Use Characteristics of 27 Participants Who Self-Reported Problem...

Number of internet and social media users worldwide 2025

Data_Sheet_1_Investigating links between Internet literacy, Internet use,...

Global number of internet users 2005-2024

Data from: Characteristics of Internet Addiction/Pathological Internet Use...

Attitudes towards the internet in Australia 2025

Average daily time spent on social media worldwide 2012-2025

Same News - Different Sources

Same News Different Sources

How different sources report on the same events

About this dataset

How to use the dataset

Research Ideas

Columns

Replication Data for: Social internet use by people with ID

Data from: Internet access - households and individuals

Age distribution of internet users worldwide 2024

Data Use in Academia Dataset

Internet Penetration in Percentage

Internet usage frequency in Germany in 2019, by activity

Stimation results for different internet usage modes.

Number of global social network users 2017-2028

Data from: Diagnostic Criteria for Problematic Internet Use among U.S....

Average treatment effect of internet usage on farmers’ adoption behavior.

Digital divide among people with disabilities: Analysis of data from a...

Data from: WikiReddit: Tracing Information and Attention Flows Between...

Preprint

Abstract

Datasheet

Motivation

Composition

Collection Process

Preprocessing/cleaning/labeling

Uses

Distribution

Maintenance

SQL Database Schema

Table: posts

Table: comments

Table: postlinks

Table: commentlinks

Internet Use Characteristics of 27 Participants Who Self-Reported Problem Internet Use.

Table: `posts`

Table: `comments`

Table: `postlinks`

Table: `commentlinks`