The number of Facebook users in the United States was forecast to continuously increase between 2024 and 2028 by in total 12.6 million users (+5.04 percent). After the ninth consecutive increasing year, the Facebook user base is estimated to reach 262.8 million users and therefore a new peak in 2028. Notably, the number of Facebook users of was continuously increasing over the past years.User figures, shown here regarding the platform facebook, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).
This statistical dataset contains estimates on the number of active online Facebook users living outside of their country of origin within the European Union. The dataset includes information on Facebook users' age, gender, country of residence, and country of previous residence. The data is divided in the number of Monthly Active Users and Daily Active Users. The data was collected through standard CSV format via an advertising API platform by using an R Studio code, and the data collection was conducted twice a month from January to November 2021. The dataset was originally published in DiVA and moved to SND in 2024. Detta statistiska dataset innehåller uppskattningar av antalet aktiva Facebook-användare online som bor utanför sitt ursprungsland inom Europeiska unionen. Se engelsk beskrivning för mer information. Datasetet har ursprungligen publicerats i DiVA och flyttades över till SND 2024.
This statistical dataset contains estimates on the number of active online Facebook users living outside of their country of origin within the European Union. The dataset includes information on Facebook users' age, gender, country of residence, and country of previous residence. The data is divided in the number of Monthly Active Users and Daily Active Users. The data was collected through standard CSV format via an advertising API platform by using an R Studio code, and the data collection was conducted twice a month from January to November 2021.
The dataset was originally published in DiVA and moved to SND in 2024.
The number of Reddit users in the United States was forecast to continuously increase between 2024 and 2028 by in total 10.3 million users (+5.21 percent). After the ninth consecutive increasing year, the Reddit user base is estimated to reach 208.12 million users and therefore a new peak in 2028. Notably, the number of Reddit users of was continuously increasing over the past years.User figures, shown here with regards to the platform reddit, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once. Reddit users encompass both users that are logged in and those that are not.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Reddit users in countries like Mexico and Canada.
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
Flixster is a social movie site allowing users to share movie ratings, discover new movies and meet others with similar movie taste. Number of Nodes: 2523386 Number of Edges: 9197338 Missing Values? no Source: N/A Data Set Information: 2 files are included: 1. nodes.csv — it s the file of all the users. This file works as a dictionary of all the users in this data set. It s useful for fast reference. It contains all the node ids used in the dataset 2. edges.csv — this is the friendship network among the users. The friends are represented using edges. Here is an example. 1,2 This means user with id "1" is friend with user id "2". Attribute Information: Flixster is a social movie site allowing users to share movie ratings, discover new movies and meet others with similar movie taste. This contains the friendship network crawled in December 2010 by Javier Parra (Javier.Parra@asu.edu). For easier understanding, all the contents are organized in CSV file form
The number of Twitter users in the United States was forecast to continuously increase between 2024 and 2028 by in total 4.3 million users (+5.32 percent). After the ninth consecutive increasing year, the Twitter user base is estimated to reach 85.08 million users and therefore a new peak in 2028. Notably, the number of Twitter users of was continuously increasing over the past years.User figures, shown here regarding the platform twitter, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Twitter users in countries like Canada and Mexico.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Purpose For the purpose of informing tobacco intervention programs, this dataset was created and used to explore how online social networks of smokers differed from those of nonsmokers. The study was a secondary analysis of data collected as part of a randomized control trial conducted within Facebook. (See "Other References" in "Metadata" for parent study information.) Basic description of 4 anonymized data files of study participants. fbr_friends: Anonymized Facebook friends networks, basic ego demographics, basic ego social media activity fbr_family: Anonymized Facebook family networks, basic ego demographics, basic ego social media activity fbr_photos: Anonymized Facebook photo networks, basic ego demographics, basic ego social media activity fbr_groups: Anonymized Facebook group networks, basic ego demographics, basic ego social media activity Each network comprises the ego, the ego's first degree connections, and the (second degree) connections between the ego's friends. Missing data and users who did not have friend, family, photo, or group networks were cleaned from the data beforehand. Each data file contains the following columns of data, taken with participant knowledge and consent participant_id: Nonidentifying ids assigned to different study participants. is_smoker: Binary value (0,1) that takes on the value 1 if participant was a smoker and 0 otherwise. gender: One of three categories: male, female, or blank, which signified Other (different from missing data). country: One of four categories: Canada (ca), US (us), Mexico (mx), or Other (xx). likes_count: Numeric data indicating number of Facebook likes the participant had made up to the date the data was collected. wall_count: Numeric data indicating number of Facebook wall posts the participant had made up to the date the data was collected. t_count_page_views: Numeric data indicating number of pages participant had visited in the UbiQUITous app up to the date the data was collected. yearsOld: Numeric data indicating age in years of the participant; right censored at 90 years for data anonymity. vertices: Number of people in the participant's network. edges: Number of connections between people in the network. density: The portion of potential connections in a network that are actual connections; a network-level metric; calculated after removing ego and isolates. mean_betweenness_centrality: An average of the relative importance of all individuals within their own network; a network-level metric; calculated after removing ego and isolates. transitivity: The extent to which the relationship between two nodes in a network that are connected by an edge is transitive (calculated as the number of triads divided by all possible connections); a network-level metric; calculated after removing ego and isolates. mean_closeness: Average of how closely associated members are to one another; a network-level metric; calculated after removing ego and isolates. isolates2: Number of individuals with no connections other than to the ego; a network-level metric. diameter3: Maximum degree of separation between any two individuals in the network; a network-level metric; calculated after removing ego and isolates. clusters3: Number of subnetworks; a network-level metric; calculated after removing ego and isolates. communities3: Number of groups, sorted to increase dense connections within the group and decrease sparse connections outside it (i.e., to maximize modularity); a network-level metric; calculated after removing ego and isolates. modularity3: The strength of division of a network into communities (calculated as the fraction of ties between community members in excess of the expected number of ties within communities if ties were random); a network-level metric. Detailed information on network metrics in the associated manuscript: "An exploration of the Facebook social networks of smokers and non-smokers" by Fu, L, Jacobs MA, Brookover J, Valente TW, Cobb NK, and Graham AL.
README file
Data Set Title: “PERCEIVE - ENGAGING THE PEOPLE’: IS SOCIAL MEDIA COVERAGE OF EU POLICY ASSOCIATED WITH PUBLIC SUPPORT FOR EUROPEAN INTEGRATION?”
Data Set Authors:
Vitaliano Barberio (Wirtschaftsuniversität Wien), ORCID http://orcid.org/0000-0002-2615-5006;
Luca Pareschi (Università di Roma Tor Vergata), ORCID http://orcid.org/0000-0002-4402-9329;
Data Set Contributors:
Ines Kuric (Wirtschaftsuniversität Wien);
Edoardo Mollona (Università di Bologna), ORCID http://orcid.org/0000-0001-9496-8618.
Markus Höllerer (Wirtschaftsuniversität Wien); http://orcid.org/0000-0003-2509-2696
Data Set Contact Person:
Luca Pareschi (Università di Roma Tor Vergata), ORCID http://orcid.org/0000-0002-4402-9329;
luca.pareschi@uniroma2.it .
Data Set License: this data set is distributed under a Creative Commons Attribution (CC BY) 4.0 International license
Publication Year: 2021
Project Info: PERCEIVE (Perception and Evaluation of Regional and Cohesion Policies by Europeans and Identification with the Values of Europe), funded by European Union, Horizon 2020 Programme. Grant Agreement num. 693529; https://www.perceiveproject.eu/.
Data set Contents
The data set consists of:
1 README file
6 textual qualitative file saved in .txt format
“stoplist_file_[nation].txt”
12 textual quantitative file saved in .txt format
“[source]-keys.txt”: 6 files
2 excel quantitative files saved in .xlsx format
“SentimentFB.xlsx”
“topics_prevalence_and_clustering.xlsx”
Data set Documentation
Abstract
This data set contains the underlying data of the paper “’ENGAGING THE PEOPLE’: IS SOCIAL MEDIA COVERAGE OF EU POLICY ASSOCIATED WITH PUBLIC SUPPORT FOR EUROPEAN INTEGRATION?”.
Data openly available within this dataset are a subset of the two following data sets, which contains all the relevant data of Work Package 3 and Work Package 5 of PERCEIVE project:
Data set: “PERCEIVE: WP3: Effectiveness of communication strategies of EU projects” https://doi.org/10.5281/zenodo.3371133
Data set: “PERCEIVE: WP5: The multiplicity of shared meanings of EU and Cohesion Regional and Urban Policy at different discursive levels” https://doi.org/10.5281/zenodo.3371174
For the paper we collected Facebook posts referred to EU CP policies. We don’t have the permission to share these data (as they are protected by copyright), but all the sources are described in Deliverable 5.2, which is public (see http://doi.org/10.6092/unibo/amsacta/5726 or http://doi.org/10.5281/zenodo.1318184). We analyzed the textual content of data to construct a database of discursive topics in Task5.4. Data set includes the results of topic modeling and of a sentiment analysis performed on the Facebook homepages of Local Management Authorities (LMA) of PERCEIVE case study regions.
Content of the files:
1 sub-folder, named “A_Stopword”, which contains all the stopword lists used for performing Topic Modeling. These are 6 .txt files, one for each language: Austrian, Italian, Polish, Romanian, Spanish, Swedish (“stoplist_file_[nation].txt”).
1 sub-folder which contain the Topic Modeling results for Facebook profiles of the Local Managing Authorities for Austria, Italy, Poland, Romania, Spain, and Sweden (sub-folder “B_Facebook”, 12 .txt files). For each case, a file “[source]-keys.txt” lists the 100 most important words for each topic, while a file “[source]-composition.txt” details the topic composition of each textual source. These files were obtained through Mallet software[1].
File “SentimentFB.xlsx” contains data regarding the sentiment analysis for contents on Facebook homepages of Local Managing Authorities. The first column indicates the country, as well as row labels (see below). Columns 2-21 indicate the number id of the topics for each topic model (national level). The three rightmost columns of the file represent respectively a) the name of the lexicon used to detect sentiment orientation (i.e. “VADER”); c) the average sentiment score for positive, neutral and average words for each lexicon and each country; and c) the sentiment score across all topics in a country.
File “topics_prevalence_and_clustering.xlsx” contains data regarding the three clusters of topics analyzed in the paper. The first column represents the ID of each topic; the second column reports the cluster of each topic; the third and the fourth columns report the average prevalence of each topic (rows) in posts and comments, respectively. As these data refer to a regional case study, these columns refer the first region for each country; the sixth and the seventh columns report the average prevalence of each topic (rows) in posts and comments for the second region analyzed (only for those countries where we analyzed two regions); the eighth and ninth columns reports the average prevalence of topics and comments, respectively, for each country; and finally the tenth column reports the country to which data in the previous two columns are referred.
[1] McCallum, Andrew Kachites. "MALLET: A Machine Learning for Language Toolkit."http://mallet.cs.umass.edu. 2002.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The report provides a snapshot of the social media usage trends amongst online Canadian adults based on an online survey of 1500 participants. Canada continues to be one of the most connected countries in the world. An overwhelming majority of online Canadian adults (94%) have an account on at least one social media platform. However, the 2022 survey results show that the COVID-19 pandemic has ushered in some changes in how and where Canadians are spending their time on social media. Dominant platforms such as Facebook, messaging apps and YouTube are still on top but are losing ground to newer platforms such as TikTok and more niche platforms such as Reddit and Twitch.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The peer-reviewed publication for this dataset has been presented in the 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), and can be accessed here: https://arxiv.org/abs/2205.02596. Please cite this when using the dataset.
This dataset contains a heterogeneous set of True and False COVID claims and online sources of information for each claim.
The claims have been obtained from online fact-checking sources, existing datasets and research challenges. It combines different data sources with different foci, thus enabling a comprehensive approach that combines different media (Twitter, Facebook, general websites, academia), information domains (health, scholar, media), information types (news, claims) and applications (information retrieval, veracity evaluation).
The processing of the claims included an extensive de-duplication process eliminating repeated or very similar claims. The dataset is presented in a LARGE and a SMALL version, accounting for different degrees of similarity between the remaining claims (excluding respectively claims with a 90% and 99% probability of being similar, as obtained through the MonoT5 model). The similarity of claims was analysed using BM25 (Robertson et al., 1995; Crestani et al., 1998; Robertson and Zaragoza, 2009) with MonoT5 re-ranking (Nogueira et al., 2020), and BERTScore (Zhang et al., 2019).
The processing of the content also involved removing claims making only a direct reference to existing content in other media (audio, video, photos); automatically obtained content not representing claims; and entries with claims or fact-checking sources in languages other than English.
The claims were analysed to identify types of claims that may be of particular interest, either for inclusion or exclusion depending on the type of analysis. The following types were identified: (1) Multimodal; (2) Social media references; (3) Claims including questions; (4) Claims including numerical content; (5) Named entities, including: PERSON − People, including fictional; ORGANIZATION − Companies, agencies, institutions, etc.; GPE − Countries, cities, states; FACILITY − Buildings, highways, etc. These entities have been detected using a RoBERTa base English model (Liu et al., 2019) trained on the OntoNotes Release 5.0 dataset (Weischedel et al., 2013) using Spacy.
The original labels for the claims have been reviewed and homogenised from the different criteria used by each original fact-checker into the final True and False labels.
The data sources used are:
- The CoronaVirusFacts/DatosCoronaVirus Alliance Database. https://www.poynter.org/ifcn-covid-19-misinformation/
- CoAID dataset (Cui and Lee, 2020) https://github.com/cuilimeng/CoAID
- MM-COVID (Li et al., 2020) https://github.com/bigheiniu/MM-COVID
- CovidLies (Hossain et al., 2020) https://github.com/ucinlp/covid19-data
- TREC Health Misinformation track https://trec-health-misinfo.github.io/
- TREC COVID challenge (Voorhees et al., 2021; Roberts et al., 2020) https://ir.nist.gov/covidSubmit/data.html
The LARGE dataset contains 5,143 claims (1,810 False and 3,333 True), and the SMALL version 1,709 claims (477 False and 1,232 True).
The entries in the dataset contain the following information:
- Claim. Text of the claim.
- Claim label. The labels are: False, and True.
- Claim source. The sources include mostly fact-checking websites, health information websites, health clinics, public institutions sites, and peer-reviewed scientific journals.
- Original information source. Information about which general information source was used to obtain the claim.
- Claim type. The different types, previously explained, are: Multimodal, Social Media, Questions, Numerical, and Named Entities.
Funding. This work was supported by the UK Engineering and Physical Sciences Research Council (grant no. EP/V048597/1, EP/T017112/1). ML and YH are supported by Turing AI Fellowships funded by the UK Research and Innovation (grant no. EP/V030302/1, EP/V020579/1).
References
- Arana-Catania M., Kochkina E., Zubiaga A., Liakata M., Procter R., He Y.. Natural Language Inference with Self-Attention for Veracity Assessment of Pandemic Claims. NAACL 2022 https://arxiv.org/abs/2205.02596
- Stephen E Robertson, Steve Walker, Susan Jones, Micheline M Hancock-Beaulieu, Mike Gatford, et al. 1995. Okapi at trec-3. Nist Special Publication Sp,109:109.
- Fabio Crestani, Mounia Lalmas, Cornelis J Van Rijsbergen, and Iain Campbell. 1998. “is this document relevant?. . . probably” a survey of probabilistic models in information retrieval. ACM Computing Surveys (CSUR), 30(4):528–552.
- Stephen Robertson and Hugo Zaragoza. 2009. The probabilistic relevance framework: BM25 and beyond. Now Publishers Inc.
- Rodrigo Nogueira, Zhiying Jiang, Ronak Pradeep, and Jimmy Lin. 2020. Document ranking with a pre-trained sequence-to-sequence model. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings, pages 708–718.
- Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q Weinberger, and Yoav Artzi. 2019. Bertscore: Evaluating text generation with bert. In International Conference on Learning Representations.
- Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
- Ralph Weischedel, Martha Palmer, Mitchell Marcus, Eduard Hovy, Sameer Pradhan, Lance Ramshaw, Nianwen Xue, Ann Taylor, Jeff Kaufman, Michelle Franchini, et al. 2013. Ontonotes release 5.0 ldc2013t19. Linguistic Data Consortium, Philadelphia, PA, 23.
- Limeng Cui and Dongwon Lee. 2020. Coaid: Covid-19 healthcare misinformation dataset. arXiv preprint arXiv:2006.00885.
- Yichuan Li, Bohan Jiang, Kai Shu, and Huan Liu. 2020. Mm-covid: A multilingual and multimodal data repository for combating covid-19 disinformation.
- Tamanna Hossain, Robert L. Logan IV, Arjuna Ugarte, Yoshitomo Matsubara, Sean Young, and Sameer Singh. 2020. COVIDLies: Detecting COVID-19 misinformation on social media. In Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020, Online. Association for Computational Linguistics.
- Ellen Voorhees, Tasmeer Alam, Steven Bedrick, Dina Demner-Fushman, William R Hersh, Kyle Lo, Kirk Roberts, Ian Soboroff, and Lucy Lu Wang. 2021. Trec-covid: constructing a pandemic information retrieval test collection. In ACM SIGIR Forum, volume 54, pages 1–12. ACM New York, NY, USA.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This feature layer is part of SDGs Today. Please see sdgstoday.orgTo ensure that all women and girls can benefit from the digital revolution, tracking progress on gender inequalities in relation to internet and mobile access and use is more important than ever. Unfortunately, the data are significantly lacking in geographic coverage, comparability, and timeliness. The University of Oxford and Qatar Computing Research Institute (QCRI), with support from Data2X, are collaborating to measure digital gender gaps in real time. The Digital Gender Gaps project uses Facebook marketing data to generate a country-level dataset combining ‘online’ indicators of Facebook users by gender, age, and device type. These online indicators are used to predict internet and mobile use gender gaps by validating them against data on gender gaps in internet and mobile access from nationally-representative surveys where available. The data shows the internet gender gap (ratio of female-to-male internet use) and mobile gender gap (female-to-male mobile use) estimated using the Facebook Gender Gap Index (female-to-male ratio of Facebook users).Read more about the methodology here. To learn more about the project visit www.digitalgendergaps.org. Contact Ridhi Kashyap (ridhi.kashyap@nuffield.ox.ac.uk) or Ingmar Weber (iweber@hbku.edu.qa) for any questions about the data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
More than 200 million businesses use Facebook globally. The goal of Meta’s quarterly Small Business Surveys is to learn about the unique perspectives, challenges and opportunities of small and medium-sized businesses (SMBs).
The Future of Business (FoB) Survey is conducted biannually in partnership with the World Bank and the Organisation for Economic Cooperation and Development (OECD) across nearly 100 countries. The target population consists of SMEs that have an active Facebook Business Page and include both newer and longer-standing businesses, spanning across a variety of sectors. Meta also conducts the Global State of Small Business (GSoSB) Survey bi-annually in partnership with various academic partners across approximately 30 countries. Similarly to the FoB Survey, the target population is active Facebook Page Administrators, but also includes the general population of Facebook users.
Survey questions for all surveys cover a range of topics depending on the survey wave such as business characteristics, challenges, financials and strategy in addition to custom modules related to regulation, gender inequity, access to finance, digital technologies, reduction in revenues, business closures, international trade, inflation, reduction of employees and challenges/needs of the business.
Aggregated country level data for each survey wave is available to the public on HDX and controlled access microdata is available to Data for Good at Meta partners. Please visit https://dataforgood.facebook.com/dfg/tools/future-of-business-survey to apply for access to microdata or contact dataforgood@fb.com for any questions.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘COVID-19: Holidays of countries’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/vbmokin/covid19-holidays-of-countries on 28 January 2022.
--- Dataset description provided by original source is as follows ---
This research is devoted to the analysis of the impact of holidays on the statistics of confirmed coronavirus diseases. The Prophet using the holidays library with holidays of countries and their regions. As of 30 June 2020, only 62 countries (some with regions) are available in the holidays library:
['AR', 'AT', 'AU', 'BD', 'BE', 'BG', 'BR', 'BY', 'CA', 'CH', 'CL', 'CN', 'CO', 'CZ', 'DE', 'DK', 'DO', 'EE', 'EG', 'ES', 'FI', 'FR', 'GB', 'GR', 'HN', 'HR', 'HU', 'ID', 'IE', 'IL', 'IN', 'IS', 'IT', 'JP', 'KE', 'KR', 'LT', 'LU', 'MX', 'MY', 'NG', 'NI', 'NL', 'NO', 'NZ', 'PE', 'PH', 'PK', 'PL', 'PT', 'PY', 'RS', 'RU', 'SE', 'SG', 'SI', 'SK', 'TH', 'TR', 'UA', 'US', 'ZA'] or ['Argentina', 'Australia', 'Austria', 'Bangladesh', 'Belarus', 'Belgium', 'Brazil', 'Bulgaria', 'Canada', 'Chile', 'China', 'Colombia', 'Croatia', 'Czechia', 'Denmark', 'Dominican Republic', 'Egypt', 'Estonia', 'Finland', 'France', 'Germany', 'Greece', 'Honduras', 'Hungary', 'Iceland', 'India', 'Indonesia', 'Ireland', 'Israel', 'Italy', 'Japan', 'Kenya', 'Korea, Republic of', 'Lithuania', 'Luxembourg', 'Malaysia', 'Mexico', 'Netherlands', 'New Zealand', 'Nicaragua', 'Nigeria', 'Norway', 'Pakistan', 'Paraguay', 'Peru', 'Philippines', 'Poland', 'Portugal', 'Russian Federation', 'Serbia', 'Singapore', 'Slovakia', 'Slovenia', 'South Africa', 'Spain', 'Sweden', 'Switzerland', 'Thailand', 'Turkey', 'Ukraine', 'United Kingdom', 'United States']
I will note at once that the list of available countries in the description of the holidays library contains a lot of mistakes, which I wrote to the authors.
When I asked if this list would expand, the Prophet team made it clear that they were waiting for help from the community with holidays library expand.
As of Jan 2021 (version 8.4.1), 67 countries (some with regions) are available in the holidays library: a number of data have been refined and countries ['BI', 'LV', 'MA', 'RO', 'VN' - two-letter country codes or alpha_2 of the country (ISO 3166)] added.
Unfortunately, the format of the holidays library is not very suitable for coronavirus problems, as it has a number of disadvantages. First, the names of the countries are given in one word, which makes it difficult for many of them to identify them according to their common names (ISO 3166). It is best that the dataset contains the common name and two-letter abbreviation in English according to ISO 3166 (see pycountry). Second, the dates are not adapted to the potential impact of the holidays on coronavirus statistics. It is known that after the moment of infection, the active manifestation of symptoms occurs with a delay of 4-10 days, that is a person is likely to get into the statistics on the number of diseases only after 4-7 days. Therefore, it is advisable to use the dates window of impacts: ``` Lower_window = [4, 7] Upper_window = [7, 10]
`Lower_window <= 0`
But my [request](https://github.com/facebook/prophet/issues/1588#issue-661098613) to allow positive numbers in this parameter [was refused](https://github.com/facebook/prophet/issues/1588#issuecomment-661984730) by the Prophet team and [advised](https://github.com/facebook/prophet/issues/1588#issuecomment-661984730) to simply move the dates themselves.
Therefore, it is advisable to shift the holiday dates by 7 days. If the researcher thinks that 7 is too much and enough is 4 days, then he simply indicates "Lower" of the window in -3. Actually, by default, it makes sense to specify parameters:
Lower_window = -3 Upper_window = 3
If necessary, these settings are easy to change
### Content
This dataset:
1. Contains ISO codes, ISO names (common and official) (ISO 3166) of **70** countries (3 European countries **['Albania' - 'AL', 'Georgia' - 'GE', 'Moldova' - 'MD']** have been added).
2. Contains imported dates from the holidays library for 2020-01-20-2021-12-31 (all countries from holidays library as of Jan 2021), and the same dates, but moved 7 days forward.
3. Holidays of countries that are not in the list of holidays of the library, but which are in the data of the World Health Organization and on which considerable statistics of diseases on coronavirus are already collected.
4. Parameters for Prophet model:
`lower_window, upper_window, prior_scale`
If you find errors, please write to the [Discussion](https://www.kaggle.com/vbmokin/covid19-holidays-of-countries/discussion).
It is planned to periodically update (and, if necessary, correct) this dataset.
### Acknowledgements
Thanks to the authors of the information resources
* [https://github.com/dr-prodigy/python-holidays](https://github.com/dr-prodigy/python-holidays)
* [https://en.wikipedia.org/wiki/List_of_holidays_by_country](https://en.wikipedia.org/wiki/List_of_holidays_by_country)
about the dates and names of holidays in different countries, which I used.
Thanks for the image to <a href="https://pixabay.com/ru/users/iXimus-2352783/?utm_source=link-attribution&utm_medium=referral&utm_campaign=image&utm_content=5062659">iXimus</a> from <a href="https://pixabay.com/ru/?utm_source=link-attribution&utm_medium=referral&utm_campaign=image&utm_content=5062659">Pixabay</a>
### Inspiration
The main task for which this dataset was created is to study the impact of holidays on the accuracy of predicting coronavirus diseases, identifying new patterns, and forming optimal solutions to counteract or minimize its spread.
Tasks that need to be solved to improve this dataset in order to increase the accuracy of modeling the impact of holidays on the number of coronavirus patients:
1) Expanding the list of countries
2) Clarification of holiday dates
3) Clarification of parameters
`lower_window, upper_window, prior_scale`
they must be unique for each country and each holiday.
Also, it is advisable to carry out similar work for each region of countries, but this will not be done in this dataset.
--- Original source retains full ownership of the source dataset ---
The global number of Facebook users was forecast to continuously increase between 2023 and 2027 by in total 391 million users (+14.36 percent). After the fourth consecutive increasing year, the Facebook user base is estimated to reach 3.1 billion users and therefore a new peak in 2027. Notably, the number of Facebook users was continuously increasing over the past years.User figures, shown here regarding the platform Facebook, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).
The number of Facebook users in India was forecast to continuously increase between 2024 and 2028 by in total 59.2 million users (+8.7 percent). After the ninth consecutive increasing year, the Facebook user base is estimated to reach 739.66 million users and therefore a new peak in 2028. Notably, the number of Facebook users of was continuously increasing over the past years.User figures, shown here regarding the platform facebook, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Facebook users in countries like Nepal and Pakistan.
https://edg.epa.gov/EPA_Data_License.htmhttps://edg.epa.gov/EPA_Data_License.htm
Annual air trends report in the form of an interactive web application. The report features a suite of visualization tools that allow the user to: -Learn about air pollution and how it can affect our health and environment. -Compare key air emissions to gross domestic product, vehicle miles traveled, population, and energy consumption back to 1970. -Take a closer look at how the number of days with unhealthy air has dropped since 2000 in 35 major US cities. -Explore how air quality and emissions have changed through time and space for each of the common air pollutants. -Check out air trends where you live.
Users will also be able to share this content across social media, with one-click access to Facebook, Twitter, Pinterest, and other major social media sites.
How much time do people spend on social media? As of 2024, the average daily social media usage of internet users worldwide amounted to 143 minutes per day, down from 151 minutes in the previous year. Currently, the country with the most time spent on social media per day is Brazil, with online users spending an average of three hours and 49 minutes on social media each day. In comparison, the daily time spent with social media in the U.S. was just two hours and 16 minutes. Global social media usageCurrently, the global social network penetration rate is 62.3 percent. Northern Europe had an 81.7 percent social media penetration rate, topping the ranking of global social media usage by region. Eastern and Middle Africa closed the ranking with 10.1 and 9.6 percent usage reach, respectively. People access social media for a variety of reasons. Users like to find funny or entertaining content and enjoy sharing photos and videos with friends, but mainly use social media to stay in touch with current events friends. Global impact of social mediaSocial media has a wide-reaching and significant impact on not only online activities but also offline behavior and life in general. During a global online user survey in February 2019, a significant share of respondents stated that social media had increased their access to information, ease of communication, and freedom of expression. On the flip side, respondents also felt that social media had worsened their personal privacy, increased a polarization in politics and heightened everyday distractions.
The number of Facebook users in Europe was forecast to continuously increase between 2024 and 2028 by in total 15.5 million users (+3.91 percent). According to this forecast, in 2028, the Facebook user base will have increased for the sixth consecutive year to 412.26 million users. User figures, shown here regarding the platform facebook, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Facebook users in countries like South America and North America.
The number of Facebook users in Central & Western Europe was forecast to decrease between 2024 and 2028 by in total 29.8 million users. This overall decrease does not happen continuously, notably not in 2026 and 2027. The Facebook user base is estimated to amount to 192.47 million users in 2028. Notably, the number of Facebook users of was continuously increasing over the past years.User figures, shown here regarding the platform facebook, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Facebook users in countries like Eastern Europe and Russia.
This statistic shows a ranking of the estimated number of Facebook users in 2020 in Africa, differentiated by country. The user numbers have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in more than 150 countries and regions worldwide. All input data are sourced from international institutions, national statistical offices, and trade associations. All data has been are processed to generate comparable datasets (see supplementary notes under details for more information).
The number of Facebook users in the United States was forecast to continuously increase between 2024 and 2028 by in total 12.6 million users (+5.04 percent). After the ninth consecutive increasing year, the Facebook user base is estimated to reach 262.8 million users and therefore a new peak in 2028. Notably, the number of Facebook users of was continuously increasing over the past years.User figures, shown here regarding the platform facebook, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).