48 datasets found

Facebook users worldwide 2017-2027
statista.com
de.statista.com
+1more
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stacy Jo Dixon, Facebook users worldwide 2017-2027 [Dataset]. https://www.statista.com/topics/1164/social-networks/
Explore at:
Dataset provided by
Statistahttp://statista.com/
Authors
Stacy Jo Dixon
Description
The global number of Facebook users was forecast to continuously increase between 2023 and 2027 by in total 391 million users (+14.36 percent). After the fourth consecutive increasing year, the Facebook user base is estimated to reach 3.1 billion users and therefore a new peak in 2027. Notably, the number of Facebook users was continuously increasing over the past years. User figures, shown here regarding the platform Facebook, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).

Countries with the most Facebook users 2024

statista.com
de.statista.com
+1more

Facebook

Twitter

Click to copy link

Link copied

Cite

Stacy Jo Dixon, Countries with the most Facebook users 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/

Explore at:

Dataset provided by

Statistahttp://statista.com/

Authors

Stacy Jo Dixon

Description

Which county has the most Facebook users?

              There are more than 378 million Facebook users in India alone, making it the leading country in terms of Facebook audience size. To put this into context, if India’s Facebook audience were a country then it would be ranked third in terms of largest population worldwide. Apart from India, there are several other markets with more than 100 million Facebook users each: The United States, Indonesia, and Brazil with 193.8 million, 119.05 million, and 112.55 million Facebook users respectively.

              Facebook – the most used social media

              Meta, the company that was previously called Facebook, owns four of the most popular social media platforms worldwide, WhatsApp, Facebook Messenger, Facebook, and Instagram. As of the third quarter of 2021, there were around 3,5 billion cumulative monthly users of the company’s products worldwide. With around 2.9 billion monthly active users, Facebook is the most popular social media worldwide. With an audience of this scale, it is no surprise that the vast majority of Facebook’s revenue is generated through advertising.

              Facebook usage by device
              As of July 2021, it was found that 98.5 percent of active users accessed their Facebook account from mobile devices. In fact, almost 81.8 percent of Facebook audiences worldwide access the platform only via mobile phone. Facebook is not only available through mobile browser as the company has published several mobile apps for users to access their products and services. As of the third quarter 2021, the four core Meta products were leading the ranking of most downloaded mobile apps worldwide, with WhatsApp amassing approximately six billion downloads.

Number of global social network users 2017-2028

statista.com
es.statista.com
+1more

Facebook

Twitter

Click to copy link

Link copied

Cite

Stacy Jo Dixon, Number of global social network users 2017-2028 [Dataset]. https://www.statista.com/topics/1164/social-networks/

Explore at:

Dataset provided by

Statistahttp://statista.com/

Authors

Stacy Jo Dixon

Description

How many people use social media?

              Social media usage is one of the most popular online activities. In 2024, over five billion people were using social media worldwide, a number projected to increase to over six billion in 2028.

              Who uses social media?
              Social networking is one of the most popular digital activities worldwide and it is no surprise that social networking penetration across all regions is constantly increasing. As of January 2023, the global social media usage rate stood at 59 percent. This figure is anticipated to grow as lesser developed digital markets catch up with other regions
              when it comes to infrastructure development and the availability of cheap mobile devices. In fact, most of social media’s global growth is driven by the increasing usage of mobile devices. Mobile-first market Eastern Asia topped the global ranking of mobile social networking penetration, followed by established digital powerhouses such as the Americas and Northern Europe.

              How much time do people spend on social media?
              Social media is an integral part of daily internet usage. On average, internet users spend 151 minutes per day on social media and messaging apps, an increase of 40 minutes since 2015. On average, internet users in Latin America had the highest average time spent per day on social media.

              What are the most popular social media platforms?
              Market leader Facebook was the first social network to surpass one billion registered accounts and currently boasts approximately 2.9 billion monthly active users, making it the most popular social network worldwide. In June 2023, the top social media apps in the Apple App Store included mobile messaging apps WhatsApp and Telegram Messenger, as well as the ever-popular app version of Facebook.

Meta Stock Price Dataset 🔥🔍⭐
kaggle.com
Updated Jun 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mayank Anand (2024). Meta Stock Price Dataset 🔥🔍⭐ [Dataset]. https://www.kaggle.com/datasets/mayankanand2701/meta-stock-price-dataset/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 1, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Mayank Anand
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Meta Platforms, Inc., formerly known as Facebook, was founded by Mark Zuckerberg along with his college roommates in 2004. Initially created as a social networking site for Harvard students, Facebook rapidly expanded to become a global social media giant. The company rebranded as Meta in 2021 to reflect its new focus on building the metaverse, a virtual-reality space where users can interact with a computer-generated environment and other users. Over the years, Meta has acquired several other social media platforms and technology companies, including Instagram, WhatsApp, and Oculus VR, significantly expanding its influence in the tech industry. Headquartered in Menlo Park, California, Meta continues to lead in social media innovation and virtual reality technology.

This dataset provides a comprehensive record of Meta's stock price changes over the last 12 years. It includes essential columns such as the date, opening price, highest price of the day, lowest price of the day, closing price, adjusted closing price, and trading volume.

This extensive data is invaluable for conducting historical analyses, forecasting future stock performance, and understanding long-term market trends related to Meta's stock.
📊 Meta Kaggle| Kaggle Users' Stats
kaggle.com
Updated Jun 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BwandoWando (2025). 📊 Meta Kaggle| Kaggle Users' Stats [Dataset]. http://doi.org/10.34740/kaggle/dsv/10595847
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/10595847
Dataset updated
Jun 26, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
BwandoWando
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Image

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1842206%2Ff84a67b64934ccfdd6fd4bfc24db094d%2F_982f849a-87df-44ff-94ff-3fc97c6198aa-small2.jpeg?generation=1738169001850229&alt=media" alt="">

History

03Mar2025- when determining last content shared, I am now using the latest version of Model, Dataset, and Notebook, rather than the creation date of the very first version. I also added the reaction counts which was a new csv added in the MetaKaggle dataset. The discussion can be found here . I also added versions created for Model, Notebook, and Dataset to properly track users that are updating their datasets.

04Feb2025- Fixed the issue on ModelUpvotesGiven and ModelUpvotesReceived values being identical

Context

User aggregated stats and data using the Official Meta Kaggle dataset

Note

Expect some discrepancies between the counts seen in your profile, because, aside from there is a lag of one to two days before a new dataset is published, some information such as Kaggle staffs' upvotes and private competitions are not included. But for almost all members, the figures should reconcile

Notebook updater

📊 (Scheduled) Meta Kaggle Users' Stats

Image

Generated with Bing image generator
Global social media subscriptions comparison 2023
statista.com
es.statista.com
+1more
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stacy Jo Dixon, Global social media subscriptions comparison 2023 [Dataset]. https://www.statista.com/topics/1164/social-networks/
Explore at:
Dataset provided by
Statistahttp://statista.com/
Authors
Stacy Jo Dixon
Description
Social media companies are starting to offer users the option to subscribe to their platforms in exchange for monthly fees. Until recently, social media has been predominantly free to use, with tech companies relying on advertising as their main revenue generator. However, advertising revenues have been dropping following the COVID-induced boom. As of July 2023, Meta Verified is the most costly of the subscription services, setting users back almost 15 U.S. dollars per month on iOS or Android. Twitter Blue costs between eight and 11 U.S. dollars per month and ensures users will receive the blue check mark, and have the ability to edit tweets and have NFT profile pictures. Snapchat+, drawing in four million users as of the second quarter of 2023, boasts a Story re-watch function, custom app icons, and a Snapchat+ badge.
Meta Kaggle Code
kaggle.com
zip
Updated Sep 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kaggle (2025). Meta Kaggle Code [Dataset]. https://www.kaggle.com/datasets/kaggle/meta-kaggle-code/code
Explore at:
zip(157532166589 bytes)Available download formats
Dataset updated
Sep 18, 2025
Dataset authored and provided by
Kagglehttp://kaggle.com/
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Explore our public notebook content!

Meta Kaggle Code is an extension to our popular Meta Kaggle dataset. This extension contains all the raw source code from hundreds of thousands of public, Apache 2.0 licensed Python and R notebooks versions on Kaggle used to analyze Datasets, make submissions to Competitions, and more. This represents nearly a decade of data spanning a period of tremendous evolution in the ways ML work is done.

Why we’re releasing this dataset

By collecting all of this code created by Kaggle’s community in one dataset, we hope to make it easier for the world to research and share insights about trends in our industry. With the growing significance of AI-assisted development, we expect this data can also be used to fine-tune models for ML-specific code generation tasks.

Meta Kaggle for Code is also a continuation of our commitment to open data and research. This new dataset is a companion to Meta Kaggle which we originally released in 2016. On top of Meta Kaggle, our community has shared nearly 1,000 public code examples. Research papers written using Meta Kaggle have examined how data scientists collaboratively solve problems, analyzed overfitting in machine learning competitions, compared discussions between Kaggle and Stack Overflow communities, and more.

The best part is Meta Kaggle enriches Meta Kaggle for Code. By joining the datasets together, you can easily understand which competitions code was run against, the progression tier of the code’s author, how many votes a notebook had, what kinds of comments it received, and much, much more. We hope the new potential for uncovering deep insights into how ML code is written feels just as limitless to you as it does to us!

Sensitive data

While we have made an attempt to filter out notebooks containing potentially sensitive information published by Kaggle users, the dataset may still contain such information. Research, publications, applications, etc. relying on this data should only use or report on publicly available, non-sensitive information.

Joining with Meta Kaggle

The files contained here are a subset of the KernelVersions in Meta Kaggle. The file names match the ids in the KernelVersions csv file. Whereas Meta Kaggle contains data for all interactive and commit sessions, Meta Kaggle Code contains only data for commit sessions.

File organization

The files are organized into a two-level directory structure. Each top level folder contains up to 1 million files, e.g. - folder 123 contains all versions from 123,000,000 to 123,999,999. Each sub folder contains up to 1 thousand files, e.g. - 123/456 contains all versions from 123,456,000 to 123,456,999. In practice, each folder will have many fewer than 1 thousand files due to private and interactive sessions.

The ipynb files in this dataset hosted on Kaggle do not contain the output cells. If the outputs are required, the full set of ipynbs with the outputs embedded can be obtained from this public GCS bucket: kaggle-meta-kaggle-code-downloads. Note that this is a "requester pays" bucket. This means you will need a GCP account with billing enabled to download. Learn more here: https://cloud.google.com/storage/docs/requester-pays

Questions / Comments

We love feedback! Let us know in the Discussion tab.

Happy Kaggling!
t
Crossroad Camera Dataset - Mobility Aid Users
repository.tugraz.at
zip
Updated May 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ludwig Mohr; Nadezda Kirillova; Horst Possegger; Horst Bischof; Ludwig Mohr; Nadezda Kirillova; Horst Possegger; Horst Bischof (2025). Crossroad Camera Dataset - Mobility Aid Users [Dataset]. http://doi.org/10.3217/2gat1-pev27
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.3217/2gat1-pev27
Dataset updated
May 13, 2025
Dataset provided by
Graz University of Technology
Authors
Ludwig Mohr; Nadezda Kirillova; Horst Possegger; Horst Bischof; Ludwig Mohr; Nadezda Kirillova; Horst Possegger; Horst Bischof
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Time period covered
Oct 2022
Description
The most vulnerable group of traffic participants are pedestrians using mobility aids. While there has been significant progress in the robustness and reliability of camera based general pedestrian detection systems, pedestrians reliant on mobility aids are highly underrepresented in common datasets for object detection and classification.
To bridge this gap and enable research towards robust and reliable detection systems which may be employed in traffic monitoring, scheduling, and planning, we present this dataset of a pedestrian crossing scenario taken from an elevated traffic monitoring perspective together with ground truth annotations (Yolo format [1]). Classes present in the dataset are pedestrian (without mobility aids), as well as pedestrians using wheelchairs, rollators/wheeled walkers, crutches, and walking canes. The dataset comes with official training, validation, and test splits.
An in-depth description of the dataset can be found in [2]. If you make use of this dataset in your work, research or publication, please cite this work as:
@inproceedings{mohr2023mau,
author = {Mohr, Ludwig and Kirillova, Nadezda and Possegger, Horst and Bischof, Horst},
title = {{A Comprehensive Crossroad Camera Dataset of Mobility Aid Users}},
booktitle = {Proceedings of the 34th British Machine Vision Conference ({BMVC}2023)},
year = {2023}
}
Archive mobility.zip contains the full detection dataset in Yolo format with images, ground truth labels and meta data, archive mobility_class_hierarchy.zip contains labels and meta files (Yolo format) for training with class hierarchy using e.g. the modified version of Yolo v5/v8 available under [3].
To use this dataset with Yolo, you will need to download and extract the zip archive and change the path entry in dataset.yaml to the directory where you extracted the archive to.
[1] https://github.com/ultralytics/ultralytics
[2] coming soon
[3] coming soon
f
Who Tweets? Deriving the Demographic Characteristics of Age, Occupation and...
plos.figshare.com
datasetcatalog.nlm.nih.gov
txt
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luke Sloan; Jeffrey Morgan; Pete Burnap; Matthew Williams (2023). Who Tweets? Deriving the Demographic Characteristics of Age, Occupation and Social Class from Twitter User Meta-Data [Dataset]. http://doi.org/10.1371/journal.pone.0115545
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0115545
Dataset updated
Jun 3, 2023
Dataset provided by
PLOS ONE
Authors
Luke Sloan; Jeffrey Morgan; Pete Burnap; Matthew Williams
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This paper specifies, designs and critically evaluates two tools for the automated identification of demographic data (age, occupation and social class) from the profile descriptions of Twitter users in the United Kingdom (UK). Meta-data data routinely collected through the Collaborative Social Media Observatory (COSMOS: http://www.cosmosproject.net/) relating to UK Twitter users is matched with the occupational lookup tables between job and social class provided by the Office for National Statistics (ONS) using SOC2010. Using expert human validation, the validity and reliability of the automated matching process is critically assessed and a prospective class distribution of UK Twitter users is offered with 2011 Census baseline comparisons. The pattern matching rules for identifying age are explained and enacted following a discussion on how to minimise false positives. The age distribution of Twitter users, as identified using the tool, is presented alongside the age distribution of the UK population from the 2011 Census. The automated occupation detection tool reliably identifies certain occupational groups, such as professionals, for which job titles cannot be confused with hobbies or are used in common parlance within alternative contexts. An alternative explanation on the prevalence of hobbies is that the creative sector is overrepresented on Twitter compared to 2011 Census data. The age detection tool illustrates the youthfulness of Twitter users compared to the general UK population as of the 2011 Census according to proportions, but projections demonstrate that there is still potentially a large number of older platform users. It is possible to detect “signatures” of both occupation and age from Twitter meta-data with varying degrees of accuracy (particularly dependent on occupational groups) but further confirmatory work is needed.
h
Supporting data for "A Meta-Intervention: Quantifying the Impact of Social...
datahub.hku.hk
Updated May 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mingzhe Quan (2025). Supporting data for "A Meta-Intervention: Quantifying the Impact of Social Media Information on Adherence to Non-Pharmaceutical Interventions" [Dataset]. http://doi.org/10.25442/hku.29068061.v1
Explore at:
Unique identifier
https://doi.org/10.25442/hku.29068061.v1
Dataset updated
May 23, 2025
Dataset provided by
HKU Data Repository
Authors
Mingzhe Quan
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
This dataset supports a research project in the field of digital medicine, which aims to quantify the impact of disseminating scientific information on social media—as a form of "meta-intervention"—on public adherence to Non-Pharmaceutical Interventions (NPIs) during health crises such as the COVID-19 pandemic. The research encompasses multiple sub-studies and pilot experiments, drawing data from various global and China-specific social media platforms.The data included in this submission has been collected from several sources:From Sina Weibo and Tencent WeChat, 189 online poll datasets were collected, involving a total of 1,391,706 participants. These participants are users of Sina Weibo or Tencent WeChat.From Twitter, 187 tweets published by scientists (verified with a blue checkmark) related to COVID-19 were collected.From Xiaohongshu and Bilibili, textual content from 143 user posts/videos concerning COVID-19, along with associated user comments and specific user responses to a question, were gathered.It is important to note that while the broader research project also utilized a 3TB Reddit corpus hosted on Academic Torrents (academictorrents.com), this specific Reddit dataset is publicly available directly from Academic Torrents and is not included in this particular DataHub submission. The submitted dataset comprises publicly available data, formatted as Excel files (.xlsx), and includes the following:Filename: scientists' discourse (source from screenshot of tweets)Description: This file contains screenshots of tweets published by scientists on Twitter concerning COVID-19 research, its current status, and related topics. It also includes a coded analysis of the textual content from these tweets. Specific details regarding the coding scheme can be found in the readme.txt file.Filename: The links of online polls (Weibo & WeChat)Description: This data file includes information from online polls conducted on Weibo and WeChat after December 7, 2022. These polls, often initiated by verified users (who may or may not be science popularizers), aimed to track the self-reported proportion of participants testing positive for COVID-19 (via PCR or rapid antigen test) or remaining negative, particularly during periods of rapid Omicron infection spread. The file contains links to the original polls, links to the social media accounts that published these polls, and relevant metadata about both the poll-creating accounts and the online polls themselves.Filename: Online posts & comments (From Xiaohongshu & Bilibili)Description: This file contains textual content from COVID-19 related posts and videos published by users on the Xiaohongshu and Bilibili platforms. It also includes user-generated comments reacting to these posts/videos, as well as user responses to a specific question posed within the context of the original content.Key Features of this Dataset:Data Type: Mixed, including textual data, screenshots of social media posts, web links to original sources, and coded metadata.Source Platforms: Twitter (global), Weibo/WeChat (primarily China), Xiaohongshu (China), and Bilibili (video-sharing platform, primarily China).Use Case: This dataset is intended for the analysis of public discourse, the dissemination of scientific information, and user engagement patterns across different cultural contexts and social media platforms, particularly in relation to public health information.
Data from: Using multiple imputation to estimate missing data in...
zenodo.org
data.niaid.nih.gov
+2more
csv, txt
Updated May 30, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
E. Hance Ellington; Guillaume Bastille-Rousseau; Cayla Austin; Kristen N. Landolt; Bruce A. Pond; Erin E. Rees; Nicholas Robar; Dennis L. Murray; E. Hance Ellington; Guillaume Bastille-Rousseau; Cayla Austin; Kristen N. Landolt; Bruce A. Pond; Erin E. Rees; Nicholas Robar; Dennis L. Murray (2022). Data from: Using multiple imputation to estimate missing data in meta-regression [Dataset]. http://doi.org/10.5061/dryad.m2v4m
Explore at:
txt, csvAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.m2v4m
Dataset updated
May 30, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
E. Hance Ellington; Guillaume Bastille-Rousseau; Cayla Austin; Kristen N. Landolt; Bruce A. Pond; Erin E. Rees; Nicholas Robar; Dennis L. Murray; E. Hance Ellington; Guillaume Bastille-Rousseau; Cayla Austin; Kristen N. Landolt; Bruce A. Pond; Erin E. Rees; Nicholas Robar; Dennis L. Murray
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
There is a growing need for scientific synthesis in ecology and evolution. In many cases, meta-analytic techniques can be used to complement such synthesis. However, missing data is a serious problem for any synthetic efforts and can compromise the integrity of meta-analyses in these and other disciplines. Currently, the prevalence of missing data in meta-analytic datasets in ecology and the efficacy of different remedies for this problem have not been adequately quantified. 2. We generated meta-analytic datasets based on literature reviews of experimental and observational data and found that missing data were prevalent in meta-analytic ecological datasets. We then tested the performance of complete case removal (a widely used method when data are missing) and multiple imputation (an alternative method for data recovery) and assessed model bias, precision, and multi-model rankings under a variety of simulated conditions using published meta-regression datasets. 3. We found that complete case removal led to biased and imprecise coefficient estimates and yielded poorly specified models. In contrast, multiple imputation provided unbiased parameter estimates with only a small loss in precision. The performance of multiple imputation, however, was dependent on the type of data missing. It performed best when missing values were weighting variables, but performance was mixed when missing values were predictor variables. Multiple imputation performed poorly when imputing raw data which was then used to calculate effect size and the weighting variable. 4. We conclude that complete case removal should not be used in meta-regression, and that multiple imputation has the potential to be an indispensable tool for meta-regression in ecology and evolution. However, we recommend that users assess the performance of multiple imputation by simulating missing data on a subset of their data before implementing it to recover actual missing data.

Facebook: distribution of global audiences 2024, by age and gender

statista.com
de.statista.com
+1more

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

Stacy Jo Dixon, Facebook: distribution of global audiences 2024, by age and gender [Dataset]. https://www.statista.com/topics/1164/social-networks/

Explore at:

Dataset provided by

Statistahttp://statista.com/

Authors

Stacy Jo Dixon

Description

As of April 2024, it was found that men between the ages of 25 and 34 years made up Facebook largest audience, accounting for 18.4 percent of global users. Additionally, Facebook's second largest audience base could be found with men aged 18 to 24 years.

              Facebook connects the world

              Founded in 2004 and going public in 2012, Facebook is one of the biggest internet companies in the world with influence that goes beyond social media. It is widely considered as one of the Big Four tech companies, along with Google, Apple, and Amazon (all together known under the acronym GAFA). Facebook is the most popular social network worldwide and the company also owns three other billion-user properties: mobile messaging apps WhatsApp and Facebook Messenger,
              as well as photo-sharing app Instagram. Facebook usersThe vast majority of Facebook users connect to the social network via mobile devices. This is unsurprising, as Facebook has many users in mobile-first online markets. Currently, India ranks first in terms of Facebook audience size with 378 million users. The United States, Brazil, and Indonesia also all have more than 100 million Facebook users each.

f
Dataset 1: The purchasing numbers.
plos.figshare.com
xls
Updated Apr 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shuang Zhou; Norlaile Salleh Hudin (2024). Dataset 1: The purchasing numbers. [Dataset]. http://doi.org/10.1371/journal.pone.0299087.t004
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0299087.t004
Dataset updated
Apr 18, 2024
Dataset provided by
PLOS ONE
Authors
Shuang Zhou; Norlaile Salleh Hudin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In recent years, the global e-commerce landscape has witnessed rapid growth, with sales reaching a new peak in the past year and expected to rise further in the coming years. Amid this e-commerce boom, accurately predicting user purchase behavior has become crucial for commercial success. We introduce a novel framework integrating three innovative approaches to enhance the prediction model’s effectiveness. First, we integrate an event-based timestamp encoding within a time-series attention model, effectively capturing the dynamic and temporal aspects of user behavior. This aspect is often neglected in traditional user purchase prediction methods, leading to suboptimal accuracy. Second, we incorporate Graph Neural Networks (GNNs) to analyze user behavior. By modeling users and their actions as nodes and edges within a graph structure, we capture complex relationships and patterns in user behavior more effectively than current models, offering a nuanced and comprehensive analysis. Lastly, our framework transcends traditional learning strategies by implementing advanced meta-learning techniques. This enables the model to autonomously adjust learning parameters, including the learning rate, in response to new and evolving data environments, thereby significantly enhancing its adaptability and learning efficiency. Through extensive experiments on diverse real-world e-commerce datasets, our model demonstrates superior performance, particularly in accuracy and adaptability in large-scale data scenarios. This study not only overcomes the existing challenges in analyzing e-commerce user behavior but also sets a foundation for future exploration in this dynamic field. We believe our contributions provide significant insights and tools for e-commerce platforms to better understand and cater to their users, ultimately driving sales and improving user experiences.
Z
Eye2Sky dataset - All-sky images and meteorological measurements
data.niaid.nih.gov
Updated Feb 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Schmidt, Thomas (2025). Eye2Sky dataset - All-sky images and meteorological measurements [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_12804612
Explore at:
Dataset updated
Feb 12, 2025
Dataset provided by
Hammer, Annette
Stührenberg, Jonas
Schmidt, Thomas
Vogt, Thomas
Wilbert, Stefan
Nouri, Bijan
Schroedter-Homscheidt, Marion
Lezaca, Jorge
Blum, Niklas
Heinemann, Detlev
License
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Description
Eye2Sky dataset - All-sky images and meteorological measurements

Description

The Eye2Sky public dataset comprises all-sky images from 29 All-Sky Imager (ASI) stations and meteorological measurements including solar irradiance from 11 stations in north-west Germany. A list of all stations with meta information is provided in the attached Eye2Sky_Station_List.xlsx

Meteorological measurements cover a full year from April 2022 to March 2023 with minutely averaged parameters. Parameters measured are three solar radiation components global horizontal, diffuse horizontal and direct normal irradiance. Global tilted irradiance is measured with tilt angles of 30° in southern direction. Measurement data has been quality controlled. Raw data is provided along with quality flags. For users who want to use measurements directly, a "ready-to-use" data set with cleaned data is provided.

ASI images of the whole ASI network at all stations cover a 4-month period from April to July 2022 with 30 seconds sampling rate, for station OLUOL the dataset covers a full year from April 2022 to March 2023 to allow season-dependent analyses at a central location inside the ASI network. The sampling rate for OLUOL is 15 seconds (since 2022-06-10). Images are provided as raw data (jpg-format), but calibration information and horizon masks are provided as meta data. Note: Due to maximum upload size allowed by Zenodo, only an example data set (ASI_20220620.zip) is provided here. The full set of ASI images can be downloaded from here: https://eye2sky.de/data

Quality remarks

Invalid images (broken downloads, empty images, ...) have been removed from the dataset.

Moreover, images where faces of people are visible have been removed.

Any other disturbed images (e.g. due to birds, insects or just dirt) are kept in the database.

Python libraries for image handling are in preparation and will be soon published on Github

Quality flags for measurement data

Measurement data was quality checked using the procedure in attached image QC_Flowchart.png. The manual quality control refers to obvious measurement errors known to the station operator. Corresponding data is removed from the data set before additional tests are carried out on the data in an automatic quality control process. The tests performed are described in a publication in preparation.

Data Format

All-sky images are provided as zipped archives in raw jpg-format (sampling 30 seconds).

Directory structure: One folder per day and station

Meta data: each station folder consists of

a YAML-file providing meta data for each configuration change. The filename contains the timestamp of the latest change in configuration (could be a change in camera alignment or positioning). The file contains information on geographical coordinates, internal and external orientation and necessary mask file.

image masks in binary png-files and mat-files.

an example ASI image with an image mask overlay and image orientated to geographic north according to calibration parameters

a keogram composite of all ASI images in the full data period.

Measurement data is provided in zipped station directories.

There are two files of timeseries data for each station and for the full time period.

data/STATION_ID.flagged.nc contains raw data including QC-flags

data/STATION_ID.cleaned.nc contains cleaned data

Meta data: For each station

a horizon file for the near horizon near_horizon.csv

and a log file of changes and quality check report is provided: QC_{STATION_ID}.pdf

Ceilometer data for Stations CDLRA (Oldenburg) and CLDRB (Westerstede) is provided in daily netcdf-Files.

Timeseries plots of daily measurements are attached for both stations.

VERSION UPDATES

Version 1.0:

Initial Upload

ACKNOWLEDGEMENTS

DLR Institute of Networked Energy Systems is responsible for the construction, operations, quality control and scientific evolution of the Eye2Sky network. DLR Institute of Solar Research was involved in designing the network and strongly supports Eye2Sky with software for data acquisition, calibration, processing and evaluation. We would like to thank all DLR staff who helped with the procurement of funding, installation support, data acquisition and data quality control. We also thank all persons, institutions and companies who contributed to the installation and who offered hosting a station.

LICENSE

All the All-Sky images are licensed under CC-BY-SA 3.0.- All the measurement data from meteorological stations and ceilometers is licensedunder the CDLA 1.0 sharing license: https://cdla.dev/sharing-1-0/.
Data from: KGTorrent: A Dataset of Python Jupyter Notebooks from Kaggle
zenodo.org
bin, bz2, pdf
Updated Jul 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luigi Quaranta; Fabio Calefato; Fabio Calefato; Filippo Lanubile; Filippo Lanubile; Luigi Quaranta (2024). KGTorrent: A Dataset of Python Jupyter Notebooks from Kaggle [Dataset]. http://doi.org/10.5281/zenodo.4468523
Explore at:
bz2, pdf, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4468523
Dataset updated
Jul 19, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Luigi Quaranta; Fabio Calefato; Fabio Calefato; Filippo Lanubile; Filippo Lanubile; Luigi Quaranta
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
KGTorrent is a dataset of Python Jupyter notebooks from the Kaggle platform.

The dataset is accompanied by a MySQL database containing metadata about the notebooks and the activity of Kaggle users on the platform. The information to build the MySQL database has been derived from Meta Kaggle, a publicly available dataset containing Kaggle metadata.

In this package, we share the complete KGTorrent dataset (consisting of the dataset itself plus its companion database), as well as the specific version of Meta Kaggle used to build the database.

More specifically, the package comprises the following three compressed archives:

KGT_dataset.tar.bz2, the dataset of Jupyter notebooks;

KGTorrent_dump_10-2020.sql.tar.bz2, the dump of the MySQL companion database;

MetaKaggle27Oct2020.tar.bz2, a copy of the Meta Kaggle version used to build the database.

Moreover, we include KGTorrent_logical_schema.pdf, the logical schema of the KGTorrent MySQL database.
Z
AIT Alert Data Set
data.niaid.nih.gov
Updated Oct 14, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Landauer, Max (2024). AIT Alert Data Set [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8263180
Explore at:
Dataset updated
Oct 14, 2024
Dataset provided by
Wurzenberger, Markus
Skopik, Florian
Landauer, Max
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository contains the AIT Alert Data Set (AIT-ADS), a collection of synthetic alerts suitable for evaluation of alert aggregation, alert correlation, alert filtering, and attack graph generation approaches. The alerts were forensically generated from the AIT Log Data Set V2 (AIT-LDSv2) and origin from three intrusion detection systems, namely Suricata, Wazuh, and AMiner. The data sets comprise eight scenarios, each of which has been targeted by a multi-step attack with attack steps such as scans, web application exploits, password cracking, remote command execution, privilege escalation, etc. Each scenario and attack chain has certain variations so that attack manifestations and resulting alert sequences vary in each scenario; this means that the data set allows to develop and evaluate approaches that compute similarities of attack chains or merge them into meta-alerts. Since only few benchmark alert data sets are publicly available, the AIT-ADS was developed to address common issues in the research domain of multi-step attack analysis; specifically, the alert data set contains many false positives caused by normal user behavior (e.g., user login attempts or software updates), heterogeneous alert formats (although all alerts are in JSON format, their fields are different for each IDS), repeated executions of attacks according to an attack plan, collection of alerts from diverse log sources (application logs and network traffic) and all components in the network (mail server, web server, DNS, firewall, file share, etc.), and labels for attack phases. For more information on how this alert data set was generated, check out our paper accompanying this data set [1] or our GitHub repository. More information on the original log data set, including a detailed description of scenarios and attacks, can be found in [2].

The alert data set contains two files for each of the eight scenarios, and a file for their labels:

_aminer.json contains alerts from AMiner IDS

_wazuh.json contains alerts from Wazuh IDS and Suricata IDS

labels.csv contains the start and end times of attack phases in each scenario

Beside false positive alerts, the alerts in the AIT-ADS correspond to the following attacks:

Scans (nmap, WPScan, dirb)

Webshell upload (CVE-2020-24186)

Password cracking (John the Ripper)

Privilege escalation

Remote command execution

Data exfiltration (DNSteal) and stopped service

The total number of alerts involved in the data set is 2,655,821, of which 2,293,628 origin from Wazuh, 306,635 origin from Suricata, and 55,558 origin from AMiner. The numbers of alerts in each scenario are as follows. fox: 473,104; harrison: 593,948; russellmitchell: 45,544; santos: 130,779; shaw: 70,782; wardbeck: 91,257; wheeler: 616,161; wilson: 634,246.

Acknowledgements: Partially funded by the European Defence Fund (EDF) projects AInception (101103385) and NEWSROOM (101121403), and the FFG project PRESENT (FO999899544). Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union. The European Union cannot be held responsible for them.

If you use the AIT-ADS, please cite the following publications:

[1] Landauer, M., Skopik, F., Wurzenberger, M. (2024): Introducing a New Alert Data Set for Multi-Step Attack Analysis. Proceedings of the 17th Cyber Security Experimentation and Test Workshop. [PDF]

[2] Landauer M., Skopik F., Frank M., Hotwagner W., Wurzenberger M., Rauber A. (2023): Maintainable Log Datasets for Evaluation of Intrusion Detection Systems. IEEE Transactions on Dependable and Secure Computing, vol. 20, no. 4, pp. 3466-3482. [PDF]
H
Standardized Hudup dataset based on Movielens 1m
dataverse.harvard.edu
data.mendeley.com
Updated Feb 16, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Loc Nguyen (2021). Standardized Hudup dataset based on Movielens 1m [Dataset]. http://doi.org/10.7910/DVN/F1VQFJ
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/F1VQFJ
Dataset updated
Feb 16, 2021
Dataset provided by
Harvard Dataverse
Authors
Loc Nguyen
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Standardized Hudup dataset receives information from raw data, which is composed of ten units such as “hdp_config”, “hdp_account”, “hdp_attribute_map”, “hdp_nominal”, “hdp_user”, “hdp_item”, “hdp_rating”, “hdp_context_template”, “hdp_context”, and “hdp_sample”. Each unit has particular functions, which is described in the section of data description. Hudup dataset is meta-data which models any raw data with abstract level. The default raw data which is source of Hudup dataset here is Movielens 1M. It is possible to consider that Hudup dataset is secondary data whereas Movielens is primary data. The raw rating data Movielens (GroupLens, 1998) 1M has 1,000,209 ratings from 6,040 users on 3,900 movies (items), which is available at https://files.grouplens.org/datasets/movielens/ml-1m.zip.
H
Standardized Hudup dataset based on Film Trust data
dataverse.harvard.edu
data.mendeley.com
Updated Feb 16, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Loc Nguyen (2021). Standardized Hudup dataset based on Film Trust data [Dataset]. http://doi.org/10.7910/DVN/GTGJQD
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/GTGJQD
Dataset updated
Feb 16, 2021
Dataset provided by
Harvard Dataverse
Authors
Loc Nguyen
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Standardized Hudup dataset receives information from raw data, which is composed of ten units such as “hdp_config”, “hdp_account”, “hdp_attribute_map”, “hdp_nominal”, “hdp_user”, “hdp_item”, “hdp_rating”, “hdp_context_template”, “hdp_context”, and “hdp_sample”. Each unit has particular functions, which is described in the section of data description. Hudup dataset is meta-data which models any raw data with abstract level. The raw data which is source of Hudup dataset here is Film Trust data. It is possible to consider that Hudup dataset is secondary data whereas Film Trust is primary data. The raw rating data Film Trust has 35,497 ratings from 1,508 users on 2,071 films (items), which is available at https://guoguibing.github.io/librec/datasets/filmtrust.zip.
d
Coastal final ecosystem goods and services (FEGS) and habitats meta-analysis...
datasets.ai
catalog.data.gov
53
Updated Nov 12, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Environmental Protection Agency (2020). Coastal final ecosystem goods and services (FEGS) and habitats meta-analysis data file [Dataset]. https://datasets.ai/datasets/coastal-final-ecosystem-goods-and-services-fegs-and-habitats-meta-analysis-data-file
Explore at:
53Available download formats
Dataset updated
Nov 12, 2020
Dataset authored and provided by
U.S. Environmental Protection Agency
Description
Coastal ecosystem goods and services (EGS) have steadily gained traction in the scientific literature over the last few decades, providing a wealth of information about underlying coastal habitat dependencies. This meta-analysis summarizes relationships between coastal habitats and final ecosystem goods and services (FEGS) users. Through a “weight of evidence” approach synthesizing information from published literature, we assessed habitat classes most relevant to coastal users. Approximately 2800 coastal EGS journal articles were identified by online search engines, of which 16% addressed linkages between specific coastal habitats and FEGS users, and were retained for subsequent analysis. Recreational (83%) and industrial (35%) users were most cited in literature, with experiential-users/hikers and commercial fishermen most prominent in each category, respectively. Recreational users were linked to the widest diversity of coastal habitat subclasses (i.e., 22 of 26). Whereas, mangroves and emergent wetlands were most relevant for property owners. We urge EGS studies to continue surveying local users and identifying habitat dependencies, as these steps are important precursors for developing appropriate coastal FEGS metrics and facilitating local valuation. In addition, understanding how habitats contribute to human well-being may assist communities in prioritizing restoration and evaluating development scenarios in the context of future ecosystem service delivery.

This dataset is associated with the following publication: Littles, C., C. Jackson, T. DeWitt, and M. Harwell. Linking People to Coastal Habitats: A meta-analysis of final ecosystem goods and services (FEGS) on the coast. Ocean & Coastal Management. Elsevier, Shannon, IRELAND, 165: 356-369, (2018).
Whatsapp 2022 Stats
kaggle.com
Updated Feb 21, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ArnavR (2022). Whatsapp 2022 Stats [Dataset]. https://www.kaggle.com/arnavr10880/whatsapp-stats/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 21, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
ArnavR
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

WhatsApp Messenger, or simply WhatsApp, is an internationally available American freeware, owned by Meta Platforms (previously Facebook). This dataset provides the latest statistics of Whatsapp in our day-to-day lives.

Content

The dataset contains 7 files: * age_group.csv : Whatsapp usage by Age group (US) * by_country.csv : Whatsapp users by country * messages_sent_daily.csv : Whatsapp messages sent daily * ratings.csv : Whatsapp Play Store & App Store ratings * usage.csv: Whatsapp daily, weekly & monthly usage (US) * user.csv: Whatsapp users growth over time * user_growth : Latest Whatsapp users growth percentage

Acknowledgements

This data has been scraped from Bussiness Insider, Twitter, Facebook, Statista, Sensor Tower, backlinkto and some others.

Inspiration

This dataset can be analyzed to: * see the effect of Whatsapp on the present day world; * how much time does an average person spends on Whatsapp; * the number of users on the platform; and a lot other parameters that we can think of!

Facebook

Twitter

Click to copy link

Link copied

Cite

Stacy Jo Dixon, Facebook users worldwide 2017-2027 [Dataset]. https://www.statista.com/topics/1164/social-networks/

Facebook users worldwide 2017-2027

Explore at:

Dataset provided by

Statistahttp://statista.com/

Authors

Stacy Jo Dixon

Description

The global number of Facebook users was forecast to continuously increase between 2023 and 2027 by in total 391 million users (+14.36 percent). After the fourth consecutive increasing year, the Facebook user base is estimated to reach 3.1 billion users and therefore a new peak in 2027. Notably, the number of Facebook users was continuously increasing over the past years. User figures, shown here regarding the platform Facebook, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).

Clear search

Close search

Google apps

Main menu

Facebook users worldwide 2017-2027

Countries with the most Facebook users 2024

Number of global social network users 2017-2028

Meta Stock Price Dataset 🔥🔍⭐

📊 Meta Kaggle| Kaggle Users' Stats

Image

History

Context

Note

Notebook updater

Image

Global social media subscriptions comparison 2023

Meta Kaggle Code

Explore our public notebook content!

Why we’re releasing this dataset

Sensitive data

Joining with Meta Kaggle

File organization

Questions / Comments

Crossroad Camera Dataset - Mobility Aid Users

Who Tweets? Deriving the Demographic Characteristics of Age, Occupation and...

Supporting data for "A Meta-Intervention: Quantifying the Impact of Social...

Data from: Using multiple imputation to estimate missing data in...

Facebook: distribution of global audiences 2024, by age and gender

Dataset 1: The purchasing numbers.

Eye2Sky dataset - All-sky images and meteorological measurements

Data from: KGTorrent: A Dataset of Python Jupyter Notebooks from Kaggle

AIT Alert Data Set

Standardized Hudup dataset based on Movielens 1m

Standardized Hudup dataset based on Film Trust data

Coastal final ecosystem goods and services (FEGS) and habitats meta-analysis...

Whatsapp 2022 Stats

Context

Content

Acknowledgements

Inspiration

Facebook users worldwide 2017-2027