The global number of Facebook users was forecast to continuously increase between 2023 and 2027 by in total 391 million users (+14.36 percent). After the fourth consecutive increasing year, the Facebook user base is estimated to reach 3.1 billion users and therefore a new peak in 2027. Notably, the number of Facebook users was continuously increasing over the past years. User figures, shown here regarding the platform Facebook, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).
Which county has the most Facebook users?
There are more than 378 million Facebook users in India alone, making it the leading country in terms of Facebook audience size. To put this into context, if India’s Facebook audience were a country then it would be ranked third in terms of largest population worldwide. Apart from India, there are several other markets with more than 100 million Facebook users each: The United States, Indonesia, and Brazil with 193.8 million, 119.05 million, and 112.55 million Facebook users respectively.
Facebook – the most used social media
Meta, the company that was previously called Facebook, owns four of the most popular social media platforms worldwide, WhatsApp, Facebook Messenger, Facebook, and Instagram. As of the third quarter of 2021, there were around 3,5 billion cumulative monthly users of the company’s products worldwide. With around 2.9 billion monthly active users, Facebook is the most popular social media worldwide. With an audience of this scale, it is no surprise that the vast majority of Facebook’s revenue is generated through advertising.
Facebook usage by device
As of July 2021, it was found that 98.5 percent of active users accessed their Facebook account from mobile devices. In fact, almost 81.8 percent of Facebook audiences worldwide access the platform only via mobile phone. Facebook is not only available through mobile browser as the company has published several mobile apps for users to access their products and services. As of the third quarter 2021, the four core Meta products were leading the ranking of most downloaded mobile apps worldwide, with WhatsApp amassing approximately six billion downloads.
How many people use social media?
Social media usage is one of the most popular online activities. In 2024, over five billion people were using social media worldwide, a number projected to increase to over six billion in 2028.
Who uses social media?
Social networking is one of the most popular digital activities worldwide and it is no surprise that social networking penetration across all regions is constantly increasing. As of January 2023, the global social media usage rate stood at 59 percent. This figure is anticipated to grow as lesser developed digital markets catch up with other regions
when it comes to infrastructure development and the availability of cheap mobile devices. In fact, most of social media’s global growth is driven by the increasing usage of mobile devices. Mobile-first market Eastern Asia topped the global ranking of mobile social networking penetration, followed by established digital powerhouses such as the Americas and Northern Europe.
How much time do people spend on social media?
Social media is an integral part of daily internet usage. On average, internet users spend 151 minutes per day on social media and messaging apps, an increase of 40 minutes since 2015. On average, internet users in Latin America had the highest average time spent per day on social media.
What are the most popular social media platforms?
Market leader Facebook was the first social network to surpass one billion registered accounts and currently boasts approximately 2.9 billion monthly active users, making it the most popular social network worldwide. In June 2023, the top social media apps in the Apple App Store included mobile messaging apps WhatsApp and Telegram Messenger, as well as the ever-popular app version of Facebook.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Meta Platforms, Inc., formerly known as Facebook, was founded by Mark Zuckerberg along with his college roommates in 2004. Initially created as a social networking site for Harvard students, Facebook rapidly expanded to become a global social media giant. The company rebranded as Meta in 2021 to reflect its new focus on building the metaverse, a virtual-reality space where users can interact with a computer-generated environment and other users. Over the years, Meta has acquired several other social media platforms and technology companies, including Instagram, WhatsApp, and Oculus VR, significantly expanding its influence in the tech industry. Headquartered in Menlo Park, California, Meta continues to lead in social media innovation and virtual reality technology.
This dataset provides a comprehensive record of Meta's stock price changes over the last 12 years. It includes essential columns such as the date, opening price, highest price of the day, lowest price of the day, closing price, adjusted closing price, and trading volume.
This extensive data is invaluable for conducting historical analyses, forecasting future stock performance, and understanding long-term market trends related to Meta's stock.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1842206%2Ff84a67b64934ccfdd6fd4bfc24db094d%2F_982f849a-87df-44ff-94ff-3fc97c6198aa-small2.jpeg?generation=1738169001850229&alt=media" alt="">
Model
, Dataset
, and Notebook
, rather than the creation date of the very first version. I also added the reaction counts which was a new csv added in the MetaKaggle dataset. The discussion can be found here . I also added versions created for Model, Notebook, and Dataset to properly track users that are updating their datasets.ModelUpvotesGiven
and ModelUpvotesReceived
values being identicalUser aggregated stats and data using the Official Meta Kaggle dataset
Expect some discrepancies between the counts seen in your profile, because, aside from there is a lag of one to two days before a new dataset is published, some information such as Kaggle staffs' upvotes and private competitions are not included. But for almost all members, the figures should reconcile
📊 (Scheduled) Meta Kaggle Users' Stats
Generated with Bing image generator
Social media companies are starting to offer users the option to subscribe to their platforms in exchange for monthly fees. Until recently, social media has been predominantly free to use, with tech companies relying on advertising as their main revenue generator. However, advertising revenues have been dropping following the COVID-induced boom. As of July 2023, Meta Verified is the most costly of the subscription services, setting users back almost 15 U.S. dollars per month on iOS or Android. Twitter Blue costs between eight and 11 U.S. dollars per month and ensures users will receive the blue check mark, and have the ability to edit tweets and have NFT profile pictures. Snapchat+, drawing in four million users as of the second quarter of 2023, boasts a Story re-watch function, custom app icons, and a Snapchat+ badge.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Meta Kaggle Code is an extension to our popular Meta Kaggle dataset. This extension contains all the raw source code from hundreds of thousands of public, Apache 2.0 licensed Python and R notebooks versions on Kaggle used to analyze Datasets, make submissions to Competitions, and more. This represents nearly a decade of data spanning a period of tremendous evolution in the ways ML work is done.
By collecting all of this code created by Kaggle’s community in one dataset, we hope to make it easier for the world to research and share insights about trends in our industry. With the growing significance of AI-assisted development, we expect this data can also be used to fine-tune models for ML-specific code generation tasks.
Meta Kaggle for Code is also a continuation of our commitment to open data and research. This new dataset is a companion to Meta Kaggle which we originally released in 2016. On top of Meta Kaggle, our community has shared nearly 1,000 public code examples. Research papers written using Meta Kaggle have examined how data scientists collaboratively solve problems, analyzed overfitting in machine learning competitions, compared discussions between Kaggle and Stack Overflow communities, and more.
The best part is Meta Kaggle enriches Meta Kaggle for Code. By joining the datasets together, you can easily understand which competitions code was run against, the progression tier of the code’s author, how many votes a notebook had, what kinds of comments it received, and much, much more. We hope the new potential for uncovering deep insights into how ML code is written feels just as limitless to you as it does to us!
While we have made an attempt to filter out notebooks containing potentially sensitive information published by Kaggle users, the dataset may still contain such information. Research, publications, applications, etc. relying on this data should only use or report on publicly available, non-sensitive information.
The files contained here are a subset of the KernelVersions
in Meta Kaggle. The file names match the ids in the KernelVersions
csv file. Whereas Meta Kaggle contains data for all interactive and commit sessions, Meta Kaggle Code contains only data for commit sessions.
The files are organized into a two-level directory structure. Each top level folder contains up to 1 million files, e.g. - folder 123 contains all versions from 123,000,000 to 123,999,999. Each sub folder contains up to 1 thousand files, e.g. - 123/456 contains all versions from 123,456,000 to 123,456,999. In practice, each folder will have many fewer than 1 thousand files due to private and interactive sessions.
The ipynb files in this dataset hosted on Kaggle do not contain the output cells. If the outputs are required, the full set of ipynbs with the outputs embedded can be obtained from this public GCS bucket: kaggle-meta-kaggle-code-downloads
. Note that this is a "requester pays" bucket. This means you will need a GCP account with billing enabled to download. Learn more here: https://cloud.google.com/storage/docs/requester-pays
We love feedback! Let us know in the Discussion tab.
Happy Kaggling!
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
The most vulnerable group of traffic participants are pedestrians using mobility aids. While there has been significant progress in the robustness and reliability of camera based general pedestrian detection systems, pedestrians reliant on mobility aids are highly underrepresented in common datasets for object detection and classification.
To bridge this gap and enable research towards robust and reliable detection systems which may be employed in traffic monitoring, scheduling, and planning, we present this dataset of a pedestrian crossing scenario taken from an elevated traffic monitoring perspective together with ground truth annotations (Yolo format [1]). Classes present in the dataset are pedestrian (without mobility aids), as well as pedestrians using wheelchairs, rollators/wheeled walkers, crutches, and walking canes. The dataset comes with official training, validation, and test splits.
An in-depth description of the dataset can be found in [2]. If you make use of this dataset in your work, research or publication, please cite this work as:
@inproceedings{mohr2023mau,
author = {Mohr, Ludwig and Kirillova, Nadezda and Possegger, Horst and Bischof, Horst},
title = {{A Comprehensive Crossroad Camera Dataset of Mobility Aid Users}},
booktitle = {Proceedings of the 34th British Machine Vision Conference ({BMVC}2023)},
year = {2023}
}
Archive mobility.zip contains the full detection dataset in Yolo format with images, ground truth labels and meta data, archive mobility_class_hierarchy.zip contains labels and meta files (Yolo format) for training with class hierarchy using e.g. the modified version of Yolo v5/v8 available under [3].
To use this dataset with Yolo, you will need to download and extract the zip archive and change the path entry in dataset.yaml to the directory where you extracted the archive to.
[1] https://github.com/ultralytics/ultralytics
[2] coming soon
[3] coming soon
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This paper specifies, designs and critically evaluates two tools for the automated identification of demographic data (age, occupation and social class) from the profile descriptions of Twitter users in the United Kingdom (UK). Meta-data data routinely collected through the Collaborative Social Media Observatory (COSMOS: http://www.cosmosproject.net/) relating to UK Twitter users is matched with the occupational lookup tables between job and social class provided by the Office for National Statistics (ONS) using SOC2010. Using expert human validation, the validity and reliability of the automated matching process is critically assessed and a prospective class distribution of UK Twitter users is offered with 2011 Census baseline comparisons. The pattern matching rules for identifying age are explained and enacted following a discussion on how to minimise false positives. The age distribution of Twitter users, as identified using the tool, is presented alongside the age distribution of the UK population from the 2011 Census. The automated occupation detection tool reliably identifies certain occupational groups, such as professionals, for which job titles cannot be confused with hobbies or are used in common parlance within alternative contexts. An alternative explanation on the prevalence of hobbies is that the creative sector is overrepresented on Twitter compared to 2011 Census data. The age detection tool illustrates the youthfulness of Twitter users compared to the general UK population as of the 2011 Census according to proportions, but projections demonstrate that there is still potentially a large number of older platform users. It is possible to detect “signatures” of both occupation and age from Twitter meta-data with varying degrees of accuracy (particularly dependent on occupational groups) but further confirmatory work is needed.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
This dataset supports a research project in the field of digital medicine, which aims to quantify the impact of disseminating scientific information on social media—as a form of "meta-intervention"—on public adherence to Non-Pharmaceutical Interventions (NPIs) during health crises such as the COVID-19 pandemic. The research encompasses multiple sub-studies and pilot experiments, drawing data from various global and China-specific social media platforms.The data included in this submission has been collected from several sources:From Sina Weibo and Tencent WeChat, 189 online poll datasets were collected, involving a total of 1,391,706 participants. These participants are users of Sina Weibo or Tencent WeChat.From Twitter, 187 tweets published by scientists (verified with a blue checkmark) related to COVID-19 were collected.From Xiaohongshu and Bilibili, textual content from 143 user posts/videos concerning COVID-19, along with associated user comments and specific user responses to a question, were gathered.It is important to note that while the broader research project also utilized a 3TB Reddit corpus hosted on Academic Torrents (academictorrents.com), this specific Reddit dataset is publicly available directly from Academic Torrents and is not included in this particular DataHub submission. The submitted dataset comprises publicly available data, formatted as Excel files (.xlsx), and includes the following:Filename: scientists' discourse (source from screenshot of tweets)Description: This file contains screenshots of tweets published by scientists on Twitter concerning COVID-19 research, its current status, and related topics. It also includes a coded analysis of the textual content from these tweets. Specific details regarding the coding scheme can be found in the readme.txt file.Filename: The links of online polls (Weibo & WeChat)Description: This data file includes information from online polls conducted on Weibo and WeChat after December 7, 2022. These polls, often initiated by verified users (who may or may not be science popularizers), aimed to track the self-reported proportion of participants testing positive for COVID-19 (via PCR or rapid antigen test) or remaining negative, particularly during periods of rapid Omicron infection spread. The file contains links to the original polls, links to the social media accounts that published these polls, and relevant metadata about both the poll-creating accounts and the online polls themselves.Filename: Online posts & comments (From Xiaohongshu & Bilibili)Description: This file contains textual content from COVID-19 related posts and videos published by users on the Xiaohongshu and Bilibili platforms. It also includes user-generated comments reacting to these posts/videos, as well as user responses to a specific question posed within the context of the original content.Key Features of this Dataset:Data Type: Mixed, including textual data, screenshots of social media posts, web links to original sources, and coded metadata.Source Platforms: Twitter (global), Weibo/WeChat (primarily China), Xiaohongshu (China), and Bilibili (video-sharing platform, primarily China).Use Case: This dataset is intended for the analysis of public discourse, the dissemination of scientific information, and user engagement patterns across different cultural contexts and social media platforms, particularly in relation to public health information.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
As of April 2024, it was found that men between the ages of 25 and 34 years made up Facebook largest audience, accounting for 18.4 percent of global users. Additionally, Facebook's second largest audience base could be found with men aged 18 to 24 years.
Facebook connects the world
Founded in 2004 and going public in 2012, Facebook is one of the biggest internet companies in the world with influence that goes beyond social media. It is widely considered as one of the Big Four tech companies, along with Google, Apple, and Amazon (all together known under the acronym GAFA). Facebook is the most popular social network worldwide and the company also owns three other billion-user properties: mobile messaging apps WhatsApp and Facebook Messenger,
as well as photo-sharing app Instagram. Facebook usersThe vast majority of Facebook users connect to the social network via mobile devices. This is unsurprising, as Facebook has many users in mobile-first online markets. Currently, India ranks first in terms of Facebook audience size with 378 million users. The United States, Brazil, and Indonesia also all have more than 100 million Facebook users each.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In recent years, the global e-commerce landscape has witnessed rapid growth, with sales reaching a new peak in the past year and expected to rise further in the coming years. Amid this e-commerce boom, accurately predicting user purchase behavior has become crucial for commercial success. We introduce a novel framework integrating three innovative approaches to enhance the prediction model’s effectiveness. First, we integrate an event-based timestamp encoding within a time-series attention model, effectively capturing the dynamic and temporal aspects of user behavior. This aspect is often neglected in traditional user purchase prediction methods, leading to suboptimal accuracy. Second, we incorporate Graph Neural Networks (GNNs) to analyze user behavior. By modeling users and their actions as nodes and edges within a graph structure, we capture complex relationships and patterns in user behavior more effectively than current models, offering a nuanced and comprehensive analysis. Lastly, our framework transcends traditional learning strategies by implementing advanced meta-learning techniques. This enables the model to autonomously adjust learning parameters, including the learning rate, in response to new and evolving data environments, thereby significantly enhancing its adaptability and learning efficiency. Through extensive experiments on diverse real-world e-commerce datasets, our model demonstrates superior performance, particularly in accuracy and adaptability in large-scale data scenarios. This study not only overcomes the existing challenges in analyzing e-commerce user behavior but also sets a foundation for future exploration in this dynamic field. We believe our contributions provide significant insights and tools for e-commerce platforms to better understand and cater to their users, ultimately driving sales and improving user experiences.
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Eye2Sky dataset - All-sky images and meteorological measurements
Description
The Eye2Sky public dataset comprises all-sky images from 29 All-Sky Imager (ASI) stations and meteorological measurements including solar irradiance from 11 stations in north-west Germany. A list of all stations with meta information is provided in the attached Eye2Sky_Station_List.xlsx
Meteorological measurements cover a full year from April 2022 to March 2023 with minutely averaged parameters. Parameters measured are three solar radiation components global horizontal, diffuse horizontal and direct normal irradiance. Global tilted irradiance is measured with tilt angles of 30° in southern direction. Measurement data has been quality controlled. Raw data is provided along with quality flags. For users who want to use measurements directly, a "ready-to-use" data set with cleaned data is provided.
ASI images of the whole ASI network at all stations cover a 4-month period from April to July 2022 with 30 seconds sampling rate, for station OLUOL the dataset covers a full year from April 2022 to March 2023 to allow season-dependent analyses at a central location inside the ASI network. The sampling rate for OLUOL is 15 seconds (since 2022-06-10). Images are provided as raw data (jpg-format), but calibration information and horizon masks are provided as meta data. Note: Due to maximum upload size allowed by Zenodo, only an example data set (ASI_20220620.zip) is provided here. The full set of ASI images can be downloaded from here: https://eye2sky.de/data
Quality remarks
Invalid images (broken downloads, empty images, ...) have been removed from the dataset.
Moreover, images where faces of people are visible have been removed.
Any other disturbed images (e.g. due to birds, insects or just dirt) are kept in the database.
Python libraries for image handling are in preparation and will be soon published on Github
Quality flags for measurement data
Measurement data was quality checked using the procedure in attached image QC_Flowchart.png. The manual quality control refers to obvious measurement errors known to the station operator. Corresponding data is removed from the data set before additional tests are carried out on the data in an automatic quality control process. The tests performed are described in a publication in preparation.
Data Format
All-sky images are provided as zipped archives in raw jpg-format (sampling 30 seconds).
Directory structure: One folder per day and station
Meta data: each station folder consists of
a YAML-file providing meta data for each configuration change. The filename contains the timestamp of the latest change in configuration (could be a change in camera alignment or positioning). The file contains information on geographical coordinates, internal and external orientation and necessary mask file.
image masks in binary png-files and mat-files.
an example ASI image with an image mask overlay and image orientated to geographic north according to calibration parameters
a keogram composite of all ASI images in the full data period.
Measurement data is provided in zipped station directories.
There are two files of timeseries data for each station and for the full time period.
data/STATION_ID.flagged.nc contains raw data including QC-flags
data/STATION_ID.cleaned.nc contains cleaned data
Meta data: For each station
a horizon file for the near horizon near_horizon.csv
and a log file of changes and quality check report is provided: QC_{STATION_ID}.pdf
Ceilometer data for Stations CDLRA (Oldenburg) and CLDRB (Westerstede) is provided in daily netcdf-Files.
Timeseries plots of daily measurements are attached for both stations.
VERSION UPDATES
Version 1.0:
Initial Upload
ACKNOWLEDGEMENTS
DLR Institute of Networked Energy Systems is responsible for the construction, operations, quality control and scientific evolution of the Eye2Sky network. DLR Institute of Solar Research was involved in designing the network and strongly supports Eye2Sky with software for data acquisition, calibration, processing and evaluation. We would like to thank all DLR staff who helped with the procurement of funding, installation support, data acquisition and data quality control. We also thank all persons, institutions and companies who contributed to the installation and who offered hosting a station.
LICENSE
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
KGTorrent is a dataset of Python Jupyter notebooks from the Kaggle platform.
The dataset is accompanied by a MySQL database containing metadata about the notebooks and the activity of Kaggle users on the platform. The information to build the MySQL database has been derived from Meta Kaggle, a publicly available dataset containing Kaggle metadata.
In this package, we share the complete KGTorrent dataset (consisting of the dataset itself plus its companion database), as well as the specific version of Meta Kaggle used to build the database.
More specifically, the package comprises the following three compressed archives:
KGT_dataset.tar.bz2
, the dataset of Jupyter notebooks;
KGTorrent_dump_10-2020.sql.tar.bz2
, the dump of the MySQL companion database;
MetaKaggle27Oct2020.tar.bz2
, a copy of the Meta Kaggle version used to build the database.
Moreover, we include KGTorrent_logical_schema.pdf
, the logical schema of the KGTorrent MySQL database.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains the AIT Alert Data Set (AIT-ADS), a collection of synthetic alerts suitable for evaluation of alert aggregation, alert correlation, alert filtering, and attack graph generation approaches. The alerts were forensically generated from the AIT Log Data Set V2 (AIT-LDSv2) and origin from three intrusion detection systems, namely Suricata, Wazuh, and AMiner. The data sets comprise eight scenarios, each of which has been targeted by a multi-step attack with attack steps such as scans, web application exploits, password cracking, remote command execution, privilege escalation, etc. Each scenario and attack chain has certain variations so that attack manifestations and resulting alert sequences vary in each scenario; this means that the data set allows to develop and evaluate approaches that compute similarities of attack chains or merge them into meta-alerts. Since only few benchmark alert data sets are publicly available, the AIT-ADS was developed to address common issues in the research domain of multi-step attack analysis; specifically, the alert data set contains many false positives caused by normal user behavior (e.g., user login attempts or software updates), heterogeneous alert formats (although all alerts are in JSON format, their fields are different for each IDS), repeated executions of attacks according to an attack plan, collection of alerts from diverse log sources (application logs and network traffic) and all components in the network (mail server, web server, DNS, firewall, file share, etc.), and labels for attack phases. For more information on how this alert data set was generated, check out our paper accompanying this data set [1] or our GitHub repository. More information on the original log data set, including a detailed description of scenarios and attacks, can be found in [2].
The alert data set contains two files for each of the eight scenarios, and a file for their labels:
_aminer.json contains alerts from AMiner IDS
_wazuh.json contains alerts from Wazuh IDS and Suricata IDS
labels.csv contains the start and end times of attack phases in each scenario
Beside false positive alerts, the alerts in the AIT-ADS correspond to the following attacks:
Scans (nmap, WPScan, dirb)
Webshell upload (CVE-2020-24186)
Password cracking (John the Ripper)
Privilege escalation
Remote command execution
Data exfiltration (DNSteal) and stopped service
The total number of alerts involved in the data set is 2,655,821, of which 2,293,628 origin from Wazuh, 306,635 origin from Suricata, and 55,558 origin from AMiner. The numbers of alerts in each scenario are as follows. fox: 473,104; harrison: 593,948; russellmitchell: 45,544; santos: 130,779; shaw: 70,782; wardbeck: 91,257; wheeler: 616,161; wilson: 634,246.
Acknowledgements: Partially funded by the European Defence Fund (EDF) projects AInception (101103385) and NEWSROOM (101121403), and the FFG project PRESENT (FO999899544). Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union. The European Union cannot be held responsible for them.
If you use the AIT-ADS, please cite the following publications:
[1] Landauer, M., Skopik, F., Wurzenberger, M. (2024): Introducing a New Alert Data Set for Multi-Step Attack Analysis. Proceedings of the 17th Cyber Security Experimentation and Test Workshop. [PDF]
[2] Landauer M., Skopik F., Frank M., Hotwagner W., Wurzenberger M., Rauber A. (2023): Maintainable Log Datasets for Evaluation of Intrusion Detection Systems. IEEE Transactions on Dependable and Secure Computing, vol. 20, no. 4, pp. 3466-3482. [PDF]
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Standardized Hudup dataset receives information from raw data, which is composed of ten units such as “hdp_config”, “hdp_account”, “hdp_attribute_map”, “hdp_nominal”, “hdp_user”, “hdp_item”, “hdp_rating”, “hdp_context_template”, “hdp_context”, and “hdp_sample”. Each unit has particular functions, which is described in the section of data description. Hudup dataset is meta-data which models any raw data with abstract level. The default raw data which is source of Hudup dataset here is Movielens 1M. It is possible to consider that Hudup dataset is secondary data whereas Movielens is primary data. The raw rating data Movielens (GroupLens, 1998) 1M has 1,000,209 ratings from 6,040 users on 3,900 movies (items), which is available at https://files.grouplens.org/datasets/movielens/ml-1m.zip.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Standardized Hudup dataset receives information from raw data, which is composed of ten units such as “hdp_config”, “hdp_account”, “hdp_attribute_map”, “hdp_nominal”, “hdp_user”, “hdp_item”, “hdp_rating”, “hdp_context_template”, “hdp_context”, and “hdp_sample”. Each unit has particular functions, which is described in the section of data description. Hudup dataset is meta-data which models any raw data with abstract level. The raw data which is source of Hudup dataset here is Film Trust data. It is possible to consider that Hudup dataset is secondary data whereas Film Trust is primary data. The raw rating data Film Trust has 35,497 ratings from 1,508 users on 2,071 films (items), which is available at https://guoguibing.github.io/librec/datasets/filmtrust.zip.
Coastal ecosystem goods and services (EGS) have steadily gained traction in the scientific literature over the last few decades, providing a wealth of information about underlying coastal habitat dependencies. This meta-analysis summarizes relationships between coastal habitats and final ecosystem goods and services (FEGS) users. Through a “weight of evidence” approach synthesizing information from published literature, we assessed habitat classes most relevant to coastal users. Approximately 2800 coastal EGS journal articles were identified by online search engines, of which 16% addressed linkages between specific coastal habitats and FEGS users, and were retained for subsequent analysis. Recreational (83%) and industrial (35%) users were most cited in literature, with experiential-users/hikers and commercial fishermen most prominent in each category, respectively. Recreational users were linked to the widest diversity of coastal habitat subclasses (i.e., 22 of 26). Whereas, mangroves and emergent wetlands were most relevant for property owners. We urge EGS studies to continue surveying local users and identifying habitat dependencies, as these steps are important precursors for developing appropriate coastal FEGS metrics and facilitating local valuation. In addition, understanding how habitats contribute to human well-being may assist communities in prioritizing restoration and evaluating development scenarios in the context of future ecosystem service delivery.
This dataset is associated with the following publication: Littles, C., C. Jackson, T. DeWitt, and M. Harwell. Linking People to Coastal Habitats: A meta-analysis of final ecosystem goods and services (FEGS) on the coast. Ocean & Coastal Management. Elsevier, Shannon, IRELAND, 165: 356-369, (2018).
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
WhatsApp Messenger, or simply WhatsApp, is an internationally available American freeware, owned by Meta Platforms (previously Facebook). This dataset provides the latest statistics of Whatsapp in our day-to-day lives.
The dataset contains 7 files:
* age_group.csv
: Whatsapp usage by Age group (US)
* by_country.csv
: Whatsapp users by country
* messages_sent_daily.csv
: Whatsapp messages sent daily
* ratings.csv
: Whatsapp Play Store & App Store ratings
* usage.csv
: Whatsapp daily, weekly & monthly usage (US)
* user.csv
: Whatsapp users growth over time
* user_growth
: Latest Whatsapp users growth percentage
This data has been scraped from Bussiness Insider, Twitter, Facebook, Statista, Sensor Tower, backlinkto and some others.
This dataset can be analyzed to: * see the effect of Whatsapp on the present day world; * how much time does an average person spends on Whatsapp; * the number of users on the platform; and a lot other parameters that we can think of!
The global number of Facebook users was forecast to continuously increase between 2023 and 2027 by in total 391 million users (+14.36 percent). After the fourth consecutive increasing year, the Facebook user base is estimated to reach 3.1 billion users and therefore a new peak in 2027. Notably, the number of Facebook users was continuously increasing over the past years. User figures, shown here regarding the platform Facebook, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).