100+ datasets found

App User Dataset
kaggle.com
Updated Sep 7, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kalle Fischer (2022). App User Dataset [Dataset]. https://www.kaggle.com/datasets/kallefischer/app-user-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 7, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Kalle Fischer
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
About Dataset

This dataset contains 6 columns and 10k rows about the demographics of the users of an app. UID - User ID, unique identifier for every app user. reg_date - Date that each user registered. device - Operating system of the user. Gender - Gender of the user Country - Country where the user downloaded the app. Age - Age of the user.
H
Worldwide Mobile App User Behavior Dataset
dataverse.harvard.edu
doc, xlsx
Updated Sep 28, 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Harvard Dataverse (2014). Worldwide Mobile App User Behavior Dataset [Dataset]. http://doi.org/10.7910/DVN/27459
Explore at:
doc(56320), xlsx(7037534)Available download formats
Unique identifier
https://doi.org/10.7910/DVN/27459
Dataset updated
Sep 28, 2014
Dataset provided by
Harvard Dataverse
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Time period covered
2012
Area covered
Worldwide
Description
We surveyed 10,208 people from more than 15 countries on their mobile app usage behavior. The countries include USA, China, Japan, Germany, France, Brazil, UK, Italy, Russia, India, Canada, Spain, Australia, Mexico, and South Korea. We asked respondents about: (1) their mobile app user behavior in terms of mobile app usage, including the app stores they use, what triggers them to look for apps, why they download apps, why they abandon apps, and the types of apps they download. (2) their demographics including gender, age, marital status, nationality, country of residence, first language, ethnicity, education level, occupation, and household income (3) their personality using the Big-Five personality traits This dataset contains the results of the survey.
c
IOS application reviews dataset in English
crawlfeeds.com
csv, zip
Updated Jul 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Crawl Feeds (2025). IOS application reviews dataset in English [Dataset]. https://crawlfeeds.com/datasets/ios-application-reviews-dataset-in-english
Explore at:
zip, csvAvailable download formats
Dataset updated
Jul 8, 2025
Dataset authored and provided by
Crawl Feeds
License
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Description
This comprehensive iOS application reviews dataset contains thousands of authentic user reviews from the Apple App Store in English. The dataset provides valuable insights for app developers, marketers, and researchers studying mobile application performance and user sentiment.

Key Features:

Real user reviews from popular iOS apps

Star ratings from 1 to 5 stars

Review dates and timestamps

App store URLs and metadata

User demographics and location data

App version information

Review titles and detailed feedback

Applications: Perfect for sentiment analysis, app store optimization, mobile app development research, user experience studies, and competitive analysis. This dataset enables businesses to understand user preferences, identify app improvement opportunities, and develop better mobile applications.

Data Quality: All reviews are genuine user feedback collected from the official Apple App Store, ensuring authenticity and reliability for research and business intelligence purposes. The dataset covers various app categories including fitness, shopping, education, entertainment, and productivity applications.
d
App + Web Consumer Data | MFour's 1st Party - App + Web Usage Data | 2M...
datarade.ai
.csv
Updated Nov 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
mfour (2023). App + Web Consumer Data | MFour's 1st Party - App + Web Usage Data | 2M consumers, 3B+ events verified, US consumers | CCPA Compliant [Dataset]. https://datarade.ai/data-categories/app-data/datasets
Explore at:
.csvAvailable download formats
Dataset updated
Nov 14, 2023
Dataset authored and provided by
mfour
Area covered
United States of America
Description
At MFour, our Behavioral Data stands out for its uniqueness and depth of insights. What makes our data genuinely exceptional is the combination of several key factors:

First-Party Opt-In Data: Our data is sourced directly from our opt-in panel of consumers who willingly participate in research and provide observed behaviors. This ensures the highest data quality and eliminates privacy concerns. CCPA compliant.

Unparalleled Data Coverage: With access to 3B+ billion events, we have an extensive pool of participants who allow us to observe their brick + mortar location visitation, app + web smartphone usage, or both. This large-scale coverage provides robust and reliable insights.

Our data is generally sourced through our Surveys On The Go (SOTG) mobile research app, where consumers are incentivized with cash rewards to participate in surveys and share their observed behaviors. This incentivized approach ensures a willing and engaged panel, leading to the highest-quality data.

The primary use cases and verticals of our Behavioral Data Product are diverse and varied. Some key applications include:

Data Acquisition and Modeling: Our data helps businesses acquire valuable insights into consumer behavior and enables modeling for various research objectives.

Shopper Data Analysis: By understanding purchase behavior and patterns, businesses can optimize their strategies, improve targeting, and enhance customer experiences.

Media Consumption Insights: Our data provides a deep understanding of viewer behavior and patterns across popular platforms like YouTube, Amazon Prime, Netflix, and Disney+, enabling effective media planning and content optimization.

App Performance Optimization: Analyzing app behavior allows businesses to monitor usage patterns, track key performance indicators (KPIs), and optimize app experiences to drive user engagement and retention.

Location-Based Targeting: With our detailed location data, businesses can map out consumer visits to physical venues and combine them with web and app behavior to create predictive ad targeting strategies.

Audience Creation for Ad Placement: Our data enables the creation of highly targeted audiences for ad campaigns, ensuring better reach and engagement with relevant consumer segments.

The Behavioral Data Product complements our comprehensive suite of data solutions in the broader context of our data offering. It provides granular and event-level insights into consumer behaviors, which can be combined with other data sets such as survey responses, demographics, or custom profiling questions to offer a holistic understanding of consumer preferences, motivations, and actions.

MFour's Behavioral Data empowers businesses with unparalleled consumer insights, allowing them to make data-driven decisions, uncover new opportunities, and stay ahead in today's dynamic market landscape.
Z
Dataset used for "A Recommender System of Buggy App Checkers for App Store...
data.niaid.nih.gov
Updated Jun 28, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Martin Monperrus (2021). Dataset used for "A Recommender System of Buggy App Checkers for App Store Moderators" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5034291
Explore at:
Dataset updated
Jun 28, 2021
Dataset provided by
Lionel Seinturier
Martin Monperrus
Maria Gomez
Romain Rouvoy
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the dataset used for paper: "A Recommender System of Buggy App Checkers for App Store Moderators", published on the International Conference on Mobile Software Engineering and Systems (MOBILESoft) in 2015.

Dataset Collection We built a dataset that consists of a random sample of Android app metadata and user reviews available on the Google Play Store on January and March 2014. Since the Google Play Store is continuously evolving (adding, removing and/or updating apps), we updated the dataset twice. The dataset D1 contains available apps in the Google Play Store in January 2014. Then, we created a new snapshot (D2) of the Google Play Store in March 2014.

The apps belong to the 27 different categories defined by Google (at the time of writing the paper), and the 4 predefined subcategories (free, paid, new_free, and new_paid). For each category-subcategory pair (e.g. tools-free, tools-paid, sports-new_free, etc.), we collected a maximum of 500 samples, resulting in a median number of 1.978 apps per category.

For each app, we retrieved the following metadata: name, package, creator, version code, version name, number of downloads, size, upload date, star rating, star counting, and the set of permission requests.

In addition, for each app, we collected up to a maximum of the latest 500 reviews posted by users in the Google Play Store. For each review, we retrieved its metadata: title, description, device, and version of the app. None of these fields were mandatory, thus several reviews lack some of these details. From all the reviews attached to an app, we only considered the reviews associated with the latest version of the app —i.e., we discarded unversioned and old-versioned reviews. Thus, resulting in a corpus of 1,402,717 reviews (2014 Jan.).

Dataset Stats Some stats about the datasets:

D1 (Jan. 2014) contains 38,781 apps requesting 7,826 different permissions, and 1,402,717 user reviews.

D2 (Mar. 2014) contains 46,644 apps and 9,319 different permission requests, and 1,361,319 user reviews.

Additional stats about the datasets are available here.

Dataset Description To store the dataset, we created a graph database with Neo4j. This dataset therefore consists of a graph describing the apps as nodes and edges. We chose a graph database because the graph visualization helps to identify connections among data (e.g., clusters of apps sharing similar sets of permission requests).

In particular, our dataset graph contains six types of nodes: - APP nodes containing metadata of each app, - PERMISSION nodes describing permission types, - CATEGORY nodes describing app categories, - SUBCATEGORY nodes describing app subcategories, - USER_REVIEW nodes storing user reviews. - TOPIC topics mined from user reviews (using LDA).

Furthermore, there are five types of relationships between APP nodes and each of the remaining nodes:

USES_PERMISSION relationships between APP and PERMISSION nodes

HAS_REVIEW between APP and USER_REVIEW nodes

HAS_TOPIC between USER_REVIEW and TOPIC nodes

BELONGS_TO_CATEGORY between APP and CATEGORY nodes

BELONGS_TO_SUBCATEGORY between APP and SUBCATEGORY nodes

Dataset Files Info

Neo4j 2.0 Databases

googlePlayDB1-Jan2014_neo4j_2_0.rar

googlePlayDB2-Mar2014_neo4j_2_0.rar We provide two Neo4j databases containing the 2 snapshots of the Google Play Store (January and March 2014). These are the original databases created for the paper. The databases were created with Neo4j 2.0. In particular with the tool version 'Neo4j 2.0.0-M06 Community Edition' (latest version available at the time of implementing the paper in 2014).

Neo4j 3.5 Databases

googlePlayDB1-Jan2014_neo4j_3_5_28.rar

googlePlayDB2-Mar2014_neo4j_3_5_28.rar Currently, the version Neo4j 2.0 is deprecated and it is not available for download in the official Neo4j Download Center. We have migrated the original databases (Neo4j 2.0) to Neo4j 3.5.28. The databases can be opened with the tool version: 'Neo4j Community Edition 3.5.28'. The tool can be downloaded from the official Neo4j Donwload page.

In order to open the databases with more recent versions of Neo4j, the databases must be first migrated to the corresponding version. Instructions about the migration process can be found in the Neo4j Migration Guide. First time the Neo4j database is connected, it could request credentials. The username and pasword are: neo4j/neo4j
Mobile Application User Statistics
kaggle.com
Updated Dec 31, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
wolfgang (2018). Mobile Application User Statistics [Dataset]. https://www.kaggle.com/wolfgangb33r/usercount/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 31, 2018
Dataset provided by
Kagglehttp://kaggle.com/
Authors
wolfgang
Description
Context

This data set contains some basic statistics about user count and user growth as well as crash count for a real mobile app. The dataset contains a basic timeseries of 1 hour resolution for a period of one week.

Content

The data set contains columns for total concurrent user count, new users acquired in that period of time, number of sessions and crash count.

Acknowledgements

This data set would not be available without the Real User Monitoring capabilities of Dynatrace and its flexibility to export and expose this data for scientific experiments.

Inspiration

The data set was intended to play around with seasonality, trend and prediction of timeseries.
f
dataset for dating app use and TNSB.sav
figshare.com
bin
Updated Jan 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yao Yao (2024). dataset for dating app use and TNSB.sav [Dataset]. http://doi.org/10.6084/m9.figshare.25001390.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.25001390.v1
Dataset updated
Jan 16, 2024
Dataset provided by
figshare
Authors
Yao Yao
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This research conducted an online survey to investigate the relationship between dating app use and hookup intention. It measured dating app use, perceived descriptive norms, injunctive norms, fear of negative evaluation, hookup intention, and demographic information including age, gender, sexual orientation, and relationship status.
d
Factori USA Consumer Graph Data | socio-demographic, location, interest and...
datarade.ai
.json, .csv
Updated Jul 23, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Factori (2022). Factori USA Consumer Graph Data | socio-demographic, location, interest and intent data | E-Commere |Mobile Apps | Online Services [Dataset]. https://datarade.ai/data-products/factori-usa-consumer-graph-data-socio-demographic-location-factori
Explore at:
.json, .csvAvailable download formats
Dataset updated
Jul 23, 2022
Dataset authored and provided by
Factori
Area covered
United States of America
Description
Our consumer data is gathered and aggregated via surveys, digital services, and public data sources. We use powerful profiling algorithms to collect and ingest only fresh and reliable data points.

Our comprehensive data enrichment solution includes a variety of data sets that can help you address gaps in your customer data, gain a deeper understanding of your customers, and power superior client experiences.

Geography - City, State, ZIP, County, CBSA, Census Tract, etc.

Demographics - Gender, Age Group, Marital Status, Language etc.

Financial - Income Range, Credit Rating Range, Credit Type, Net worth Range, etc

Persona - Consumer type, Communication preferences, Family type, etc

Interests - Content, Brands, Shopping, Hobbies, Lifestyle etc.

Household - Number of Children, Number of Adults, IP Address, etc.

Behaviours - Brand Affinity, App Usage, Web Browsing etc.

Firmographics - Industry, Company, Occupation, Revenue, etc

Retail Purchase - Store, Category, Brand, SKU, Quantity, Price etc.

Auto - Car Make, Model, Type, Year, etc.

Housing - Home type, Home value, Renter/Owner, Year Built etc.

Consumer Graph Schema & Reach: Our data reach represents the total number of counts available within various categories and comprises attributes such as country location, MAU, DAU & Monthly Location Pings:

Data Export Methodology: Since we collect data dynamically, we provide the most updated data and insights via a best-suited method on a suitable interval (daily/weekly/monthly).

Consumer Graph Use Cases:

360-Degree Customer View:Get a comprehensive image of customers by the means of internal and external data aggregation.

Data Enrichment:Leverage Online to offline consumer profiles to build holistic audience segments to improve campaign targeting using user data enrichment

Fraud Detection: Use multiple digital (web and mobile) identities to verify real users and detect anomalies or fraudulent activity.

Advertising & Marketing:Understand audience demographics, interests, lifestyle, hobbies, and behaviors to build targeted marketing campaigns.

Using Factori Consumer Data graph you can solve use cases like:

Acquisition Marketing Expand your reach to new users and customers using lookalike modeling with your first party audiences to extend to other potential consumers with similar traits and attributes.

Lookalike Modeling

Build lookalike audience segments using your first party audiences as a seed to extend your reach for running marketing campaigns to acquire new users or customers

And also, CRM Data Enrichment, Consumer Data Enrichment B2B Data Enrichment B2C Data Enrichment Customer Acquisition Audience Segmentation 360-Degree Customer View Consumer Profiling Consumer Behaviour Data

Here's the schema of Consumer Data: person_id first_name last_name age gender linkedin_url twitter_url facebook_url city state address zip zip4 country delivery_point_bar_code carrier_route walk_seuqence_code fips_state_code fips_country_code country_name latitude longtiude address_type metropolitan_statistical_area core_based+statistical_area census_tract census_block_group census_block primary_address pre_address streer post_address address_suffix address_secondline address_abrev census_median_home_value home_market_value property_build+year property_with_ac property_with_pool property_with_water property_with_sewer general_home_value property_fuel_type year month household_id Census_median_household_income household_size marital_status length+of_residence number_of_kids pre_school_kids single_parents working_women_in_house_hold homeowner children adults generations net_worth education_level occupation education_history credit_lines credit_card_user newly_issued_credit_card_user credit_range_new
credit_cards loan_to_value mortgage_loan2_amount mortgage_loan_type
mortgage_loan2_type mortgage_lender_code
mortgage_loan2_render_code
mortgage_lender mortgage_loan2_lender
mortgage_loan2_ratetype mortgage_rate
mortgage_loan2_rate donor investor interest buyer hobby personal_email work_email devices phone employee_title employee_department employee_job_function skills recent_job_change company_id company_name company_description technologies_used office_address office_city office_country office_state office_zip5 office_zip4 office_carrier_route office_latitude office_longitude office_cbsa_code
office_census_block_group
office_census_tract office_county_code
company_phone
company_credit_score
company_csa_code
company_dpbc
company_franchiseflag
company_facebookurl company_linkedinurl company_twitterurl
company_website company_fortune_rank
company_government_type company_headquarters_branch company_home_business
company_industry
company_num_pcs_used
company_num_employees
company_firm_individual company_msa company_msa_name
company_naics_code
company_naics_description
company_naics_code2 company_naics_description2
company_sic_code2
company_sic_code2_desc...
MHS Dashboard Children and Youth Demographic Datasets
catalog.data.gov
data.chhs.ca.gov
+1more
Updated Jul 23, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
California Department of Health Care Services (2025). MHS Dashboard Children and Youth Demographic Datasets [Dataset]. https://catalog.data.gov/dataset/mhs-dashboard-children-and-youth-demographic-datasets-8c678
Explore at:
Dataset updated
Jul 23, 2025
Dataset provided by
California Department of Health Care Serviceshttp://www.dhcs.ca.gov/
Description
The following datasets are based on the children and youth (under age 21) beneficiary population and consist of aggregate Mental Health Service data derived from Medi-Cal claims, encounter, and eligibility systems. These datasets were developed in accordance with California Welfare and Institutions Code (WIC) § 14707.5 (added as part of Assembly Bill 470 on 10/7/17). Please contact BHData@dhcs.ca.gov for any questions or to request previous years’ versions of these datasets. Note: The Performance Dashboard AB 470 Report Application Excel tool development has been discontinued. Please see the Behavioral Health reporting data hub at https://behavioralhealth-data.dhcs.ca.gov/ for access to dashboards utilizing these datasets and other behavioral health data.
d
Basic Demographics Age and Gender - Seattle Neighborhoods
catalog.data.gov
data.seattle.gov
Updated Jan 31, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
City of Seattle ArcGIS Online (2025). Basic Demographics Age and Gender - Seattle Neighborhoods [Dataset]. https://catalog.data.gov/dataset/basic-demographics-age-and-gender-seattle-neighborhoods
Explore at:
Dataset updated
Jan 31, 2025
Dataset provided by
City of Seattle ArcGIS Online
Area covered
Seattle
Description
Table from the American Community Survey (ACS) 5-year series on age and gender related topics for City of Seattle Council Districts, Comprehensive Plan Growth Areas and Community Reporting Areas. Table includes B01001 Sex by Age, B01002 Median Age by Sex. Data is pulled from block group tables for the most recent ACS vintage and summarized to the neighborhoods based on block group assignment.Table created for and used in the Neighborhood Profiles application.Vintages: 2023ACS Table(s): B01001, B01002Data downloaded from: Census Bureau's Explore Census Data The United States Census Bureau's American Community Survey (ACS):About the SurveyGeography & ACSTechnical DocumentationNews & UpdatesThis ready-to-use layer can be used within ArcGIS Pro, ArcGIS Online, its configurable apps, dashboards, Story Maps, custom apps, and mobile apps. Data can also be exported for offline workflows. Please cite the Census and ACS when using this data.Data Note from the Census:Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estima
s
Spotify User and Artist Analytics Dataset 2025
spotmod.online
Updated Jul 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Spotmod (2025). Spotify User and Artist Analytics Dataset 2025 [Dataset]. https://spotmod.online/spotify-stats/
Explore at:
Dataset updated
Jul 17, 2025
Dataset authored and provided by
Spotmod
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A dataset covering Spotify usage and artist performance in 2025, including metrics like monthly active users, premium subscriber counts, demographic breakdowns, and playlist analytics.
f
Dataset.
plos.figshare.com
figshare.com
xlsx
Updated Oct 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jennifer J. Lee; Mavra Ahmed; Rim Mouhaffel; Mary R. L’Abbé (2023). Dataset. [Dataset]. http://doi.org/10.1371/journal.pdig.0000360.s005
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pdig.0000360.s005
Dataset updated
Oct 25, 2023
Dataset provided by
PLOS Digital Health
Authors
Jennifer J. Lee; Mavra Ahmed; Rim Mouhaffel; Mary R. L’Abbé
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
There has been an increased emphasis on plant-based foods and diets. Although mobile technology has the potential to be a convenient and innovative tool to help consumers adhere to dietary guidelines, little is known about the content and quality of free, popular mobile health (mHealth) plant-based diet apps. The objective of the study was to assess the content and quality of free, popular mHealth apps supporting plant-based diets for Canadians. Free mHealth apps with high user ratings, a high number of user ratings, available on both Apple App and GooglePlay stores, and primarily marketed to help users follow plant-based diet were included. Using pre-defined search terms, Apple App and GooglePlay App stores were searched on December 22, 2020; the top 100 returns for each search term were screened for eligibility. Included apps were downloaded and assessed for quality by three dietitians/nutrition research assistants using the Mobile App Rating Scale (MARS) and the App Quality Evaluation (AQEL) scale. Of the 998 apps screened, 16 apps (mean user ratings±SEM: 4.6±0.1) met the eligibility criteria, comprising 10 recipe managers and meal planners, 2 food scanners, 2 community builders, 1 restaurant identifier, and 1 sustainability assessor. All included apps targeted the general population and focused on changing behaviors using education (15 apps), skills training (9 apps), and/or goal setting (4 apps). Although MARS (scale: 1–5) revealed overall adequate app quality scores (3.8±0.1), domain-specific assessments revealed high functionality (4.0±0.1) and aesthetic (4.0±0.2), but low credibility scores (2.4±0.1). The AQEL (scale: 0–10) revealed overall low score in support of knowledge acquisition (4.5±0.4) and adequate scores in other nutrition-focused domains (6.1–7.6). Despite a variety of free plant-based apps available with different focuses to help Canadians follow plant-based diets, our findings suggest a need for increased credibility and additional resources to complement the low support of knowledge acquisition among currently available plant-based apps. This research received no specific grant from any funding agency.
MHS Dashboard Adult Demographic Datasets
catalog.data.gov
data.chhs.ca.gov
+3more
Updated Jul 23, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
California Department of Health Care Services (2025). MHS Dashboard Adult Demographic Datasets [Dataset]. https://catalog.data.gov/dataset/mhs-dashboard-adult-demographic-datasets-4de54
Explore at:
Dataset updated
Jul 23, 2025
Dataset provided by
California Department of Health Care Serviceshttp://www.dhcs.ca.gov/
Description
The following datasets are based on the adult (age 21 and over) beneficiary population and consist of aggregate MHS data derived from Medi-Cal claims, encounter, and eligibility systems. These datasets were developed in accordance with California Welfare and Institutions Code (WIC) § 14707.5 (added as part of Assembly Bill 470 on 10/7/17). Please contact BHData@dhcs.ca.gov for any questions or to request previous years’ versions of these datasets. Note: The Performance Dashboard AB 470 Report Application Excel tool development has been discontinued. Please see the Behavioral Health reporting data hub at https://behavioralhealth-data.dhcs.ca.gov/ for access to dashboards utilizing these datasets and other behavioral health data.

Myket Android Application Install Dataset

zenodo.org

bin, csv

Updated Aug 23, 2023

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

Erfan Loghmani; MohammadAmin Fazli; Erfan Loghmani; MohammadAmin Fazli (2023). Myket Android Application Install Dataset [Dataset]. http://doi.org/10.48550/arxiv.2308.06862

Explore at:

bin, csvAvailable download formats

Unique identifier

https://doi.org/10.48550/arxiv.2308.06862

Dataset updated

Aug 23, 2023

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Erfan Loghmani; MohammadAmin Fazli; Erfan Loghmani; MohammadAmin Fazli

License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

This dataset contains information on application install interactions of users in the Myket android application market. The dataset was created for the purpose of evaluating interaction prediction models, requiring user and item identifiers along with timestamps of the interactions. Hence, the dataset can be used for interaction prediction and building a recommendation system. Furthermore, the data forms a dynamic network of interactions, and we can also perform network representation learning on the nodes in the network, which are users and applications.

Data Creation

The dataset was initially generated by the Myket data team, and later cleaned and subsampled by Erfan Loghmani a master student at Sharif University of Technology at the time. The data team focused on a two-week period and randomly sampled 1/3 of the users with interactions during that period. They then selected install and update interactions for three months before and after the two-week period, resulting in interactions spanning about 6 months and two weeks.

We further subsampled and cleaned the data to focus on application download interactions. We identified the top 8000 most installed applications and selected interactions related to them. We retained users with more than 32 interactions, resulting in 280,391 users. From this group, we randomly selected 10,000 users, and the data was filtered to include only interactions for these users. The detailed procedure can be found in here.

Data Structure

The dataset has two main files.

myket.csv: This file contains the interaction information and follows the same format as the datasets used in the "JODIE: Predicting Dynamic Embedding Trajectory in Temporal Interaction Networks" (ACM SIGKDD 2019) project. However, this data does not contain state labels and interaction features, resulting in associated columns being all zero.
app_info_sample.csv: This file comprises features associated with applications present in the sample. For each individual application, information such as the approximate number of installs, average rating, count of ratings, and category are included. These features provide insights into the applications present in the dataset.

Dataset Details

Total Instances: 694,121 install interaction instances
Instances Format: Triplets of user_id, app_name, timestamp
10,000 users and 7,988 android applications
Item features for 7,606 applications

For a detailed summary of the data's statistics, including information on users, applications, and interactions, please refer to the Python notebook available at summary-stats.ipynb. The notebook provides an overview of the dataset's characteristics and can be helpful for understanding the data's structure before using it for research or analysis.

Top 20 Most Installed Applications

Package Name	Count of Interactions
com.instagram.android	15292
ir.resaneh1.iptv	12143
com.tencent.ig	7919
com.ForgeGames.SpecialForcesGroup2	7797
ir.nomogame.ClutchGame	6193
com.dts.freefireth	6041
com.whatsapp	5876
com.supercell.clashofclans	5817
com.mojang.minecraftpe	5649
com.lenovo.anyshare.gps	5076
ir.medu.shad	4673
com.firsttouchgames.dls3	4641
com.activision.callofduty.shooter	4357
com.tencent.iglite	4126
com.aparat	3598
com.kiloo.subwaysurf	3135
com.supercell.clashroyale	2793
co.palang.QuizOfKings	2589
com.nazdika.app	2436
com.digikala	2413

Comparison with SNAP Datasets

The Myket dataset introduced in this repository exhibits distinct characteristics compared to the real-world datasets used by the project. The table below provides a comparative overview of the key dataset characteristics:

Dataset	#Users	#Items	#Interactions	Average Interactions per User	Average Unique Items per User
Myket	10,000	7,988	694,121	69.4	54.6
LastFM	980	1,000	1,293,103	1,319.5	158.2
Reddit	10,000	984	672,447	67.2	7.9
Wikipedia	8,227	1,000	157,474	19.1	2.2
MOOC	7,047	97	411,749	58.4	25.3

The Myket dataset stands out by having an ample number of both users and items, highlighting its relevance for real-world, large-scale applications. Unlike LastFM, Reddit, and Wikipedia datasets, where users exhibit repetitive item interactions, the Myket dataset contains a comparatively lower amount of repetitive interactions. This unique characteristic reflects the diverse nature of user behaviors in the Android application market environment.

Citation

If you use this dataset in your research, please cite the following preprint:

@misc{loghmani2023effect,
   title={Effect of Choosing Loss Function when Using T-batching for Representation Learning on Dynamic Networks}, 
   author={Erfan Loghmani and MohammadAmin Fazli},
   year={2023},
   eprint={2308.06862},
   archivePrefix={arXiv},
   primaryClass={cs.LG}
}

d
Year, Month and Payment Application-wise UPI Apps Transaction Statistics
dataful.in
Updated Jul 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataful (Factly) (2025). Year, Month and Payment Application-wise UPI Apps Transaction Statistics [Dataset]. https://dataful.in/datasets/413
Explore at:
application/x-parquet, xlsx, csvAvailable download formats
Dataset updated
Jul 22, 2025
Dataset authored and provided by
Dataful (Factly)
License
https://dataful.in/terms-and-conditionshttps://dataful.in/terms-and-conditions
Area covered
India
Variables measured
UPI Transaction Volumes, UPI Transaction Values,
Description
The dataset contains year, month and payment application-wise UPI Apps Transaction Statistics like Customer Initiated Transactions, B2C Transactions, B2B Transactions and On-us Transactions Note: 1) Unified Payments Interface(UPI) is an instant real-time payment system developed by National Payments Corporation of India. The interface facilitates inter-bank peer-to-peer and person-to-merchant transactions 2) From January 2021 onwards, ‚On-us Transactions‚ in UPI that are not processed and settled through the UPI Central System is shown under ‚ On-us Transactions column 3) Apps which has volume less than 10,000 is included under‚ Other Apps. 4) App volume in table is basis the Payer App logic, i.e the financial transaction is attributed to the PSP in UPI on the Payer's side. 5) BHIM Volume is inclusive of *99# volume. 6) For WhatsApp, Maximum registered user base of hundred (100) million in UPI
Top 10 social media by active users
kaggle.com
Updated Aug 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mahmoud Gamil (2024). Top 10 social media by active users [Dataset]. https://www.kaggle.com/datasets/mahmoudredagamail/number-of-monthly-active-users-worldwide
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 15, 2024
Dataset provided by
Kaggle
Authors
Mahmoud Gamil
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Social Media has become a part of our day-to-day routine, keeping users from across the world well-connected through digital platforms. With each passing year, social media is evolving at a rapid speed. With each passing year, the number of social media users is increasing at an immersive speed. Reports also suggest the number of social media users will reach a milestone of 5.85 billion in 2027.

In 2024, 62.6% of the world’s population will access social media, which clearly indicates the dominance of social media platforms in today’s world. In this article, we will examine social media statistics for 2024, uncovering monthly active users, daily time spent by users, most downloaded social media apps, etc.
Data and code for: Generation and applications of simulated datasets to...
zenodo.org
data.niaid.nih.gov
+1more
bin, zip
Updated Mar 12, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matthew Silk; Matthew Silk; Olivier Gimenez; Olivier Gimenez (2023). Data and code for: Generation and applications of simulated datasets to integrate social network and demographic analyses [Dataset]. http://doi.org/10.5061/dryad.m0cfxpp7s
Explore at:
zip, binAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.m0cfxpp7s
Dataset updated
Mar 12, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Matthew Silk; Matthew Silk; Olivier Gimenez; Olivier Gimenez
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Social networks are tied to population dynamics; interactions are driven by population density and demographic structure, while social relationships can be key determinants of survival and reproductive success. However, difficulties integrating models used in demography and network analysis have limited research at this interface. We introduce the R package genNetDem for simulating integrated network-demographic datasets. It can be used to create longitudinal social networks and/or capture-recapture datasets with known properties. It incorporates the ability to generate populations and their social networks, generate grouping events using these networks, simulate social network effects on individual survival, and flexibly sample these longitudinal datasets of social associations. By generating co-capture data with known statistical relationships it provides functionality for methodological research. We demonstrate its use with case studies testing how imputation and sampling design influence the success of adding network traits to conventional Cormack-Jolly-Seber (CJS) models. We show that incorporating social network effects in CJS models generates qualitatively accurate results, but with downward-biased parameter estimates when network position influences survival. Biases are greater when fewer interactions are sampled or fewer individuals are observed in each interaction. While our results indicate the potential of incorporating social effects within demographic models, they show that imputing missing network measures alone is insufficient to accurately estimate social effects on survival, pointing to the importance of incorporating network imputation approaches. genNetDem provides a flexible tool to aid these methodological advancements and help researchers test other sampling considerations in social network studies.
Social media users in Saudi Arabia 2020-2029
statista.com
Updated Nov 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista Research Department (2024). Social media users in Saudi Arabia 2020-2029 [Dataset]. https://www.statista.com/study/175878/mobile-apps-usage-in-saudi-arabia/
Explore at:
Dataset updated
Nov 4, 2024
Dataset provided by
Statistahttp://statista.com/
Authors
Statista Research Department
Area covered
Saudi Arabia
Description
The number of social media users in Saudi Arabia was forecast to continuously increase between 2024 and 2029 by in total six million users (+28.05 percent). After the ninth consecutive increasing year, the social media user base is estimated to reach 27.42 million users and therefore a new peak in 2029. Notably, the number of social media users of was continuously increasing over the past years.The shown figures regarding social media users have been derived from survey data that has been processed to estimate missing demographics.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of social media users in countries like Israel and Kuwait.
f
Is Demography Destiny? Application of Machine Learning Techniques to...
plos.figshare.com
figshare.com
docx
Updated Jun 3, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wei Luo; Thin Nguyen; Melanie Nichols; Truyen Tran; Santu Rana; Sunil Gupta; Dinh Phung; Svetha Venkatesh; Steve Allender (2023). Is Demography Destiny? Application of Machine Learning Techniques to Accurately Predict Population Health Outcomes from a Minimal Demographic Dataset [Dataset]. http://doi.org/10.1371/journal.pone.0125602
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0125602
Dataset updated
Jun 3, 2023
Dataset provided by
PLOS ONE
Authors
Wei Luo; Thin Nguyen; Melanie Nichols; Truyen Tran; Santu Rana; Sunil Gupta; Dinh Phung; Svetha Venkatesh; Steve Allender
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
For years, we have relied on population surveys to keep track of regional public health statistics, including the prevalence of non-communicable diseases. Because of the cost and limitations of such surveys, we often do not have the up-to-date data on health outcomes of a region. In this paper, we examined the feasibility of inferring regional health outcomes from socio-demographic data that are widely available and timely updated through national censuses and community surveys. Using data for 50 American states (excluding Washington DC) from 2007 to 2012, we constructed a machine-learning model to predict the prevalence of six non-communicable disease (NCD) outcomes (four NCDs and two major clinical risk factors), based on population socio-demographic characteristics from the American Community Survey. We found that regional prevalence estimates for non-communicable diseases can be reasonably predicted. The predictions were highly correlated with the observed data, in both the states included in the derivation model (median correlation 0.88) and those excluded from the development for use as a completely separated validation sample (median correlation 0.85), demonstrating that the model had sufficient external validity to make good predictions, based on demographics alone, for areas not included in the model development. This highlights both the utility of this sophisticated approach to model development, and the vital importance of simple socio-demographic characteristics as both indicators and determinants of chronic disease.
d
TagX Web Browsing clickstream Data - 300K Users North America, EU - GDPR -...
datarade.ai
.json, .csv, .xls
Updated Sep 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TagX (2024). TagX Web Browsing clickstream Data - 300K Users North America, EU - GDPR - CCPA Compliant [Dataset]. https://datarade.ai/data-products/tagx-web-browsing-clickstream-data-300k-users-north-america-tagx
Explore at:
.json, .csv, .xlsAvailable download formats
Dataset updated
Sep 16, 2024
Dataset authored and provided by
TagX
Area covered
United States
Description
TagX Web Browsing Clickstream Data: Unveiling Digital Behavior Across North America and EU Unique Insights into Online User Behavior TagX Web Browsing clickstream Data offers an unparalleled window into the digital lives of 1 million users across North America and the European Union. This comprehensive dataset stands out in the market due to its breadth, depth, and stringent compliance with data protection regulations. What Makes Our Data Unique?

Extensive Geographic Coverage: Spanning two major markets, our data provides a holistic view of web browsing patterns in developed economies. Large User Base: With 300K active users, our dataset offers statistically significant insights across various demographics and user segments. GDPR and CCPA Compliance: We prioritize user privacy and data protection, ensuring that our data collection and processing methods adhere to the strictest regulatory standards. Real-time Updates: Our clickstream data is continuously refreshed, providing up-to-the-minute insights into evolving online trends and user behaviors. Granular Data Points: We capture a wide array of metrics, including time spent on websites, click patterns, search queries, and user journey flows.

Data Sourcing: Ethical and Transparent Our web browsing clickstream data is sourced through a network of partnered websites and applications. Users explicitly opt-in to data collection, ensuring transparency and consent. We employ advanced anonymization techniques to protect individual privacy while maintaining the integrity and value of the aggregated data. Key aspects of our data sourcing process include:

Voluntary user participation through clear opt-in mechanisms Regular audits of data collection methods to ensure ongoing compliance Collaboration with privacy experts to implement best practices in data anonymization Continuous monitoring of regulatory landscapes to adapt our processes as needed

Primary Use Cases and Verticals TagX Web Browsing clickstream Data serves a multitude of industries and use cases, including but not limited to:

Digital Marketing and Advertising:

Audience segmentation and targeting Campaign performance optimization Competitor analysis and benchmarking

E-commerce and Retail:

Customer journey mapping Product recommendation enhancements Cart abandonment analysis

Media and Entertainment:

Content consumption trends Audience engagement metrics Cross-platform user behavior analysis

Financial Services:

Risk assessment based on online behavior Fraud detection through anomaly identification Investment trend analysis

Technology and Software:

User experience optimization Feature adoption tracking Competitive intelligence

Market Research and Consulting:

Consumer behavior studies Industry trend analysis Digital transformation strategies

Integration with Broader Data Offering TagX Web Browsing clickstream Data is a cornerstone of our comprehensive digital intelligence suite. It seamlessly integrates with our other data products to provide a 360-degree view of online user behavior:

Social Media Engagement Data: Combine clickstream insights with social media interactions for a holistic understanding of digital footprints. Mobile App Usage Data: Cross-reference web browsing patterns with mobile app usage to map the complete digital journey. Purchase Intent Signals: Enrich clickstream data with purchase intent indicators to power predictive analytics and targeted marketing efforts. Demographic Overlays: Enhance web browsing data with demographic information for more precise audience segmentation and targeting.

By leveraging these complementary datasets, businesses can unlock deeper insights and drive more impactful strategies across their digital initiatives. Data Quality and Scale We pride ourselves on delivering high-quality, reliable data at scale:

Rigorous Data Cleaning: Advanced algorithms filter out bot traffic, VPNs, and other non-human interactions. Regular Quality Checks: Our data science team conducts ongoing audits to ensure data accuracy and consistency. Scalable Infrastructure: Our robust data processing pipeline can handle billions of daily events, ensuring comprehensive coverage. Historical Data Availability: Access up to 24 months of historical data for trend analysis and longitudinal studies. Customizable Data Feeds: Tailor the data delivery to your specific needs, from raw clickstream events to aggregated insights.

Empowering Data-Driven Decision Making In today's digital-first world, understanding online user behavior is crucial for businesses across all sectors. TagX Web Browsing clickstream Data empowers organizations to make informed decisions, optimize their digital strategies, and stay ahead of the competition. Whether you're a marketer looking to refine your targeting, a product manager seeking to enhance user experience, or a researcher exploring digital trends, our cli...

Facebook

Twitter

Click to copy link

Link copied

Cite

Kalle Fischer (2022). App User Dataset [Dataset]. https://www.kaggle.com/datasets/kallefischer/app-user-dataset

App User Dataset

Analyze the users of your app

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Sep 7, 2022

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Kalle Fischer

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

About Dataset

This dataset contains 6 columns and 10k rows about the demographics of the users of an app. UID - User ID, unique identifier for every app user. reg_date - Date that each user registered. device - Operating system of the user. Gender - Gender of the user Country - Country where the user downloaded the app. Age - Age of the user.

Clear search

Close search

Google apps

Main menu

App User Dataset

About Dataset

Worldwide Mobile App User Behavior Dataset

IOS application reviews dataset in English

App + Web Consumer Data | MFour's 1st Party - App + Web Usage Data | 2M...

Dataset used for "A Recommender System of Buggy App Checkers for App Store...

Mobile Application User Statistics

Context

Content

Acknowledgements

Inspiration

dataset for dating app use and TNSB.sav

Factori USA Consumer Graph Data | socio-demographic, location, interest and...

MHS Dashboard Children and Youth Demographic Datasets

Basic Demographics Age and Gender - Seattle Neighborhoods

Spotify User and Artist Analytics Dataset 2025

Dataset.

MHS Dashboard Adult Demographic Datasets

Myket Android Application Install Dataset

Year, Month and Payment Application-wise UPI Apps Transaction Statistics

Top 10 social media by active users

Data and code for: Generation and applications of simulated datasets to...

Social media users in Saudi Arabia 2020-2029

Is Demography Destiny? Application of Machine Learning Techniques to...

TagX Web Browsing clickstream Data - 300K Users North America, EU - GDPR -...

App User Dataset

Analyze the users of your app

About Dataset