Facebook
Twitterhttps://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Looking for a Google Play apps dataset to analyze mobile app trends? The Google Play Store Apps Dataset delivers ~10,000 app records from the Google Play Store, including key app metadata like app name, category, rating, installs, price, developer details, and more. This dataset is ideal for app market research, mobile analytics, app store optimization studies (ASO), data science projects, and trend analysis.
Collect structured data on apps across genres and niches, so you can build visualizations, train machine-learning models, analyze user engagement, or compare categories like games, productivity, health & fitness, and finance.
Rich App Metadata: Includes app_id, app_name, category, rating, review_count, price, installs, content_rating, genres, last_updated, current_version, android_version, developer_name, developer_email, <span style="font-size: 12pt; font-family: 'Roboto Mono',monospace; color: #188038; background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space:
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
While many public datasets (on Kaggle and the like) provide Apple App Store data, few counterpart datasets are available for Google Play Store apps anywhere on the web. On digging deeper, I discovered that the iTunes App Store page deploys a nicely indexed appendix-like structure to allow for simple and easy web scraping. On the other hand, Google Play Store uses sophisticated modern-day techniques (like dynamic page load) using JQuery making scraping more challenging.
- There are 13 features in the dataset, and each feature indicates some details of Google application name, category, rating, reviews, size, installs, type, price, content rating genres, last updated, current version and Android version.
- App: The application name.
- Category: The category the app belongs to.
- Rating: Overall user rating of the app.
- Reviews: Number of user reviews for the app.
- Size: The size of the app.
- Installs: Number of user installs for the app.
- Type: Either "Paid" or "Free".
- Price: The price of the app.
- Content Rating: The age group the app is targeted at - "Children" / "Mature 21+" / "Adult".
- Genres: Possibly multiple genres the app belongs to.
- Last Updated: The date the app was last updated.
- Current Ver: The current version of the app.
- Android Ver: The Android version is needed for this app.
Facebook
TwitterAttribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
The iOS App Store launched in 2008 with 500 apps. Today, there are over four million apps available across iOS and Android platforms, extending to a wide range of sub-genres and niches. These apps...
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset consists of apps needed permissions during installation and run-time. We collect apps from three different sources google play, third-party apps and malware dataset. This file contains more than 5,00,000 Android apps. features extracted at the time of installation and execution. One file contains the name of the features and others contain .apk file corresponding to it extracted permissions and API calls. Benign apps are collected from Google's play store, hiapk, app china, Android, mumayi , gfan slideme, and pandaapp. These .apk files collected from the last three years continuously and contain 81 distinct malware families.
Facebook
TwitterAttribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Outside of China, Apple and Google control more than 95 percent of the app store market share through iOS and Android, respectively. Both mobile operating systems originally came with a few...
Facebook
TwitterA large-scale dataset on the dynamic profiles based on function calls of 35,974 benign and malicious Android apps from 10 historical years (2010 through 2019). Function calls are a commonly used means to model program behaviors, which may contribute to various code analysis approaches to assuring software correctness, reliability, and security. In particular, our dataset includes dynamic profiles of each app resulting from the same-length of time (10 mins) of being exercised by randomly generated inputs on both emulator and real device, enabling interesting and useful app analysis that reason about app behaviors in an evolutionary perspective while informing the differences of app behaviors on different run-time hardware platforms. Since we have 20 yearly datasets associated with 35,974 unique Android apps across the 10 years, profiling these apps took 12,000 hours. Considering the costs of filtering out apps that were originally sampled but that we were unable to profile (due to various reasons such as broken APKs, not being executable because of incompatibility issues, not instrumentable, etc.), we took over two years to produce all these traces. We hope to save future researchers' time in producing such a set of dynamic data to enable their empirical and technical work. ================== Thanks for your interest in our dataset. Collecting this dataset took tremendous computational and human effort. Thus, please observe the following restrictions in using our dataset: - Do not redistribute this dataset without our consent. - Do not make commercial usage of this dataset. - Get a faculty, or someone in a permanent position, to agree and commit to these conditions. - When publishing your work that uses our dataset, please cite the following MSR 2021 data paper. @inproceedings{AndroidCT, title = {AndroCT: Ten Years of App Call Traces in Android}, author = {Wen Li, Xiaoqin Fu, and Haipeng Cai}, booktitle = {The 18th International Conference on Mining Software Repositories (MSR 2021), Data Showcase Track}, year = {2021}, }
Facebook
TwitterThe dataset is proposed by DroidRL. Cite this paper if you need to use the dataset. It contains 5000 benign samples from AndroZoo and 5560 malware from Drebin to train and test the model.
Static analysis is applied, extracting the permissions, intent actions, and opcode as original features from android samples decompiled by APKtool and Androguard for further reinforcement learning based feature selection. For each kind of feature, an N-dimension boolean vector is constructed, where "1" implies that the feature is required and "0" means not. Additionally, label is appended at the last to present whether the sample is a malicious Android application.
In total 457 permissions and 126 intent actions that are typically considered to be highly relevant to the malicious behavior of Android applications, are chosen in this paper to construct original feature vectors. Permissions indicate what sensitive user data (e.g., contacts and SMS) need to be accessed by an application, which is essential in Android malware detection. Intent actions are abstract objects containing information on the operation to be performed for an app component.
After disassembling the class.dex to generate the smalis files, Dalvik bytecode (example: invoke-direct) is gained through scanning the method field in smalis files with regular expression. The dimensionality reduction approach is employed as following described to address the problem of high feature vector dimensionality due to the increase of the number of N-grams with the value of N.
Facebook
TwitterAttribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Android is the most popular operating system in the world, with over three billion active users spanning over 190 countries. Created by Andy Rubin as the open-source alternative to iPhone and Palm...
Facebook
TwitterGlobal app downloads have plateaued in recent years, especially when comparing between the previous figures provided by data.ai and Sensor Tower. However, global downloads seemed to have recovered in 2025, reaching nearly *** billion unique downloads. Why the difference? Source methodology explains the gap The discrepancy arises from considerable differences in the methodology used by the sources to aggregate and generate the data. Sensor Tower reports only unique downloads per user account, excluding app updates, re-downloads, and installations on multiple devices by the same user. In contrast, data.ai includes these additional activities as well as downloads from third-party Android stores and a broader geographic scope, resulting in substantially higher total counts. As a result, Sensor Tower's numbers better reflect new user acquisition, while data.ai's encompass all market activity and total engagement. Despite stagnating downloads, user spending is growing While the number of downloads is leveling off, consumer spending on in-app purchases and related revenue has grown in 2025 to *** billion U.S. dollars, up from around *** billion U.S. dollars in 2023. While gaming remains the highest-grossing app category overall, other categories drove the growth. The entertainment, photo & video, productivity, and social networking categories each grew by at least *** billion U.S. dollars in revenue in 2025 compared to the previous year.
Facebook
TwitterAttribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Google Play is the largest app store by number of apps and downloads, accounting for about half of all app downloads in the world. Launched in 2008 as the Android Market, it followed in the footsteps...
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset simulates anonymized mobile screen time and app usage data collected from Android/iOS users over a 3-month period (Jan–April 2024). It captures daily usage trends across various app categories including:
Productivity: Google Docs, Notion, Slack
Entertainment: YouTube, Netflix, TikTok
Social Media: Instagram, WhatsApp, Facebook
Utilities: Chrome, Gmail, Maps
For YouTube, additional engagement statistics such as views, likes, and comments are included to analyze video popularity and content consumption behavior.
The dataset enables exploration of:
Productivity vs. entertainment screen time patterns
Daily usage fluctuations
App-specific user engagement
Correlation between time spent and user interactions
YouTube content virality metrics
This is a great resource for:
EDA projects
Behavioral clustering
Dashboard development
Time series and anomaly detection
Building recommendation or focus-assistive apps
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset has extracted features from Hybrid Apps available for deployment on the Android platform until recently. The data for this dataset has been culled out from various sources, including existing similar datasets and Google Play Store or its mirrors. The dataset is labelled to differentiate malicious and benign Hybrid Apps. Thus, it may conveniently be used for supervised learning. Nonetheless, the dataset has adequate attributes to support any unsupervised learning task as well. The dataset comprises 78,767 samples.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Dive into Android Statistics with eye‑opening insights on usage, device trends, and AI impact for smarter understanding and smarter strategy.
Facebook
TwitterA large-scale dataset on the static and dynamic profiles based on function calls of 30,634 benign and malicious Android apps from eight historical years (2010 through 2017). Function calls are a commonly used means to model program behaviors, which may contribute to various code analysis approaches to assuring software correctness, reliability, and security. In particular, our dataset includes static and dynamic profiles of each app based on the same set of metrics that define the profile, enabling hybrid app analysis that reason about app behaviors from the dynamic profiles with the corresponding profiles as context. The static profiles are computed by the state-of-the-art static app analysis for Android, while the dynamic profiles are the result of running each sample app against automatically generated test inputs for ten minutes.
Facebook
TwitterAttribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
The App Data Report offers a thorough analysis of the two key mobile operating systems—Android and iOS. Providing detailed data on consumer spending, app downloads and app store statistics. The...
Facebook
TwitterAttribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
The Apple App Store and Google Play Store are made up of millions of apps, but the vast majority get less than 1,000 downloads. For the select few that reach the top, millions of people download them...
Facebook
Twitterhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html
The ever-changing mobile landscape is a challenging space to navigate. . The percentage of mobile over desktop is only increasing. Android holds about 53.2% of the smartphone market, while iOS is 43%. To get more people to download your app, you need to make sure they can easily find your app. Mobile app analytics is a great way to understand the existing strategy to drive growth and retention of future user.
With million of apps around nowadays, the following data set has become very key to getting top trending apps in iOS app store. This data set contains more than 7000 Apple iOS mobile application details. The data was extracted from the iTunes Search API at the Apple Inc website. R and linux web scraping tools were used for this study.
Interactive full Shiny app can be seen here( https://multiscal.shinyapps.io/appStore/)
Data collection date (from API); July 2017
Dimension of the data set; 7197 rows and 16 columns
"id" : App ID
"track_name": App Name
"size_bytes": Size (in Bytes)
"currency": Currency Type
"price": Price amount
"rating_count_tot": User Rating counts (for all version)
"rating_count_ver": User Rating counts (for current version)
"user_rating" : Average User Rating value (for all version)
"user_rating_ver": Average User Rating value (for current version)
"ver" : Latest version code
"cont_rating": Content Rating
"prime_genre": Primary Genre
"sup_devices.num": Number of supporting devices
"ipadSc_urls.num": Number of screenshots showed for display
"lang.num": Number of supported languages
"vpp_lic": Vpp Device Based Licensing Enabled
The data was extracted from the iTunes Search API at the Apple Inc website. R and linux web scraping tools were used for this study.
Reference: R package
From github, with
devtools::install_github("ramamet/applestoreR")
Copyright (c) 2018 Ramanathan Perumal
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A dataset containing 2375 samples of Android Process Memory String Dumps. The dataset is broadly composed of 2 classes: "Benign App" Memory Dumps and "Malicious App" Memory Dumps, respectively, split into 2 ZIP archives. The ZIP archives in total are approximately 17GB in size, however the unzipped contents are approximately 67GB.
This dataset is derived from a subset of the APK files originally made freely available for research through the AndroZoo project [1]. The AndroZoo project collected millions of Android applications and scanned them with the VirusTotal online malware scanning service, thereby classifying most of the apps as either malicious or benign at the time of scanning. The process memory dumps in this dataset were generated through running the subset of APK files from the AndroZoo dataset in an Android Emulator, capturing the process memory of the individual process and subsequently extracting only the strings from the process memory dump. This was facilitated through building 2 applications: Coriander and AndroMemDumpBeta which facilitate the running of Apps on Android Emulators, and the capturing of process memory respectively. The source code for these software applications is available on Github.
The individual samples are labelled with the SHA256 hash filename from the original AndroZoo labeling and the application package names extracted from within the specific APK manifest file. They also contain a time-stamp for when the memory dumping process took place for the specific file. The file extension used is ".dmp" to indicate that the files are memory dumps, however they only contain strings, and thus can be viewed in any simple text editor.
A subset of the first 10000 APK files from the original AndroZoo dataset is also included within this dataset. The metadata of these APK files is present in the file "AndroZoo-First-10000" and the 2375 Android Apps that are the main subjects of our dataset are extracted from here..
Our dataset is intended to be used in furthering our research related to Machine Learning-based Triage for Android Memory Forensics. It has been made openly available in order to foster opportunities for collaboration with other researchers, to enable validation of research results as well as to enhance the body of knowledge in related areas of research.
References: [1]. K. Allix, T. F. Bissyandé, J. Klein, and Y. Le Traon. AndroZoo: Collecting Millions of Android Apps for the Research Community. Mining Software Repositories (MSR) 2016
Facebook
TwitterAs of August 2022, language learning app HelloTalk and Google's meeting point for schools Google Classroom were the educational app collecting the largest amount of data points. ClassDojo and popular language learning app Duolingo followed, collecting approximately ** different data points from global Android users.
Facebook
TwitterAttribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
While many public datasets (on Kaggle and the like) provide Apple App Store data, there are not many counterpart datasets available for Google Play Store apps anywhere on the web. On digging deeper, I found out that iTunes App Store page deploys a nicely indexed appendix-like structure to allow for simple and easy web scraping. On the other hand, Google Play Store uses sophisticated modern-day techniques (like dynamic page load) using JQuery making scraping more challenging.
Each app (row) has values for catergory, rating, size, and more.
This information is scraped from the Google Play Store. This app information would not be available without it.
The Play Store apps data has enormous potential to drive app-making businesses to success. Actionable insights can be drawn for developers to work on and capture the Android market!
Facebook
Twitterhttps://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Looking for a Google Play apps dataset to analyze mobile app trends? The Google Play Store Apps Dataset delivers ~10,000 app records from the Google Play Store, including key app metadata like app name, category, rating, installs, price, developer details, and more. This dataset is ideal for app market research, mobile analytics, app store optimization studies (ASO), data science projects, and trend analysis.
Collect structured data on apps across genres and niches, so you can build visualizations, train machine-learning models, analyze user engagement, or compare categories like games, productivity, health & fitness, and finance.
Rich App Metadata: Includes app_id, app_name, category, rating, review_count, price, installs, content_rating, genres, last_updated, current_version, android_version, developer_name, developer_email, <span style="font-size: 12pt; font-family: 'Roboto Mono',monospace; color: #188038; background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: