Use the OpenWeb Ninja Google Play App Store Data API to access comprehensive data on Google Play Store, including Android Apps / Games, reviews, top charts, search, and more. Our extensive dataset provides over 40 app store data points, enabling you to gain deep insights into the market.
The App Store Data dataset includes all key app details:
App Name, Description, Rating, Photos, Downloads, Version Information, App Size, Permissions, Developer and Contact Information, Consumer Review Data.
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
This dataset offers a focused and invaluable window into user perceptions and experiences with applications listed on the Apple App Store. It is a vital resource for app developers, product managers, market analysts, and anyone seeking to understand the direct voice of the customer in the dynamic mobile app ecosystem.
Dataset Specifications:
Last crawled:
(This field is blank in your provided info, which means its recency is currently unknown. If this were a real product, specifying this would be critical for its value proposition.)Richness of Detail (11 Comprehensive Fields):
Each record in this dataset provides a detailed breakdown of a single App Store review, enabling multi-dimensional analysis:
Review Content:
review
: The full text of the user's written feedback, crucial for Natural Language Processing (NLP) to extract themes, sentiment, and common keywords.title
: The title given to the review by the user, often summarizing their main point.isEdited
: A boolean flag indicating whether the review has been edited by the user since its initial submission. This can be important for tracking evolving sentiment or understanding user behavior.Reviewer & Rating Information:
username
: The public username of the reviewer, allowing for analysis of engagement patterns from specific users (though not personally identifiable).rating
: The star rating (typically 1-5) given by the user, providing a quantifiable measure of satisfaction.App & Origin Context:
app_name
: The name of the application being reviewed.app_id
: A unique identifier for the application within the App Store, enabling direct linking to app details or other datasets.country
: The country of the App Store storefront where the review was left, allowing for geographic segmentation of feedback.Metadata & Timestamps:
_id
: A unique identifier for the specific review record in the dataset.crawled_at
: The timestamp indicating when this particular review record was collected by the data provider (Crawl Feeds).date
: The original date the review was posted by the user on the App Store.Expanded Use Cases & Analytical Applications:
This dataset is a goldmine for understanding what users truly think and feel about mobile applications. Here's how it can be leveraged:
Product Development & Improvement:
review
text to identify recurring technical issues, crashes, or bugs, allowing developers to prioritize fixes based on user impact.review
text to inform future product roadmap decisions and develop features users actively desire.review
field.rating
and sentiment
after new app updates to assess the effectiveness of bug fixes or new features.Market Research & Competitive Intelligence:
Marketing & App Store Optimization (ASO):
review
and title
fields to gauge overall user satisfaction, pinpoint specific positive and negative aspects, and track sentiment shifts over time.rating
trends and identify critical reviews quickly to facilitate timely responses and proactive customer engagement.Academic & Data Science Research:
review
and title
fields are excellent for training and testing NLP models for sentiment analysis, topic modeling, named entity recognition, and text summarization.rating
distribution, isEdited
status, and date
to understand user engagement and feedback cycles.country
-specific reviews to understand regional differences in app perception, feature preferences, or cultural nuances in feedback.This App Store Reviews dataset provides a direct, unfiltered conduit to understanding user needs and ultimately driving better app performance and greater user satisfaction. Its structured format and granular detail make it an indispensable asset for data-driven decision-making in the mobile app industry.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
We built a crawler to collect data from the Google Play store including the application's metadata and APK files. The manifest files were extracted from the APK files and then processed to extract the features. The data set is composed of 870,515 records/apps, and for each app we produced 48 features. The data set was used to built and test two bootstrap aggregating of multiple XGBoost machine learning classifiers. The dataset were collected between April 2017 and November 2018. We then checked the status of these applications on three different occasions; December 2018, February 2019, and May-June 2019.
Data-driven models help mobile app designers understand best practices and trends, and can be used to make predictions about design performance and support the creation of adaptive UIs. This paper presents Rico, the largest repository of mobile app designs to date, created to support five classes of data-driven applications: design search, UI layout generation, UI code generation, user interaction modeling, and user perception prediction. To create Rico, we built a system that combines crowdsourcing and automation to scalably mine design and interaction data from Android apps at runtime. The Rico dataset contains design data from more than 9.3k Android apps spanning 27 categories. It exposes visual, textual, structural, and interactive design properties of more than 66k unique UI screens. To demonstrate the kinds of applications that Rico enables, we present results from training an autoencoder for UI layout similarity, which supports query-by-example search over UIs.
Rico was built by mining Android apps at runtime via human-powered and programmatic exploration. Like its predecessor ERICA, Rico’s app mining infrastructure requires no access to — or modification of — an app’s source code. Apps are downloaded from the Google Play Store and served to crowd workers through a web interface. When crowd workers use an app, the system records a user interaction trace that captures the UIs visited and the interactions performed on them. Then, an automated agent replays the trace to warm up a new copy of the app and continues the exploration programmatically, leveraging a content-agnostic similarity heuristic to efficiently discover new UI states. By combining crowdsourcing and automation, Rico can achieve higher coverage over an app’s UI states than either crawling strategy alone. In total, 13 workers recruited on UpWork spent 2,450 hours using apps on the platform over five months, producing 10,811 user interaction traces. After collecting a user trace for an app, we ran the automated crawler on the app for one hour.
UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN https://interactionmining.org/rico
The Rico dataset is large enough to support deep learning applications. We trained an autoencoder to learn an embedding for UI layouts, and used it to annotate each UI with a 64-dimensional vector representation encoding visual layout. This vector representation can be used to compute structurally — and often semantically — similar UIs, supporting example-based search over the dataset. To create training inputs for the autoencoder that embed layout information, we constructed a new image for each UI capturing the bounding box regions of all leaf elements in its view hierarchy, differentiating between text and non-text elements. Rico’s view hierarchies obviate the need for noisy image processing or OCR techniques to create these inputs.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Apple App Store Key StatisticsApps & Games in the Apple App StoreApps in the Apple App StoreGames in the Apple App StoreMost Popular Apple App Store CategoriesPaid vs Free Apps in Apple App...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A large-scale dataset on the dynamic profiles based on function calls of 35,974 benign and malicious Android apps from 10 historical years (2010 through 2019). Function calls are a commonly used means to model program behaviors, which may contribute to various code analysis approaches to assuring software correctness, reliability, and security. In particular, our dataset includes dynamic profiles of each app resulting from the same-length of time (10 mins) of being exercised by randomly generated inputs on both emulator and real device, enabling interesting and useful app analysis that reason about app behaviors in an evolutionary perspective while informing the differences of app behaviors on different run-time hardware platforms. Since we have 20 yearly datasets associated with 35,974 unique Android apps across the 10 years, profiling these apps took 12,000 hours. Considering the costs of filtering out apps that were originally sampled but that we were unable to profile (due to various reasons such as broken APKs, not being executable because of incompatibility issues, not instrumentable, etc.), we took over two years to produce all these traces. We hope to save future researchers' time in producing such a set of dynamic data to enable their empirical and technical work.
==================
Thanks for your interest in our dataset. Collecting this dataset took tremendous computational and human effort. Thus, please observe the following restrictions in using our dataset:
- Do not redistribute this dataset without our consent.
- Do not make commercial usage of this dataset.
- Get a faculty, or someone in a permanent position, to agree and commit to these conditions.
- When publishing your work that uses our dataset, please cite the following MSR 2021 data paper.
@inproceedings{AndroidCT,
title = {AndroCT: Ten Years of App Call Traces in Android},
author = {Wen Li, Xiaoqin Fu, and Haipeng Cai},
booktitle = {The 18th International Conference on Mining Software Repositories (MSR 2021), Data Showcase Track},
year = {2021},
}
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
App Download Key StatisticsApp and Game DownloadsiOS App and Game DownloadsGoogle Play App and Game DownloadsGame DownloadsiOS Game DownloadsGoogle Play Game DownloadsApp DownloadsiOS App...
Do you know how much time you spend on an app? Do you know the total use time of a day or average use time of an app?
This data set consists of - how many times a person unlocks his phone. - how much time he spends on every app on every day. - how much time he spends on his phone.
It lists the usage time of apps for each day.
Use the test data to find the Total Minutes that we can use the given app in a day. we can get a clear stats of apps usage. This data set will show you about the persons sleeping behavior as well as what app he spends most of his time. with this we can improve the productivity of the person.
The dataset was collected from the app usage app.
We built a crawler to collect data from the Google Play store including the application's metadata and APK files. The manifest files were extracted from the APK files and then processed to extract the features. The data set is composed of 870,515 records/apps, and for each app we produced 48 features. The data set was used to built and test two bootstrap aggregating of multiple XGBoost machine learning classifiers. The dataset were collected between April 2017 and November 2018. We then checked the status of these applications on three different occasions; December 2018, February 2019, and May-June 2019.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Check In Qld is the Queensland Governments app used to support contract tracing and is available to download for use across a number of businesses to help keep Queenslanders COVID Safe.
For more information on the Check In Qld app please visit https://www.covid19.qld.gov.au/check-in-qld
Note: From 1am AEST Thursday 30 June 2022, checking in at locations in Queensland is no longer required. Data from the Qld Check in App will no longer be collected by the Queensland Government and therefore this dataset will no longer be updated.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Card for Macappstore Applications Metadata
Mac App Store Applications Metadata sourced by the public API.
Curated by: MacPaw Way Ltd.
Language(s) (NLP): Mostly EN, DE License: MIT
Dataset Details
This data aims to cover our internal company research needs and start collecting and sharing the macOS app dataset since we have yet to find a suitable existing one. Full application metadata was sourced by the public iTunes search API for the US, Germany, and Ukraine… See the full description on the dataset page: https://huggingface.co/datasets/MacPaw/mac-app-store-apps-metadata.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Playstore Analysis’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/madhav000/playstore-analysis on 30 September 2021.
--- Dataset description provided by original source is as follows ---
Google Play Store team had launched a new feature wherein, certain apps that are promising, are boosted in visibility. The boost will manifest in multiple ways including higher priority in recommendations sections (“Similar apps”, “You might also like”, “New and updated games”). These will also get a boost in search results visibility. This feature will help bring more attention to newer apps that have the potential.
The problem is to identify the apps that are going to be good for Google to promote. App ratings, which are provided by the customers, is always a great indicator of the goodness of the app. The problem reduces to: predict which apps will have high ratings.
Google Play Store team is about to launch a new feature wherein, certain apps that are promising, are boosted in visibility. The boost will manifest in multiple ways including higher priority in recommendations sections (“Similar apps”, “You might also like”, “New and updated games”). These will also get a boost in search results visibility. This feature will help bring more attention to newer apps that have the potential.
Dataset: Google Play Store data (“googleplaystore.csv”)
Fields in the data: App: Application name Category: Category to which the app belongs Rating: Overall user rating of the app Reviews: Number of user reviews for the app Size: Size of the app Installs: Number of user downloads/installs for the app Type: Paid or Free Price: Price of the app Content Rating: Age group the app is targeted at - Children / Mature 21+ / Adult Genres: An app can belong to multiple genres (apart from its main category). For example, a musical family game will belong to Music, Game, Family genres. Last Updated: Date when the app was last updated on Play Store Current Ver: Current version of the app available on Play Store Android Ver: Minimum required Android version
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Nowadays, mobile applications (a.k.a., apps) are used by over two billion users for every type of need, including social and emergency connectivity. Their pervasiveness in today world has inspired the software testing research community in devising approaches to allow developers to better test their apps and improve the quality of the tests being developed. In spite of this research effort, we still notice a lack of empirical analyses aiming at assessing the actual quality of test cases manually developed by mobile developers: this perspective could provide evidence-based findings on the future research directions in the field as well as on the current status of testing in the wild. As such, we performed a large-scale empirical study targeting 1,780 open-source Android apps and aiming at assessing (1) the extent to which these apps are actually tested, (2) how well-designed are the available tests, and (3) what is their effectiveness. The key results of our study show that mobile developers still tend not to properly test their apps, possibly because of time to market requirements. Furthermore, we discovered that the test cases of the considered apps have a low (i) design quality, both in terms of test code metrics and test smells, and (ii) effectiveness when considering code coverage as well as assertion density.
TagX Web Browsing Clickstream Data: Unveiling Digital Behavior Across North America and EU Unique Insights into Online User Behavior TagX Web Browsing clickstream Data offers an unparalleled window into the digital lives of 1 million users across North America and the European Union. This comprehensive dataset stands out in the market due to its breadth, depth, and stringent compliance with data protection regulations. What Makes Our Data Unique?
Extensive Geographic Coverage: Spanning two major markets, our data provides a holistic view of web browsing patterns in developed economies. Large User Base: With 300K active users, our dataset offers statistically significant insights across various demographics and user segments. GDPR and CCPA Compliance: We prioritize user privacy and data protection, ensuring that our data collection and processing methods adhere to the strictest regulatory standards. Real-time Updates: Our clickstream data is continuously refreshed, providing up-to-the-minute insights into evolving online trends and user behaviors. Granular Data Points: We capture a wide array of metrics, including time spent on websites, click patterns, search queries, and user journey flows.
Data Sourcing: Ethical and Transparent Our web browsing clickstream data is sourced through a network of partnered websites and applications. Users explicitly opt-in to data collection, ensuring transparency and consent. We employ advanced anonymization techniques to protect individual privacy while maintaining the integrity and value of the aggregated data. Key aspects of our data sourcing process include:
Voluntary user participation through clear opt-in mechanisms Regular audits of data collection methods to ensure ongoing compliance Collaboration with privacy experts to implement best practices in data anonymization Continuous monitoring of regulatory landscapes to adapt our processes as needed
Primary Use Cases and Verticals TagX Web Browsing clickstream Data serves a multitude of industries and use cases, including but not limited to:
Digital Marketing and Advertising:
Audience segmentation and targeting Campaign performance optimization Competitor analysis and benchmarking
E-commerce and Retail:
Customer journey mapping Product recommendation enhancements Cart abandonment analysis
Media and Entertainment:
Content consumption trends Audience engagement metrics Cross-platform user behavior analysis
Financial Services:
Risk assessment based on online behavior Fraud detection through anomaly identification Investment trend analysis
Technology and Software:
User experience optimization Feature adoption tracking Competitive intelligence
Market Research and Consulting:
Consumer behavior studies Industry trend analysis Digital transformation strategies
Integration with Broader Data Offering TagX Web Browsing clickstream Data is a cornerstone of our comprehensive digital intelligence suite. It seamlessly integrates with our other data products to provide a 360-degree view of online user behavior:
Social Media Engagement Data: Combine clickstream insights with social media interactions for a holistic understanding of digital footprints. Mobile App Usage Data: Cross-reference web browsing patterns with mobile app usage to map the complete digital journey. Purchase Intent Signals: Enrich clickstream data with purchase intent indicators to power predictive analytics and targeted marketing efforts. Demographic Overlays: Enhance web browsing data with demographic information for more precise audience segmentation and targeting.
By leveraging these complementary datasets, businesses can unlock deeper insights and drive more impactful strategies across their digital initiatives. Data Quality and Scale We pride ourselves on delivering high-quality, reliable data at scale:
Rigorous Data Cleaning: Advanced algorithms filter out bot traffic, VPNs, and other non-human interactions. Regular Quality Checks: Our data science team conducts ongoing audits to ensure data accuracy and consistency. Scalable Infrastructure: Our robust data processing pipeline can handle billions of daily events, ensuring comprehensive coverage. Historical Data Availability: Access up to 24 months of historical data for trend analysis and longitudinal studies. Customizable Data Feeds: Tailor the data delivery to your specific needs, from raw clickstream events to aggregated insights.
Empowering Data-Driven Decision Making In today's digital-first world, understanding online user behavior is crucial for businesses across all sectors. TagX Web Browsing clickstream Data empowers organizations to make informed decisions, optimize their digital strategies, and stay ahead of the competition. Whether you're a marketer looking to refine your targeting, a product manager seeking to enhance user experience, or a researcher exploring digital trends, our cli...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data was collected by a smartphone app (Brighter Time) to capture measures of cognitive performance and light exposure during everyday life. The app incorporated a psychomotor vigilance (PVT), an N2-back, and a visual search task with questionnaire-based assessments of sleep timing. The app also measured illuminance during task completion using the smartphone’s intrinsic light meter. Data was collected in a pilot feasibility study of Brighter Time based upon 91 week-long running Brighter Time on their own smartphones. Data: ambient light (log lx), kss score (Karolinska Sleepiness Scale), median reaction times (ms), number of lapses in PVT (>500ms), hit rate (%), false alarm rate (%), 90th percentile of reaction times (ms), 10th percentile of reaction times (ms), inverse efficiency score, d-prime for N-back task, search efficieny slope for visual search task.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The peer-reviewed paper of AWARE dataset is published in ASEW 2021, and can be accessed through: http://doi.org/10.1109/ASEW52652.2021.00049. Kindly cite this paper when using AWARE dataset.
Aspect-Based Sentiment Analysis (ABSA) aims to identify the opinion (sentiment) with respect to a specific aspect. Since there is a lack of smartphone apps reviews dataset that is annotated to support the ABSA task, we present AWARE: ABSA Warehouse of Apps REviews.
AWARE contains apps reviews from three different domains (Productivity, Social Networking, and Games), as each domain has its distinct functionalities and audience. Each sentence is annotated with three labels, as follows:
Aspect Term: a term that exists in the sentence and describes an aspect of the app that is expressed by the sentiment. A term value of “N/A” means that the term is not explicitly mentioned in the sentence.
Aspect Category: one of the pre-defined set of domain-specific categories that represent an aspect of the app (e.g., security, usability, etc.).
Sentiment: positive or negative.
Note: games domain does not contain aspect terms.
We provide a comprehensive dataset of 11323 sentences from the three domains, where each sentence is additionally annotated with a Boolean value indicating whether the sentence expresses a positive/negative opinion. In addition, we provide three separate datasets, one for each domain, containing only sentences that express opinions. The file named “AWARE_metadata.csv” contains a description of the dataset’s columns.
How AWARE can be used?
We designed AWARE such that it can be used to serve various tasks. The tasks can be, but are not limited to:
Sentiment Analysis.
Aspect Term Extraction.
Aspect Category Classification.
Aspect Sentiment Analysis.
Explicit/Implicit Aspect Term Classification.
Opinion/Not-Opinion Classification.
Furthermore, researchers can experiment with and investigate the effects of different domains on users' feedback.
This paper introduces GLARE an Arabic Apps Reviews dataset collected from Saudi Google PlayStore. It consists of 76M reviews, 69M of which are Arabic reviews of 9,980 Android Applications. We present the data collection methodology, along with a detailed Exploratory Data Analysis (EDA) and Feature Engineering on the gathered reviews. We also highlight possible use cases and benefits of the dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset of smartphone-based finger tapping test submitted to Scientific Data journal.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Check In Qld is the Queensland Governments app used to support contract tracing and is available to download for use across a number of businesses to help keep Queenslanders COVID Safe.
For more information on the Check In Qld app please visit https://www.covid19.qld.gov.au/check-in-qld
Note: From 1am AEST Thursday 30 June 2022, checking in at locations in Queensland is no longer required. Data from the Qld Check in App will no longer be collected by the Queensland Government and therefore this dataset will no longer be updated.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is one of the data sets used by IMGDroid, which includes 1000 commercial Android apps. Please note that these apps in the data set can only be used for research. The copyright is preserved by the app vendor.
Use the OpenWeb Ninja Google Play App Store Data API to access comprehensive data on Google Play Store, including Android Apps / Games, reviews, top charts, search, and more. Our extensive dataset provides over 40 app store data points, enabling you to gain deep insights into the market.
The App Store Data dataset includes all key app details:
App Name, Description, Rating, Photos, Downloads, Version Information, App Size, Permissions, Developer and Contact Information, Consumer Review Data.