Facebook
TwitterData-driven models help mobile app designers understand best practices and trends, and can be used to make predictions about design performance and support the creation of adaptive UIs. This paper presents Rico, the largest repository of mobile app designs to date, created to support five classes of data-driven applications: design search, UI layout generation, UI code generation, user interaction modeling, and user perception prediction. To create Rico, we built a system that combines crowdsourcing and automation to scalably mine design and interaction data from Android apps at runtime. The Rico dataset contains design data from more than 9.3k Android apps spanning 27 categories. It exposes visual, textual, structural, and interactive design properties of more than 66k unique UI screens. To demonstrate the kinds of applications that Rico enables, we present results from training an autoencoder for UI layout similarity, which supports query-by-example search over UIs.
Rico was built by mining Android apps at runtime via human-powered and programmatic exploration. Like its predecessor ERICA, Rico’s app mining infrastructure requires no access to — or modification of — an app’s source code. Apps are downloaded from the Google Play Store and served to crowd workers through a web interface. When crowd workers use an app, the system records a user interaction trace that captures the UIs visited and the interactions performed on them. Then, an automated agent replays the trace to warm up a new copy of the app and continues the exploration programmatically, leveraging a content-agnostic similarity heuristic to efficiently discover new UI states. By combining crowdsourcing and automation, Rico can achieve higher coverage over an app’s UI states than either crawling strategy alone. In total, 13 workers recruited on UpWork spent 2,450 hours using apps on the platform over five months, producing 10,811 user interaction traces. After collecting a user trace for an app, we ran the automated crawler on the app for one hour.
UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN https://interactionmining.org/rico
The Rico dataset is large enough to support deep learning applications. We trained an autoencoder to learn an embedding for UI layouts, and used it to annotate each UI with a 64-dimensional vector representation encoding visual layout. This vector representation can be used to compute structurally — and often semantically — similar UIs, supporting example-based search over the dataset. To create training inputs for the autoencoder that embed layout information, we constructed a new image for each UI capturing the bounding box regions of all leaf elements in its view hierarchy, differentiating between text and non-text elements. Rico’s view hierarchies obviate the need for noisy image processing or OCR techniques to create these inputs.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset provides a synthetic representation of user behavior on a fictional dating app. It contains 50,000 records with 19 features capturing demographic details, app usage patterns, swipe tendencies, and match outcomes. The data was generated programmatically to simulate realistic user interactions, making it ideal for exploratory data analysis (EDA), machine learning modeling (e.g., predicting match outcomes), or studying user behavior trends in online dating platforms.
Key features include gender, sexual orientation, location type, income bracket, education level, user interests, app usage time, swipe ratios, likes received, mutual matches, and match outcomes (e.g., "Mutual Match," "Ghosted," "Catfished"). The dataset is designed to be diverse and balanced, with categorical, numerical, and labeled variables for various analytical purposes.
This dataset can be used for:
Exploratory Data Analysis (EDA): Investigate correlations between demographics, app usage, and match success. Machine Learning: Build models to predict match outcomes or user engagement levels. Social Studies: Analyze trends in dating app behavior across different demographics. Feature Engineering Practice: Experiment with transforming categorical and numerical data.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
We surveyed 10,208 people from more than 15 countries on their mobile app usage behavior. The countries include USA, China, Japan, Germany, France, Brazil, UK, Italy, Russia, India, Canada, Spain, Australia, Mexico, and South Korea. We asked respondents about: (1) their mobile app user behavior in terms of mobile app usage, including the app stores they use, what triggers them to look for apps, why they download apps, why they abandon apps, and the types of apps they download. (2) their demographics including gender, age, marital status, nationality, country of residence, first language, ethnicity, education level, occupation, and household income (3) their personality using the Big-Five personality traits This dataset contains the results of the survey.
Facebook
Twitterhttps://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
This comprehensive iOS application reviews dataset contains thousands of authentic user reviews from the Apple App Store in English. The dataset provides valuable insights for app developers, marketers, and researchers studying mobile application performance and user sentiment.
Key Features:
Applications: Perfect for sentiment analysis, app store optimization, mobile app development research, user experience studies, and competitive analysis. This dataset enables businesses to understand user preferences, identify app improvement opportunities, and develop better mobile applications.
Data Quality: All reviews are genuine user feedback collected from the official Apple App Store, ensuring authenticity and reliability for research and business intelligence purposes. The dataset covers various app categories including fitness, shopping, education, entertainment, and productivity applications.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
A dataset consisting of 751,500 English app reviews of 12 online shopping apps. The dataset was scraped from the internet using a python script. This ShoppingAppReviews dataset contains app reviews of the 12 most popular online shopping android apps: Alibaba, Aliexpress, Amazon, Daraz, eBay, Flipcart, Lazada, Meesho, Myntra, Shein, Snapdeal and Walmart. Each review entry contains many metadata like review score, thumbsupcount, review posting time, reply content etc. The dataset is organized in a zip file, under which there are 12 json files and 12 csv files for 12 online shopping apps. This dataset can be used to obtain valuable information about customers' feedback regarding their user experience of these financially important apps.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This public web application offers an engaging and interactive platform that showcases the rich tapestry of art and cultural assets throughout the City of Perth. Designed to connect residents, visitors, artists, and cultural enthusiasts, the app presents detailed spatial information and descriptions of various cultural elements, including public art installations, heritage plaques, and art gallery locations. It celebrates the city’s vibrant creative landscape and promotes cultural tourism and community engagement.
Facebook
TwitterResource of datasets used for the development of the National Water Model Mobile App.
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
This dataset contains fictional reviews from a hypothetical mobile application, generated for demo purposes in various projects. The reviews include detailed feedback from users across different countries and platforms, with additional attributes such as star ratings, like/dislike counts, and issue flags. The data was later used as an input for a large language model (LLM) to generate labeled outputs, which are included in a separate dataset named labeled_app_store_reviews. This labeled dataset can be used for machine learning tasks such as sentiment analysis, text classification, or even A/B testing simulations.
Facebook
Twitterhttps://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
iOS App Reviews Dataset
Unlock the potential of user feedback with our extensive iOS App Reviews Dataset. This dataset contains detailed reviews from a wide range of iOS applications, providing invaluable insights for developers, researchers, and marketers.
Key Features:
Last crawled at: 29 march 2021
Individual column percentage
| rating | 100% |
| review_date | 100% |
| app_name | 100% |
| tags | 37.62% |
| country | 100.0% |
| title | 100.0% |
| app_id | 100.0% |
| content | 99.99% |
| version | 86.33% |
| link | 100% |
| _id | 100% |
Countries covered: 102
tr, my, sa, mx, au, us, lb, fr, cz, om, gb, ar, br, se, pe, cl, ph, co, es, cr, no, it, de, pl, be, za, ru, tw, cn, ng, kr, ca, ua, jp, sv, vn, nl, in, do, ro, hu, ch, at, sg, th, id, ae, pa, dk, mo, gr, ec, hk, gt, pt, pk, nz, kw, bo, kz, lu, gh, ie, ve, eg, ke, il, qa, bg, hr, cy, fi, lt, dz, by, kh, lv, iq, lk, uz, uy, az, py, sk, mz, rs, mt, bh, ao, bb, ni, mg, ly, si, tn, ma, ee, mm, ge, ye, bm, af
Facebook
TwitterDatasets used to generate figures and sample runs in the SENTINEL application in the journal article "SENTINEL: A Shiny App for Processing and Analysis of Fenceline Sensor Data". This dataset is associated with the following publication: MacDonald, M., W. Champion, and E. Thoma. SENTINEL: A Shiny App for Processing and Analysis of Fenceline Sensor Data. ENVIRONMENTAL MODELLING & SOFTWARE. Elsevier Science, New York, NY, 189: 0, (2025).
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
A global subset of the dataset from 2012 onwards to be used in the Land Use demo app.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Mobile App is a dataset for object detection tasks - it contains Fruit annotations for 300 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
TwitterThe dataset in dataset1_AF2021Q3_N_S.csv contains SARS-CoV-2 wastewater data; dataset2_miR210.csv contains liquid biopsy data; and dataset3_NCTR_E2198FeCt_mmu_let_7d.csv contains a separate set of liquid biopsy data.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
APPS Control Arena Dataset
Unified dataset combining APPS problems with backdoors from both the AI Control paper and Control-Tax paper.
Dataset Description
This dataset is based on the codeparrot/apps dataset, enhanced with backdoor solutions from two sources:
APPS Backdoors: From "AI Control: Improving Safety Despite Intentional Subversion" / TylordTheGreat/apps-backdoors-04-02-25 Control-Tax Backdoors: From "Control Tax: The Price of Keeping AI in Check"… See the full description on the dataset page: https://huggingface.co/datasets/RoganInglis/apps-control-arena.
Facebook
TwitterThis dataset was constructed for use in the paper Neural Interactive Proofs. It is based on the APPS benchmark for code generation (see also the corresponding Hugging Face dataset). It includes includes a number of coding problems with both buggy and non-buggy solutions (though note that, apparently, in AlphaCode the authors found that this dataset can generate many false positives during evaluation, where incorrect submissions are marked as correct due to lack of test coverage). Each datum… See the full description on the dataset page: https://huggingface.co/datasets/lrhammond/buggy-apps.
Facebook
TwitterDataset Card for Dataset Name
Dataset Summary
MobileRec is a large-scale app recommendation dataset. There are 19.3 million user\item interactions. This is a 5-core dataset. User\item interactions are sorted in ascending chronological order. There are 0.7 million users who have had at least five distinct interactions. There are 10173 apps in total.
Supported Tasks and Leaderboards
Sequential Recommendation
Languages
English
How to use the… See the full description on the dataset page: https://huggingface.co/datasets/recmeapp/mobilerec.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Hand App is a dataset for classification tasks - it contains Handtracking annotations for 488 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This app is designed to assist residents in easily determining their scheduled bin collection days for both general waste and recycling. It provides a simple, user-friendly interface that ensures people stay informed about their local waste collection schedules, helping reduce missed pickups and improve overall community participation in recycling efforts. Show full description
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
App Test 2 is a dataset for object detection tasks - it contains App annotations for 524 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The City of Perth Scheme Maps provide a comprehensive spatial representation of key planning and zoning frameworks that guide land use and property development within the City of Perth local government area. These map layers incorporate detailed information from multiple statutory schemes and redevelopment plans, enabling planners, developers, residents, and government officials to understand regulatory controls and future urban growth directions. Show full description
Facebook
TwitterData-driven models help mobile app designers understand best practices and trends, and can be used to make predictions about design performance and support the creation of adaptive UIs. This paper presents Rico, the largest repository of mobile app designs to date, created to support five classes of data-driven applications: design search, UI layout generation, UI code generation, user interaction modeling, and user perception prediction. To create Rico, we built a system that combines crowdsourcing and automation to scalably mine design and interaction data from Android apps at runtime. The Rico dataset contains design data from more than 9.3k Android apps spanning 27 categories. It exposes visual, textual, structural, and interactive design properties of more than 66k unique UI screens. To demonstrate the kinds of applications that Rico enables, we present results from training an autoencoder for UI layout similarity, which supports query-by-example search over UIs.
Rico was built by mining Android apps at runtime via human-powered and programmatic exploration. Like its predecessor ERICA, Rico’s app mining infrastructure requires no access to — or modification of — an app’s source code. Apps are downloaded from the Google Play Store and served to crowd workers through a web interface. When crowd workers use an app, the system records a user interaction trace that captures the UIs visited and the interactions performed on them. Then, an automated agent replays the trace to warm up a new copy of the app and continues the exploration programmatically, leveraging a content-agnostic similarity heuristic to efficiently discover new UI states. By combining crowdsourcing and automation, Rico can achieve higher coverage over an app’s UI states than either crawling strategy alone. In total, 13 workers recruited on UpWork spent 2,450 hours using apps on the platform over five months, producing 10,811 user interaction traces. After collecting a user trace for an app, we ran the automated crawler on the app for one hour.
UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN https://interactionmining.org/rico
The Rico dataset is large enough to support deep learning applications. We trained an autoencoder to learn an embedding for UI layouts, and used it to annotate each UI with a 64-dimensional vector representation encoding visual layout. This vector representation can be used to compute structurally — and often semantically — similar UIs, supporting example-based search over the dataset. To create training inputs for the autoencoder that embed layout information, we constructed a new image for each UI capturing the bounding box regions of all leaf elements in its view hierarchy, differentiating between text and non-text elements. Rico’s view hierarchies obviate the need for noisy image processing or OCR techniques to create these inputs.