50 datasets found

RICO dataset
kaggle.com
Updated Dec 2, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Onur Gunes (2021). RICO dataset [Dataset]. https://www.kaggle.com/datasets/onurgunes1993/rico-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 2, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Onur Gunes
Description
Context

Data-driven models help mobile app designers understand best practices and trends, and can be used to make predictions about design performance and support the creation of adaptive UIs. This paper presents Rico, the largest repository of mobile app designs to date, created to support five classes of data-driven applications: design search, UI layout generation, UI code generation, user interaction modeling, and user perception prediction. To create Rico, we built a system that combines crowdsourcing and automation to scalably mine design and interaction data from Android apps at runtime. The Rico dataset contains design data from more than 9.3k Android apps spanning 27 categories. It exposes visual, textual, structural, and interactive design properties of more than 66k unique UI screens. To demonstrate the kinds of applications that Rico enables, we present results from training an autoencoder for UI layout similarity, which supports query-by-example search over UIs.

Content

Rico was built by mining Android apps at runtime via human-powered and programmatic exploration. Like its predecessor ERICA, Rico’s app mining infrastructure requires no access to — or modification of — an app’s source code. Apps are downloaded from the Google Play Store and served to crowd workers through a web interface. When crowd workers use an app, the system records a user interaction trace that captures the UIs visited and the interactions performed on them. Then, an automated agent replays the trace to warm up a new copy of the app and continues the exploration programmatically, leveraging a content-agnostic similarity heuristic to efficiently discover new UI states. By combining crowdsourcing and automation, Rico can achieve higher coverage over an app’s UI states than either crawling strategy alone. In total, 13 workers recruited on UpWork spent 2,450 hours using apps on the platform over five months, producing 10,811 user interaction traces. After collecting a user trace for an app, we ran the automated crawler on the app for one hour.

Acknowledgements

UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN https://interactionmining.org/rico

Inspiration

The Rico dataset is large enough to support deep learning applications. We trained an autoencoder to learn an embedding for UI layouts, and used it to annotate each UI with a 64-dimensional vector representation encoding visual layout. This vector representation can be used to compute structurally — and often semantically — similar UIs, supporting example-based search over the dataset. To create training inputs for the autoencoder that embed layout information, we constructed a new image for each UI capturing the bounding box regions of all leaf elements in its view hierarchy, differentiating between text and non-text elements. Rico’s view hierarchies obviate the need for noisy image processing or OCR techniques to create these inputs.
d
Google Play Store Apps / Games Data, Android Apps Data, Consumer Review...
datarade.ai
.json, .csv
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
OpenWeb Ninja, Google Play Store Apps / Games Data, Android Apps Data, Consumer Review Data, Top Charts | Real-Time API [Dataset]. https://datarade.ai/data-products/openweb-ninja-google-play-store-data-android-apps-games-openweb-ninja
Explore at:
.json, .csvAvailable download formats
Dataset authored and provided by
OpenWeb Ninja
Area covered
Christmas Island, Finland, Macedonia (the former Yugoslav Republic of), Nicaragua, Mali, Bermuda, Azerbaijan, Guam, Korea (Republic of), Netherlands
Description
Use the OpenWeb Ninja Google Play App Store Data API to access comprehensive data on Google Play Store, including Android Apps / Games, reviews, top charts, search, and more. Our extensive dataset provides over 40 app store data points, enabling you to gain deep insights into the market.

The App Store Data dataset includes all key app details:

App Name, Description, Rating, Photos, Downloads, Version Information, App Size, Permissions, Developer and Contact Information, Consumer Review Data.
e
The manifest and store data of 870,515 Android mobile applications - Dataset...
b2find.eudat.eu
Updated Oct 23, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). The manifest and store data of 870,515 Android mobile applications - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/b25ee20e-5268-50ae-9914-4bc70bd4ff1c
Explore at:
Dataset updated
Oct 23, 2023
Description
We built a crawler to collect data from the Google Play store including the application's metadata and APK files. The manifest files were extracted from the APK files and then processed to extract the features. The data set is composed of 870,515 records/apps, and for each app we produced 48 features. The data set was used to built and test two bootstrap aggregating of multiple XGBoost machine learning classifiers. The dataset were collected between April 2017 and November 2018. We then checked the status of these applications on three different occasions; December 2018, February 2019, and May-June 2019.
m
Data of cognitive task performance and illuminance collected by a smartphone...
figshare.manchester.ac.uk
txt
Updated Jul 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Altug Didikoglu; Marina Gardesevic; Samuel JD Lawrence; Céline Vetter; Timothy M Brown; Annette E Allen; Robert J Lucas (2022). Data of cognitive task performance and illuminance collected by a smartphone app [Dataset]. http://doi.org/10.48420/20212265.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.48420/20212265.v1
Dataset updated
Jul 1, 2022
Dataset provided by
University of Manchester
Authors
Altug Didikoglu; Marina Gardesevic; Samuel JD Lawrence; Céline Vetter; Timothy M Brown; Annette E Allen; Robert J Lucas
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data was collected by a smartphone app (Brighter Time) to capture measures of cognitive performance and light exposure during everyday life. The app incorporated a psychomotor vigilance (PVT), an N2-back, and a visual search task with questionnaire-based assessments of sleep timing. The app also measured illuminance during task completion using the smartphone’s intrinsic light meter. Data was collected in a pilot feasibility study of Brighter Time based upon 91 week-long running Brighter Time on their own smartphones. Data: ambient light (log lx), kss score (Karolinska Sleepiness Scale), median reaction times (ms), number of lapses in PVT (>500ms), hit rate (%), false alarm rate (%), 90th percentile of reaction times (ms), 10th percentile of reaction times (ms), inverse efficiency score, d-prime for N-back task, search efficieny slope for visual search task.
d
TagX Web Browsing clickstream Data - 300K Users North America, EU - GDPR -...
datarade.ai
.json, .csv, .xls
Updated Sep 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TagX (2024). TagX Web Browsing clickstream Data - 300K Users North America, EU - GDPR - CCPA Compliant [Dataset]. https://datarade.ai/data-products/tagx-web-browsing-clickstream-data-300k-users-north-america-tagx
Explore at:
.json, .csv, .xlsAvailable download formats
Dataset updated
Sep 16, 2024
Dataset authored and provided by
TagX
Area covered
United States
Description
TagX Web Browsing Clickstream Data: Unveiling Digital Behavior Across North America and EU Unique Insights into Online User Behavior TagX Web Browsing clickstream Data offers an unparalleled window into the digital lives of 1 million users across North America and the European Union. This comprehensive dataset stands out in the market due to its breadth, depth, and stringent compliance with data protection regulations. What Makes Our Data Unique?

Extensive Geographic Coverage: Spanning two major markets, our data provides a holistic view of web browsing patterns in developed economies. Large User Base: With 300K active users, our dataset offers statistically significant insights across various demographics and user segments. GDPR and CCPA Compliance: We prioritize user privacy and data protection, ensuring that our data collection and processing methods adhere to the strictest regulatory standards. Real-time Updates: Our clickstream data is continuously refreshed, providing up-to-the-minute insights into evolving online trends and user behaviors. Granular Data Points: We capture a wide array of metrics, including time spent on websites, click patterns, search queries, and user journey flows.

Data Sourcing: Ethical and Transparent Our web browsing clickstream data is sourced through a network of partnered websites and applications. Users explicitly opt-in to data collection, ensuring transparency and consent. We employ advanced anonymization techniques to protect individual privacy while maintaining the integrity and value of the aggregated data. Key aspects of our data sourcing process include:

Voluntary user participation through clear opt-in mechanisms Regular audits of data collection methods to ensure ongoing compliance Collaboration with privacy experts to implement best practices in data anonymization Continuous monitoring of regulatory landscapes to adapt our processes as needed

Primary Use Cases and Verticals TagX Web Browsing clickstream Data serves a multitude of industries and use cases, including but not limited to:

Digital Marketing and Advertising:

Audience segmentation and targeting Campaign performance optimization Competitor analysis and benchmarking

E-commerce and Retail:

Customer journey mapping Product recommendation enhancements Cart abandonment analysis

Media and Entertainment:

Content consumption trends Audience engagement metrics Cross-platform user behavior analysis

Financial Services:

Risk assessment based on online behavior Fraud detection through anomaly identification Investment trend analysis

Technology and Software:

User experience optimization Feature adoption tracking Competitive intelligence

Market Research and Consulting:

Consumer behavior studies Industry trend analysis Digital transformation strategies

Integration with Broader Data Offering TagX Web Browsing clickstream Data is a cornerstone of our comprehensive digital intelligence suite. It seamlessly integrates with our other data products to provide a 360-degree view of online user behavior:

Social Media Engagement Data: Combine clickstream insights with social media interactions for a holistic understanding of digital footprints. Mobile App Usage Data: Cross-reference web browsing patterns with mobile app usage to map the complete digital journey. Purchase Intent Signals: Enrich clickstream data with purchase intent indicators to power predictive analytics and targeted marketing efforts. Demographic Overlays: Enhance web browsing data with demographic information for more precise audience segmentation and targeting.

By leveraging these complementary datasets, businesses can unlock deeper insights and drive more impactful strategies across their digital initiatives. Data Quality and Scale We pride ourselves on delivering high-quality, reliable data at scale:

Rigorous Data Cleaning: Advanced algorithms filter out bot traffic, VPNs, and other non-human interactions. Regular Quality Checks: Our data science team conducts ongoing audits to ensure data accuracy and consistency. Scalable Infrastructure: Our robust data processing pipeline can handle billions of daily events, ensuring comprehensive coverage. Historical Data Availability: Access up to 24 months of historical data for trend analysis and longitudinal studies. Customizable Data Feeds: Tailor the data delivery to your specific needs, from raw clickstream events to aggregated insights.

Empowering Data-Driven Decision Making In today's digital-first world, understanding online user behavior is crucial for businesses across all sectors. TagX Web Browsing clickstream Data empowers organizations to make informed decisions, optimize their digital strategies, and stay ahead of the competition. Whether you're a marketer looking to refine your targeting, a product manager seeking to enhance user experience, or a researcher exploring digital trends, our cli...
Data from: Testing of Mobile Applications in the Wild: A Large-Scale...
figshare.com
txt
Updated Mar 25, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fabiano Pecorelli (2020). Testing of Mobile Applications in the Wild: A Large-Scale Empirical Study on Android Apps [Dataset]. http://doi.org/10.6084/m9.figshare.9980672.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.9980672.v1
Dataset updated
Mar 25, 2020
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Fabiano Pecorelli
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Nowadays, mobile applications (a.k.a., apps) are used by over two billion users for every type of need, including social and emergency connectivity. Their pervasiveness in today world has inspired the software testing research community in devising approaches to allow developers to better test their apps and improve the quality of the tests being developed. In spite of this research effort, we still notice a lack of empirical analyses aiming at assessing the actual quality of test cases manually developed by mobile developers: this perspective could provide evidence-based findings on the future research directions in the field as well as on the current status of testing in the wild. As such, we performed a large-scale empirical study targeting 1,780 open-source Android apps and aiming at assessing (1) the extent to which these apps are actually tested, (2) how well-designed are the available tests, and (3) what is their effectiveness. The key results of our study show that mobile developers still tend not to properly test their apps, possibly because of time to market requirements. Furthermore, we discovered that the test cases of the considered apps have a low (i) design quality, both in terms of test code metrics and test smells, and (ii) effectiveness when considering code coverage as well as assertion density.
Data from: EBSCO Academic Search Complete
catalog.data.gov
datasets.ai
+1more
Updated Oct 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
EBSCO (2022). EBSCO Academic Search Complete [Dataset]. https://catalog.data.gov/dataset/ebsco-academic-search-complete
Explore at:
Dataset updated
Oct 14, 2022
Dataset provided by
EBSCO Information Serviceshttp://www.ebsco.com/
Description
Academic Search Complete is the world's most valuable and comprehensive scholarly, multi-disciplinary full-text database, with more than 8,500 full-text periodicals, including more than 7,300 peer-reviewed journals.
f
Data from: Dataset of smartphone-based finger tapping test
figshare.com
csv
Updated Sep 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Givago Souza (2024). Dataset of smartphone-based finger tapping test [Dataset]. http://doi.org/10.6084/m9.figshare.26940823.v1
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.26940823.v1
Dataset updated
Sep 5, 2024
Dataset provided by
figshare
Authors
Givago Souza
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset of smartphone-based finger tapping test submitted to Scientific Data journal.
instant app embed test - edm3-jhz2 - Archive Repository
healthdata.gov
application/rdfxml +5
Updated Aug 18, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). instant app embed test - edm3-jhz2 - Archive Repository [Dataset]. https://healthdata.gov/dataset/instant-app-embed-test-edm3-jhz2-Archive-Repositor/wpg8-uxj3
Explore at:
csv, json, application/rdfxml, application/rssxml, xml, tsvAvailable download formats
Dataset updated
Aug 18, 2022
Description
This dataset tracks the updates made on the dataset "instant app embed test" as a repository for previous versions of the data and metadata.
d
COVID-19 Test Sites
catalog.data.gov
Updated Mar 31, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
City of Philadelphia (2025). COVID-19 Test Sites [Dataset]. https://catalog.data.gov/dataset/covid-19-test-sites
Explore at:
Dataset updated
Mar 31, 2025
Dataset provided by
City of Philadelphia
Description
A dataset of COVID-19 testing sites. A dataset of COVID-19 testing sites. If looking for a test, please use the Testing Sites locator app. You will be asked for identification and will also be asked for health insurance information. Identification will be required to receive a test. If you don’t have health insurance, you may still be able to receive a test by paying out-of-pocket. Some sites may also: - Limit testing to people who meet certain criteria. - Require an appointment. - Require a referral from your doctor. Check a location’s specific details on the map. Then, call or visit the provider’s website before going for a test.
O
Check In Qld app - Customer Check-ins
data.qld.gov.au
csv
Updated Jul 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Communities, Housing and Digital Economy (2022). Check In Qld app - Customer Check-ins [Dataset]. https://www.data.qld.gov.au/dataset/check-in-qld-app-customer-check-ins
Explore at:
csv(1.2 MiB)Available download formats
Dataset updated
Jul 1, 2022
Dataset authored and provided by
Communities, Housing and Digital Economy
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Queensland
Description
Check In Qld is the Queensland Governments app used to support contract tracing and is available to download for use across a number of businesses to help keep Queenslanders COVID Safe.

For more information on the Check In Qld app please visit https://www.covid19.qld.gov.au/check-in-qld

Note: From 1am AEST Thursday 30 June 2022, checking in at locations in Queensland is no longer required. Data from the Qld Check in App will no longer be collected by the Queensland Government and therefore this dataset will no longer be updated.
PlantifyDr Dataset
kaggle.com
Updated Mar 6, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alex Lavaee (2021). PlantifyDr Dataset [Dataset]. https://www.kaggle.com/datasets/lavaman151/plantifydr-dataset/discussion?sort=undefined
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 6, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Alex Lavaee
Description
Context

This is the dataset that I used in my iOS and Android plant disease detection app, PlantifyDr. You can check out my full open-source project here: https://github.com/lavaman131/PlantifyDr

Content

The dataset contains over 125,000 jpg images of 10 different plant types: Apple, Bell pepper, Cherry, Citrus, Corn, Grape, Peach, Potato, Strawberry, and Tomato. The total number of plant diseases is 37. Augmentations have already been applied to the data, but feel free to add your own augmentations if you like.

Acknowledgements

Special thanks to: https://data.mendeley.com/datasets/tywbtsjrjv/1 https://www.kaggle.com/vipoooool/new-plant-diseases-dataset https://github.com/pratikkayal/PlantDoc-Dataset https://data.mendeley.com/datasets/3f83gxmv57/2

for the data.

Inspiration

The Food and Agriculture Organization of the United Nations (FAO) estimates that annually between 20 to 40 percent of global crop production is lost. Each year, plant diseases cost the global economy around $220 billion. I hoped to use deep learning to solve this problem and be able to better educate farmers and the public with the necessary knowledge to treat their plants.
Z
Data Set on Accuracy of Symptom Checker Apps in 2020
data.niaid.nih.gov
explore.openaire.eu
Updated Feb 13, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Schmidt, Konrad (2022). Data Set on Accuracy of Symptom Checker Apps in 2020 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6054092
Explore at:
Dataset updated
Feb 13, 2022
Dataset provided by
Schulz-Niethammer, Sven
Schmidt, Konrad
Balzer, Felix
Schmieding, Malte L
Feufel, Markus
Kopka, Marvin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These two data sets present the accuracy of triage (disposition) and diagnostic advice of symptom checker apps sampled in 2020. The sample consists of 22 commonly used symptom checker apps, of which 14 also provide diagnostic advice. The apps were tested on 45 case vignettes, i.e. fictitious descriptions of patients. As not every app was able to appraise every vignette our study yielded a total of 796 unique triage evaluations and 520 unique diagnostic evaluations. The data sets are a supplement to the paper "Triage Accuracy of Symptom Checker Apps: A Five-year Follow-up Evaluation" (doi: 10.2196/31810).

The was collected by Anna Dames as partial requirement for her MSc degree in Human Factors in the Department of Psychology and Ergonomics (IPA) at Technische Universität Berlin.

The clinical vignettes were originally compiled and modified by Semigran et al. in 2015 (https://doi.org/10.1136/bmj.h3480), and further adapted by Hill et al. (2020) (doi: 10.5694/mja2.50600) and in the study these data sets are supplement to (doi: 10.2196/31810).
Facebook users worldwide 2017-2027
statista.com
de.statista.com
+1more
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stacy Jo Dixon, Facebook users worldwide 2017-2027 [Dataset]. https://www.statista.com/topics/1164/social-networks/
Explore at:
Dataset provided by
Statistahttp://statista.com/
Authors
Stacy Jo Dixon
Description
The global number of Facebook users was forecast to continuously increase between 2023 and 2027 by in total 391 million users (+14.36 percent). After the fourth consecutive increasing year, the Facebook user base is estimated to reach 3.1 billion users and therefore a new peak in 2027. Notably, the number of Facebook users was continuously increasing over the past years. User figures, shown here regarding the platform Facebook, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).
R
Data from: Mechanical Parts Dataset
universe.roboflow.com
zip
Updated Dec 12, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mazhar Cakir (2022). Mechanical Parts Dataset [Dataset]. https://universe.roboflow.com/mazhar-cakir/mechanical-parts/dataset/1
Explore at:
zipAvailable download formats
Dataset updated
Dec 12, 2022
Dataset authored and provided by
Mazhar Cakir
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Variables measured
Bearing Bounding Boxes
Description
Here are a few use cases for this project:

Automated Industrial Quality Control: Companies involved in manufacturing mechanical parts could use this model to automate their quality control process. The model could identify each part in real-time and determine if the right component is being used, check for defects, or confirm assembly accuracy.

Tool Inventory Management: The "Mechanical Parts" model could be used in hardware stores or workshops for efficient tool inventory management. By scanning an area with a camera, the system could instantly itemize all available parts, categorizing them into nuts, bolts, gears, bearings, etc.

Mechanical Failure Diagnostics: This model could be used by mechanics to diagnose mechanical failures in equipment or engines. By identifying individual parts through images, it could help point out any damaged parts like a worn out gear or defective bearing.

Augmented Reality (AR) Applications: The model could be used in AR applications to help students or novice mechanics learn about machinery. As they scan different parts with their mobile device, the app could identify each component and provide educational information.

Recycling Center Sorting: The model could be applied in recycling centers to sort and categorize mechanical parts. This could help efficiently separate reusable components and streamline the recycling process.
Food dot com recipes dataset
crawlfeeds.com
csv, zip
Updated Jul 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Crawl Feeds (2025). Food dot com recipes dataset [Dataset]. https://crawlfeeds.com/datasets/food-dot-com-recipes-dataset
Explore at:
zip, csvAvailable download formats
Dataset updated
Jul 1, 2025
Dataset authored and provided by
Crawl Feeds
License
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Description
Explore the culinary world with our extensive Food.com Recipes dataset. This dataset offers a rich collection of recipes sourced from Food.com, one of the largest and most trusted recipe platforms. Ideal for food enthusiasts, chefs, app developers, and data scientists, this dataset provides everything you need to create, analyze, and innovate in the kitchen.

The dataset includes detailed information such as recipe names, ingredients, step-by-step cooking instructions, preparation and cooking times, user ratings, and dietary preferences. Whether you're developing a new recipe app, conducting food-related research, or simply looking to explore new dishes, this dataset offers a wealth of information to help you achieve your goals.

Looking for additional data to power your food-related projects? Check out our Food & Beverage Data for access to a wide variety of datasets that can help you unlock new opportunities in the food and beverage industry.

With thousands of recipes covering a wide range of cuisines, meal types, and dietary requirements, this dataset is perfect for those looking to build recipe recommendation systems, nutritional analysis tools, or food blogs. Tap into the rich culinary diversity offered by Food.com and take your food-related projects to the next level.
O
Check In Qld app - Registered Locations
data.qld.gov.au
csv
Updated Jul 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Communities, Housing and Digital Economy (2022). Check In Qld app - Registered Locations [Dataset]. https://www.data.qld.gov.au/dataset/check-in-qld-app-registered-locations
Explore at:
csv(731.5 KiB)Available download formats
Dataset updated
Jul 1, 2022
Dataset authored and provided by
Communities, Housing and Digital Economy
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Queensland
Description
Check In Qld is the Queensland Governments app used to support contract tracing and is available to download for use across a number of businesses to help keep Queenslanders COVID Safe.

For more information on the Check In Qld app please visit https://www.covid19.qld.gov.au/check-in-qld

Note: From 1am AEST Thursday 30 June 2022, checking in at locations in Queensland is no longer required. Data from the Qld Check in App will no longer be collected by the Queensland Government and therefore this dataset will no longer be updated.
Z
StreetSurfaceVis: a dataset of street-level imagery with annotations of road...
data.niaid.nih.gov
zenodo.org
Updated Jan 20, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hoffmann, Edith (2025). StreetSurfaceVis: a dataset of street-level imagery with annotations of road surface type and quality [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11449976
Explore at:
Dataset updated
Jan 20, 2025
Dataset provided by
Hoffmann, Edith
Kapp, Alexandra
Weigmann, Esther
Mihaljevic, Helena
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
StreetSurfaceVis

StreetSurfaceVis is an image dataset containing 9,122 street-level images from Germany with labels on road surface type and quality. The CSV file streetSurfaceVis_v1_0.csv contains all image metadata and four folders contain the image files. All images are available in four different sizes, based on the image width, in 256px, 1024px, 2048px and the original size.Folders containing the images are named according to the respective image size. Image files are named based on the mapillary_image_id.

You can find the corresponding publication here: StreetSurfaceVis: a dataset of crowdsourced street-level imagery with semi-automated annotations of road surface type and quality

Image metadata

Each CSV record contains information about one street-level image with the following attributes:

mapillary_image_id: ID provided by Mapillary (see information below on Mapillary)

user_id: Mapillary user ID of contributor

user_name: Mapillary user name of contributor

captured_at: timestamp, capture time of image

longitude, latitude: location the image was taken at

train: Suggestion to split train and test data. True for train data and False for test data. Test data contains data from 5 cities which are excluded in the training data.

surface_type: Surface type of the road in the focal area (the center of the lower image half) of the image. Possible values: asphalt, concrete, paving_stones, sett, unpaved

surface_quality: Surface quality of the road in the focal area of the image. Possible values: (1) excellent, (2) good, (3) intermediate, (4) bad, (5) very bad (see the attached Labeling Guide document for details)

Image source

Images are obtained from Mapillary, a crowd-sourcing plattform for street-level imagery. More metadata about each image can be obtained via the Mapillary API . User-generated images are shared by Mapillary under the CC-BY-SA License.

For each image, the dataset contains the mapillary_image_id and user_name. You can access user information on the Mapillary website by https://www.mapillary.com/app/user/ and image information by https://www.mapillary.com/app/?focus=photo&pKey=

If you use the provided images, please adhere to the terms of use of Mapillary.

Instances per class

Total number of images: 9,122

excellent good intermediate bad very bad

asphalt 971 1697 821

246

concrete 314 350 250

58

paving stones 385 1063 519

70

sett

129 694

540

unpaved

-

326 387 303

For modeling, we recommend using a train-test split where the test data includes geospatially distinct areas, thereby ensuring the model's ability to generalize to unseen regions is tested. We propose five cities varying in population size and from different regions in Germany for testing - images are tagged accordingly.

Number of test images (train-test split): 776

Inter-rater-reliablility

Three annotators labeled the dataset, such that each image was annotated by one person. Annotators were encouraged to consult each other for a second opinion when uncertain.1,800 images were annotated by all three annotators, resulting in a Krippendorff's alpha of 0.96 for surface type and 0.74 for surface quality.

Recommended image preprocessing

As the focal road located in the bottom center of the street-level image is labeled, it is recommended to crop images to their lower and middle half prior using for classification tasks.

This is an exemplary code for recommended image preprocessing in Python:

from PIL import Imageimg = Image.open(image_path)width, height = img.sizeimg_cropped = img.crop((0.25 * width, 0.5 * height, 0.75 * width, height))

License

CC-BY-SA

Citation

If you use this dataset, please cite as:

Kapp, A., Hoffmann, E., Weigmann, E. et al. StreetSurfaceVis: a dataset of crowdsourced street-level imagery annotated by road surface type and quality. Sci Data 12, 92 (2025). https://doi.org/10.1038/s41597-024-04295-9

@article{kapp_streetsurfacevis_2025, title = {{StreetSurfaceVis}: a dataset of crowdsourced street-level imagery annotated by road surface type and quality}, volume = {12}, issn = {2052-4463}, url = {https://doi.org/10.1038/s41597-024-04295-9}, doi = {10.1038/s41597-024-04295-9}, pages = {92}, number = {1}, journaltitle = {Scientific Data}, shortjournal = {Scientific Data}, author = {Kapp, Alexandra and Hoffmann, Edith and Weigmann, Esther and Mihaljević, Helena}, date = {2025-01-16},}

This is part of the SurfaceAI project at the University of Applied Sciences, HTW Berlin.

Prof. Dr. Helena Mihajlević- Alexandra Kapp- Edith Hoffmann- Esther Weigmann

Contact: surface-ai@htw-berlin.de

https://surfaceai.github.io/surfaceai/

Funding: SurfaceAI is a mFund project funded by the Federal Ministry for Digital and Transportation Germany.
o
Monthly time series of spatially enhanced relative humidity for Europe at...
data.opendatascience.eu
data.mundialis.de
+1more
Updated Dec 16, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). Monthly time series of spatially enhanced relative humidity for Europe at 1000 m resolution (2000 - 2022) derived from ERA5-Land data [Dataset]. https://data.opendatascience.eu/geonetwork/srv/search?keyword=TBE
Explore at:
Dataset updated
Dec 16, 2023
Description
Overview: ERA5-Land is a reanalysis dataset providing a consistent view of the evolution of land variables over several decades at an enhanced resolution compared to ERA5. ERA5-Land has been produced by replaying the land component of the ECMWF ERA5 climate reanalysis. Reanalysis combines model data with observations from across the world into a globally complete and consistent dataset using the laws of physics. Reanalysis produces data that goes several decades back in time, providing an accurate description of the climate of the past. Processing steps: The original hourly ERA5-Land air temperature 2 m above ground and dewpoint temperature 2 m data has been spatially enhanced from 0.1 degree to 30 arc seconds (approx. 1000 m) spatial resolution by image fusion with CHELSA data (V1.2) (https://chelsa-climate.org/). For each day we used the corresponding monthly long-term average of CHELSA. The aim was to use the fine spatial detail of CHELSA and at the same time preserve the general regional pattern and fine temporal detail of ERA5-Land. The steps included aggregation and enhancement, specifically: 1. spatially aggregate CHELSA to the resolution of ERA5-Land 2. calculate difference of ERA5-Land - aggregated CHELSA 3. interpolate differences with a Gaussian filter to 30 arc seconds. 4. add the interpolated differences to CHELSA Subsequently, the temperature time series have been aggregated on a daily basis. From these, daily relative humidity has been calculated for the time period 01/2000 - 12/2023. Relative humidity (rh2m) has been calculated from air temperature 2 m above ground (Ta) and dewpoint temperature 2 m above ground (Td) using the formula for saturated water pressure from Wright (1997): maximum water pressure = 611.21 * exp(17.502 * Ta / (240.97 + Ta)) actual water pressure = 611.21 * exp(17.502 * Td / (240.97 + Td)) relative humidity = actual water pressure / maximum water pressure The resulting relative humidity has been aggregated to monthly averages. Resultant values have been converted to represent percent * 10, thus covering a theoretical range of [0, 1000]. The data have been reprojected to EU LAEA. File naming scheme (YYYY = year; MM = month): ERA5_land_rh2m_avg_monthly_YYYY_MM.tif Projection + EPSG code: EU LAEA (EPSG: 3035) Spatial extent: north: 6874000 south: -485000 west: 869000 east: 8712000 Spatial resolution: 1000 m Temporal resolution: Monthly Pixel values: Percent * 10 (scaled to Integer; example: value 738 = 73.8 %) Software used: GDAL 3.2.2 and GRASS GIS 8.0.0/8.3.2 Original ERA5-Land dataset license: https://apps.ecmwf.int/datasets/licences/copernicus/ CHELSA climatologies (V1.2): Data used: Karger D.N., Conrad, O., Böhner, J., Kawohl, T., Kreft, H., Soria-Auza, R.W., Zimmermann, N.E, Linder, H.P., Kessler, M. (2018): Data from: Climatologies at high resolution for the earth's land surface areas. Dryad digital repository. http://dx.doi.org/doi:10.5061/dryad.kd1d4 Original peer-reviewed publication: Karger, D.N., Conrad, O., Böhner, J., Kawohl, T., Kreft, H., Soria-Auza, R.W., Zimmermann, N.E., Linder, P., Kessler, M. (2017): Climatologies at high resolution for the Earth land surface areas. Scientific Data. 4 170122. https://doi.org/10.1038/sdata.2017.122 Processed by: mundialis GmbH & Co. KG, Germany (https://www.mundialis.de/) Reference: Wright, J.M. (1997): Federal meteorological handbook no. 3 (FCM-H3-1997). Office of Federal Coordinator for Meteorological Services and Supporting Research. Washington, DC Data is also available in Latitude-Longitude/WGS84 (EPSG: 4326) projection: https://data.mundialis.de/geonetwork/srv/eng/catalog.search#/metadata/b9ce7dba-4130-428d-96f0-9089d8b9f4a5 Acknowledgements: This study was partially funded by EU grant 874850 MOOD. The contents of this publication are the sole responsibility of the authors and don't necessarily reflect the views of the European Commission.
E
WS : Overflowing Garbage
data.edmonton.ca
application/rdfxml +5
Updated Sep 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
City of Edmonton (2025). WS : Overflowing Garbage [Dataset]. https://data.edmonton.ca/w/a4ha-fnwu/depj-dfck?cur=AX_Pv9UntPm
Explore at:
application/rdfxml, csv, application/rssxml, json, xml, tsvAvailable download formats
Dataset updated
Sep 18, 2025
Dataset authored and provided by
City of Edmonton
Description
Listing of all requests raised to 311, details of each call/transaction and update on the status.

This is anonymized data intended for reporting on trends; if you have a specific reference number for which you would like to check the status, please see the "check your status" search box at https://www.edmonton.ca/programs_services/311-city-services.aspx.

Due to the volume of data, this dataset contains 311 requests created within the current calendar year and the previous two calendar years. In January of a given year it will contain 2 years and 1 month of requests; in December of that year it will contain 3 years of requests. For historical data please search for "311 Requests (YEAR)" where "YEAR" is a year value, e.g. 2018. These will exist for years that no longer exist in this dataset, i.e. "current year minus three" will be the greatest year value that has its own single-year dataset.

Facebook

Twitter

Click to copy link

Link copied

Cite

Onur Gunes (2021). RICO dataset [Dataset]. https://www.kaggle.com/datasets/onurgunes1993/rico-dataset

RICO dataset

A Mobile App Dataset for Building Data-Driven Design Applications

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Dec 2, 2021

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Onur Gunes

Description

Context

Data-driven models help mobile app designers understand best practices and trends, and can be used to make predictions about design performance and support the creation of adaptive UIs. This paper presents Rico, the largest repository of mobile app designs to date, created to support five classes of data-driven applications: design search, UI layout generation, UI code generation, user interaction modeling, and user perception prediction. To create Rico, we built a system that combines crowdsourcing and automation to scalably mine design and interaction data from Android apps at runtime. The Rico dataset contains design data from more than 9.3k Android apps spanning 27 categories. It exposes visual, textual, structural, and interactive design properties of more than 66k unique UI screens. To demonstrate the kinds of applications that Rico enables, we present results from training an autoencoder for UI layout similarity, which supports query-by-example search over UIs.

Content

Rico was built by mining Android apps at runtime via human-powered and programmatic exploration. Like its predecessor ERICA, Rico’s app mining infrastructure requires no access to — or modification of — an app’s source code. Apps are downloaded from the Google Play Store and served to crowd workers through a web interface. When crowd workers use an app, the system records a user interaction trace that captures the UIs visited and the interactions performed on them. Then, an automated agent replays the trace to warm up a new copy of the app and continues the exploration programmatically, leveraging a content-agnostic similarity heuristic to efficiently discover new UI states. By combining crowdsourcing and automation, Rico can achieve higher coverage over an app’s UI states than either crawling strategy alone. In total, 13 workers recruited on UpWork spent 2,450 hours using apps on the platform over five months, producing 10,811 user interaction traces. After collecting a user trace for an app, we ran the automated crawler on the app for one hour.

Acknowledgements

UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN https://interactionmining.org/rico

Inspiration

The Rico dataset is large enough to support deep learning applications. We trained an autoencoder to learn an embedding for UI layouts, and used it to annotate each UI with a 64-dimensional vector representation encoding visual layout. This vector representation can be used to compute structurally — and often semantically — similar UIs, supporting example-based search over the dataset. To create training inputs for the autoencoder that embed layout information, we constructed a new image for each UI capturing the bounding box regions of all leaf elements in its view hierarchy, differentiating between text and non-text elements. Rico’s view hierarchies obviate the need for noisy image processing or OCR techniques to create these inputs.

Clear search

Close search

Google apps

Main menu

RICO dataset

Context

Content

Acknowledgements

Inspiration

Google Play Store Apps / Games Data, Android Apps Data, Consumer Review...

The manifest and store data of 870,515 Android mobile applications - Dataset...

Data of cognitive task performance and illuminance collected by a smartphone...

TagX Web Browsing clickstream Data - 300K Users North America, EU - GDPR -...

Data from: Testing of Mobile Applications in the Wild: A Large-Scale...

Data from: EBSCO Academic Search Complete

Data from: Dataset of smartphone-based finger tapping test

instant app embed test - edm3-jhz2 - Archive Repository

COVID-19 Test Sites

Check In Qld app - Customer Check-ins

PlantifyDr Dataset

Context

Content

Acknowledgements

Inspiration

Data Set on Accuracy of Symptom Checker Apps in 2020

Facebook users worldwide 2017-2027

Data from: Mechanical Parts Dataset

Food dot com recipes dataset

Check In Qld app - Registered Locations

StreetSurfaceVis: a dataset of street-level imagery with annotations of road...

246

58

70

sett

540

unpaved

Monthly time series of spatially enhanced relative humidity for Europe at...

WS : Overflowing Garbage

RICO dataset

A Mobile App Dataset for Building Data-Driven Design Applications

Context

Content

Acknowledgements

Inspiration