50 datasets found
  1. RICO dataset

    • kaggle.com
    Updated Dec 2, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Onur Gunes (2021). RICO dataset [Dataset]. https://www.kaggle.com/datasets/onurgunes1993/rico-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 2, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Onur Gunes
    Description

    Context

    Data-driven models help mobile app designers understand best practices and trends, and can be used to make predictions about design performance and support the creation of adaptive UIs. This paper presents Rico, the largest repository of mobile app designs to date, created to support five classes of data-driven applications: design search, UI layout generation, UI code generation, user interaction modeling, and user perception prediction. To create Rico, we built a system that combines crowdsourcing and automation to scalably mine design and interaction data from Android apps at runtime. The Rico dataset contains design data from more than 9.3k Android apps spanning 27 categories. It exposes visual, textual, structural, and interactive design properties of more than 66k unique UI screens. To demonstrate the kinds of applications that Rico enables, we present results from training an autoencoder for UI layout similarity, which supports query-by-example search over UIs.

    Content

    Rico was built by mining Android apps at runtime via human-powered and programmatic exploration. Like its predecessor ERICA, Rico’s app mining infrastructure requires no access to — or modification of — an app’s source code. Apps are downloaded from the Google Play Store and served to crowd workers through a web interface. When crowd workers use an app, the system records a user interaction trace that captures the UIs visited and the interactions performed on them. Then, an automated agent replays the trace to warm up a new copy of the app and continues the exploration programmatically, leveraging a content-agnostic similarity heuristic to efficiently discover new UI states. By combining crowdsourcing and automation, Rico can achieve higher coverage over an app’s UI states than either crawling strategy alone. In total, 13 workers recruited on UpWork spent 2,450 hours using apps on the platform over five months, producing 10,811 user interaction traces. After collecting a user trace for an app, we ran the automated crawler on the app for one hour.

    Acknowledgements

    UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN https://interactionmining.org/rico

    Inspiration

    The Rico dataset is large enough to support deep learning applications. We trained an autoencoder to learn an embedding for UI layouts, and used it to annotate each UI with a 64-dimensional vector representation encoding visual layout. This vector representation can be used to compute structurally — and often semantically — similar UIs, supporting example-based search over the dataset. To create training inputs for the autoencoder that embed layout information, we constructed a new image for each UI capturing the bounding box regions of all leaf elements in its view hierarchy, differentiating between text and non-text elements. Rico’s view hierarchies obviate the need for noisy image processing or OCR techniques to create these inputs.

  2. d

    Google Play Store Apps / Games Data, Android Apps Data, Consumer Review...

    • datarade.ai
    .json, .csv
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    OpenWeb Ninja, Google Play Store Apps / Games Data, Android Apps Data, Consumer Review Data, Top Charts | Real-Time API [Dataset]. https://datarade.ai/data-products/openweb-ninja-google-play-store-data-android-apps-games-openweb-ninja
    Explore at:
    .json, .csvAvailable download formats
    Dataset authored and provided by
    OpenWeb Ninja
    Area covered
    Christmas Island, Finland, Macedonia (the former Yugoslav Republic of), Nicaragua, Mali, Bermuda, Azerbaijan, Guam, Korea (Republic of), Netherlands
    Description

    Use the OpenWeb Ninja Google Play App Store Data API to access comprehensive data on Google Play Store, including Android Apps / Games, reviews, top charts, search, and more. Our extensive dataset provides over 40 app store data points, enabling you to gain deep insights into the market.

    The App Store Data dataset includes all key app details:

    App Name, Description, Rating, Photos, Downloads, Version Information, App Size, Permissions, Developer and Contact Information, Consumer Review Data.

  3. e

    The manifest and store data of 870,515 Android mobile applications - Dataset...

    • b2find.eudat.eu
    Updated Oct 23, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). The manifest and store data of 870,515 Android mobile applications - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/b25ee20e-5268-50ae-9914-4bc70bd4ff1c
    Explore at:
    Dataset updated
    Oct 23, 2023
    Description

    We built a crawler to collect data from the Google Play store including the application's metadata and APK files. The manifest files were extracted from the APK files and then processed to extract the features. The data set is composed of 870,515 records/apps, and for each app we produced 48 features. The data set was used to built and test two bootstrap aggregating of multiple XGBoost machine learning classifiers. The dataset were collected between April 2017 and November 2018. We then checked the status of these applications on three different occasions; December 2018, February 2019, and May-June 2019.

  4. m

    Data of cognitive task performance and illuminance collected by a smartphone...

    • figshare.manchester.ac.uk
    txt
    Updated Jul 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Altug Didikoglu; Marina Gardesevic; Samuel JD Lawrence; Céline Vetter; Timothy M Brown; Annette E Allen; Robert J Lucas (2022). Data of cognitive task performance and illuminance collected by a smartphone app [Dataset]. http://doi.org/10.48420/20212265.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jul 1, 2022
    Dataset provided by
    University of Manchester
    Authors
    Altug Didikoglu; Marina Gardesevic; Samuel JD Lawrence; Céline Vetter; Timothy M Brown; Annette E Allen; Robert J Lucas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data was collected by a smartphone app (Brighter Time) to capture measures of cognitive performance and light exposure during everyday life. The app incorporated a psychomotor vigilance (PVT), an N2-back, and a visual search task with questionnaire-based assessments of sleep timing. The app also measured illuminance during task completion using the smartphone’s intrinsic light meter. Data was collected in a pilot feasibility study of Brighter Time based upon 91 week-long running Brighter Time on their own smartphones. Data: ambient light (log lx), kss score (Karolinska Sleepiness Scale), median reaction times (ms), number of lapses in PVT (>500ms), hit rate (%), false alarm rate (%), 90th percentile of reaction times (ms), 10th percentile of reaction times (ms), inverse efficiency score, d-prime for N-back task, search efficieny slope for visual search task.

  5. d

    TagX Web Browsing clickstream Data - 300K Users North America, EU - GDPR -...

    • datarade.ai
    .json, .csv, .xls
    Updated Sep 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TagX (2024). TagX Web Browsing clickstream Data - 300K Users North America, EU - GDPR - CCPA Compliant [Dataset]. https://datarade.ai/data-products/tagx-web-browsing-clickstream-data-300k-users-north-america-tagx
    Explore at:
    .json, .csv, .xlsAvailable download formats
    Dataset updated
    Sep 16, 2024
    Dataset authored and provided by
    TagX
    Area covered
    United States
    Description

    TagX Web Browsing Clickstream Data: Unveiling Digital Behavior Across North America and EU Unique Insights into Online User Behavior TagX Web Browsing clickstream Data offers an unparalleled window into the digital lives of 1 million users across North America and the European Union. This comprehensive dataset stands out in the market due to its breadth, depth, and stringent compliance with data protection regulations. What Makes Our Data Unique?

    Extensive Geographic Coverage: Spanning two major markets, our data provides a holistic view of web browsing patterns in developed economies. Large User Base: With 300K active users, our dataset offers statistically significant insights across various demographics and user segments. GDPR and CCPA Compliance: We prioritize user privacy and data protection, ensuring that our data collection and processing methods adhere to the strictest regulatory standards. Real-time Updates: Our clickstream data is continuously refreshed, providing up-to-the-minute insights into evolving online trends and user behaviors. Granular Data Points: We capture a wide array of metrics, including time spent on websites, click patterns, search queries, and user journey flows.

    Data Sourcing: Ethical and Transparent Our web browsing clickstream data is sourced through a network of partnered websites and applications. Users explicitly opt-in to data collection, ensuring transparency and consent. We employ advanced anonymization techniques to protect individual privacy while maintaining the integrity and value of the aggregated data. Key aspects of our data sourcing process include:

    Voluntary user participation through clear opt-in mechanisms Regular audits of data collection methods to ensure ongoing compliance Collaboration with privacy experts to implement best practices in data anonymization Continuous monitoring of regulatory landscapes to adapt our processes as needed

    Primary Use Cases and Verticals TagX Web Browsing clickstream Data serves a multitude of industries and use cases, including but not limited to:

    Digital Marketing and Advertising:

    Audience segmentation and targeting Campaign performance optimization Competitor analysis and benchmarking

    E-commerce and Retail:

    Customer journey mapping Product recommendation enhancements Cart abandonment analysis

    Media and Entertainment:

    Content consumption trends Audience engagement metrics Cross-platform user behavior analysis

    Financial Services:

    Risk assessment based on online behavior Fraud detection through anomaly identification Investment trend analysis

    Technology and Software:

    User experience optimization Feature adoption tracking Competitive intelligence

    Market Research and Consulting:

    Consumer behavior studies Industry trend analysis Digital transformation strategies

    Integration with Broader Data Offering TagX Web Browsing clickstream Data is a cornerstone of our comprehensive digital intelligence suite. It seamlessly integrates with our other data products to provide a 360-degree view of online user behavior:

    Social Media Engagement Data: Combine clickstream insights with social media interactions for a holistic understanding of digital footprints. Mobile App Usage Data: Cross-reference web browsing patterns with mobile app usage to map the complete digital journey. Purchase Intent Signals: Enrich clickstream data with purchase intent indicators to power predictive analytics and targeted marketing efforts. Demographic Overlays: Enhance web browsing data with demographic information for more precise audience segmentation and targeting.

    By leveraging these complementary datasets, businesses can unlock deeper insights and drive more impactful strategies across their digital initiatives. Data Quality and Scale We pride ourselves on delivering high-quality, reliable data at scale:

    Rigorous Data Cleaning: Advanced algorithms filter out bot traffic, VPNs, and other non-human interactions. Regular Quality Checks: Our data science team conducts ongoing audits to ensure data accuracy and consistency. Scalable Infrastructure: Our robust data processing pipeline can handle billions of daily events, ensuring comprehensive coverage. Historical Data Availability: Access up to 24 months of historical data for trend analysis and longitudinal studies. Customizable Data Feeds: Tailor the data delivery to your specific needs, from raw clickstream events to aggregated insights.

    Empowering Data-Driven Decision Making In today's digital-first world, understanding online user behavior is crucial for businesses across all sectors. TagX Web Browsing clickstream Data empowers organizations to make informed decisions, optimize their digital strategies, and stay ahead of the competition. Whether you're a marketer looking to refine your targeting, a product manager seeking to enhance user experience, or a researcher exploring digital trends, our cli...

  6. Data from: Testing of Mobile Applications in the Wild: A Large-Scale...

    • figshare.com
    txt
    Updated Mar 25, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fabiano Pecorelli (2020). Testing of Mobile Applications in the Wild: A Large-Scale Empirical Study on Android Apps [Dataset]. http://doi.org/10.6084/m9.figshare.9980672.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Mar 25, 2020
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Fabiano Pecorelli
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Nowadays, mobile applications (a.k.a., apps) are used by over two billion users for every type of need, including social and emergency connectivity. Their pervasiveness in today world has inspired the software testing research community in devising approaches to allow developers to better test their apps and improve the quality of the tests being developed. In spite of this research effort, we still notice a lack of empirical analyses aiming at assessing the actual quality of test cases manually developed by mobile developers: this perspective could provide evidence-based findings on the future research directions in the field as well as on the current status of testing in the wild. As such, we performed a large-scale empirical study targeting 1,780 open-source Android apps and aiming at assessing (1) the extent to which these apps are actually tested, (2) how well-designed are the available tests, and (3) what is their effectiveness. The key results of our study show that mobile developers still tend not to properly test their apps, possibly because of time to market requirements. Furthermore, we discovered that the test cases of the considered apps have a low (i) design quality, both in terms of test code metrics and test smells, and (ii) effectiveness when considering code coverage as well as assertion density.

  7. Data from: EBSCO Academic Search Complete

    • catalog.data.gov
    • datasets.ai
    • +1more
    Updated Oct 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    EBSCO (2022). EBSCO Academic Search Complete [Dataset]. https://catalog.data.gov/dataset/ebsco-academic-search-complete
    Explore at:
    Dataset updated
    Oct 14, 2022
    Dataset provided by
    EBSCO Information Serviceshttp://www.ebsco.com/
    Description

    Academic Search Complete is the world's most valuable and comprehensive scholarly, multi-disciplinary full-text database, with more than 8,500 full-text periodicals, including more than 7,300 peer-reviewed journals.

  8. f

    Data from: Dataset of smartphone-based finger tapping test

    • figshare.com
    csv
    Updated Sep 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Givago Souza (2024). Dataset of smartphone-based finger tapping test [Dataset]. http://doi.org/10.6084/m9.figshare.26940823.v1
    Explore at:
    csvAvailable download formats
    Dataset updated
    Sep 5, 2024
    Dataset provided by
    figshare
    Authors
    Givago Souza
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset of smartphone-based finger tapping test submitted to Scientific Data journal.

  9. instant app embed test - edm3-jhz2 - Archive Repository

    • healthdata.gov
    application/rdfxml +5
    Updated Aug 18, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). instant app embed test - edm3-jhz2 - Archive Repository [Dataset]. https://healthdata.gov/dataset/instant-app-embed-test-edm3-jhz2-Archive-Repositor/wpg8-uxj3
    Explore at:
    csv, json, application/rdfxml, application/rssxml, xml, tsvAvailable download formats
    Dataset updated
    Aug 18, 2022
    Description

    This dataset tracks the updates made on the dataset "instant app embed test" as a repository for previous versions of the data and metadata.

  10. d

    COVID-19 Test Sites

    • catalog.data.gov
    Updated Mar 31, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Philadelphia (2025). COVID-19 Test Sites [Dataset]. https://catalog.data.gov/dataset/covid-19-test-sites
    Explore at:
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    City of Philadelphia
    Description

    A dataset of COVID-19 testing sites. A dataset of COVID-19 testing sites. If looking for a test, please use the Testing Sites locator app. You will be asked for identification and will also be asked for health insurance information. Identification will be required to receive a test. If you don’t have health insurance, you may still be able to receive a test by paying out-of-pocket. Some sites may also: - Limit testing to people who meet certain criteria. - Require an appointment. - Require a referral from your doctor. Check a location’s specific details on the map. Then, call or visit the provider’s website before going for a test.

  11. O

    Check In Qld app - Customer Check-ins

    • data.qld.gov.au
    csv
    Updated Jul 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Communities, Housing and Digital Economy (2022). Check In Qld app - Customer Check-ins [Dataset]. https://www.data.qld.gov.au/dataset/check-in-qld-app-customer-check-ins
    Explore at:
    csv(1.2 MiB)Available download formats
    Dataset updated
    Jul 1, 2022
    Dataset authored and provided by
    Communities, Housing and Digital Economy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Queensland
    Description

    Check In Qld is the Queensland Governments app used to support contract tracing and is available to download for use across a number of businesses to help keep Queenslanders COVID Safe.

    For more information on the Check In Qld app please visit https://www.covid19.qld.gov.au/check-in-qld

    Note: From 1am AEST Thursday 30 June 2022, checking in at locations in Queensland is no longer required. Data from the Qld Check in App will no longer be collected by the Queensland Government and therefore this dataset will no longer be updated.

  12. PlantifyDr Dataset

    • kaggle.com
    Updated Mar 6, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alex Lavaee (2021). PlantifyDr Dataset [Dataset]. https://www.kaggle.com/datasets/lavaman151/plantifydr-dataset/discussion?sort=undefined
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 6, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Alex Lavaee
    Description

    Context

    This is the dataset that I used in my iOS and Android plant disease detection app, PlantifyDr. You can check out my full open-source project here: https://github.com/lavaman131/PlantifyDr

    Content

    The dataset contains over 125,000 jpg images of 10 different plant types: Apple, Bell pepper, Cherry, Citrus, Corn, Grape, Peach, Potato, Strawberry, and Tomato. The total number of plant diseases is 37. Augmentations have already been applied to the data, but feel free to add your own augmentations if you like.

    Acknowledgements

    Special thanks to: https://data.mendeley.com/datasets/tywbtsjrjv/1 https://www.kaggle.com/vipoooool/new-plant-diseases-dataset https://github.com/pratikkayal/PlantDoc-Dataset https://data.mendeley.com/datasets/3f83gxmv57/2

    for the data.

    Inspiration

    The Food and Agriculture Organization of the United Nations (FAO) estimates that annually between 20 to 40 percent of global crop production is lost. Each year, plant diseases cost the global economy around $220 billion. I hoped to use deep learning to solve this problem and be able to better educate farmers and the public with the necessary knowledge to treat their plants.

  13. Z

    Data Set on Accuracy of Symptom Checker Apps in 2020

    • data.niaid.nih.gov
    • explore.openaire.eu
    Updated Feb 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Schmidt, Konrad (2022). Data Set on Accuracy of Symptom Checker Apps in 2020 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6054092
    Explore at:
    Dataset updated
    Feb 13, 2022
    Dataset provided by
    Schulz-Niethammer, Sven
    Schmidt, Konrad
    Balzer, Felix
    Schmieding, Malte L
    Feufel, Markus
    Kopka, Marvin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These two data sets present the accuracy of triage (disposition) and diagnostic advice of symptom checker apps sampled in 2020. The sample consists of 22 commonly used symptom checker apps, of which 14 also provide diagnostic advice. The apps were tested on 45 case vignettes, i.e. fictitious descriptions of patients. As not every app was able to appraise every vignette our study yielded a total of 796 unique triage evaluations and 520 unique diagnostic evaluations. The data sets are a supplement to the paper "Triage Accuracy of Symptom Checker Apps: A Five-year Follow-up Evaluation" (doi: 10.2196/31810).

    The was collected by Anna Dames as partial requirement for her MSc degree in Human Factors in the Department of Psychology and Ergonomics (IPA) at Technische Universität Berlin.

    The clinical vignettes were originally compiled and modified by Semigran et al. in 2015 (https://doi.org/10.1136/bmj.h3480), and further adapted by Hill et al. (2020) (doi: 10.5694/mja2.50600) and in the study these data sets are supplement to (doi: 10.2196/31810).

  14. Facebook users worldwide 2017-2027

    • statista.com
    • de.statista.com
    • +1more
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Facebook users worldwide 2017-2027 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    The global number of Facebook users was forecast to continuously increase between 2023 and 2027 by in total 391 million users (+14.36 percent). After the fourth consecutive increasing year, the Facebook user base is estimated to reach 3.1 billion users and therefore a new peak in 2027. Notably, the number of Facebook users was continuously increasing over the past years. User figures, shown here regarding the platform Facebook, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).

  15. R

    Data from: Mechanical Parts Dataset

    • universe.roboflow.com
    zip
    Updated Dec 12, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mazhar Cakir (2022). Mechanical Parts Dataset [Dataset]. https://universe.roboflow.com/mazhar-cakir/mechanical-parts/dataset/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 12, 2022
    Dataset authored and provided by
    Mazhar Cakir
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Variables measured
    Bearing Bounding Boxes
    Description

    Here are a few use cases for this project:

    1. Automated Industrial Quality Control: Companies involved in manufacturing mechanical parts could use this model to automate their quality control process. The model could identify each part in real-time and determine if the right component is being used, check for defects, or confirm assembly accuracy.

    2. Tool Inventory Management: The "Mechanical Parts" model could be used in hardware stores or workshops for efficient tool inventory management. By scanning an area with a camera, the system could instantly itemize all available parts, categorizing them into nuts, bolts, gears, bearings, etc.

    3. Mechanical Failure Diagnostics: This model could be used by mechanics to diagnose mechanical failures in equipment or engines. By identifying individual parts through images, it could help point out any damaged parts like a worn out gear or defective bearing.

    4. Augmented Reality (AR) Applications: The model could be used in AR applications to help students or novice mechanics learn about machinery. As they scan different parts with their mobile device, the app could identify each component and provide educational information.

    5. Recycling Center Sorting: The model could be applied in recycling centers to sort and categorize mechanical parts. This could help efficiently separate reusable components and streamline the recycling process.

  16. Food dot com recipes dataset

    • crawlfeeds.com
    csv, zip
    Updated Jul 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). Food dot com recipes dataset [Dataset]. https://crawlfeeds.com/datasets/food-dot-com-recipes-dataset
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Jul 1, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    Explore the culinary world with our extensive Food.com Recipes dataset. This dataset offers a rich collection of recipes sourced from Food.com, one of the largest and most trusted recipe platforms. Ideal for food enthusiasts, chefs, app developers, and data scientists, this dataset provides everything you need to create, analyze, and innovate in the kitchen.

    The dataset includes detailed information such as recipe names, ingredients, step-by-step cooking instructions, preparation and cooking times, user ratings, and dietary preferences. Whether you're developing a new recipe app, conducting food-related research, or simply looking to explore new dishes, this dataset offers a wealth of information to help you achieve your goals.

    Looking for additional data to power your food-related projects? Check out our Food & Beverage Data for access to a wide variety of datasets that can help you unlock new opportunities in the food and beverage industry.

    With thousands of recipes covering a wide range of cuisines, meal types, and dietary requirements, this dataset is perfect for those looking to build recipe recommendation systems, nutritional analysis tools, or food blogs. Tap into the rich culinary diversity offered by Food.com and take your food-related projects to the next level.

  17. O

    Check In Qld app - Registered Locations

    • data.qld.gov.au
    csv
    Updated Jul 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Communities, Housing and Digital Economy (2022). Check In Qld app - Registered Locations [Dataset]. https://www.data.qld.gov.au/dataset/check-in-qld-app-registered-locations
    Explore at:
    csv(731.5 KiB)Available download formats
    Dataset updated
    Jul 1, 2022
    Dataset authored and provided by
    Communities, Housing and Digital Economy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Queensland
    Description

    Check In Qld is the Queensland Governments app used to support contract tracing and is available to download for use across a number of businesses to help keep Queenslanders COVID Safe.

    For more information on the Check In Qld app please visit https://www.covid19.qld.gov.au/check-in-qld

    Note: From 1am AEST Thursday 30 June 2022, checking in at locations in Queensland is no longer required. Data from the Qld Check in App will no longer be collected by the Queensland Government and therefore this dataset will no longer be updated.

  18. Z

    StreetSurfaceVis: a dataset of street-level imagery with annotations of road...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 20, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hoffmann, Edith (2025). StreetSurfaceVis: a dataset of street-level imagery with annotations of road surface type and quality [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11449976
    Explore at:
    Dataset updated
    Jan 20, 2025
    Dataset provided by
    Hoffmann, Edith
    Kapp, Alexandra
    Weigmann, Esther
    Mihaljevic, Helena
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    StreetSurfaceVis

    StreetSurfaceVis is an image dataset containing 9,122 street-level images from Germany with labels on road surface type and quality. The CSV file streetSurfaceVis_v1_0.csv contains all image metadata and four folders contain the image files. All images are available in four different sizes, based on the image width, in 256px, 1024px, 2048px and the original size.Folders containing the images are named according to the respective image size. Image files are named based on the mapillary_image_id.

    You can find the corresponding publication here: StreetSurfaceVis: a dataset of crowdsourced street-level imagery with semi-automated annotations of road surface type and quality

    Image metadata

    Each CSV record contains information about one street-level image with the following attributes:

    mapillary_image_id: ID provided by Mapillary (see information below on Mapillary)

    user_id: Mapillary user ID of contributor

    user_name: Mapillary user name of contributor

    captured_at: timestamp, capture time of image

    longitude, latitude: location the image was taken at

    train: Suggestion to split train and test data. True for train data and False for test data. Test data contains data from 5 cities which are excluded in the training data.

    surface_type: Surface type of the road in the focal area (the center of the lower image half) of the image. Possible values: asphalt, concrete, paving_stones, sett, unpaved

    surface_quality: Surface quality of the road in the focal area of the image. Possible values: (1) excellent, (2) good, (3) intermediate, (4) bad, (5) very bad (see the attached Labeling Guide document for details)

    Image source

    Images are obtained from Mapillary, a crowd-sourcing plattform for street-level imagery. More metadata about each image can be obtained via the Mapillary API . User-generated images are shared by Mapillary under the CC-BY-SA License.

    For each image, the dataset contains the mapillary_image_id and user_name. You can access user information on the Mapillary website by https://www.mapillary.com/app/user/ and image information by https://www.mapillary.com/app/?focus=photo&pKey=

    If you use the provided images, please adhere to the terms of use of Mapillary.

    Instances per class

    Total number of images: 9,122

    excellent good intermediate bad very bad

    asphalt 971 1697 821

    246

    concrete 314 350 250

    58

    paving stones 385 1063 519

    70

    sett

    129 694

    540

    unpaved

    -

    326 387 303

    For modeling, we recommend using a train-test split where the test data includes geospatially distinct areas, thereby ensuring the model's ability to generalize to unseen regions is tested. We propose five cities varying in population size and from different regions in Germany for testing - images are tagged accordingly.

    Number of test images (train-test split): 776

    Inter-rater-reliablility

    Three annotators labeled the dataset, such that each image was annotated by one person. Annotators were encouraged to consult each other for a second opinion when uncertain.1,800 images were annotated by all three annotators, resulting in a Krippendorff's alpha of 0.96 for surface type and 0.74 for surface quality.

    Recommended image preprocessing

    As the focal road located in the bottom center of the street-level image is labeled, it is recommended to crop images to their lower and middle half prior using for classification tasks.

    This is an exemplary code for recommended image preprocessing in Python:

    from PIL import Imageimg = Image.open(image_path)width, height = img.sizeimg_cropped = img.crop((0.25 * width, 0.5 * height, 0.75 * width, height))

    License

    CC-BY-SA

    Citation

    If you use this dataset, please cite as:

    Kapp, A., Hoffmann, E., Weigmann, E. et al. StreetSurfaceVis: a dataset of crowdsourced street-level imagery annotated by road surface type and quality. Sci Data 12, 92 (2025). https://doi.org/10.1038/s41597-024-04295-9

    @article{kapp_streetsurfacevis_2025, title = {{StreetSurfaceVis}: a dataset of crowdsourced street-level imagery annotated by road surface type and quality}, volume = {12}, issn = {2052-4463}, url = {https://doi.org/10.1038/s41597-024-04295-9}, doi = {10.1038/s41597-024-04295-9}, pages = {92}, number = {1}, journaltitle = {Scientific Data}, shortjournal = {Scientific Data}, author = {Kapp, Alexandra and Hoffmann, Edith and Weigmann, Esther and Mihaljević, Helena}, date = {2025-01-16},}

    This is part of the SurfaceAI project at the University of Applied Sciences, HTW Berlin.

    • Prof. Dr. Helena Mihajlević- Alexandra Kapp- Edith Hoffmann- Esther Weigmann

    Contact: surface-ai@htw-berlin.de

    https://surfaceai.github.io/surfaceai/

    Funding: SurfaceAI is a mFund project funded by the Federal Ministry for Digital and Transportation Germany.

  19. o

    Monthly time series of spatially enhanced relative humidity for Europe at...

    • data.opendatascience.eu
    • data.mundialis.de
    • +1more
    Updated Dec 16, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Monthly time series of spatially enhanced relative humidity for Europe at 1000 m resolution (2000 - 2022) derived from ERA5-Land data [Dataset]. https://data.opendatascience.eu/geonetwork/srv/search?keyword=TBE
    Explore at:
    Dataset updated
    Dec 16, 2023
    Description

    Overview: ERA5-Land is a reanalysis dataset providing a consistent view of the evolution of land variables over several decades at an enhanced resolution compared to ERA5. ERA5-Land has been produced by replaying the land component of the ECMWF ERA5 climate reanalysis. Reanalysis combines model data with observations from across the world into a globally complete and consistent dataset using the laws of physics. Reanalysis produces data that goes several decades back in time, providing an accurate description of the climate of the past. Processing steps: The original hourly ERA5-Land air temperature 2 m above ground and dewpoint temperature 2 m data has been spatially enhanced from 0.1 degree to 30 arc seconds (approx. 1000 m) spatial resolution by image fusion with CHELSA data (V1.2) (https://chelsa-climate.org/). For each day we used the corresponding monthly long-term average of CHELSA. The aim was to use the fine spatial detail of CHELSA and at the same time preserve the general regional pattern and fine temporal detail of ERA5-Land. The steps included aggregation and enhancement, specifically: 1. spatially aggregate CHELSA to the resolution of ERA5-Land 2. calculate difference of ERA5-Land - aggregated CHELSA 3. interpolate differences with a Gaussian filter to 30 arc seconds. 4. add the interpolated differences to CHELSA Subsequently, the temperature time series have been aggregated on a daily basis. From these, daily relative humidity has been calculated for the time period 01/2000 - 12/2023. Relative humidity (rh2m) has been calculated from air temperature 2 m above ground (Ta) and dewpoint temperature 2 m above ground (Td) using the formula for saturated water pressure from Wright (1997): maximum water pressure = 611.21 * exp(17.502 * Ta / (240.97 + Ta)) actual water pressure = 611.21 * exp(17.502 * Td / (240.97 + Td)) relative humidity = actual water pressure / maximum water pressure The resulting relative humidity has been aggregated to monthly averages. Resultant values have been converted to represent percent * 10, thus covering a theoretical range of [0, 1000]. The data have been reprojected to EU LAEA. File naming scheme (YYYY = year; MM = month): ERA5_land_rh2m_avg_monthly_YYYY_MM.tif Projection + EPSG code: EU LAEA (EPSG: 3035) Spatial extent: north: 6874000 south: -485000 west: 869000 east: 8712000 Spatial resolution: 1000 m Temporal resolution: Monthly Pixel values: Percent * 10 (scaled to Integer; example: value 738 = 73.8 %) Software used: GDAL 3.2.2 and GRASS GIS 8.0.0/8.3.2 Original ERA5-Land dataset license: https://apps.ecmwf.int/datasets/licences/copernicus/ CHELSA climatologies (V1.2): Data used: Karger D.N., Conrad, O., Böhner, J., Kawohl, T., Kreft, H., Soria-Auza, R.W., Zimmermann, N.E, Linder, H.P., Kessler, M. (2018): Data from: Climatologies at high resolution for the earth's land surface areas. Dryad digital repository. http://dx.doi.org/doi:10.5061/dryad.kd1d4 Original peer-reviewed publication: Karger, D.N., Conrad, O., Böhner, J., Kawohl, T., Kreft, H., Soria-Auza, R.W., Zimmermann, N.E., Linder, P., Kessler, M. (2017): Climatologies at high resolution for the Earth land surface areas. Scientific Data. 4 170122. https://doi.org/10.1038/sdata.2017.122 Processed by: mundialis GmbH & Co. KG, Germany (https://www.mundialis.de/) Reference: Wright, J.M. (1997): Federal meteorological handbook no. 3 (FCM-H3-1997). Office of Federal Coordinator for Meteorological Services and Supporting Research. Washington, DC Data is also available in Latitude-Longitude/WGS84 (EPSG: 4326) projection: https://data.mundialis.de/geonetwork/srv/eng/catalog.search#/metadata/b9ce7dba-4130-428d-96f0-9089d8b9f4a5 Acknowledgements: This study was partially funded by EU grant 874850 MOOD. The contents of this publication are the sole responsibility of the authors and don't necessarily reflect the views of the European Commission.

  20. E

    WS : Overflowing Garbage

    • data.edmonton.ca
    application/rdfxml +5
    Updated Sep 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Edmonton (2025). WS : Overflowing Garbage [Dataset]. https://data.edmonton.ca/w/a4ha-fnwu/depj-dfck?cur=AX_Pv9UntPm
    Explore at:
    application/rdfxml, csv, application/rssxml, json, xml, tsvAvailable download formats
    Dataset updated
    Sep 18, 2025
    Dataset authored and provided by
    City of Edmonton
    Description

    Listing of all requests raised to 311, details of each call/transaction and update on the status.

    This is anonymized data intended for reporting on trends; if you have a specific reference number for which you would like to check the status, please see the "check your status" search box at https://www.edmonton.ca/programs_services/311-city-services.aspx.

    Due to the volume of data, this dataset contains 311 requests created within the current calendar year and the previous two calendar years. In January of a given year it will contain 2 years and 1 month of requests; in December of that year it will contain 3 years of requests. For historical data please search for "311 Requests (YEAR)" where "YEAR" is a year value, e.g. 2018. These will exist for years that no longer exist in this dataset, i.e. "current year minus three" will be the greatest year value that has its own single-year dataset.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Onur Gunes (2021). RICO dataset [Dataset]. https://www.kaggle.com/datasets/onurgunes1993/rico-dataset
Organization logo

RICO dataset

A Mobile App Dataset for Building Data-Driven Design Applications

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 2, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Onur Gunes
Description

Context

Data-driven models help mobile app designers understand best practices and trends, and can be used to make predictions about design performance and support the creation of adaptive UIs. This paper presents Rico, the largest repository of mobile app designs to date, created to support five classes of data-driven applications: design search, UI layout generation, UI code generation, user interaction modeling, and user perception prediction. To create Rico, we built a system that combines crowdsourcing and automation to scalably mine design and interaction data from Android apps at runtime. The Rico dataset contains design data from more than 9.3k Android apps spanning 27 categories. It exposes visual, textual, structural, and interactive design properties of more than 66k unique UI screens. To demonstrate the kinds of applications that Rico enables, we present results from training an autoencoder for UI layout similarity, which supports query-by-example search over UIs.

Content

Rico was built by mining Android apps at runtime via human-powered and programmatic exploration. Like its predecessor ERICA, Rico’s app mining infrastructure requires no access to — or modification of — an app’s source code. Apps are downloaded from the Google Play Store and served to crowd workers through a web interface. When crowd workers use an app, the system records a user interaction trace that captures the UIs visited and the interactions performed on them. Then, an automated agent replays the trace to warm up a new copy of the app and continues the exploration programmatically, leveraging a content-agnostic similarity heuristic to efficiently discover new UI states. By combining crowdsourcing and automation, Rico can achieve higher coverage over an app’s UI states than either crawling strategy alone. In total, 13 workers recruited on UpWork spent 2,450 hours using apps on the platform over five months, producing 10,811 user interaction traces. After collecting a user trace for an app, we ran the automated crawler on the app for one hour.

Acknowledgements

UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN https://interactionmining.org/rico

Inspiration

The Rico dataset is large enough to support deep learning applications. We trained an autoencoder to learn an embedding for UI layouts, and used it to annotate each UI with a 64-dimensional vector representation encoding visual layout. This vector representation can be used to compute structurally — and often semantically — similar UIs, supporting example-based search over the dataset. To create training inputs for the autoencoder that embed layout information, we constructed a new image for each UI capturing the bounding box regions of all leaf elements in its view hierarchy, differentiating between text and non-text elements. Rico’s view hierarchies obviate the need for noisy image processing or OCR techniques to create these inputs.

Search
Clear search
Close search
Google apps
Main menu