100+ datasets found

P
APPS Dataset
paperswithcode.com
Updated Feb 28, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dan Hendrycks; Steven Basart; Saurav Kadavath; Mantas Mazeika; Akul Arora; Ethan Guo; Collin Burns; Samir Puranik; Horace He; Dawn Song; Jacob Steinhardt (2024). APPS Dataset [Dataset]. https://paperswithcode.com/dataset/apps
Explore at:
Dataset updated
Feb 28, 2024
Authors
Dan Hendrycks; Steven Basart; Saurav Kadavath; Mantas Mazeika; Akul Arora; Ethan Guo; Collin Burns; Samir Puranik; Horace He; Dawn Song; Jacob Steinhardt
Description
The APPS dataset consists of problems collected from different open-access coding websites such as Codeforces, Kattis, and more. The APPS benchmark attempts to mirror how humans programmers are evaluated by posing coding problems in unrestricted natural language and evaluating the correctness of solutions. The problems range in difficulty from introductory to collegiate competition level and measure coding ability as well as problem-solving.

The Automated Programming Progress Standard, abbreviated APPS, consists of 10,000 coding problems in total, with 131,836 test cases for checking solutions and 232,444 ground-truth solutions written by humans. Problems can be complicated, as the average length of a problem is 293.2 words. The data are split evenly into training and test sets, with 5,000 problems each. In the test set, every problem has multiple test cases, and the average number of test cases is 21.2. Each test case is specifically designed for the corresponding problem, enabling us to rigorously evaluate program functionality.
h
apps
huggingface.co
Updated Jun 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CodeParrot (2022). apps [Dataset]. https://huggingface.co/datasets/codeparrot/apps
Explore at:
Dataset updated
Jun 29, 2022
Dataset provided by
Good Engineering, Inc
Authors
CodeParrot
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
APPS is a benchmark for Python code generation, it includes 10,000 problems, which range from having simple oneline solutions to being substantial algorithmic challenges, for more details please refer to this paper: https://arxiv.org/pdf/2105.09938.pdf.
h
apps-small
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aurora Huang, apps-small [Dataset]. https://huggingface.co/datasets/AuroraH456/apps-small
Explore at:
Authors
Aurora Huang
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
APPS Dataset

Dataset Description

APPS is a benchmark for code generation with 10000 problems. It can be used to evaluate the ability of language models to generate code from natural language specifications. You can also find APPS metric in the hub here codeparrot/apps_metric.

Languages

The dataset contains questions in English and code solutions in Python.

Dataset Structure

from datasets import load_dataset load_dataset("codeparrot/apps")… See the full description on the dataset page: https://huggingface.co/datasets/AuroraH456/apps-small.
b
Dating App Benchmarks (2025)
businessofapps.com
Updated May 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Business of Apps (2025). Dating App Benchmarks (2025) [Dataset]. https://www.businessofapps.com/data/dating-app-benchmarks/
Explore at:
Dataset updated
May 13, 2025
Dataset authored and provided by
Business of Apps
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
The dating app industry grew out of the dating websites that were prominent in the early 2010s, with Match, Plenty of Fish and Zoosk leading the way with similarly designed services for mobile. This...
i
Data from: xr-droid: A Benchmark Dataset for AR/VR and Security Applications...
ieee-dataport.org
Updated May 20, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abdulaziz Alghamdi (2024). xr-droid: A Benchmark Dataset for AR/VR and Security Applications [Dataset]. https://ieee-dataport.org/documents/xr-droid-benchmark-dataset-arvr-and-security-applications
Explore at:
Dataset updated
May 20, 2024
Authors
Abdulaziz Alghamdi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
including mobile devices
P
AndroidWorld Dataset
paperswithcode.com
Updated Sep 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christopher Rawles; Sarah Clinckemaillie; Yifan Chang; Jonathan Waltz; Gabrielle Lau; Marybeth Fair; Alice Li; William Bishop; Wei Li; Folawiyo Campbell-Ajala; Daniel Toyama; Robert Berry; Divya Tyamagundlu; Timothy Lillicrap; Oriana Riva (2024). AndroidWorld Dataset [Dataset]. https://paperswithcode.com/dataset/androidworld
Explore at:
Dataset updated
Sep 11, 2024
Authors
Christopher Rawles; Sarah Clinckemaillie; Yifan Chang; Jonathan Waltz; Gabrielle Lau; Marybeth Fair; Alice Li; William Bishop; Wei Li; Folawiyo Campbell-Ajala; Daniel Toyama; Robert Berry; Divya Tyamagundlu; Timothy Lillicrap; Oriana Riva
Description
AndroidWorld is an environment for building and benchmarking autonomous computer control agents.

It runs on a live Android emulator and contains a highly reproducible benchmark of 116 hand-crafted tasks across 20 apps, which are dynamically instantiated with randomly-generated parameters to create millions of unique task variations.

In addition to the built-in tasks, AndroidWorld also supports the popular web benchmark, MiniWoB++ from Liu et al..

Key features of AndroidWorld include:

📝 116 diverse tasks across 20 real-world apps 🎲 Dynamic task instantiation for millions of unique variations 🏆 Durable reward signals for reliable evaluation 🌐 Open environment with access to millions of Android apps and websites 💾 Lightweight footprint (2 GB memory, 8 GB disk) 🔧 Extensible design to easily add new tasks and benchmarks 🖥️ Integration with MiniWoB++ web-based tasks
SciRAG-QA: Multi-domain Closed-Question Benchmark Dataset for Scientific QA
zenodo.org
explore.openaire.eu
bin, csv, json
Updated Dec 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mahira Ibnath Joytu; Mahira Ibnath Joytu; Md Raisul Kibria; Md Raisul Kibria; Sébastien Lafond; Sébastien Lafond (2024). SciRAG-QA: Multi-domain Closed-Question Benchmark Dataset for Scientific QA [Dataset]. http://doi.org/10.5281/zenodo.14390011
Explore at:
csv, bin, jsonAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14390011
Dataset updated
Dec 15, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Mahira Ibnath Joytu; Mahira Ibnath Joytu; Md Raisul Kibria; Md Raisul Kibria; Sébastien Lafond; Sébastien Lafond
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 2024
Description
In recent times, one of the most impactful applications of the growing capabilities of Large Language Models (LLMs) has been their use in Retrieval-Augmented Generation (RAG) systems. RAG applications are inherently more robust against LLM hallucinations and provide source traceability, which holds critical importance in the scientific reading and writing process. However, validating such systems is essential due to the stringent systematic requirements of the scientific domain. Existing benchmark datasets are limited in the scope of research areas they cover, often focusing on the natural sciences, which restricts their applicability and validation across other scientific fields.

To address this gap, we present a closed-question answering (QA) dataset for benchmarking scientific RAG applications. This dataset spans 34 research topics across 10 distinct areas of study. It includes 108 manually curated question-answer pairs, each annotated with answer type, difficulty level, and a gold reference along with a link to the source paper. Further details on each of these attributes can be found in the accompanying README.md file.

Please cite the following publication when using the dataset: TBD

The publication is available at: TBD

A preprint version of the publication is available at: TBD
f
ORBIT: A real-world few-shot dataset for teachable object recognition...
city.figshare.com
bin
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniela Massiceti; Lida Theodorou; Luisa Zintgraf; Matthew Tobias Harris; Simone Stumpf; Cecily Morrison; Edward Cutrell; Katja Hofmann (2023). ORBIT: A real-world few-shot dataset for teachable object recognition collected from people who are blind or low vision [Dataset]. http://doi.org/10.25383/city.14294597.v3
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.25383/city.14294597.v3
Dataset updated
May 31, 2023
Dataset provided by
City, University of London
Authors
Daniela Massiceti; Lida Theodorou; Luisa Zintgraf; Matthew Tobias Harris; Simone Stumpf; Cecily Morrison; Edward Cutrell; Katja Hofmann
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Object recognition predominately still relies on many high-quality training examples per object category. In contrast, learning new objects from only a few examples could enable many impactful applications from robotics to user personalization. Most few-shot learning research, however, has been driven by benchmark datasets that lack the high variation that these applications will face when deployed in the real-world. To close this gap, we present the ORBIT dataset, grounded in a real-world application of teachable object recognizers for people who are blind/low vision. We provide a full, unfiltered dataset of 4,733 videos of 588 objects recorded by 97 people who are blind/low-vision on their mobile phones, and a benchmark dataset of 3,822 videos of 486 objects collected by 77 collectors. The code for loading the dataset, computing all benchmark metrics, and running the baseline models is available at https://github.com/microsoft/ORBIT-DatasetThis version comprises several zip files:- train, validation, test: benchmark dataset, organised by collector, with raw videos split into static individual frames in jpg format at 30FPS- other: data not in the benchmark set, organised by collector, with raw videos split into static individual frames in jpg format at 30FPS (please note that the train, validation, test, and other files make up the unfiltered dataset)- *_224: as for the benchmark, but static individual frames are scaled down to 224 pixels.- *_unfiltered_videos: full unfiltered dataset, organised by collector, in mp4 format.
h
applications-benchmark-dataset
huggingface.co
Updated Apr 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eliciting-Contexts-LASR (2025). applications-benchmark-dataset [Dataset]. https://huggingface.co/datasets/Eliciting-Contexts/applications-benchmark-dataset
Explore at:
Dataset updated
Apr 29, 2025
Dataset authored and provided by
Eliciting-Contexts-LASR
Description
Eliciting-Contexts/applications-benchmark-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
b
App Engagement Rates (2025)
businessofapps.com
Updated Jul 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Business of Apps (2024). App Engagement Rates (2025) [Dataset]. https://www.businessofapps.com/data/app-engagement-rates/
Explore at:
Dataset updated
Jul 4, 2024
Dataset authored and provided by
Business of Apps
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
Key App Engagement Rate StatisticsSession Length by App CategoryMonthly Sessions by App CategoryDaily Active Users by App CategoryHours Spent by App CategoryHours Spent on Apps by CountryMost Popular...
Benchmarks
s.cnmilf.com
datasets.ai
+2more
Updated Dec 2, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Federal Emergency Management Agency (Point of Contact) (2020). Benchmarks [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/benchmarks
Explore at:
Dataset updated
Dec 2, 2020
Dataset provided by
Federal Emergency Management Agencyhttp://www.fema.gov/
Description
The National Flood Hazard Layer (NFHL) data incorporates all Digital Flood Insurance Rate Map(DFIRM) databases published by FEMA, and any Letters Of Map Revision (LOMRs) that have been issued against those databases since their publication date. The DFIRM Database is the digital, geospatial version of the flood hazard information shown on the published paper Flood Insurance Rate Maps(FIRMs). The primary risk classifications used are the 1-percent-annual-chance flood event, the 0.2-percent-annual-chance flood event, and areas of minimal flood risk. The NFHL data are derived from Flood Insurance Studies (FISs), previously published Flood Insurance Rate Maps (FIRMs), flood hazard analyses performed in support of the FISs and FIRMs, and new mapping data where available. The FISs and FIRMs are published by the Federal Emergency Management Agency (FEMA). The specifications for the horizontal control of DFIRM data are consistent with those required for mapping at a scale of 1:12,000. The NFHL data contain layers in the Standard DFIRM datasets except for S_Label_Pt and S_Label_Ld. The NFHL is available as State or US Territory data sets. Each State or Territory data set consists of all DFIRMs and corresponding LOMRs available on the publication date of the data set.
a
Benchmark Public Works Web App
brevard-gis-open-data-hub-brevardbocc.hub.arcgis.com
Updated Jun 26, 2015
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Brevard County, Board of County Commissioners (2015). Benchmark Public Works Web App [Dataset]. https://brevard-gis-open-data-hub-brevardbocc.hub.arcgis.com/datasets/benchmark-public-works-web-app
Explore at:
Dataset updated
Jun 26, 2015
Dataset authored and provided by
Brevard County, Board of County Commissioners
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
The benchmark data contains name, type, material, coordinates, elevations and vertical order. All benchmarks were conventionally leveled through in accordance with the procedures setup in the Brevard County Vertical Control Manual (October 2012). The elevations of the bench marks are based on the North American Vertical Datum of 1988 (NAVD88). The horizontal coordinates are from a handheld GPS unit and are fro reference purposes only.
P
Data Loss repository Dataset
paperswithcode.com
opendatalab.com
Updated Jul 23, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). Data Loss repository Dataset [Dataset]. https://paperswithcode.com/dataset/data-loss-repository
Explore at:
Dataset updated
Jul 23, 2021
Description
This is a benchmark of data loss bugs for android apps. It is a public benchmark of 110 data loss faults in Android apps that we systematically collected to facilitate research and experimentation with these problems. The benchmark is available on GitLab and includes the faulty apps, the fixed apps (when available), the test cases to automatically reproduce the problems, and additional information that may help researchers in their tasks.
g
AI Search Data for "mHealth app retention benchmarks"
geneo.app
html
Updated Jul 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Geneo (2025). AI Search Data for "mHealth app retention benchmarks" [Dataset]. https://geneo.app/query-reports/mhealth-app-retention-benchmarks
Explore at:
htmlAvailable download formats
Dataset updated
Jul 1, 2025
Dataset authored and provided by
Geneo
Description
Brand performance data collected from AI search platforms for the query "mHealth app retention benchmarks".
ParamScope-apps-dataset
zenodo.org
zip
Updated May 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anonymous Anonymous; Anonymous Anonymous (2025). ParamScope-apps-dataset [Dataset]. http://doi.org/10.5281/zenodo.15538300
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.15538300
Dataset updated
May 30, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Anonymous Anonymous; Anonymous Anonymous
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Update: We have uploaded the apk dataset in ParamScope repository, You can ignore the dataset here. See https://zenodo.org/records/15546562 for details.

The apps dataset of ParamScope.

To avoid too large files in one repository. We have uploaded the 327 apks to the separate repository

Paper Abstract:

Cryptographic API misuses, such as the use of predictable secrets or insecure cryptographic algorithms, have led to numerous incidents involving data breaches, financial theft, and privilege escalation in real-world applications. These consequences highlight the critical importance of detecting cryptographic misuses. Most existing studies focus on identifying such issues through the analysis of API parameter values. However, dynamic detection approaches often suffer from low code coverage, which limits their ability to cover all misuse instances. As a result, increasing attention has been given to static approaches. Nevertheless, these static methods also exhibit notable limitations, as they typically focus on direct parameter value propagation while ignoring values that are transformed through expressions or method calls. These semantically dynamic values are often opaque to static analysis, leading to significant blind spots and an underestimation of existing static tools.

To address the aforementioned limitations, this paper presents ParamScope, a static analysis tool for cryptographic API misuse detection. ParamScope first obtains high-quality Intermediate Representation (IR) and comprehensive coverage of cryptographic API calls through fine-grained static analysis. It then performs assignment-driven program slicing and lightweight IR simulation to reconstruct the complete propagation and assignment chain of parameter values. This approach enables effective analysis of value assignments that can only be determined at runtime, which are often missed by existing static analysis, while also addressing the coverage limitations inherent in dynamic approaches. We evaluated ParamScope by comparing it with leading static and dynamic tools, including CryptoGuard, CrySL, and RvSec, using four cryptographic misuse benchmarks and a dataset of 327 Google Play applications. The results show that ParamScope outperforms the other tools, achieving an accuracy of 96.22% and an F1-score of 96.85%. In real-world experiments, ParamScope identifies 27% more misuse cases than the best-performing tools, while maintaining a comparable analysis time.
a
app Pennsylvania Area NGS Geodetic Control Stations
pa-geo-data-pennmap.hub.arcgis.com
Updated Apr 8, 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cambria County GIS Center (2014). app Pennsylvania Area NGS Geodetic Control Stations [Dataset]. https://pa-geo-data-pennmap.hub.arcgis.com/datasets/CambriaPA::app-pennsylvania-area-ngs-geodetic-control-stations
Explore at:
Dataset updated
Apr 8, 2014
Dataset authored and provided by
Cambria County GIS Center
Area covered
Pennsylvania
Description
This map displays National Geodetic Survey (NGS) classifications of geodetic control stations for the Pennsylvania area with PennDOT county and municipal boundaries.NOAA Charting and Geodesy: https://www.noaa.gov/chartingNOAA Survey Map: https://noaa.maps.arcgis.com/apps/webappviewer/index.html?id=190385f9aadb4cf1b0dd8759893032dbPennDOT GIS Hub: GIS Hub (arcgis.com)
Z
Data for: A Benchmark Engineering Methodology to Measure the Overhead of...
data.niaid.nih.gov
explore.openaire.eu
+1more
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Waller, Jan (2020). Data for: A Benchmark Engineering Methodology to Measure the Overhead of Application-Level Monitoring [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7615
Explore at:
Dataset updated
Jan 24, 2020
Dataset provided by
Hasselbring, Wilhelm
Waller, Jan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Application-level monitoring frameworks, such as Kieker, provide insight into the inner workings and the dynamic behavior of software systems. However, depending on the number of monitoring probes used, these frameworks may introduce significant runtime overhead. Consequently, planning the instrumentation of continuously operating software systems requires detailed knowledge of the performance impact of each monitoring probe.

In this paper, we present our benchmark engineering approach to quantify the monitoring overhead caused by each probe under controlled and repeatable conditions. Our developed MooBench benchmark provides a basis for performance evaluations and comparisons of application-level monitoring frameworks. To evaluate its capabilities, we employ our benchmark to conduct a performance comparison of all available Kieker releases from version 0.91 to the current release 1.8.

This dataset supplements the paper and contains the raw experimental data as well as several generated diagrams for each experiment.
d
Data from: Application of a 1H brain MRS benchmark dataset to deep learning...
search.dataone.org
datadryad.org
Updated Mar 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Craig Stark; Aaron Gudmundson (2024). Application of a 1H brain MRS benchmark dataset to deep learning for out-of-voxel artifacts [Dataset]. http://doi.org/10.7280/D1RX1T
Explore at:
Unique identifier
https://doi.org/10.7280/D1RX1T
Dataset updated
Mar 6, 2024
Dataset provided by
Dryad Digital Repository
Authors
Craig Stark; Aaron Gudmundson
Time period covered
Jan 1, 2023
Description
Neural networks are potentially valuable for many of the challenges associated with MRS data. The purpose of this manuscript is to describe the AGNOSTIC dataset, which contains 259,200 synthetic 1H MRS examples for training and testing neural networks. AGNOSTIC was created using 270 basis sets that were simulated across 18 field strengths and 15 echo times. The synthetic examples were produced to resembleÂ in vivo brain data with combinations of metabolite, macromolecule, residual water signals, and noise. To demonstrate the utility, we apply AGNOSTIC to train two Convolutional Neural Networks (CNNs) to address out-of-voxel (OOV) echoes. A Detection Network was trained to identify the point-wise presence of OOV echoes, providing proof of concept for real-time detection. A Prediction Network was trained to reconstruct OOV echoes, allowing subtraction during post-processing. Complex OOV signals were mixed into 85% of synthetic examples to train two separate CNNs for the detection and predi..., AGNOSTIC was created using 270 basis sets that were simulated across 18 field strengths and 15 echo times. The synthetic examples were produced to resemble in vivo brain data with combinations of metabolite, macromolecule, and residual water signals, and noise. All of the parameters (i.e., amplitudes, relaxation decays, etc.) are included in each of the NumPy zipped archive file., NumPy archive files can be opened using Python and NumPy., # AGNOSTIC: Adaptable Generalized Neural-Network Open-source Spectroscopy Training dataset of Individual Components

Published in Imaging Neuroscience: Application of a 1H brain MRS benchmark dataset to deep learning for out-of-voxel artifacts

Aaron T. Gudmundson, Johns Hopkins School of Medicine, Kennedy Krieger Institute, ORCID: 0000-0001-5104-0959

Christopher W. Davies-Jenkins, Johns Hopkins School of Medicine, Kennedy Krieger Institute, ORCID: 0000-0002-6015-762X

Ä°pek Ã–zdemir, Johns Hopkins School of Medicine, Kennedy Krieger Institute, ORCID: 0000-0001-6807-9390

Saipavitra Murali-Manohar, Johns Hopkins School of Medicine, Kennedy Krieger Institute, ORCID: 0000-0002-4978-0736

Helge J. ZÃ¶llner, Johns Hopkins School of Medicine, Kenne...
Z
TreeSatAI Benchmark Archive for Deep Learning in Forest Applications
data.niaid.nih.gov
zenodo.org
Updated Jul 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Demir, Begüm (2024). TreeSatAI Benchmark Archive for Deep Learning in Forest Applications [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6598390
Explore at:
Dataset updated
Jul 16, 2024
Dataset provided by
Helber, Patrick
Demir, Begüm
Kleinschmit, Birgit
Arias, Florencia
Hees, Jörn
Schulz, Christian
Förster, Michael
Ahlswede, Steve
Bischke, Benjamin
Gava, Christiano
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Context and Aim

Deep learning in Earth Observation requires large image archives with highly reliable labels for model training and testing. However, a preferable quality standard for forest applications in Europe has not yet been determined. The TreeSatAI consortium investigated numerous sources for annotated datasets as an alternative to manually labeled training datasets.

We found the federal forest inventory of Lower Saxony, Germany represents an unseen treasure of annotated samples for training data generation. The respective 20-cm Color-infrared (CIR) imagery, which is used for forestry management through visual interpretation, constitutes an excellent baseline for deep learning tasks such as image segmentation and classification.

Description

The data archive is highly suitable for benchmarking as it represents the real-world data situation of many German forest management services. One the one hand, it has a high number of samples which are supported by the high-resolution aerial imagery. On the other hand, this data archive presents challenges, including class label imbalances between the different forest stand types.

The TreeSatAI Benchmark Archive contains:

50,381 image triplets (aerial, Sentinel-1, Sentinel-2)

synchronized time steps and locations

all original spectral bands/polarizations from the sensors

20 species classes (single labels)

12 age classes (single labels)

15 genus classes (multi labels)

60 m and 200 m patches

fixed split for train (90%) and test (10%) data

additional single labels such as English species name, genus, forest stand type, foliage type, land cover

The geoTIFF and GeoJSON files are readable in any GIS software, such as QGIS. For further information, we refer to the PDF document in the archive and publications in the reference section.

Version history

v1.0.2 - Minor bug fix multi label JSON file

v1.0.1 - Minor bug fixes in multi label JSON file and description file

v1.0.0 - First release

Citation

Ahlswede, S., Schulz, C., Gava, C., Helber, P., Bischke, B., Förster, M., Arias, F., Hees, J., Demir, B., and Kleinschmit, B.: TreeSatAI Benchmark Archive: a multi-sensor, multi-label dataset for tree species classification in remote sensing, Earth Syst. Sci. Data, 15, 681–695, https://doi.org/10.5194/essd-15-681-2023, 2023.

GitHub

Full code examples and pre-trained models from the dataset article (Ahlswede et al. 2022) using the TreeSatAI Benchmark Archive are published on the GitLab and GitHub repositories of the Remote Sensing Image Analysis (RSiM) Group (https://git.tu-berlin.de/rsim/treesat_benchmark) and the Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI) (https://github.com/DFKI/treesatai_benchmark). Code examples for the sampling strategy can be made available by Christian Schulz via email request.

Folder structure

We refer to the proposed folder structure in the PDF file.

Folder “aerial” contains the aerial imagery patches derived from summertime orthophotos of the years 2011 to 2020. Patches are available in 60 x 60 m (304 x 304 pixels). Band order is near-infrared, red, green, and blue. Spatial resolution is 20 cm.

Folder “s1” contains the Sentinel-1 imagery patches derived from summertime mosaics of the years 2015 to 2020. Patches are available in 60 x 60 m (6 x 6 pixels) and 200 x 200 m (20 x 20 pixels). Band order is VV, VH, and VV/VH ratio. Spatial resolution is 10 m.

Folder “s2” contains the Sentinel-2 imagery patches derived from summertime mosaics of the years 2015 to 2020. Patches are available in 60 x 60 m (6 x 6 pixels) and 200 x 200 m (20 x 20 pixels). Band order is B02, B03, B04, B08, B05, B06, B07, B8A, B11, B12, B01, and B09. Spatial resolution is 10 m.

The folder “labels” contains a JSON string which was used for multi-labeling of the training patches. Code example of an image sample with respective proportions of 94% for Abies and 6% for Larix is: "Abies_alba_3_834_WEFL_NLF.tif": [["Abies", 0.93771], ["Larix", 0.06229]]

The two files “test_filesnames.lst” and “train_filenames.lst” define the filenames used for train (90%) and test (10%) split. We refer to this fixed split for better reproducibility and comparability.

The folder “geojson” contains geoJSON files with all the samples chosen for the derivation of training patch generation (point, 60 m bounding box, 200 m bounding box).

CAUTION: As we could not upload the aerial patches as a single zip file on Zenodo, you need to download the 20 single species files (aerial_60m_…zip) separately. Then, unzip them into a folder named “aerial” with a subfolder named “60m”. This structure is recommended for better reproducibility and comparability to the experimental results of Ahlswede et al. (2022),

Join the archive

Model training, benchmarking, algorithm development… many applications are possible! Feel free to add samples from other regions in Europe or even worldwide. Additional remote sensing data from Lidar, UAVs or aerial imagery from different time steps are very welcome. This helps the research community in development of better deep learning and machine learning models for forest applications. You might have questions or want to share code/results/publications using that archive? Feel free to contact the authors.

Project description

This work was part of the project TreeSatAI (Artificial Intelligence with Satellite data and Multi-Source Geodata for Monitoring of Trees at Infrastructures, Nature Conservation Sites and Forests). Its overall aim is the development of AI methods for the monitoring of forests and woody features on a local, regional and global scale. Based on freely available geodata from different sources (e.g., remote sensing, administration maps, and social media), prototypes will be developed for the deep learning-based extraction and classification of tree- and tree stand features. These prototypes deal with real cases from the monitoring of managed forests, nature conservation and infrastructures. The development of the resulting services by three enterprises (liveEO, Vision Impulse and LUP Potsdam) will be supported by three research institutes (German Research Center for Artificial Intelligence, TUB Remote Sensing Image Analysis Group, TUB Geoinformation in Environmental Planning Lab).

Project publications

Ahlswede, S., Schulz, C., Gava, C., Helber, P., Bischke, B., Förster, M., Arias, F., Hees, J., Demir, B., and Kleinschmit, B.: TreeSatAI Benchmark Archive: a multi-sensor, multi-label dataset for tree species classification in remote sensing, Earth System Science Data, 15, 681–695, https://doi.org/10.5194/essd-15-681-2023, 2023.

Schulz, C., Förster, M., Vulova, S. V., Rocha, A. D., and Kleinschmit, B.: Spectral-temporal traits in Sentinel-1 C-band SAR and Sentinel-2 multispectral remote sensing time series for 61 tree species in Central Europe. Remote Sensing of Environment, 307, 114162, https://doi.org/10.1016/j.rse.2024.114162, 2024.

Conference contributions

Ahlswede, S. Madam, N.T., Schulz, C., Kleinschmit, B., and Demіr, B.: Weakly Supervised Semantic Segmentation of Remote Sensing Images for Tree Species Classification Based on Explanation Methods, IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, https://doi.org/10.48550/arXiv.2201.07495, 2022.

Schulz, C., Förster, M., Vulova, S., Gränzig, T., and Kleinschmit, B.: Exploring the temporal fingerprints of mid-European forest types from Sentinel-1 RVI and Sentinel-2 NDVI time series, IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, https://doi.org/10.1109/IGARSS46834.2022.9884173, 2022.

Schulz, C., Förster, M., Vulova, S., and Kleinschmit, B.: The temporal fingerprints of common European forest types from SAR and optical remote sensing data, AGU Fall Meeting, New Orleans, USA, 2021.

Kleinschmit, B., Förster, M., Schulz, C., Arias, F., Demir, B., Ahlswede, S., Aksoy, A.K., Ha Minh, T., Hees, J., Gava, C., Helber, P., Bischke, B., Habelitz, P., Frick, A., Klinke, R., Gey, S., Seidel, D., Przywarra, S., Zondag, R., and Odermatt B.: Artificial Intelligence with Satellite data and Multi-Source Geodata for Monitoring of Trees and Forests, Living Planet Symposium, Bonn, Germany, 2022.

Schulz, C., Förster, M., Vulova, S., Gränzig, T., and Kleinschmit, B.: Exploring the temporal fingerprints of sixteen mid-European forest types from Sentinel-1 and Sentinel-2 time series, ForestSAT, Berlin, Germany, 2022.
movie_rating_data
kaggle.com
zip
Updated Nov 9, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
pooh (2017). movie_rating_data [Dataset]. https://www.kaggle.com/ashukr/movie-rating-data
Explore at:
zip(140248200 bytes)Available download formats
Dataset updated
Nov 9, 2017
Authors
pooh
Description
Context

Stable benchmark dataset. 20 million ratings and 465,000 tag applications applied to 27,000 movies by 138,000 users.

Content

this dataset has got three files named as ratings.csv, movies.csv and tags.csv

movies.csv In the 3 columns stored are the values of movieId, title and genre. The title has got the release year of movie in parenthesis. The movie list range from Dickson Greeting (1891) to movies of 2015. With the total of 27278 movies.

ratings.csv the movies have been rated by 138493 users on the scale of 1 to 5, this file contains the information divided in the column 'userId', 'movieId', 'rating' and 'timestamp'.

tags.csv this file has the data divided under category 'userId','movieId' and 'tag'

Acknowledgements

I got this data from MovieLens, for a mini project. This is the link to original data set

Inspiration

You have got a ton data. You can use this to make fun decisions like which is the best movie series of all time or create a completely new story out of the data that you have.

Facebook

Twitter

Click to copy link

Link copied

Cite

Dan Hendrycks; Steven Basart; Saurav Kadavath; Mantas Mazeika; Akul Arora; Ethan Guo; Collin Burns; Samir Puranik; Horace He; Dawn Song; Jacob Steinhardt (2024). APPS Dataset [Dataset]. https://paperswithcode.com/dataset/apps

APPS Dataset

Automated Programming Progress Standard

Explore at:

Dataset updated

Feb 28, 2024

Authors

Dan Hendrycks; Steven Basart; Saurav Kadavath; Mantas Mazeika; Akul Arora; Ethan Guo; Collin Burns; Samir Puranik; Horace He; Dawn Song; Jacob Steinhardt

Description

The APPS dataset consists of problems collected from different open-access coding websites such as Codeforces, Kattis, and more. The APPS benchmark attempts to mirror how humans programmers are evaluated by posing coding problems in unrestricted natural language and evaluating the correctness of solutions. The problems range in difficulty from introductory to collegiate competition level and measure coding ability as well as problem-solving.

The Automated Programming Progress Standard, abbreviated APPS, consists of 10,000 coding problems in total, with 131,836 test cases for checking solutions and 232,444 ground-truth solutions written by humans. Problems can be complicated, as the average length of a problem is 293.2 words. The data are split evenly into training and test sets, with 5,000 problems each. In the test set, every problem has multiple test cases, and the average number of test cases is 21.2. Each test case is specifically designed for the corresponding problem, enabling us to rigorously evaluate program functionality.

Clear search

Close search

Google apps

Main menu

APPS Dataset

apps

apps-small

Dating App Benchmarks (2025)

Data from: xr-droid: A Benchmark Dataset for AR/VR and Security Applications...

AndroidWorld Dataset

SciRAG-QA: Multi-domain Closed-Question Benchmark Dataset for Scientific QA

ORBIT: A real-world few-shot dataset for teachable object recognition...

applications-benchmark-dataset

App Engagement Rates (2025)

Benchmarks

Benchmark Public Works Web App

Data Loss repository Dataset

AI Search Data for "mHealth app retention benchmarks"

ParamScope-apps-dataset

app Pennsylvania Area NGS Geodetic Control Stations

Data for: A Benchmark Engineering Methodology to Measure the Overhead of...

Data from: Application of a 1H brain MRS benchmark dataset to deep learning...

Published in Imaging Neuroscience: Application of a 1H brain MRS benchmark dataset to deep learning for out-of-voxel artifacts

TreeSatAI Benchmark Archive for Deep Learning in Forest Applications

movie_rating_data

Context

Content

Acknowledgements

Inspiration

APPS DatasetSee More Versions

Automated Programming Progress Standard

APPS Dataset