100+ datasets found
  1. Active Evaluation Software for Selection of Ground Truth Labels

    • catalog.data.gov
    • s.cnmilf.com
    Updated Jul 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2022). Active Evaluation Software for Selection of Ground Truth Labels [Dataset]. https://catalog.data.gov/dataset/active-evaluation-software-for-selection-of-ground-truth-labels-d0581
    Explore at:
    Dataset updated
    Jul 29, 2022
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Description

    This software repository contains a python package Aegis (Active Evaluator Germane Interactive Selector) package that allows us to evaluate machine learning systems's performance (according to a metric such as accuracy) by adaptively sampling trials to label from an unlabeled test set to minimize the number of labels needed. This includes sample (public) data as well as a simulation script that tests different label-selecting strategies on already labelled test sets. This software is configured so that users can add their own data and system outputs to test evaluation.

  2. Manual Ground Truth Labels Image Dataset

    • figshare.com
    zip
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gianluca Pegoraro; George Zaki (2023). Manual Ground Truth Labels Image Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.12430085.v2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Gianluca Pegoraro; George Zaki
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This dataset contains the 16 bit of the manually annotated ground truth labels for the nuclei that were used both in training (Labelled as "Original") or inference (Labelled as "Biological" or "Technical) for the MRCNN and FPN2-WS networks

  3. ground truth labels

    • kaggle.com
    Updated Oct 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aditi Pandey (2024). ground truth labels [Dataset]. https://www.kaggle.com/datasets/aditipandey7205/ground-truth-labels/suggestions
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 27, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Aditi Pandey
    Description

    Dataset

    This dataset was created by Aditi Pandey

    Contents

  4. Datasets and Supporting Files for DPDGPT: Updated Labels, Features, and...

    • zenodo.org
    Updated Nov 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fengwei Lin; Fengwei Lin (2024). Datasets and Supporting Files for DPDGPT: Updated Labels, Features, and Ground Truths [Dataset]. http://doi.org/10.5281/zenodo.14018370
    Explore at:
    Dataset updated
    Nov 5, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Fengwei Lin; Fengwei Lin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description
    1. CONTEXTDP_Ground_truth_info.json: Ground truth for CONTEXTDP after updating some of the labels.
    2. UpdateLabels.xlsx: a list of images that we have updated the label with, containing the image name, the original label, the updated label, reason for update.
    3. RICO_Ground_truth_info.json:Ground truth for RICODP after updating some of the labels.
    4. DPfeatures.xlsx : the DP features we constructed, including visual and textual features.
    5. CONTEXTDP_Web.zip: The CONTEXTDP_Web dataset we used, contains 84 UI screenshots, 7 DP categories, 143 DP instances
    6. CONTEXTDP_Mobile.zip; The CONTEXTDP_Mobile dataset we used, contains 175 UI screenshots, 5 DP categories, 274 DP instances.
    7. RicoDP.zip: The RICODP dataset we used, contains 1353 UI screenshots, 14 DP categories, 1603 DP instances.
  5. Patient characteristics.

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated Feb 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Siddharth Guha; Abdalla Ibrahim; Qian Wu; Pengfei Geng; Yen Chou; Hao Yang; Jingchen Ma; Lin Lu; Delin Wang; Lawrence H. Schwartz; Chuan-miao Xie; Binsheng Zhao (2024). Patient characteristics. [Dataset]. http://doi.org/10.1371/journal.pone.0294581.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Feb 2, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Siddharth Guha; Abdalla Ibrahim; Qian Wu; Pengfei Geng; Yen Chou; Hao Yang; Jingchen Ma; Lin Lu; Delin Wang; Lawrence H. Schwartz; Chuan-miao Xie; Binsheng Zhao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Contrast-enhanced computed tomography scans (CECT) are routinely used in the evaluation of different clinical scenarios, including the detection and characterization of hepatocellular carcinoma (HCC). Quantitative medical image analysis has been an exponentially growing scientific field. A number of studies reported on the effects of variations in the contrast enhancement phase on the reproducibility of quantitative imaging features extracted from CT scans. The identification and labeling of phase enhancement is a time-consuming task, with a current need for an accurate automated labeling algorithm to identify the enhancement phase of CT scans. In this study, we investigated the ability of machine learning algorithms to label the phases in a dataset of 59 HCC patients scanned with a dynamic contrast-enhanced CT protocol. The ground truth labels were provided by expert radiologists. Regions of interest were defined within the aorta, the portal vein, and the liver. Mean density values were extracted from those regions of interest and used for machine learning modeling. Models were evaluated using accuracy, the area under the curve (AUC), and Matthew’s correlation coefficient (MCC). We tested the algorithms on an external dataset (76 patients). Our results indicate that several supervised learning algorithms (logistic regression, random forest, etc.) performed similarly, and our developed algorithms can accurately classify the phase of contrast enhancement.

  6. Z

    Ground Truth for DCASE 2020 Challenge Task 2 Evaluation Dataset

    • data.niaid.nih.gov
    Updated May 24, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Noboru Harada (2022). Ground Truth for DCASE 2020 Challenge Task 2 Evaluation Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3951616
    Explore at:
    Dataset updated
    May 24, 2022
    Dataset provided by
    Yuki Nikaido
    Kaori Suefusa
    Toshiki Nakamura
    Masahito Yasuda
    Harsh Purohit
    Takashi Endo
    Keisuke Imoto
    Yohei Kawaguchi
    Ryo Tanabe
    Noboru Harada
    Yuma Koizumi
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Description

    This data is the ground truth for the "evaluation dataset" for the DCASE 2020 Challenge Task 2 "Unsupervised Detection of Anomalous Sounds for Machine Condition Monitoring" [task description].

    In the task, three datasets have been released: "development dataset", "additional training dataset", and "evaluation dataset". The evaluation dataset was the last of the three released and includes around 400 samples for each Machine Type and Machine ID used in the evaluation dataset, none of which have any condition label (i.e., normal or anomaly). This ground truth data contains the condition labels.

    Data format

    The ground truth data is a CSV file like the following:

    fan id_01_00000000.wav,normal_id_01_00000098.wav,0 id_01_00000001.wav,anomaly_id_01_00000064.wav,1 ...

    id_05_00000456.wav,anomaly_id_05_00000033.wav,1 id_05_00000457.wav,normal_id_05_00000049.wav,0 pump id_01_00000000.wav,anomaly_id_01_00000049.wav,1 id_01_00000001.wav,anomaly_id_01_00000039.wav,1 ...

    id_05_00000346.wav,anomaly_id_05_00000052.wav,1 id_05_00000347.wav,anomaly_id_05_00000080.wav,1 slider id_01_00000000.wav,anomaly_id_01_00000035.wav,1 id_01_00000001.wav,anomaly_id_01_00000176.wav,1 ...

    "Fan", "pump", "slider", etc mean "Machine Type" names. The lines following a Machine Type correspond to pairs of a wave file in the Machine Type and a condition label. The first column shows the name of a wave file. The second column shows the original name of the wave file, but this can be ignored by users. The third column shows the condition label (i.e., 0: normal or 1: anomaly).

    How to use

    A system for calculating AUC and pAUC scores for the "evaluation dataset" is available on the Github repository [URL]. The ground truth data is used by this system. For more information, please see the Github repository.

    Conditions of use

    This dataset was created jointly by NTT Corporation and Hitachi, Ltd. and is available under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.

    Publication

    If you use this dataset, please cite all the following three papers:

    Yuma Koizumi, Shoichiro Saito, Noboru Harada, Hisashi Uematsu, and Keisuke Imoto, "ToyADMOS: A Dataset of Miniature-Machine Operating Sounds for Anomalous Sound Detection," in Proc. of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2019. [pdf]

    Harsh Purohit, Ryo Tanabe, Kenji Ichige, Takashi Endo, Yuki Nikaido, Kaori Suefusa, and Yohei Kawaguchi, “MIMII Dataset: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection,” in Proc. 4th Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2019. [pdf]

    Yuma Koizumi, Yohei Kawaguchi, Keisuke Imoto, Toshiki Nakamura, Yuki Nikaido, Ryo Tanabe, Harsh Purohit, Kaori Suefusa, Takashi Endo, Masahiro Yasuda, and Noboru Harada, "Description and Discussion on DCASE2020 Challenge Task2: Unsupervised Anomalous Sound Detection for Machine Condition Monitoring," in Proc. 5th Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2020. [pdf]

    Feedback

    If there is any problem, please contact us:

    Yuma Koizumi, koizumi.yuma@ieee.org

    Yohei Kawaguchi, yohei.kawaguchi.xk@hitachi.com

    Keisuke Imoto, keisuke.imoto@ieee.org

  7. Image Annotation Datasets

    • zenodo.org
    bin
    Updated Oct 15, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alshehri Abeer; Alshehri Abeer (2021). Image Annotation Datasets [Dataset]. http://doi.org/10.5281/zenodo.5570889
    Explore at:
    binAvailable download formats
    Dataset updated
    Oct 15, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Alshehri Abeer; Alshehri Abeer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This folder contains four Image Annotation Datasets (ESPGame, IAPR-TC12, ImageCLEF 2011, ImagCLEF 2012). Each dataset has sub-folders of training images, testing images, ground truth, labels.

    Moreover, labels are the limited number of labels the dataset could assign to an image. While the ground is the correct labeling for each image.

  8. a

    Inria Aerial Image Labeling Dataset

    • academictorrents.com
    bittorrent
    Updated Apr 27, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Emmanuel Maggiori and Yuliya Tarabalka and Guillaume Charpiat and Pierre Alliez (2019). Inria Aerial Image Labeling Dataset [Dataset]. https://academictorrents.com/details/cf445f6073540af0803ee345f46294f088e7bba5
    Explore at:
    bittorrent(20957265875)Available download formats
    Dataset updated
    Apr 27, 2019
    Dataset authored and provided by
    Emmanuel Maggiori and Yuliya Tarabalka and Guillaume Charpiat and Pierre Alliez
    License

    https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified

    Description

    The Inria Aerial Image Labeling addresses a core topic in remote sensing: the automatic pixelwise labeling of aerial imagery. Dataset features: Coverage of 810 km² (405 km² for training and 405 km² for testing) Aerial orthorectified color imagery with a spatial resolution of 0.3 m Ground truth data for two semantic classes: building and not building (publicly disclosed only for the training subset) The images cover dissimilar urban settlements, ranging from densely populated areas (e.g., San Francisco’s financial district) to alpine towns (e.g,. Lienz in Austrian Tyrol). Instead of splitting adjacent portions of the same images into the training and test subsets, different cities are included in each of the subsets. For example, images over Chicago are included in the training set (and not on the test set) and images over San Francisco are included on the test set (and not on the training set). The ultimate goal of this dataset is to assess the generalization power of the techniqu

  9. Apple CT Data: Ground truth reconstructions - 5 of 6

    • zenodo.org
    zip
    Updated Mar 4, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sophia Bethany Coban; Sophia Bethany Coban; Vladyslav Andriiashen; Vladyslav Andriiashen; Poulami Somanya Ganguly; Poulami Somanya Ganguly (2021). Apple CT Data: Ground truth reconstructions - 5 of 6 [Dataset]. http://doi.org/10.5281/zenodo.4576202
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 4, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sophia Bethany Coban; Sophia Bethany Coban; Vladyslav Andriiashen; Vladyslav Andriiashen; Poulami Somanya Ganguly; Poulami Somanya Ganguly
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Summary
    This submission is a supplementary material to the article [Coban 2020b]. As part of the manuscript, we release three simulated parallel-beam tomographic datasets of 94 apples with internal defects, the ground truth reconstructions and two defect label files.

    Description
    This Zenodo upload contains the ground truth reconstructed slices for each apple. In total, there are 72192 reconstructed slices, which have been divided into 6 separate submissions:

    The simulated parallel-beam datasets and defect label files are also available through this project, via a separate Zenodo upload: 10.5281/zenodo.4212301.

    Apparatus
    The dataset is acquired using the custom-built and highly flexible CT scanner, FleX-ray Laboratory, developed by TESCAN-XRE, located at CWI in Amsterdam. This apparatus consists of a cone-beam microfocus X-ray point source that projects polychromatic X-rays onto a 1944-by-1536 pixels, 14-bit, flat detector panel. Full details can be found in [Coban 2020a].

    Ground Truth Generation

    We reconstructed the raw tomographic data, which was captured at sample resolution of 54.2µm over a 360 degrees in circular and continuous motion in a cone-beam setup. A total of 1200 projections were collected, which were distributed evenly over the full circle. The raw tomographic data is available upon request.

    The ground truth reconstructed slices were generated based on Conjugate Gradient Least Squares (CGLS) reconstruction of each apple. The voxel grid in the reconstruction was 972px x 972px x 768px. The resolution in the ground truth reconstructions remained unchanged.

    All ground truth reconstructed slices are in .tif format. Each file is named "appleNo_sliceNo.tif".

    List of Contents
    The contents of the submission is given below.

    • ground_truths_5: This folder contains reconstructed slices of 16 apples

    Additional Links
    These datasets are produced by the Computational Imaging group at Centrum Wiskunde & Informatica (CI-CWI). For any relevant Python/MATLAB scripts for the FleX-ray datasets, we refer the reader to our group's GitHub page.

    Contact Details
    For more information or guidance in using these dataset, please get in touch with

    • s.b.coban [at] cwi.nl
    • vladyslav.andriiashen [at] cwi.nl
    • poulami.ganguly [at] cwi.nl

    Acknowledgments
    We acknowledge GREEFA for supplying the apples and further discussions.

  10. u

    Cultural ecosystem service labels for photos from Flickr and Twitter using...

    • produccioncientifica.ugr.es
    • zenodo.org
    Updated 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alcaraz-Segura, Domingo; del Águila, Ana; Elghouat, Akram; Guouman Ferreyra, Franco; Khaldi, Rohaifa; López Pacheco, Domingo Jesús; Martínez-López, Javier; Merino Ceballos, Manuel; Molina Cabrera, Daniel; Moreno Llorca, Ricardo Antonio; Navarro, Carlos Javier; Nieto Pacheco, Irati; Pistón, Nuria; Rodríguez Díaz, Francisco Javier; Ros-Candeira, Andrea; Sissoko Cusio, Aixa; Stetiukha Romanovna, Taisiia; Tabik, Siham; Zamora Rodriguez, Regino; Alcaraz-Segura, Domingo; del Águila, Ana; Elghouat, Akram; Guouman Ferreyra, Franco; Khaldi, Rohaifa; López Pacheco, Domingo Jesús; Martínez-López, Javier; Merino Ceballos, Manuel; Molina Cabrera, Daniel; Moreno Llorca, Ricardo Antonio; Navarro, Carlos Javier; Nieto Pacheco, Irati; Pistón, Nuria; Rodríguez Díaz, Francisco Javier; Ros-Candeira, Andrea; Sissoko Cusio, Aixa; Stetiukha Romanovna, Taisiia; Tabik, Siham; Zamora Rodriguez, Regino (2025). Cultural ecosystem service labels for photos from Flickr and Twitter using artificial intelligence models [Dataset]. https://produccioncientifica.ugr.es/documentos/688b602417bb6239d2d48ea5
    Explore at:
    Dataset updated
    2025
    Authors
    Alcaraz-Segura, Domingo; del Águila, Ana; Elghouat, Akram; Guouman Ferreyra, Franco; Khaldi, Rohaifa; López Pacheco, Domingo Jesús; Martínez-López, Javier; Merino Ceballos, Manuel; Molina Cabrera, Daniel; Moreno Llorca, Ricardo Antonio; Navarro, Carlos Javier; Nieto Pacheco, Irati; Pistón, Nuria; Rodríguez Díaz, Francisco Javier; Ros-Candeira, Andrea; Sissoko Cusio, Aixa; Stetiukha Romanovna, Taisiia; Tabik, Siham; Zamora Rodriguez, Regino; Alcaraz-Segura, Domingo; del Águila, Ana; Elghouat, Akram; Guouman Ferreyra, Franco; Khaldi, Rohaifa; López Pacheco, Domingo Jesús; Martínez-López, Javier; Merino Ceballos, Manuel; Molina Cabrera, Daniel; Moreno Llorca, Ricardo Antonio; Navarro, Carlos Javier; Nieto Pacheco, Irati; Pistón, Nuria; Rodríguez Díaz, Francisco Javier; Ros-Candeira, Andrea; Sissoko Cusio, Aixa; Stetiukha Romanovna, Taisiia; Tabik, Siham; Zamora Rodriguez, Regino
    Description

    Description:

    Dataset of photos downloaded from Flickr (241,582 photos) and Twitter-X (1,035,488 photos) labeled by different artificial intelligence models and validated by labels assigned by human experts.

    The entire dataset was labeled using different AI models. First, we applied a Large Language Model (GPT-4.1 from OPENAI) and Llava 1.6 (on a subset of the data) to extract semantic labels from the image content based on prompts fine-tuned using prompt engineering.

    In parallel, we used the base version of DINO (a self-supervised vision transformation model), fine-tuned with a subset of human expert-labeled images from our own dataset, to generate inferences for the entire image collection.

    We also incorporated labels derived from expert vision models pre-trained on established datasets such as ImageNet, COCO, Places365, and Nature, which provided complementary classification information.

    The labels used correspond to two categories (Table 1):

    Table 1. Categories used in social media photo tagging: Stoten, based on the scientific framework proposed by Moreno-Llorca et al. (2020). Level 3, a hierarchical tagging system developed by our team to provide greater thematic detail, especially suited for the identification of Cultural Ecosystem Services.

    Stoten

    Level3

    Cultural

    Accommodation

    Fauna/Flora

    Air activities

    Gastronomy

    Animals

    Nature & Landscape

    Breakwater

    Not relevant

    Bridge

    Recreational

    Commerce facilities

    Religious

    Cities

    Rural tourism

    Clouds

    Sports

    Dam

    Sun and beach

    Dock

    Urban

    Fungus

    Heritage and culture

    Knowledge

    Landscapes

    Lighthouse

    Not relevant

    Other abiotic features

    Plants

    Roads

    Shelter

    Skies

    Spiritual, symbolic and related connotations

    Terrestrial activities

    Towns and villages

    Tracks and trails

    Vegetation and habitats

    Vehicle

    Water activities

    Wind farm

    Winter activities

    Table 2. Table of contents of the dataset

    Folder

    format

    Description

    AI models

    DINO

    model

    .pt and pth

    Model fine-tuned with a subset of expert-labeled images

    Expert models

    CES_label_tree

    .csv

    Equivalence table used to assign labels generated by expert models to our categories of interest (Stoten and Level3)

    LLMs GPT and Llava prompts

    GPT_Label_local_files

    .py

    Python script used for labeling photos using OPENAI models (in our case we used the GPT 4.1 model)

    Level3_GPT_LLava_7_prompts_used

    .txt

    Seven prompts used for photo tagging using GPT 4.1 and Llava 1.6

    Stoten_GPT_LLava_7_prompts_used

    .txt

    Seven prompts used for photo tagging with Stoten using GPT 4.1 and Llava 1.6

    Stoten_Level3_categories

    .csv

    Seven prompts used for photo tagging with level 3 using GPT 4.1 and Llava 1.6

    Flickr

    AI based labels

    DINO

    Flickr_DINO_all

    .csv

    Inferences for all Flickr photos from the DINO model trained with the ground truth

    Expert models

    Flickr_expert_models_all

    .csv

    Labels generated by expert models for the entire database

    GPT

    Flickr_GPT_all

    .csv

    Database of Flickr photos tagged with CES using OPENAI's GPT-4.1 model.

    Flickr_GPT_7_prompts_8192

    .csv

    Subset of the Flickr photo database with CES-related tags assigned by the GPT 4.1 model where 7 prompts are tested for Stoten and Level 3.

    Llava 1.6

    Flickr_Llava_1-6

    .csv

    Subset of the Flickr photo database with CES-related tags assigned by the Llava 1.6 model where 7 prompts are tested for Stoten and Level 3.

    Ground truth

    Ground Truth labels

    Flickr_Database_Labeled_1082

    .csv

    Contain labels assigned by human experts and after rounds of review and consensus, for both Stoten and Level 3, from 1082 Flickr photos

    Flickr_Database_Labeled_7110

    .csv

    Ground Truth, an archive containing labels assigned by human experts and after rounds of review and consensus, for both Stoten and Level 3, from 7110 Flickr photos

    Flickr_Database_Labeled_8192

    .csv

    Union of the two databases labeled above

    Ground Truth photos

    1082

    .jpg/png

    Photos labeled by human experts, these photos were selected to be representative of different parks, with different levels of protection and representative of different CES

    7110

    .jpg/png

    Photos labeled by human experts, these photos were selected to be representative of different parks, with different levels of protection and representative of different CES

    Human labels

    Flickr_DataBase_Labeled_1082_expert1_AS

    .csv

    File containing tags assigned by expert 1 for both Stoten and Level 3, from 1082 Flickr photos

    Flickr_DataBase_Labeled_1082_expert2_FG

    .csv

    File containing tags assigned by expert 2 for both Stoten and Level 3, from 1082 Flickr photos

    Flickr_DataBase_Labeled_7110_expert1_CN

    .csv

    File containing tags assigned by expert 1 for both Stoten and Level 3, from 7110 Flickr photos

    Twitter

    AI based labels

    DINO

    Twitter_DINO_all

    .csv

    Inferences for all Twitter photos from the DINO model trained with the ground truth

    Expert models

    Twitter_expert_models_all

    .csv

    Labels generated by expert models for the entire database

    GPT

    Twitter_GPT_all

    .csv

    Database of Twitter photos tagged with CES using OPENAI's GPT-4.1 model.

    Twitter_GPT_7_prompts_150

    .csv

    Subset of the Twitter photo database with CES-related tags assigned by the GPT 4.1 model where 7 prompts are tested for Stoten and Level 3.

    Llava 1.6

    Twitter_Llava_1-6

    .csv

    Subset of the Twitter photo database with CES-related tags assigned by the Llava 1.6 model where 7 prompts are tested for Stoten and Level 3.

    Ground truth

    Ground Truth labels

    Twitter_Database_Labeled_150

    .csv

    Contain labels assigned by human experts and after rounds of review and consensus, for both Stoten and Level 3, from 150 Twitter photos

    Twitter_Database_Labeled_6804

    .csv

    Contain labels assigned by human experts and after rounds of review and consensus, for both Stoten and Level 3, from 6804 Twitter photos

    Ground Truth photos

    150

    .jpg/png

    Photos labeled by human experts, these photos were selected to be representative of different parks, with different levels of protection and representative of different CES

    6804

    .jpg/png

    Photos labeled by human experts, these photos were selected to be representative of different parks, with different levels of protection and representative of different CES

    Human labels

    Flickr_DataBase_Labeled_150_7experts

    .csv

    File containing tags assigned by 7 experts for both Stoten and Level 3, from 150 Twitter photos

    Flickr_DataBase_Labeled_6804_expert1_FG

    .csv

    File containing tags assigned by expert 2 for both Stoten and Level 3, from 6804 Twitter photos

    References:

    Moreno-Llorca, R., Méndez, P. F., Ros-Candeira, A., Alcaraz-Segura, D., Santamaría, L., Ramos-Ridao, Á. F., ... & Vaz, A. S. (2020). Evaluating tourist profiles and nature-based experiences in Biosphere Reserves using Flickr: Matches and mismatches between online social surveys and photo content analysis. Science of the Total Environment, 737, 140067. https://doi.org/10.1016/j.scitotenv.2020.140067

  11. m

    MATLAB source code for developing Ground Truth Dataset, Semantic...

    • data.mendeley.com
    Updated Mar 26, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sud Sudirman (2019). MATLAB source code for developing Ground Truth Dataset, Semantic Segmentation, and Evaluation for LJMU Lumbar Spine MRI Dataset [Dataset]. http://doi.org/10.17632/8cp2cp7km8.1
    Explore at:
    Dataset updated
    Mar 26, 2019
    Authors
    Sud Sudirman
    License

    http://www.gnu.org/licenses/gpl-3.0.en.htmlhttp://www.gnu.org/licenses/gpl-3.0.en.html

    Description

    This file contains the MATLAB source code for developing Ground Truth Dataset, Semantic Segmentation, and Evaluation for Lumbar Spine MRI Dataset. It has the file structure necessary for the execution of the code. Please download the MRI Dataset and the Ground Truth label Image dataset separately and unzip them inside the LJMU Lumbar Spine MRI Dataset and Software\99 Workspace\ folder.

  12. Dataset: An Open Combinatorial Diffraction Dataset Including Consensus Human...

    • catalog.data.gov
    • data.nist.gov
    • +2more
    Updated Jul 29, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2022). Dataset: An Open Combinatorial Diffraction Dataset Including Consensus Human and Machine Learning Labels with Quantified Uncertainty for Training New Machine Learning Models [Dataset]. https://catalog.data.gov/dataset/dataset-an-open-combinatorial-diffraction-dataset-including-consensus-human-and-machine-le-0de06
    Explore at:
    Dataset updated
    Jul 29, 2022
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Description

    The open dataset, software, and other files accompanying the manuscript "An Open Combinatorial Diffraction Dataset Including Consensus Human and Machine Learning Labels with Quantified Uncertainty for Training New Machine Learning Models," submitted for publication to Integrated Materials and Manufacturing Innovations.Machine learning and autonomy are increasingly prevalent in materials science, but existing models are often trained or tuned using idealized data as absolute ground truths. In actual materials science, "ground truth" is often a matter of interpretation and is more readily determined by consensus. Here we present the data, software, and other files for a study using as-obtained diffraction data as a test case for evaluating the performance of machine learning models in the presence of differing expert opinions. We demonstrate that experts with similar backgrounds can disagree greatly even for something as intuitive as using diffraction to identify the start and end of a phase transformation. We then use a logarithmic likelihood method to evaluate the performance of machine learning models in relation to the consensus expert labels and their variance. We further illustrate this method's efficacy in ranking a number of state-of-the-art phase mapping algorithms. We propose a materials data challenge centered around the problem of evaluating models based on consensus with uncertainty. The data, labels, and code used in this study are all available online at data.gov, and the interested reader is encouraged to replicate and improve the existing models or to propose alternative methods for evaluating algorithmic performance.

  13. h

    getting-started-labeled-photos

    • huggingface.co
    Updated Jan 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Voxel51 (2025). getting-started-labeled-photos [Dataset]. https://huggingface.co/datasets/Voxel51/getting-started-labeled-photos
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 3, 2025
    Dataset authored and provided by
    Voxel51
    Description

    Dataset Card for predicted_labels

    These photos are used in the FiftyOne getting started webinar. The images have a prediction label where were generated by self-supervised classification through a OpenClip Model. https://github.com/thesteve0/fiftyone-getting-started/blob/main/5_generating_labels.py They were then manually cleaned to produce the ground truth label. https://github.com/thesteve0/fiftyone-getting-started/blob/main/6_clean_labels.md They are 300 public domain photos… See the full description on the dataset page: https://huggingface.co/datasets/Voxel51/getting-started-labeled-photos.

  14. Z

    Ground Truth for DCASE 2021 Challenge Task 2 Evaluation Dataset

    • data.niaid.nih.gov
    Updated Aug 26, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Takashi Endo (2021). Ground Truth for DCASE 2021 Challenge Task 2 Evaluation Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5257673
    Explore at:
    Dataset updated
    Aug 26, 2021
    Dataset provided by
    Harsh Purohit
    Takashi Endo
    Keisuke Imoto
    Kota Dohi
    Yohei Kawaguchi
    Ryo Tanabe
    Noboru Harada
    Daisuke Niizumi
    Yuma Koizumi
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Description

    This data is the ground truth for the "evaluation dataset" for the DCASE 2021 Challenge Task 2 "Unsupervised Anomalous Sound Detection for Machine Condition Monitoring under Domain Shifted Conditions".

    In the task, three datasets have been released: "development dataset", "additional training dataset", and "evaluation dataset". The evaluation dataset was the last of the three released and includes around 200 samples for each machine type, section index, and domain, none of which have a condition label (i.e., normal or anomaly). This ground truth dataset contains the condition labels.

    Data format

    The CSV file for each machine type, section index, and domain includes the ground truth data like the following:

    section_03_source_test_0000.wav,1 section_03_source_test_0001.wav,1

    ...

    section_03_source_test_0198.wav,0 section_03_source_test_0199.wav,1

    The first column shows the name of a wave file. The second column shows the condition label (i.e., 0: normal or 1: anomaly).

    How to use

    A script for calculating the AUC, pAUC, precision, recall, and F1 scores for the "evaluation dataset" is available on the Github repository [URL]. The ground truth data are used by this system. For more information, please see the Github repository.

    Conditions of use

    This dataset was created jointly by Hitachi, Ltd. and NTT Corporation and is available under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.

    Publication

    If you use this dataset, please cite all the following three papers:

    Yohei Kawaguchi, Keisuke Imoto, Yuma Koizumi, Noboru Harada, Daisuke Niizumi, Kota Dohi, Ryo Tanabe, Harsh Purohit, and Takashi Endo, "Description and Discussion on DCASE 2021 Challenge Task 2: Unsupervised Anomalous Sound Detection for Machine Condition Monitoring under Domain Shifted Conditions," in arXiv e-prints: 2106.04492, 2021. [URL]

    Noboru Harada, Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Masahiro Yasuda, Shoichiro Saito, "ToyADMOS2: Another Dataset of Miniature-Machine Operating Sounds for Anomalous Sound Detection under Domain Shift Conditions," in arXiv e-prints: 2106.02369, 2021. [URL]

    Ryo Tanabe, Harsh Purohit, Kota Dohi, Takashi Endo, Yuki Nikaido, Toshiki Nakamura, and Yohei Kawaguchi, "MIMII DUE: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection with Domain Shifts due to Changes in Operational and Environmental Conditions," in arXiv e-prints: 2105.02702, 2021. [URL]

    Feedback

    If there is any problem, please contact us:

    Yohei Kawaguchi, yohei.kawaguchi.xk@hitachi.com

    Daisuke Niizumi, daisuke.niizumi.dt@hco.ntt.co.jp

    Keisuke Imoto, keisuke.imoto@ieee.org

  15. The 95% confidence intervals of accuracy and MCC of the supervised learning...

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated Feb 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Siddharth Guha; Abdalla Ibrahim; Qian Wu; Pengfei Geng; Yen Chou; Hao Yang; Jingchen Ma; Lin Lu; Delin Wang; Lawrence H. Schwartz; Chuan-miao Xie; Binsheng Zhao (2024). The 95% confidence intervals of accuracy and MCC of the supervised learning models for the main dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0294581.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Feb 2, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Siddharth Guha; Abdalla Ibrahim; Qian Wu; Pengfei Geng; Yen Chou; Hao Yang; Jingchen Ma; Lin Lu; Delin Wang; Lawrence H. Schwartz; Chuan-miao Xie; Binsheng Zhao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The 95% confidence intervals of accuracy and MCC of the supervised learning models for the main dataset.

  16. Dataset for paper "1-to-1 or 1-to-n? Investigating the effect of function...

    • zenodo.org
    application/gzip, bin +1
    Updated Jun 23, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ang Jia; Ang Jia (2022). Dataset for paper "1-to-1 or 1-to-n? Investigating the effect of function inlining on binary similarity analysis" [Dataset]. http://doi.org/10.5281/zenodo.6675280
    Explore at:
    zip, application/gzip, binAvailable download formats
    Dataset updated
    Jun 23, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Ang Jia; Ang Jia
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is the dataset constructed and used in the paper "1-to-1 or 1-to-n? Investigating the effect of function inlining on binary similarity analysis".

    We release it with its ground truth labels to facilitate further research in binary similarity analysis under function inlining. Besides, we also welcome others to find miss-labeling in this dataset and fix them. We hope our dataset could help researchers conduct their studies and improve the binary similarity analysis techniques.

    Binaries of dataset-I can also be accessed from https://github.com/SoftSec-KAIST/BinKit. It is named the "normal dataset".

    Any miss-labeling can be reported at https://github.com/island255/TOSEM2022.

    Using this dataset, please cite our paper, "1-to-1 or 1-to-n? Investigating the effect of function inlining on binary similarity analysis".

  17. o

    FSDKaggle2018

    • explore.openaire.eu
    • opendatalab.com
    • +2more
    Updated Jan 29, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eduardo Fonseca; Xavier Favory; Jordi Pons; Frederic Font; Manoj Plakal; Daniel P. W. Ellis; Xavier Serra (2019). FSDKaggle2018 [Dataset]. http://doi.org/10.5281/zenodo.2552860
    Explore at:
    Dataset updated
    Jan 29, 2019
    Authors
    Eduardo Fonseca; Xavier Favory; Jordi Pons; Frederic Font; Manoj Plakal; Daniel P. W. Ellis; Xavier Serra
    Description

    FSDKaggle2018 is an audio dataset containing 11,073 audio files annotated with 41 labels of the AudioSet Ontology. FSDKaggle2018 has been used for the DCASE Challenge 2018 Task 2, which was run as a Kaggle competition titled Freesound General-Purpose Audio Tagging Challenge. Citation If you use the FSDKaggle2018 dataset or part of it, please cite our DCASE 2018 paper: Eduardo Fonseca, Manoj Plakal, Frederic Font, Daniel P. W. Ellis, Xavier Favory, Jordi Pons, Xavier Serra. "General-purpose Tagging of Freesound Audio with AudioSet Labels: Task Description, Dataset, and Baseline". Proceedings of the DCASE 2018 Workshop (2018) You can also consider citing our ISMIR 2017 paper, which describes how we gathered the manual annotations included in FSDKaggle2018. Eduardo Fonseca, Jordi Pons, Xavier Favory, Frederic Font, Dmitry Bogdanov, Andres Ferraro, Sergio Oramas, Alastair Porter, and Xavier Serra, "Freesound Datasets: A Platform for the Creation of Open Audio Datasets", In Proceedings of the 18th International Society for Music Information Retrieval Conference, Suzhou, China, 2017 Contact You are welcome to contact Eduardo Fonseca should you have any questions at eduardo.fonseca@upf.edu. About this dataset Freesound Dataset Kaggle 2018 (or FSDKaggle2018 for short) is an audio dataset containing 11,073 audio files annotated with 41 labels of the AudioSet Ontology [1]. FSDKaggle2018 has been used for the Task 2 of the Detection and Classification of Acoustic Scenes and Events (DCASE) Challenge 2018. Please visit the DCASE2018 Challenge Task 2 website for more information. This Task was hosted on the Kaggle platform as a competition titled Freesound General-Purpose Audio Tagging Challenge. It was organized by researchers from the Music Technology Group of Universitat Pompeu Fabra, and from Google Research’s Machine Perception Team. The goal of this competition was to build an audio tagging system that can categorize an audio clip as belonging to one of a set of 41 diverse categories drawn from the AudioSet Ontology. All audio samples in this dataset are gathered from Freesound [2] and are provided here as uncompressed PCM 16 bit, 44.1 kHz, mono audio files. Note that because Freesound content is collaboratively contributed, recording quality and techniques can vary widely. The ground truth data provided in this dataset has been obtained after a data labeling process which is described below in the Data labeling process section. FSDKaggle2018 clips are unequally distributed in the following 41 categories of the AudioSet Ontology: "Acoustic_guitar", "Applause", "Bark", "Bass_drum", "Burping_or_eructation", "Bus", "Cello", "Chime", "Clarinet", "Computer_keyboard", "Cough", "Cowbell", "Double_bass", "Drawer_open_or_close", "Electric_piano", "Fart", "Finger_snapping", "Fireworks", "Flute", "Glockenspiel", "Gong", "Gunshot_or_gunfire", "Harmonica", "Hi-hat", "Keys_jangling", "Knock", "Laughter", "Meow", "Microwave_oven", "Oboe", "Saxophone", "Scissors", "Shatter", "Snare_drum", "Squeak", "Tambourine", "Tearing", "Telephone", "Trumpet", "Violin_or_fiddle", "Writing". Some other relevant characteristics of FSDKaggle2018: The dataset is split into a train set and a test set. The train set is meant to be for system development and includes ~9.5k samples unequally distributed among 41 categories. The minimum number of audio samples per category in the train set is 94, and the maximum 300. The duration of the audio samples ranges from 300ms to 30s due to the diversity of the sound categories and the preferences of Freesound users when recording sounds. The total duration of the train set is roughly 18h. Out of the ~9.5k samples from the train set, ~3.7k have manually-verified ground truth annotations and ~5.8k have non-verified annotations. The non-verified annotations of the train set have a quality estimate of at least 65-70% in each category. Checkout the Data labeling process section below for more information about this aspect. Non-verified annotations in the train set are properly flagged in train.csv so that participants can opt to use this information during the development of their systems. The test set is composed of 1.6k samples with manually-verified annotations and with a similar category distribution than that of the train set. The total duration of the test set is roughly 2h. All audio samples in this dataset have a single label (i.e. are only annotated with one label). Checkout the Data labeling process section below for more information about this aspect. A single label should be predicted for each file in the test set. Data labeling process The data labeling process started from a manual mapping between Freesound tags and AudioSet Ontology categories (or labels), which was carried out by researchers at the Music Technology Group, Universitat Pompeu Fabra, Barcelona. Using this mapping, a number of Freesound audio samples were automatically annotated with labels from the AudioSet Onto...

  18. W

    Labelled Weed Detection Images for Hot Peppers

    • cloud.csiss.gmu.edu
    • data.amerigeoss.org
    Updated Jun 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Trinidad and Tobago (2023). Labelled Weed Detection Images for Hot Peppers [Dataset]. https://cloud.csiss.gmu.edu/uddi/dataset/weeddetection_labelled_hotpeppers
    Explore at:
    Dataset updated
    Jun 13, 2023
    Dataset provided by
    Trinidad and Tobago
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data contains the corresponding labelled images of Capsicum Annuum that are included in the "Unlabelled Weed Detection Images for Hot Peppers" data set on this site. This data set contains the labels 0,1 and 2 which can be displayed by assigning a unique pixel value (Eg. Recommended: 0,60,255) to each occurrence of the label. These images can be utilised as ground truth labels for machine learning and data exploration. These labels represent three categories, namely, weed, crop and background. The labels were assigned by a team of trained individuals from Trinidad and Tobago using the Image Labeller App in the Computer Vision library from Matlab.

  19. Apple CT Data: Simulated parallel-beam tomographic datasets

    • zenodo.org
    csv, txt, zip
    Updated Mar 13, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sophia Bethany Coban; Sophia Bethany Coban; Vladyslav Andriiashen; Vladyslav Andriiashen; Poulami Somanya Ganguly; Poulami Somanya Ganguly (2021). Apple CT Data: Simulated parallel-beam tomographic datasets [Dataset]. http://doi.org/10.5281/zenodo.4212301
    Explore at:
    zip, csv, txtAvailable download formats
    Dataset updated
    Mar 13, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sophia Bethany Coban; Sophia Bethany Coban; Vladyslav Andriiashen; Vladyslav Andriiashen; Poulami Somanya Ganguly; Poulami Somanya Ganguly
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Summary

    This submission is a supplementary material to the article [Coban 2020b]. As part of the manuscript, we release three simulated parallel-beam tomographic datasets of 94 apples with internal defects, the ground truth reconstructions and two defect label files.

    Description

    This Zenodo upload contains the three simulated datasets, Datasets A-C, and the two defect label files. The three versions are a noiseless simulation (Dataset A); simulation with added Gaussian noise (Dataset B), and with scattering noise (Dataset C). The datasets are based on real 3D X-ray CT data and their subsequent volume reconstructions.

    The defect label files contain tables of pixel numbers assigned to each defect present in the apples. Each row in the table corresponds to a single apple, and each column is a defect across the 94 apples. The two defect label files are apple_defects_full.csv and apple_defects_partial.csv: The former containing defect information for a full apple, and the latter for a selection of slices in an apple.

    The ground truth reconstructions are also available through this project, via 6 separate Zenodo uploads:

    The datasets are prepared for development and testing of data-driven, learning-based image reconstruction, segmentation and post-processing methods.

    Simulation Setup

    Each projection in Datasets A-C contains 50 angles, taken over 180 degrees. The projections are evenly distributed over the half circle, with an angular increment of 3.6 degrees. The size of the projections is 50px x 1377px. In total we have 62792 projections for Datasets A and B, and 7520 for Dataset C. All projections are in .tif format.

    We also include proj_angs.txt containing a list of projection angles in radians. These projection angles are the same for all three datasets.

    List of Contents

    The contents of the submission is given below.

    • Dataset_A:
      • Simulated parallel-beam projections, each named as "data_appleNo_sliceNo.tif".
    • Dataset_B:
      • Dataset A with added 5% Gaussian noise, each named "data_noisy_appleNo_sliceNo.tif".
    • Dataset_C:
      • Dataset A with added scattering for a selection of slices, each named "data_appleNo_sliceNo.tif".
    • proj_angs.txt: List of projection angles for all datasets.
    • apple_defects_full.csv: The label defect information (in pixel numbers) for each apple for all slices.
    • apple_defects_partial.csv: The label defect information (in pixel numbers) for each apple for a selection of slices.

    Additional Links

    These datasets are produced by the Computational Imaging group at Centrum Wiskunde & Informatica (CI-CWI). For any relevant Python/MATLAB scripts for the FleX-ray datasets, we refer the reader to our group's GitHub page.

    Contact Details

    For more information or guidance in using these dataset, please get in touch with

    • s.b.coban [at] cwi.nl
    • vladyslav.andriiashen [at] cwi.nl
    • poulami.ganguly [at] cwi.nl

    Acknowledgments

    We acknowledge GREEFA for supplying the apples and further discussions.

  20. Z

    FSD50K

    • data.niaid.nih.gov
    • opendatalab.com
    • +2more
    Updated Apr 24, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eduardo Fonseca (2022). FSD50K [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4060431
    Explore at:
    Dataset updated
    Apr 24, 2022
    Dataset provided by
    Jordi Pons
    Xavier Favory
    Xavier Serra
    Frederic Font
    Eduardo Fonseca
    Description

    FSD50K is an open dataset of human-labeled sound events containing 51,197 Freesound clips unequally distributed in 200 classes drawn from the AudioSet Ontology. FSD50K has been created at the Music Technology Group of Universitat Pompeu Fabra.

    Citation

    If you use the FSD50K dataset, or part of it, please cite our TASLP paper (available from [arXiv] [TASLP]):

    @article{fonseca2022FSD50K, title={{FSD50K}: an open dataset of human-labeled sound events}, author={Fonseca, Eduardo and Favory, Xavier and Pons, Jordi and Font, Frederic and Serra, Xavier}, journal={IEEE/ACM Transactions on Audio, Speech, and Language Processing}, volume={30}, pages={829--852}, year={2022}, publisher={IEEE} }

    Paper update: This paper has been published in TASLP at the beginning of 2022. The accepted camera-ready version includes a number of improvements with respect to the initial submission. The main updates include: estimation of the amount of label noise in FSD50K, SNR comparison between FSD50K and AudioSet, improved description of evaluation metrics including equations, clarification of experimental methodology and some results, some content moved to Appendix for readability. The TASLP-accepted camera-ready version is available from arXiv (in particular, it is v2 in arXiv, displayed by default).

    Data curators

    Eduardo Fonseca, Xavier Favory, Jordi Pons, Mercedes Collado, Ceren Can, Rachit Gupta, Javier Arredondo, Gary Avendano and Sara Fernandez

    Contact

    You are welcome to contact Eduardo Fonseca should you have any questions, at efonseca@google.com.

    ABOUT FSD50K

    Freesound Dataset 50k (or FSD50K for short) is an open dataset of human-labeled sound events containing 51,197 Freesound clips unequally distributed in 200 classes drawn from the AudioSet Ontology [1]. FSD50K has been created at the Music Technology Group of Universitat Pompeu Fabra.

    What follows is a brief summary of FSD50K's most important characteristics. Please have a look at our paper (especially Section 4) to extend the basic information provided here with relevant details for its usage, as well as discussion, limitations, applications and more.

    Basic characteristics:

    FSD50K contains 51,197 audio clips from Freesound, totalling 108.3 hours of multi-labeled audio

    The dataset encompasses 200 sound classes (144 leaf nodes and 56 intermediate nodes) hierarchically organized with a subset of the AudioSet Ontology.

    The audio content is composed mainly of sound events produced by physical sound sources and production mechanisms, including human sounds, sounds of things, animals, natural sounds, musical instruments and more. The vocabulary can be inspected in vocabulary.csv (see Files section below).

    The acoustic material has been manually labeled by humans following a data labeling process using the Freesound Annotator platform [2].

    Clips are of variable length from 0.3 to 30s, due to the diversity of the sound classes and the preferences of Freesound users when recording sounds.

    All clips are provided as uncompressed PCM 16 bit 44.1 kHz mono audio files.

    Ground truth labels are provided at the clip-level (i.e., weak labels).

    The dataset poses mainly a large-vocabulary multi-label sound event classification problem, but also allows development and evaluation of a variety of machine listening approaches (see Sec. 4D in our paper).

    In addition to audio clips and ground truth, additional metadata is made available (including raw annotations, sound predominance ratings, Freesound metadata, and more), allowing a variety of analyses and sound event research tasks (see Files section below).

    The audio clips are grouped into a development (dev) set and an evaluation (eval) set such that they do not have clips from the same Freesound uploader.

    Dev set:

    40,966 audio clips totalling 80.4 hours of audio

    Avg duration/clip: 7.1s

    114,271 smeared labels (i.e., labels propagated in the upwards direction to the root of the ontology)

    Labels are correct but could be occasionally incomplete

    A train/validation split is provided (Sec. 3H). If a different split is used, it should be specified for reproducibility and fair comparability of results (see Sec. 5C of our paper)

    Eval set:

    10,231 audio clips totalling 27.9 hours of audio

    Avg duration/clip: 9.8s

    38,596 smeared labels

    Eval set is labeled exhaustively (labels are correct and complete for the considered vocabulary)

    Note: All classes in FSD50K are represented in AudioSet, except Crash cymbal, Human group actions, Human voice, Respiratory sounds, and Domestic sounds, home sounds.

    LICENSE

    All audio clips in FSD50K are released under Creative Commons (CC) licenses. Each clip has its own license as defined by the clip uploader in Freesound, some of them requiring attribution to their original authors and some forbidding further commercial reuse. Specifically:

    The development set consists of 40,966 clips with the following licenses:

    CC0: 14,959

    CC-BY: 20,017

    CC-BY-NC: 4616

    CC Sampling+: 1374

    The evaluation set consists of 10,231 clips with the following licenses:

    CC0: 4914

    CC-BY: 3489

    CC-BY-NC: 1425

    CC Sampling+: 403

    For attribution purposes and to facilitate attribution of these files to third parties, we include a mapping from the audio clips to their corresponding licenses. The licenses are specified in the files dev_clips_info_FSD50K.json and eval_clips_info_FSD50K.json.

    In addition, FSD50K as a whole is the result of a curation process and it has an additional license: FSD50K is released under CC-BY. This license is specified in the LICENSE-DATASET file downloaded with the FSD50K.doc zip file. We note that the choice of one license for the dataset as a whole is not straightforward as it comprises items with different licenses (such as audio clips, annotations, or data split). The choice of a global license in these cases may warrant further investigation (e.g., by someone with a background in copyright law).

    Usage of FSD50K for commercial purposes:

    If you'd like to use FSD50K for commercial purposes, please contact Eduardo Fonseca and Frederic Font at efonseca@google.com and frederic.font@upf.edu.

    Also, if you are interested in using FSD50K for machine learning competitions, please contact Eduardo Fonseca and Frederic Font at efonseca@google.com and frederic.font@upf.edu.

    FILES

    FSD50K can be downloaded as a series of zip files with the following directory structure:

    root │
    └───FSD50K.dev_audio/ Audio clips in the dev set │
    └───FSD50K.eval_audio/ Audio clips in the eval set │
    └───FSD50K.ground_truth/ Files for FSD50K's ground truth │ │
    │ └─── dev.csv Ground truth for the dev set │ │
    │ └─── eval.csv Ground truth for the eval set
    │ │
    │ └─── vocabulary.csv List of 200 sound classes in FSD50K │
    └───FSD50K.metadata/ Files for additional metadata │ │
    │ └─── class_info_FSD50K.json Metadata about the sound classes │ │
    │ └─── dev_clips_info_FSD50K.json Metadata about the dev clips │ │
    │ └─── eval_clips_info_FSD50K.json Metadata about the eval clips │ │
    │ └─── pp_pnp_ratings_FSD50K.json PP/PNP ratings
    │ │
    │ └─── collection/ Files for the sound collection format

    └───FSD50K.doc/ │
    └───README.md The dataset description file that you are reading │
    └───LICENSE-DATASET License of the FSD50K dataset as an entity

    Each row (i.e. audio clip) of dev.csv contains the following information:

    fname: the file name without the .wav extension, e.g., the fname 64760 corresponds to the file 64760.wav in disk. This number is the Freesound id. We always use Freesound ids as filenames.

    labels: the class labels (i.e., the ground truth). Note these class labels are smeared, i.e., the labels have been propagated in the upwards direction to the root of the ontology. More details about the label smearing process can be found in Appendix D of our paper.

    mids: the Freebase identifiers corresponding to the class labels, as defined in the AudioSet Ontology specification

    split: whether the clip belongs to train or val (see paper for details on the proposed split)

    Rows in eval.csv follow the same format, except that there is no split column.

    Note: We use a slightly different format than AudioSet for the naming of class labels in order to avoid potential problems with spaces, commas, etc. Example: we use Accelerating_and_revving_and_vroom instead of the original Accelerating, revving, vroom. You can go back to the original AudioSet naming using the information provided in vocabulary.csv (class label and mid for the 200 classes of FSD50K) and the AudioSet Ontology specification.

    Files with additional metadata (FSD50K.metadata/)

    To allow a variety of analysis and approaches with FSD50K, we provide the following metadata:

    class_info_FSD50K.json: python dictionary where each entry corresponds to one sound class and contains: FAQs utilized during the annotation of the class, examples (representative audio clips), and verification_examples (audio clips presented to raters during annotation as a quality control mechanism). Audio clips are described by the Freesound id. Note: It may be that some of these examples are not included in the FSD50K release.

    dev_clips_info_FSD50K.json: python dictionary where each entry corresponds to one dev clip and contains: title,

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
National Institute of Standards and Technology (2022). Active Evaluation Software for Selection of Ground Truth Labels [Dataset]. https://catalog.data.gov/dataset/active-evaluation-software-for-selection-of-ground-truth-labels-d0581
Organization logo

Active Evaluation Software for Selection of Ground Truth Labels

Explore at:
Dataset updated
Jul 29, 2022
Dataset provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
Description

This software repository contains a python package Aegis (Active Evaluator Germane Interactive Selector) package that allows us to evaluate machine learning systems's performance (according to a metric such as accuracy) by adaptively sampling trials to label from an unlabeled test set to minimize the number of labels needed. This includes sample (public) data as well as a simulation script that tests different label-selecting strategies on already labelled test sets. This software is configured so that users can add their own data and system outputs to test evaluation.

Search
Clear search
Close search
Google apps
Main menu