100+ datasets found

EMG from Combination Gestures with Ground-truth Joystick Labels
zenodo.org
bin, zip
Updated Jan 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Niklas Smedemark-Margulies; Yunus Bicer; Elifnur Sunger; Stephanie Naufel; Tales Imbiriba; Eugene Tunik; Deniz Erdogmus; Mathew Yarossi; Niklas Smedemark-Margulies; Yunus Bicer; Elifnur Sunger; Stephanie Naufel; Tales Imbiriba; Eugene Tunik; Deniz Erdogmus; Mathew Yarossi (2024). EMG from Combination Gestures with Ground-truth Joystick Labels [Dataset]. http://doi.org/10.5281/zenodo.10393194
Explore at:
bin, zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10393194
Dataset updated
Jan 4, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Niklas Smedemark-Margulies; Yunus Bicer; Elifnur Sunger; Stephanie Naufel; Tales Imbiriba; Eugene Tunik; Deniz Erdogmus; Mathew Yarossi; Niklas Smedemark-Margulies; Yunus Bicer; Elifnur Sunger; Stephanie Naufel; Tales Imbiriba; Eugene Tunik; Deniz Erdogmus; Mathew Yarossi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset of surface EMG recordings from 11 subjects performing single and combination gestures, from "**A Multi-label Classification Approach to Increase Expressivity of EMG-based Gesture Recognition**" by Niklas Smedemark-Margulies, Yunus Bicer, Elifnur Sunger, Stephanie Naufel, Tales Imbiriba, Eugene Tunik, Deniz Erdogmus, and Mathew Yarossi.

For more details and example usage, see the following:

Paper pdf - https://arxiv.org/pdf/2309.12217.pdf

Experiment code - https://github.com/neu-spiral/multi-label-emg

Contents

Dataset of single and combination gestures from 11 subjects.
Subjects participated in 13 experimental blocks.
During each block, they followed visual prompts to perform gestures while also manipulating a joystick.
Surface EMG was recorded from 8 electrodes on the forearm; labels were recorded according to the current visual prompt and the current state of the joystick.

Experiments included the following blocks:

1 Calibration block

6 Simultaneous-Pulse Combination blocks (3 without feedback, 3 with feedback)

6 Hold-Pulse Combination blocks (3 without feedback, 3 with feedback)

The contents of each block type were as follows:

In the Calibration block, subjects performed 8 repetitions of each of the 4 direction gestures, 2 modifier gestures, and a resting pose.
Each Calibration trial provided 160 overlapping examples, for a total of: 8 repetitions x 7 gestures x 160 examples = 8960 examples.

In Simultaneous-Pulse Combination blocks, subjects performed 8 trials of combination gestures, where both components were performed simultaneously.
Each Simultaneous-Pulse trial provided 240 overlapping examples, for a total of: 8 trials x 240 examples = 1920 examples.

In Hold-Pulse Combination blocks, subjects performed 28 trials of combination gestures, where 1 gesture component was held while the other was pulsed.
Each Hold-Pulse trial provided 240 overlapping examples, for a total of: 28 trials x 240 examples = 6720 examples.

A single data example (from any block) corresponds a window 250ms of EMG recorded at 1926Hz (built-in 20–450 Hz bandpass filtering applied).
A 50ms step size was used between each window; note that neighboring data examples are therefore overlapping.

Feedback was provided as follows:

In blocks with feedback, a model pre-trained on the Calibration data was used to give realtime visual feedback during the trial.

In blocks without feedback, no model was used, and the visual prompt was the only source of information about the current gesture.

For more details, see the paper.

Labels

Two types of labels are provided:

joystick labels were recorded based on the position of the joystick, and are treated as ground-truth.

visual labels were also recorded based on what prompt was currently being shown to the subject.

For both joystick and visual labels, the following structure applies. Each gesture trial has a two-part label.

The first label component describes the direction gesture, and takes values in {0, 1, 2, 3, 4}, with the following meaning:

0 - "Up" (joystick pull)

1 - "Down" (joystick push)

2 - "Left" (joystick left)

3 - "Right" (joystick right)

4 - "NoDirection" (absence of a direction gesture; none of the above)

The second label component describes the modifier gesture, and takes values in {0, 1, 2}, with the following meaning:

0 - "Pinch" (joystick trigger button)

1 - "Thumb" (joystick thumb button)

2 - "NoModifier" (absence of a modifier gesture; none of the above)

Examples of Label Structure

Single gestures have labels like (0, 2) indicating ("Up", "NoModifier") or (4, 1) indicating ("NoDirection", "Thumb").

Combination gesture have labels like (0, 0) indicating ("Up", "Pinch") or (2, 1) indicating ("Left", "Thumb").

File layout

Data are provided in Numpy and MATLAB format. Descriptions below apply for both.

Each experimental block is provided in a separate folder.
Within one experimental block, the following files are provided:

`data.npy` - Raw EMG data, with shape (items, channels, timesteps).

`joystick_direction_labels.npy` - one-hot joystick direction labels, with shape (items, 5).

`joystick_modifier_labels.npy` - one-hot joystick modifier labels, with shape (items, 3).

`visual_direction_labels.npy` - one-hot visual direction labels, with shape (items, 5).

`visual_modifier_labels.npy` - one-hot visual modifier labels, with shape (items, 3).

Loading data

For example code snippets for loading data, see the associated code repository.
P
INRIA Aerial Image Labeling Dataset
library.toponeai.link
paperswithcode.com
Updated Dec 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Karen Simonyan; Andrew Zisserman (2024). INRIA Aerial Image Labeling Dataset [Dataset]. https://library.toponeai.link/dataset/inria-aerial-image-labeling
Explore at:
Dataset updated
Dec 9, 2024
Authors
Karen Simonyan; Andrew Zisserman
Description
The INRIA Aerial Image Labeling dataset is comprised of 360 RGB tiles of 5000×5000px with a spatial resolution of 30cm/px on 10 cities across the globe. Half of the cities are used for training and are associated to a public ground truth of building footprints. The rest of the dataset is used only for evaluation with a hidden ground truth. The dataset was constructed by combining public domain imagery and public domain official building footprints.
f
Semi-Automated Ground Truth Labels Image Dataset
figshare.com
zip
Updated Jun 4, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gianluca Pegoraro; George Zaki (2020). Semi-Automated Ground Truth Labels Image Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.12071580.v2
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12071580.v2
Dataset updated
Jun 4, 2020
Dataset provided by
figshare
Authors
Gianluca Pegoraro; George Zaki
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This dataset contains the 16 bit of the semi-automatically generated ground truth labels for the nuclei that were used both in training (Labelled as "Original") or inference (Labelled as "Biological" or "Technical) for the MRCNN and FPN2-WS networks
Manual Ground Truth Labels Image Dataset
figshare.com
zip
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gianluca Pegoraro; George Zaki (2023). Manual Ground Truth Labels Image Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.12430085.v2
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12430085.v2
Dataset updated
Jun 1, 2023
Dataset provided by
figshare
Authors
Gianluca Pegoraro; George Zaki
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This dataset contains the 16 bit of the manually annotated ground truth labels for the nuclei that were used both in training (Labelled as "Original") or inference (Labelled as "Biological" or "Technical) for the MRCNN and FPN2-WS networks
a
Inria Aerial Image Labeling Dataset
academictorrents.com
bittorrent
Updated Apr 27, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Emmanuel Maggiori and Yuliya Tarabalka and Guillaume Charpiat and Pierre Alliez (2019). Inria Aerial Image Labeling Dataset [Dataset]. https://academictorrents.com/details/cf445f6073540af0803ee345f46294f088e7bba5
Explore at:
bittorrent(20957265875)Available download formats
Dataset updated
Apr 27, 2019
Dataset authored and provided by
Emmanuel Maggiori and Yuliya Tarabalka and Guillaume Charpiat and Pierre Alliez
License
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
Description
The Inria Aerial Image Labeling addresses a core topic in remote sensing: the automatic pixelwise labeling of aerial imagery. Dataset features: Coverage of 810 km² (405 km² for training and 405 km² for testing) Aerial orthorectified color imagery with a spatial resolution of 0.3 m Ground truth data for two semantic classes: building and not building (publicly disclosed only for the training subset) The images cover dissimilar urban settlements, ranging from densely populated areas (e.g., San Francisco’s financial district) to alpine towns (e.g,. Lienz in Austrian Tyrol). Instead of splitting adjacent portions of the same images into the training and test subsets, different cities are included in each of the subsets. For example, images over Chicago are included in the training set (and not on the test set) and images over San Francisco are included on the test set (and not on the training set). The ultimate goal of this dataset is to assess the generalization power of the techniqu
Z
Apple CT Data: Ground truth reconstructions - 6 of 6
data.niaid.nih.gov
zenodo.org
Updated Mar 4, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ganguly, Poulami Somanya (2021). Apple CT Data: Ground truth reconstructions - 6 of 6 [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_4576259
Explore at:
Dataset updated
Mar 4, 2021
Dataset provided by
Coban, Sophia Bethany
Ganguly, Poulami Somanya
Andriiashen, Vladyslav
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Summary This submission is a supplementary material to the article [Coban 2020b]. As part of the manuscript, we release three simulated parallel-beam tomographic datasets of 94 apples with internal defects, the ground truth reconstructions and two defect label files.

Description This Zenodo upload contains the ground truth reconstructed slices for each apple. In total, there are 72192 reconstructed slices, which have been divided into 6 separate submissions:

ground_truths_1.zip (1 of 6): 10.5281/zenodo.4550729

ground_truths_2.zip (2 of 6): 10.5281/zenodo.4575904

ground_truths_3.zip (3 of 6): 10.5281/zenodo.4576078

ground_truths_4.zip (4 of 6): 10.5281/zenodo.4576122

ground_truths_5.zip (5 of 6): 10.5281/zenodo.4576202

ground_truths_6.zip (6 of 6): 10.5281/zenodo.4576260 (this upload)

The simulated parallel-beam datasets and defect label files are also available through this project, via a separate Zenodo upload: 10.5281/zenodo.4212301.

Apparatus The dataset is acquired using the custom-built and highly flexible CT scanner, FleX-ray Laboratory, developed by TESCAN-XRE, located at CWI in Amsterdam. This apparatus consists of a cone-beam microfocus X-ray point source that projects polychromatic X-rays onto a 1944-by-1536 pixels, 14-bit, flat detector panel. Full details can be found in [Coban 2020a].

Ground Truth Generation

We reconstructed the raw tomographic data, which was captured at sample resolution of 54.2µm over a 360 degrees in circular and continuous motion in a cone-beam setup. A total of 1200 projections were collected, which were distributed evenly over the full circle. The raw tomographic data is available upon request.

The ground truth reconstructed slices were generated based on Conjugate Gradient Least Squares (CGLS) reconstruction of each apple. The voxel grid in the reconstruction was 972px x 972px x 768px. The resolution in the ground truth reconstructions remained unchanged.

All ground truth reconstructed slices are in .tif format. Each file is named "appleNo_sliceNo.tif".

List of Contents The contents of the submission is given below.

ground_truths_6: This folder contains reconstructed slices of 16 apples

Additional Links These datasets are produced by the Computational Imaging group at Centrum Wiskunde & Informatica (CI-CWI). For any relevant Python/MATLAB scripts for the FleX-ray datasets, we refer the reader to our group's GitHub page.

Contact Details For more information or guidance in using these dataset, please get in touch with

s.b.coban [at] cwi.nl

vladyslav.andriiashen [at] cwi.nl

poulami.ganguly [at] cwi.nl

Acknowledgments We acknowledge GREEFA for supplying the apples and further discussions.
Z
Ground Truth for DCASE 2020 Challenge Task 2 Evaluation Dataset
data.niaid.nih.gov
explore.openaire.eu
Updated May 24, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Takashi Endo (2022). Ground Truth for DCASE 2020 Challenge Task 2 Evaluation Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3951616
Explore at:
Dataset updated
May 24, 2022
Dataset provided by
Yuki Nikaido
Noboru Harada
Keisuke Imoto
Ryo Tanabe
Takashi Endo
Toshiki Nakamura
Masahito Yasuda
Yuma Koizumi
Kaori Suefusa
Yohei Kawaguchi
Harsh Purohit
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Description

This data is the ground truth for the "evaluation dataset" for the DCASE 2020 Challenge Task 2 "Unsupervised Detection of Anomalous Sounds for Machine Condition Monitoring" [task description].

In the task, three datasets have been released: "development dataset", "additional training dataset", and "evaluation dataset". The evaluation dataset was the last of the three released and includes around 400 samples for each Machine Type and Machine ID used in the evaluation dataset, none of which have any condition label (i.e., normal or anomaly). This ground truth data contains the condition labels.

Data format

The ground truth data is a CSV file like the following:

fan id_01_00000000.wav,normal_id_01_00000098.wav,0 id_01_00000001.wav,anomaly_id_01_00000064.wav,1 ...

id_05_00000456.wav,anomaly_id_05_00000033.wav,1 id_05_00000457.wav,normal_id_05_00000049.wav,0 pump id_01_00000000.wav,anomaly_id_01_00000049.wav,1 id_01_00000001.wav,anomaly_id_01_00000039.wav,1 ...

id_05_00000346.wav,anomaly_id_05_00000052.wav,1 id_05_00000347.wav,anomaly_id_05_00000080.wav,1 slider id_01_00000000.wav,anomaly_id_01_00000035.wav,1 id_01_00000001.wav,anomaly_id_01_00000176.wav,1 ...

"Fan", "pump", "slider", etc mean "Machine Type" names. The lines following a Machine Type correspond to pairs of a wave file in the Machine Type and a condition label. The first column shows the name of a wave file. The second column shows the original name of the wave file, but this can be ignored by users. The third column shows the condition label (i.e., 0: normal or 1: anomaly).

How to use

A system for calculating AUC and pAUC scores for the "evaluation dataset" is available on the Github repository [URL]. The ground truth data is used by this system. For more information, please see the Github repository.

Conditions of use

This dataset was created jointly by NTT Corporation and Hitachi, Ltd. and is available under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.

Publication

If you use this dataset, please cite all the following three papers:

Yuma Koizumi, Shoichiro Saito, Noboru Harada, Hisashi Uematsu, and Keisuke Imoto, "ToyADMOS: A Dataset of Miniature-Machine Operating Sounds for Anomalous Sound Detection," in Proc. of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2019. [pdf]

Harsh Purohit, Ryo Tanabe, Kenji Ichige, Takashi Endo, Yuki Nikaido, Kaori Suefusa, and Yohei Kawaguchi, “MIMII Dataset: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection,” in Proc. 4th Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2019. [pdf]

Yuma Koizumi, Yohei Kawaguchi, Keisuke Imoto, Toshiki Nakamura, Yuki Nikaido, Ryo Tanabe, Harsh Purohit, Kaori Suefusa, Takashi Endo, Masahiro Yasuda, and Noboru Harada, "Description and Discussion on DCASE2020 Challenge Task2: Unsupervised Anomalous Sound Detection for Machine Condition Monitoring," in Proc. 5th Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2020. [pdf]

Feedback

If there is any problem, please contact us:

Yuma Koizumi, koizumi.yuma@ieee.org

Yohei Kawaguchi, yohei.kawaguchi.xk@hitachi.com

Keisuke Imoto, keisuke.imoto@ieee.org
P
MOTIF Dataset
paperswithcode.com
Updated Feb 3, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). MOTIF Dataset [Dataset]. https://paperswithcode.com/dataset/motif
Explore at:
Dataset updated
Feb 3, 2025
Description
Click to add a brief description of the dataset (Markdown and LaTeX enabled).

Provide:

a high-level explanation of the dataset characteristics explain motivations and summary of its content potential use cases of the dataset
Phase labels of scans.
plos.figshare.com
xls
Updated Feb 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Siddharth Guha; Abdalla Ibrahim; Qian Wu; Pengfei Geng; Yen Chou; Hao Yang; Jingchen Ma; Lin Lu; Delin Wang; Lawrence H. Schwartz; Chuan-miao Xie; Binsheng Zhao (2024). Phase labels of scans. [Dataset]. http://doi.org/10.1371/journal.pone.0294581.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0294581.t002
Dataset updated
Feb 2, 2024
Dataset provided by
PLOShttp://plos.org/
Authors
Siddharth Guha; Abdalla Ibrahim; Qian Wu; Pengfei Geng; Yen Chou; Hao Yang; Jingchen Ma; Lin Lu; Delin Wang; Lawrence H. Schwartz; Chuan-miao Xie; Binsheng Zhao
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Contrast-enhanced computed tomography scans (CECT) are routinely used in the evaluation of different clinical scenarios, including the detection and characterization of hepatocellular carcinoma (HCC). Quantitative medical image analysis has been an exponentially growing scientific field. A number of studies reported on the effects of variations in the contrast enhancement phase on the reproducibility of quantitative imaging features extracted from CT scans. The identification and labeling of phase enhancement is a time-consuming task, with a current need for an accurate automated labeling algorithm to identify the enhancement phase of CT scans. In this study, we investigated the ability of machine learning algorithms to label the phases in a dataset of 59 HCC patients scanned with a dynamic contrast-enhanced CT protocol. The ground truth labels were provided by expert radiologists. Regions of interest were defined within the aorta, the portal vein, and the liver. Mean density values were extracted from those regions of interest and used for machine learning modeling. Models were evaluated using accuracy, the area under the curve (AUC), and Matthew’s correlation coefficient (MCC). We tested the algorithms on an external dataset (76 patients). Our results indicate that several supervised learning algorithms (logistic regression, random forest, etc.) performed similarly, and our developed algorithms can accurately classify the phase of contrast enhancement.
P
Data from: LabelMe Dataset
paperswithcode.com
Updated Mar 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bryan C. Russell; Antonio Torralba; Kevin P. Murphy; William T. Freeman (2023). LabelMe Dataset [Dataset]. https://paperswithcode.com/dataset/labelme
Explore at:
Dataset updated
Mar 26, 2023
Authors
Bryan C. Russell; Antonio Torralba; Kevin P. Murphy; William T. Freeman
Description
LabelMe database is a large collection of images with ground truth labels for object detection and recognition. The annotations come from two different sources, including the LabelMe online annotation tool.
P
PFD Dataset
paperswithcode.com
Updated Sep 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). PFD Dataset [Dataset]. https://paperswithcode.com/dataset/pfd
Explore at:
Dataset updated
Sep 26, 2023
Description
Recent progress in computer vision has been driven by high-capacity models trained on large datasets. Unfortunately, creating large datasets with pixel-level labels has been extremely costly due to the amount of human effort required. In this paper, we present an approach to rapidly creating pixel-accurate semantic label maps for images extracted from modern computer games. Although the source code and the internal operation of commercial games are inaccessible, we show that associations between image patches can be reconstructed from the communication between the game and the graphics hardware. This enables rapid propagation of semantic labels within and across images synthesized by the game, with no access to the source code or the content. We validate the presented approach by producing dense pixel-level semantic annotations for 25 thousand images synthesized by a photorealistic open-world computer game. Experiments on semantic segmentation datasets show that using the acquired data to supplement real-world images significantly increases accuracy and that the acquired data enables reducing the amount of hand-labeled real-world data: models trained with game data and just 1/3 of the CamVid training set outperform models trained on the complete CamVid training set.
Image Annotation Datasets
zenodo.org
bin
Updated Oct 15, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alshehri Abeer; Alshehri Abeer (2021). Image Annotation Datasets [Dataset]. http://doi.org/10.5281/zenodo.5570889
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5570889
Dataset updated
Oct 15, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Alshehri Abeer; Alshehri Abeer
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This folder contains four Image Annotation Datasets (ESPGame, IAPR-TC12, ImageCLEF 2011, ImagCLEF 2012). Each dataset has sub-folders of training images, testing images, ground truth, labels.

Moreover, labels are the limited number of labels the dataset could assign to an image. While the ground is the correct labeling for each image.
Patient characteristics.
plos.figshare.com
xls
Updated Feb 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Siddharth Guha; Abdalla Ibrahim; Qian Wu; Pengfei Geng; Yen Chou; Hao Yang; Jingchen Ma; Lin Lu; Delin Wang; Lawrence H. Schwartz; Chuan-miao Xie; Binsheng Zhao (2024). Patient characteristics. [Dataset]. http://doi.org/10.1371/journal.pone.0294581.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0294581.t001
Dataset updated
Feb 2, 2024
Dataset provided by
PLOShttp://plos.org/
Authors
Siddharth Guha; Abdalla Ibrahim; Qian Wu; Pengfei Geng; Yen Chou; Hao Yang; Jingchen Ma; Lin Lu; Delin Wang; Lawrence H. Schwartz; Chuan-miao Xie; Binsheng Zhao
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Contrast-enhanced computed tomography scans (CECT) are routinely used in the evaluation of different clinical scenarios, including the detection and characterization of hepatocellular carcinoma (HCC). Quantitative medical image analysis has been an exponentially growing scientific field. A number of studies reported on the effects of variations in the contrast enhancement phase on the reproducibility of quantitative imaging features extracted from CT scans. The identification and labeling of phase enhancement is a time-consuming task, with a current need for an accurate automated labeling algorithm to identify the enhancement phase of CT scans. In this study, we investigated the ability of machine learning algorithms to label the phases in a dataset of 59 HCC patients scanned with a dynamic contrast-enhanced CT protocol. The ground truth labels were provided by expert radiologists. Regions of interest were defined within the aorta, the portal vein, and the liver. Mean density values were extracted from those regions of interest and used for machine learning modeling. Models were evaluated using accuracy, the area under the curve (AUC), and Matthew’s correlation coefficient (MCC). We tested the algorithms on an external dataset (76 patients). Our results indicate that several supervised learning algorithms (logistic regression, random forest, etc.) performed similarly, and our developed algorithms can accurately classify the phase of contrast enhancement.
2D phase contrast HeLa cells images with ground truth annotations
zenodo.org
tiff, txt
Updated Aug 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chong Zhang; Chong Zhang (2024). 2D phase contrast HeLa cells images with ground truth annotations [Dataset]. http://doi.org/10.5281/zenodo.344888
Explore at:
tiff, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.344888
Dataset updated
Aug 3, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Chong Zhang; Chong Zhang
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
Original images are from http://www.robots.ox.ac.uk/~vgg/software/cell_detection/. This software is associated with the publication "Learning to Detect Cells Using Non-overlapping Extremal Regions", MICCAI 2012. (DOI: 10.1007/978-3-642-33415-3_43)

Here, we provide the ground truth labels of: cell centers and segmentation, which are used in the publications:

"Learning to Segment: Training Hierarchical Segmentation under a Topological Loss", MICCAI 2015. (DOI: 10.1007/978-3-319-24574-4_32)

"Cell Detection and Segmentation Using Correlation Clustering", MICCAI 2014. (DOI: 10.1007/978-3-319-10404-1_2)
Apple CT Data: Ground truth reconstructions - 3 of 6
zenodo.org
data.niaid.nih.gov
zip
Updated Mar 4, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sophia Bethany Coban; Sophia Bethany Coban; Vladyslav Andriiashen; Vladyslav Andriiashen; Poulami Somanya Ganguly; Poulami Somanya Ganguly (2021). Apple CT Data: Ground truth reconstructions - 3 of 6 [Dataset]. http://doi.org/10.5281/zenodo.4576078
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4576078
Dataset updated
Mar 4, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Sophia Bethany Coban; Sophia Bethany Coban; Vladyslav Andriiashen; Vladyslav Andriiashen; Poulami Somanya Ganguly; Poulami Somanya Ganguly
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Summary
This submission is a supplementary material to the article [Coban 2020b]. As part of the manuscript, we release three simulated parallel-beam tomographic datasets of 94 apples with internal defects, the ground truth reconstructions and two defect label files.

Description
This Zenodo upload contains the ground truth reconstructed slices for each apple. In total, there are 72192 reconstructed slices, which have been divided into 6 separate submissions:

ground_truths_1.zip (1 of 6): 10.5281/zenodo.4550729

ground_truths_2.zip (2 of 6): 10.5281/zenodo.4575904

ground_truths_3.zip (3 of 6): 10.5281/zenodo.4576078 (this upload)

ground_truths_4.zip (4 of 6): 10.5281/zenodo.4576122

ground_truths_5.zip (5 of 6): 10.5281/zenodo.4576202

ground_truths_6.zip (6 of 6): 10.5281/zenodo.4576260

The simulated parallel-beam datasets and defect label files are also available through this project, via a separate Zenodo upload: 10.5281/zenodo.4212301.

Apparatus
The dataset is acquired using the custom-built and highly flexible CT scanner, FleX-ray Laboratory, developed by TESCAN-XRE, located at CWI in Amsterdam. This apparatus consists of a cone-beam microfocus X-ray point source that projects polychromatic X-rays onto a 1944-by-1536 pixels, 14-bit, flat detector panel. Full details can be found in [Coban 2020a].

Ground Truth Generation

We reconstructed the raw tomographic data, which was captured at sample resolution of 54.2µm over a 360 degrees in circular and continuous motion in a cone-beam setup. A total of 1200 projections were collected, which were distributed evenly over the full circle. The raw tomographic data is available upon request.

The ground truth reconstructed slices were generated based on Conjugate Gradient Least Squares (CGLS) reconstruction of each apple. The voxel grid in the reconstruction was 972px x 972px x 768px. The resolution in the ground truth reconstructions remained unchanged.

All ground truth reconstructed slices are in .tif format. Each file is named "appleNo_sliceNo.tif".

List of Contents
The contents of the submission is given below.

ground_truths_3: This folder contains reconstructed slices of 16 apples

Additional Links
These datasets are produced by the Computational Imaging group at Centrum Wiskunde & Informatica (CI-CWI). For any relevant Python/MATLAB scripts for the FleX-ray datasets, we refer the reader to our group's GitHub page.

Contact Details
For more information or guidance in using these dataset, please get in touch with

s.b.coban [at] cwi.nl

vladyslav.andriiashen [at] cwi.nl

poulami.ganguly [at] cwi.nl

Acknowledgments
We acknowledge GREEFA for supplying the apples and further discussions.
m
MATLAB source code for developing Ground Truth Dataset, Semantic...
data.mendeley.com
Updated Apr 3, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sud Sudirman (2019). MATLAB source code for developing Ground Truth Dataset, Semantic Segmentation, and Evaluation for the Lumbar Spine MRI Dataset [Dataset]. http://doi.org/10.17632/8cp2cp7km8.2
Explore at:
Unique identifier
https://doi.org/10.17632/8cp2cp7km8.2
Dataset updated
Apr 3, 2019
Authors
Sud Sudirman
License
http://www.gnu.org/licenses/gpl-3.0.en.htmlhttp://www.gnu.org/licenses/gpl-3.0.en.html
Description
This file contains the MATLAB source code for developing Ground Truth Dataset, Semantic Segmentation, and Evaluation for Lumbar Spine MRI Dataset. It has the file structure necessary for the execution of the code. Please download the MRI Dataset, and the Label Image Ground Truth Data for Lumbar Spine MRI Dataset separately and unzip them inside the same installation folder. Links to their repository are provided in the Related Links section below.

Please refer to instruction.docx file included in the package on how to install and use this source code.

You can download and read the research papers detailing our methodology on boundary delineation for lumbar spinal stenosis detection using the URLs provided in the Related Links at the end of this page. You can also check out other dataset related to this program from that section.

We kindly request you to cite our papers when using our data or program in your research.
IIIT5K-Words
kaggle.com
Updated May 12, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Prathamesh Zade (2023). IIIT5K-Words [Dataset]. http://doi.org/10.34740/kaggle/dsv/5671242
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/5671242
Dataset updated
May 12, 2023
Dataset provided by
Kaggle
Authors
Prathamesh Zade
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
IIIT5K-Words

The IIIT5K Words Dataset is a comprehensive collection of labeled word images, curated by the International Institute of Information Technology, Hyderabad (IIIT-H). It is designed to facilitate research and development in optical character recognition (OCR), word recognition, and related fields.

The dataset contains a diverse set of 5,000 word images, covering various fonts, styles, and sizes. Each word image represents a single English word and is accompanied by its corresponding ground truth label, providing accurate transcription for training and evaluation purposes.

Please refer: IIIT5K-Words official site

Note: In order to view mat files use this code

install requirements

!pip install shutil pymatreader

unzip the zip file

import shutil

shutil.unpack_archive('IIIT5K-Word_V3.0.tar.gz', 'data')

view mat files

from pymatreader import read_mat

testdata_mat = read_mat('testdata.mat')

testCharBound_mat = read_mat('testCharBound.mat')

testdata_mat

Key Features: - Size: The dataset comprises 5,000 word images, making it suitable for training and evaluating OCR algorithms. - Diversity: The dataset encompasses a wide range of fonts, styles, and sizes to ensure the inclusion of various challenges encountered in real-world scenarios. - Ground Truth Labels: Each word image is paired with its ground truth label, enabling supervised learning approaches and facilitating evaluation metrics calculation. - Quality Annotation: The dataset has been carefully curated by experts at IIIT-H, ensuring high-quality annotations and accurate transcription of the word images. - Research Applications: The dataset serves as a valuable resource for OCR, word recognition, text detection, and related research areas.

Potential Use Cases: - Optical Character Recognition (OCR) Systems: The dataset can be employed to train and benchmark OCR models, improving their accuracy and robustness. - Word Recognition Algorithms: Researchers can utilize the dataset to develop and evaluate word recognition algorithms, including deep learning-based approaches. - Text Detection: The dataset can aid in the development and evaluation of algorithms for text detection in natural scenes. - Font and Style Analysis: Researchers can leverage the dataset to study font and style variations, character segmentation, and other related analyses.

Citation:

@InProceedings{MishraBMVC12, author = "Mishra, A. and Alahari, K. and Jawahar, C.~V.", title = "Scene Text Recognition using Higher Order Language Priors", booktitle = "BMVC", year = "2012", }
h
getting-started-labeled-photos
huggingface.co
Updated Jan 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Voxel51 (2025). getting-started-labeled-photos [Dataset]. https://huggingface.co/datasets/Voxel51/getting-started-labeled-photos
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 3, 2025
Dataset authored and provided by
Voxel51
Description
Dataset Card for predicted_labels

These photos are used in the FiftyOne getting started webinar. The images have a prediction label where were generated by self-supervised classification through a OpenClip Model. https://github.com/thesteve0/fiftyone-getting-started/blob/main/5_generating_labels.py They were then manually cleaned to produce the ground truth label. https://github.com/thesteve0/fiftyone-getting-started/blob/main/6_clean_labels.md They are 300 public domain photos… See the full description on the dataset page: https://huggingface.co/datasets/Voxel51/getting-started-labeled-photos.
f
The 95% confidence intervals of accuracy and MCC of the supervised learning...
plos.figshare.com
xls
Updated Feb 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Siddharth Guha; Abdalla Ibrahim; Qian Wu; Pengfei Geng; Yen Chou; Hao Yang; Jingchen Ma; Lin Lu; Delin Wang; Lawrence H. Schwartz; Chuan-miao Xie; Binsheng Zhao (2024). The 95% confidence intervals of accuracy and MCC of the supervised learning models for the main dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0294581.t004
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0294581.t004
Dataset updated
Feb 2, 2024
Dataset provided by
PLOS ONE
Authors
Siddharth Guha; Abdalla Ibrahim; Qian Wu; Pengfei Geng; Yen Chou; Hao Yang; Jingchen Ma; Lin Lu; Delin Wang; Lawrence H. Schwartz; Chuan-miao Xie; Binsheng Zhao
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The 95% confidence intervals of accuracy and MCC of the supervised learning models for the main dataset.
The 95% confidence intervals of accuracy and MCC of the supervised learning...
plos.figshare.com
xls
Updated Feb 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Siddharth Guha; Abdalla Ibrahim; Qian Wu; Pengfei Geng; Yen Chou; Hao Yang; Jingchen Ma; Lin Lu; Delin Wang; Lawrence H. Schwartz; Chuan-miao Xie; Binsheng Zhao (2024). The 95% confidence intervals of accuracy and MCC of the supervised learning models for the external dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0294581.t006
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0294581.t006
Dataset updated
Feb 2, 2024
Dataset provided by
PLOShttp://plos.org/
Authors
Siddharth Guha; Abdalla Ibrahim; Qian Wu; Pengfei Geng; Yen Chou; Hao Yang; Jingchen Ma; Lin Lu; Delin Wang; Lawrence H. Schwartz; Chuan-miao Xie; Binsheng Zhao
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The 95% confidence intervals of accuracy and MCC of the supervised learning models for the external dataset.

Facebook

Twitter

Click to copy link

Link copied

Cite

Niklas Smedemark-Margulies; Yunus Bicer; Elifnur Sunger; Stephanie Naufel; Tales Imbiriba; Eugene Tunik; Deniz Erdogmus; Mathew Yarossi; Niklas Smedemark-Margulies; Yunus Bicer; Elifnur Sunger; Stephanie Naufel; Tales Imbiriba; Eugene Tunik; Deniz Erdogmus; Mathew Yarossi (2024). EMG from Combination Gestures with Ground-truth Joystick Labels [Dataset]. http://doi.org/10.5281/zenodo.10393194

EMG from Combination Gestures with Ground-truth Joystick Labels

Explore at:

bin, zipAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.10393194

Dataset updated

Jan 4, 2024

Dataset provided by

Zenodohttp://zenodo.org/

Authors

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Dataset of surface EMG recordings from 11 subjects performing single and combination gestures, from "**A Multi-label Classification Approach to Increase Expressivity of EMG-based Gesture Recognition**" by Niklas Smedemark-Margulies, Yunus Bicer, Elifnur Sunger, Stephanie Naufel, Tales Imbiriba, Eugene Tunik, Deniz Erdogmus, and Mathew Yarossi.

For more details and example usage, see the following:

Paper pdf - https://arxiv.org/pdf/2309.12217.pdf
Experiment code - https://github.com/neu-spiral/multi-label-emg

Dataset of single and combination gestures from 11 subjects.
Subjects participated in 13 experimental blocks.
During each block, they followed visual prompts to perform gestures while also manipulating a joystick.
Surface EMG was recorded from 8 electrodes on the forearm; labels were recorded according to the current visual prompt and the current state of the joystick.

Experiments included the following blocks:

1 Calibration block
6 Simultaneous-Pulse Combination blocks (3 without feedback, 3 with feedback)
6 Hold-Pulse Combination blocks (3 without feedback, 3 with feedback)

The contents of each block type were as follows:

In the Calibration block, subjects performed 8 repetitions of each of the 4 direction gestures, 2 modifier gestures, and a resting pose.
Each Calibration trial provided 160 overlapping examples, for a total of: 8 repetitions x 7 gestures x 160 examples = 8960 examples.
In Simultaneous-Pulse Combination blocks, subjects performed 8 trials of combination gestures, where both components were performed simultaneously.
Each Simultaneous-Pulse trial provided 240 overlapping examples, for a total of: 8 trials x 240 examples = 1920 examples.
In Hold-Pulse Combination blocks, subjects performed 28 trials of combination gestures, where 1 gesture component was held while the other was pulsed.
Each Hold-Pulse trial provided 240 overlapping examples, for a total of: 28 trials x 240 examples = 6720 examples.

A single data example (from any block) corresponds a window 250ms of EMG recorded at 1926Hz (built-in 20–450 Hz bandpass filtering applied).
A 50ms step size was used between each window; note that neighboring data examples are therefore overlapping.

Feedback was provided as follows:

In blocks with feedback, a model pre-trained on the Calibration data was used to give realtime visual feedback during the trial.
In blocks without feedback, no model was used, and the visual prompt was the only source of information about the current gesture.

For more details, see the paper.

Labels

Two types of labels are provided:

joystick labels were recorded based on the position of the joystick, and are treated as ground-truth.
visual labels were also recorded based on what prompt was currently being shown to the subject.

For both joystick and visual labels, the following structure applies. Each gesture trial has a two-part label.

The first label component describes the direction gesture, and takes values in {0, 1, 2, 3, 4}, with the following meaning:

0 - "Up" (joystick pull)
1 - "Down" (joystick push)
2 - "Left" (joystick left)
3 - "Right" (joystick right)
4 - "NoDirection" (absence of a direction gesture; none of the above)

The second label component describes the modifier gesture, and takes values in {0, 1, 2}, with the following meaning:

0 - "Pinch" (joystick trigger button)
1 - "Thumb" (joystick thumb button)
2 - "NoModifier" (absence of a modifier gesture; none of the above)

Examples of Label Structure

Single gestures have labels like (0, 2) indicating ("Up", "NoModifier") or (4, 1) indicating ("NoDirection", "Thumb").

Combination gesture have labels like (0, 0) indicating ("Up", "Pinch") or (2, 1) indicating ("Left", "Thumb").

File layout

Data are provided in Numpy and MATLAB format. Descriptions below apply for both.

Each experimental block is provided in a separate folder.
Within one experimental block, the following files are provided:

`data.npy` - Raw EMG data, with shape (items, channels, timesteps).
`joystick_direction_labels.npy` - one-hot joystick direction labels, with shape (items, 5).
`joystick_modifier_labels.npy` - one-hot joystick modifier labels, with shape (items, 3).
`visual_direction_labels.npy` - one-hot visual direction labels, with shape (items, 5).
`visual_modifier_labels.npy` - one-hot visual modifier labels, with shape (items, 3).

Loading data

For example code snippets for loading data, see the associated code repository.

Clear search

Close search

Google apps

Main menu

EMG from Combination Gestures with Ground-truth Joystick Labels

Contents

Labels

Examples of Label Structure

File layout

Loading data

INRIA Aerial Image Labeling Dataset

Semi-Automated Ground Truth Labels Image Dataset

Manual Ground Truth Labels Image Dataset

Inria Aerial Image Labeling Dataset

Apple CT Data: Ground truth reconstructions - 6 of 6

Ground Truth for DCASE 2020 Challenge Task 2 Evaluation Dataset

MOTIF Dataset

Phase labels of scans.

Data from: LabelMe Dataset

PFD Dataset

Image Annotation Datasets

Patient characteristics.

2D phase contrast HeLa cells images with ground truth annotations

Apple CT Data: Ground truth reconstructions - 3 of 6

MATLAB source code for developing Ground Truth Dataset, Semantic...

IIIT5K-Words

IIIT5K-Words

getting-started-labeled-photos

The 95% confidence intervals of accuracy and MCC of the supervised learning...

The 95% confidence intervals of accuracy and MCC of the supervised learning...

EMG from Combination Gestures with Ground-truth Joystick Labels

Contents

Labels

Examples of Label Structure

File layout

Loading data