11 datasets found

D
Data Labeling Tools Report
datainsightsmarket.com
doc, pdf, ppt
Updated Jun 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2025). Data Labeling Tools Report [Dataset]. https://www.datainsightsmarket.com/reports/data-labeling-tools-1368998
Explore at:
doc, pdf, pptAvailable download formats
Dataset updated
Jun 19, 2025
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The Data Labeling Tools market is experiencing robust growth, driven by the escalating demand for high-quality training data in artificial intelligence (AI) and machine learning (ML) applications. The market's expansion is fueled by the increasing adoption of AI across various sectors, including automotive, healthcare, and finance, which necessitates vast amounts of accurately labeled data for model training and improvement. Technological advancements in automation and semi-supervised learning are streamlining the labeling process, improving efficiency and reducing costs, further contributing to market growth. A key trend is the shift towards more sophisticated labeling techniques, including 3D point cloud annotation and video annotation, reflecting the growing complexity of AI applications. Competition is fierce, with established players like Amazon Mechanical Turk and Google LLC coexisting with innovative startups offering specialized labeling solutions. The market is segmented by type of data labeling (image, text, video, audio), annotation method (manual, automated), and industry vertical, reflecting the diverse needs of different AI projects. Challenges include data privacy concerns, ensuring data quality and consistency, and the need for skilled annotators, which are all impacting the overall market growth, requiring continuous innovation and strategic investments to address these issues. Despite these challenges, the Data Labeling Tools market shows strong potential for continued expansion. The forecast period (2025-2033) anticipates a significant increase in market value, fueled by ongoing technological advancements, wider adoption of AI across various sectors, and a rising demand for high-quality data. The market is expected to witness increased consolidation as larger players acquire smaller companies to strengthen their market position and technological capabilities. Furthermore, the development of more sophisticated and automated labeling tools will continue to drive efficiency and reduce costs, making these tools accessible to a broader range of users and further fueling market growth. We anticipate that the focus on improving the accuracy and speed of data labeling will be paramount in shaping the future landscape of this dynamic market.
u
3D Microvascular Image Data and Labels for Machine Learning
rdr.ucl.ac.uk
bin
Updated Apr 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Natalie Holroyd; Claire Walsh; Emmeline Brown; Emma Brown; Yuxin Zhang; Carles Bosch Pinol; Simon Walker-Samuel (2024). 3D Microvascular Image Data and Labels for Machine Learning [Dataset]. http://doi.org/10.5522/04/25715604.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5522/04/25715604.v1
Dataset updated
Apr 30, 2024
Dataset provided by
University College London
Authors
Natalie Holroyd; Claire Walsh; Emmeline Brown; Emma Brown; Yuxin Zhang; Carles Bosch Pinol; Simon Walker-Samuel
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
These images and associated binary labels were collected from collaborators across multiple universities to serve as a diverse representation of biomedical images of vessel structures, for use in the training and validation of machine learning tools for vessel segmentation. The dataset contains images from a variety of imaging modalities, at different resolutions, using difference sources of contrast and featuring different organs/ pathologies. This data was use to train, test and validated a foundational model for 3D vessel segmentation, tUbeNet, which can be found on github. The paper descripting the training and validation of the model can be found here. Filenames are structured as follows: Data - [Modality]_[species Organ]_[resolution].tif Labels - [Modality]_[species Organ]_[resolution]_labels.tif Sub-volumes of larger dataset - [Modality]_[species Organ]_subvolume[dimensions in pixels].tif Manual labelling of blood vessels was carried out using Amira (2020.2, Thermo-Fisher, UK). Training data: opticalHREM_murineLiver_2.26x2.26x1.75um.tif: A high resolution episcopic microscopy (HREM) dataset, acquired in house by staining a healthy mouse liver with Eosin B and imaged using a standard HREM protocol. NB: 25% of this image volume was withheld from training, for use as test data. CT_murineTumour_20x20x20um.tif: X-ray microCT images of a microvascular cast, taken from a subcutaneous mouse model of colorectal cancer (acquired in house). NB: 25% of this image volume was withheld from training, for use as test data. RSOM_murineTumour_20x20um.tif: Raster-Scanning Optoacoustic Mesoscopy (RSOM) data from a subcutaneous tumour model (provided by Emma Brown, Bohndiek Group, University of Cambridge). The image data has undergone filtering to reduce the background (Brown et al., 2019). OCTA_humanRetina_24x24um.tif: retinal angiography data obtained using Optical Coherence Tomography Angiography (OCT-A) (provided by Dr Ranjan Rajendram, Moorfields Eye Hospital). Test data: MRI_porcineLiver_0.9x0.9x5mm.tif: T1-weighted Balanced Turbo Field Echo Magnetic Resonance Imaging (MRI) data from a machine-perfused porcine liver, acquired in-house. Test Data MFHREM_murineTumourLectin_2.76x2.76x2.61um.tif: a subcutaneous colorectal tumour mouse model was imaged in house using Multi-fluorescence HREM in house, with Dylight 647 conjugated lectin staining the vasculature (Walsh et al., 2021). The image data has been processed using an asymmetric deconvolution algorithm described by Walsh et al., 2020. NB: A sub-volume of 480x480x640 voxels was manually labelled (MFHREM_murineTumourLectin_subvolume480x480x640.tif). MFHREM_murineBrainLectin_0.85x0.85x0.86um.tif: an MF-HREM image of the cortex of a mouse brain, stained with Dylight-647 conjugated lectin, was acquired in house (Walsh et al., 2021). The image data has been downsampled and processed using an asymmetric deconvolution algorithm described by Walsh et al., 2020. NB: A sub-volume of 1000x1000x99 voxels was manually labelled. This sub-volume is provided at full resolution and without preprocessing (MFHREM_murineBrainLectin_subvol_0.57x0.57x0.86um.tif). 2Photon_murineOlfactoryBulbLectin_0.2x0.46x5.2um.tif: two-photon data of mouse olfactory bulb blood vessels, labelled with sulforhodamine 101, was kindly provided by Yuxin Zhang at the Sensory Circuits and Neurotechnology Lab, the Francis Crick Institute (Bosch et al., 2022). NB: A sub-volume of 500x500x79 voxel was manually labelled (2Photon_murineOlfactoryBulbLectin_subvolume500x500x79.tif). References: Bosch, C., Ackels, T., Pacureanu, A., Zhang, Y., Peddie, C. J., Berning, M., Rzepka, N., Zdora, M. C., Whiteley, I., Storm, M., Bonnin, A., Rau, C., Margrie, T., Collinson, L., & Schaefer, A. T. (2022). Functional and multiscale 3D structural investigation of brain tissue through correlative in vivo physiology, synchrotron microtomography and volume electron microscopy. Nature Communications 2022 13:1, 13(1), 1–16. https://doi.org/10.1038/s41467-022-30199-6 Brown, E., Brunker, J., & Bohndiek, S. E. (2019). Photoacoustic imaging as a tool to probe the tumour microenvironment. DMM Disease Models and Mechanisms, 12(7). https://doi.org/10.1242/DMM.039636 Walsh, C., Holroyd, N. A., Finnerty, E., Ryan, S. G., Sweeney, P. W., Shipley, R. J., & Walker-Samuel, S. (2021). Multifluorescence High-Resolution Episcopic Microscopy for 3D Imaging of Adult Murine Organs. Advanced Photonics Research, 2(10), 2100110. https://doi.org/10.1002/ADPR.202100110 Walsh, C., Holroyd, N., Shipley, R., & Walker-Samuel, S. (2020). Asymmetric Point Spread Function Estimation and Deconvolution for Serial-Sectioning Block-Face Imaging. Communications in Computer and Information Science, 1248 CCIS, 235–249. https://doi.org/10.1007/978-3-030-52791-4_19  
f
Immersive viewing experience self-assessment.
figshare.com
xls
Updated Oct 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eric Allen Jensen; Kalina Borkiewicz; Jill P. Naiman; Stuart Levy; Jeff Carpenter (2024). Immersive viewing experience self-assessment. [Dataset]. http://doi.org/10.1371/journal.pone.0307733.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0307733.t002
Dataset updated
Oct 17, 2024
Dataset provided by
PLOS ONE
Authors
Eric Allen Jensen; Kalina Borkiewicz; Jill P. Naiman; Stuart Levy; Jeff Carpenter
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Visualizing research data can be an important science communication tool. In recent decades, 3D data visualization has emerged as a key tool for engaging public audiences. Such visualizations are often embedded in scientific documentaries screened on giant domes in planetariums or delivered through video streaming services such as Amazon Prime. 3D data visualization has been shown to be an effective way to communicate complex scientific concepts to the public. With its ability to convey information in a scientifically accurate and visually engaging way, cinematic-style 3D data visualization has the potential to benefit millions of viewers by making scientific information more understandable and interesting. Maximizing the effectiveness of 3D data visualization can benefit millions of viewers. To support a wider shift in this professional field towards more evidence-based practice in 3D data visualization to enhance science communication impact, we have conducted a survey experiment comparing audience responses to two versions of 3D data visualizations from a scientific documentary film on the theme of ‘solar superstorms’ (n = 577). This study was conducted using a single (with two levels: labeled and unlabeled), between-subjects, factorial design. It reveals key strengths and weaknesses of communicating science using 3D data visualization. It also shows the limited power of strategically deployed informational labels to affect audience perceptions of the documentary film and its content. The major difference identified between experimental and control groups was that the quality ratings of the documentary film clip were significantly higher for the ‘labeled’ version. Other outcomes showed no statistically significant differences. The limited effects of informational labels point to the idea that other aspects, such as the story structure, voiceover narration and audio-visual content, are more important determinants of outcomes. This study concludes with a discussion of how this new research evidence informs our understanding of ‘what works and why’ with cinematic-style 3D data visualizations for the public.
Z
AI-derived annotations for the NLST and NSCLC-Radiomics computed tomography...
data.niaid.nih.gov
zenodo.org
Updated Jan 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Deepa Krishnaswamy (2024). AI-derived annotations for the NLST and NSCLC-Radiomics computed tomography imaging collections [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7473970
Explore at:
Dataset updated
Jan 22, 2024
Dataset provided by
Dennis Bontempi
Deepa Krishnaswamy
Andrey Fedorov
Hugo Aerts
David Clunie
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Public imaging datasets are critical for the development and evaluation of automated tools in cancer imaging. Unfortunately, many of the available datasets do not provide annotations of tumors or organs-at-risk, crucial for the assessment of these tools. This is due to the fact that annotation of medical images is time consuming and requires domain expertise. It has been demonstrated that artificial intelligence (AI) based annotation tools can achieve acceptable performance and thus can be used to automate the annotation of large datasets. As part of the effort to enrich the public data available within NCI Imaging Data Commons (IDC) (https://imaging.datacommons.cancer.gov/) [1], we introduce this dataset that consists of such AI-generated annotations for two publicly available medical imaging collections of Computed Tomography (CT) images of the chest. For detailed information concerning this dataset, please refer to our publication here [2].

We use publicly available pre-trained AI tools to enhance CT lung cancer collections that are unlabeled or partially labeled. The first tool is the nnU-Net deep learning framework [3] for volumetric segmentation of organs, where we use a pretrained model (Task D18 using the SegTHOR dataset) for labeling volumetric regions in the image corresponding to the heart, trachea, aorta and esophagus. These are the major organs-at-risk for radiation therapy for lung cancer. We further enhance these annotations by computing 3D shape radiomics features using the pyradiomics package [4]. The second tool is a pretrained model for per-slice automatic labeling of anatomic landmarks and imaged body part regions in axial CT volumes [5].

We focus on enhancing two publicly available collections, the Non-small Cell Lung Cancer Radiomics (NSCLC-Radiomics collection) [6,7], and the National Lung Screening Trial (NLST collection) [8,9]. The CT data for these collections are available both in The Cancer Imaging Archive (TCIA) [10] and in NCI Imaging Data Commons (IDC). Further, the NSLSC-Radiomics collection includes expert-generated manual annotations of several chest organs, allowing us to quantify performance of the AI tools in that subset of data.

IDC is relying on the DICOM standard to achieve FAIR [10] sharing of data and interoperability. Generated annotations are saved as DICOM Segmentation objects (volumetric segmentations of regions of interest) created using the dcmqi [12], and DICOM Structured Report (SR) objects (per-slice annotations of the body part imaged, anatomical landmarks and radiomics features) created using dcmqi and highdicom [13]. 3D shape radiomics features and corresponding DICOM SR objects are also provided for the manual segmentations available in the NSCLC-Radiomics collection.

The dataset is available in IDC, and is accompanied by our publication here [2]. This pre-print details how the data were generated, and how the resulting DICOM objects can be interpreted and used in tools. Additionally, for further information about how to interact with and explore the dataset, please refer to our repository and accompanying Google Colaboratory notebook.

The annotations are organized as follows. For NSCLC-Radiomics, three nnU-Net models were evaluated ('2d-tta', '3d_lowres-tta' and '3d_fullres-tta'). Within each folder, the PatientID and the StudyInstanceUID are subdirectories, and within this the DICOM Segmentation object and the DICOM SR for the 3D shape features are stored. A separate directory for the DICOM SR body part regression regions ('sr_regions') and landmarks ('sr_landmarks') are also provided with the same folder structure as above. Lastly, the DICOM SR for the existing manual annotations are provided in the 'sr_gt' directory. For NSCLC-Radiomics, each patient has a single StudyInstanceUID. The DICOM Segmentation and SR objects are named according to the SeriesInstanceUID of the original CT files.

nsclc

2d-tta

PatientID

StudyInstanceUID

ReferencedSeriesInstanceUID_SEG.dcm

ReferencedSeriesInstanceUID_features_SR.dcm

3d_lowres-tta

PatientID

StudyInstanceUID

ReferencedSeriesInstanceUID_SEG.dcm

ReferencedSeriesInstanceUID_features_SR.dcm

3d_fullres-tta

PatientID

StudyInstanceUID

ReferencedSeriesInstanceUID_SEG.dcm

ReferencedSeriesInstanceUID_features_SR.dcm

sr_regions

PatientID

StudyInstanceUID

ReferencedSeriesInstanceUID_regions_SR.dcm

sr_landmarks

PatientID

StudyInstanceUID

ReferencedSeriesInstanceUID_landmarks_SR.dcm

sr_gt

PatientID

StudyInstanceUID

ReferencedSeriesInstanceUID_features_SR.dcm

For NLST, the '3d_fullres-tta' model was evaluated. The data is organized the same as above, where within each folder the PatientID and the StudyInstanceUID are subdirectories. For the NLST collection, it is possible that some patients have more than one StudyInstanceUID subdirectory. A separate directory for the DICOM SR body par regions ('sr_regions') and landmarks ('sr_landmarks') are also provided. The DICOM Segmentation and SR objects are named according to the SeriesInstanceUID of the original CT files.

nlst

3d_fullres-tta

PatientID

StudyInstanceUID

ReferencedSeriesInstanceUID_SEG.dcm

ReferencedSeriesInstanceUID_features_SR.dcm

sr_regions

PatientID

StudyInstanceUID

ReferencedSeriesInstanceUID_regions_SR.dcm

sr_landmarks

PatientID

StudyInstanceUID

ReferencedSeriesInstanceUID_landmarks_SR.dcm

The query used for NSCLC-Radiomics is here, and a list of corresponding SeriesInstanceUIDs (along with PatientIDs and StudyInstanceUIDs) is here. The query used for NLST is here, and a list of corresponding SeriesInstanceUIDs (along with PatientIDs and StudyInstanceUIDs) is here. The two csv files that describe the series analyzed, nsclc_series_analyzed.csv and nlst_series_analyzed.csv, are also available as uploads to this repository.

Version updates:

Version 2: For the regions SR and landmarks SR, changed to use a distinct TrackingUniqueIdentifier for each MeasurementGroup. Also instead of using TargetRegion, changed to use FindingSite. Additionally for the landmarks SR, the TopographicalModifier was made a child of FindingSite instead of a sibling.

Version 3: Added the two csv files that describe which series were analyzed

Version 4: Modified the landmarks SR as the TopographicalModifier for the Kidney landmark (bottom) does not describe the landmark correctly. The Kidney landmark is the "first slice where both kidneys can be seen well." Instead, removed the use of the TopographicalModifier for that landmark. For the features SR, modified the units code for the Flatness and Elongation, as we incorrectly used mm units instead of no units.
f
Assessments of video by audiences.
plos.figshare.com
xls
Updated Oct 17, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eric Allen Jensen; Kalina Borkiewicz; Jill P. Naiman; Stuart Levy; Jeff Carpenter (2024). Assessments of video by audiences. [Dataset]. http://doi.org/10.1371/journal.pone.0307733.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0307733.t001
Dataset updated
Oct 17, 2024
Dataset provided by
PLOS ONE
Authors
Eric Allen Jensen; Kalina Borkiewicz; Jill P. Naiman; Stuart Levy; Jeff Carpenter
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Visualizing research data can be an important science communication tool. In recent decades, 3D data visualization has emerged as a key tool for engaging public audiences. Such visualizations are often embedded in scientific documentaries screened on giant domes in planetariums or delivered through video streaming services such as Amazon Prime. 3D data visualization has been shown to be an effective way to communicate complex scientific concepts to the public. With its ability to convey information in a scientifically accurate and visually engaging way, cinematic-style 3D data visualization has the potential to benefit millions of viewers by making scientific information more understandable and interesting. Maximizing the effectiveness of 3D data visualization can benefit millions of viewers. To support a wider shift in this professional field towards more evidence-based practice in 3D data visualization to enhance science communication impact, we have conducted a survey experiment comparing audience responses to two versions of 3D data visualizations from a scientific documentary film on the theme of ‘solar superstorms’ (n = 577). This study was conducted using a single (with two levels: labeled and unlabeled), between-subjects, factorial design. It reveals key strengths and weaknesses of communicating science using 3D data visualization. It also shows the limited power of strategically deployed informational labels to affect audience perceptions of the documentary film and its content. The major difference identified between experimental and control groups was that the quality ratings of the documentary film clip were significantly higher for the ‘labeled’ version. Other outcomes showed no statistically significant differences. The limited effects of informational labels point to the idea that other aspects, such as the story structure, voiceover narration and audio-visual content, are more important determinants of outcomes. This study concludes with a discussion of how this new research evidence informs our understanding of ‘what works and why’ with cinematic-style 3D data visualizations for the public.
t
LUMPI: The Leibniz University Multi-Perspective Intersection Dataset
service.tib.eu
data.uni-hannover.de
Updated May 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). LUMPI: The Leibniz University Multi-Perspective Intersection Dataset [Dataset]. https://service.tib.eu/ldmservice/dataset/luh-lumpi
Explore at:
Dataset updated
May 16, 2025
License
Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
Description
Increasing improvements in sensor technologies as well as machine learning methods allow an efficient collection, processing and analysis of the dynamic environment, which can be used for detection and tracking of traffic participants. Current datasets in this domain mostly present a single view, preventing high accurate pose estimations by occlusions. The integration of different, simultaneously acquired data allows to exploit and develop collaboration principles to increase the quality, reliability and integrity of the derived information. This work addresses this problem by providing a multi-view dataset, including 2D image information (videos) and 3D point clouds with labels of the traffic participants in the scene. The dataset was recorded during different weather and light conditions on several days at a large junction in Hanover, Germany. Paper Dataset teaser video: https://youtu.be/elwFdCu5IFo Dataset download path: https://data.uni-hannover.de/vault/ikg/busch/LUMPI/ Labeling process pipeline video: https://youtu.be/Ns6qsHsb06E Python-SDK: https://github.com/St3ff3nBusch/LUMPI-SDK-Python Labeling Tool/ C++ SDK: https://github.com/St3ff3nBusch/LUMPI-Labeling
r
PC-Urban Outdoordataset for 3D Point Cloud semantic segmentation
researchdata.edu.au
Updated 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ajmal Mian; Micheal Wise; Naveed Akhtar; Muhammad Ibrahim; Computer Science and Software Engineering (2021). PC-Urban Outdoordataset for 3D Point Cloud semantic segmentation [Dataset]. http://doi.org/10.21227/FVQD-K603
Explore at:
Unique identifier
https://doi.org/10.21227/FVQD-K603
Dataset updated
2021
Dataset provided by
IEEE DataPort
The University of Western Australia
Authors
Ajmal Mian; Micheal Wise; Naveed Akhtar; Muhammad Ibrahim; Computer Science and Software Engineering
Description
The proposed dataset, termed PC-Urban (Urban Point Cloud), is captured with an Ouster LiDAR sensor with 64 channels. The sensor is installed on an SUV that drives through the downtown of Perth, Western Australia (WA), Australia. The dataset comprises over 4.3 billion points captured for 66K sensor frames. The labelled data is organized as registered and raw point cloud frames, where the former has a different number of registered consecutive frames. We provide 25 class labels in the dataset covering 23 million points and 5K instances. Labelling is performed with PC-Annotate and can easily be extended by the end-users employing the same tool.The data is organized into unlabelled and labelled 3D point clouds. The unlabelled data is provided in .PCAP file format, which is the direct output format of the used Ouster LiDAR sensor. Raw frames are extracted from the recorded .PCAP files in the form of Ply and Excel files using the Ouster Studio Software. Labelled 3D point cloud data consists of registered or raw point clouds. A labelled point cloud is a combination of Ply, Excel, Labels and Summary files. A point cloud in Ply file contains X, Y, Z values along with color information. An Excel file contains X, Y, Z values, Intensity, Reflectivity, Ring, Noise, and Range of each point. These attributes can be useful in semantic segmentation using deep learning algorithms. The Label and Label Summary files have been explained in the previous section. Our one GB raw data contains nearly 1,300 raw frames, whereas 66,425 frames are provided in the dataset, each comprising 65,536 points. Hence, 4.3 billion points captured with the Ouster LiDAR sensor are provided. Annotation of 25 general outdoor classes is provided, which include car, building, bridge, tree, road, letterbox, traffic signal, light-pole, rubbish bin, cycles, motorcycle, truck, bus, bushes, road sign board, advertising board, road divider, road lane, pedestrians, side-path, wall, bus stop, water, zebra-crossing, and background. With the released data, a total of 143 scenes are annotated which include both raw and registered frames.
The INI-30 Dataset : Event Camera for Eye Tracking
zenodo.org
zip
Updated Jan 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pietro Bonazzi; Pietro Bonazzi (2025). The INI-30 Dataset : Event Camera for Eye Tracking [Dataset]. http://doi.org/10.5281/zenodo.11203260
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.11203260
Dataset updated
Jan 21, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Pietro Bonazzi; Pietro Bonazzi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The Ini-30 dataset is collected with two event cameras mounted on a glass frame. Each DVXplorer sensor (640 × 480 pixels) is attached on the side of the frame. The power supply was provided via a 2 meter cable connected from the cameras to a computer, which provided enough freedom of movement. Differently from [2, 24], the participants were not instructed to follow a dot on a screen, but rather encouraged to look around to collect natural eye movements. As shown in Fig. 1, the event cameras were securely screwed on a 3D-printed case attached to the side of the glass frame. The data was annotated based on accumulated linearly decayed events by defining the pixel intensity as function of the linear accumulation of previous pixel intensity. Next we labeled the position of the pupil in the DVS’s array manually, using an assistive labeling tool. We discarded the first 20ms of events to ensure the eye was visible and annotations met the level of image-based annotators. The number of labels per recording was intentionally variable, spanning from 475 to 1’848 with a time per label ranging from 20.0 to 235.77 milliseconds depending on the overall duration of the sample. This setup allows for unconstrained head movements, enables to capture event data from eye movement in a ”in-the-wild” setting and allows the generation of a representative, unique, diverse and challenging dataset.
f
Segmentation of Image Data from Complex Organotypic 3D Models of Cancer...
plos.figshare.com
avi
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sean Robinson; Laurent Guyon; Jaakko Nevalainen; Mervi Toriseva; Malin Åkerfelt; Matthias Nees (2023). Segmentation of Image Data from Complex Organotypic 3D Models of Cancer Tissues with Markov Random Fields [Dataset]. http://doi.org/10.1371/journal.pone.0143798
Explore at:
aviAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0143798
Dataset updated
May 30, 2023
Dataset provided by
PLOS ONE
Authors
Sean Robinson; Laurent Guyon; Jaakko Nevalainen; Mervi Toriseva; Malin Åkerfelt; Matthias Nees
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Organotypic, three dimensional (3D) cell culture models of epithelial tumour types such as prostate cancer recapitulate key aspects of the architecture and histology of solid cancers. Morphometric analysis of multicellular 3D organoids is particularly important when additional components such as the extracellular matrix and tumour microenvironment are included in the model. The complexity of such models has so far limited their successful implementation. There is a great need for automatic, accurate and robust image segmentation tools to facilitate the analysis of such biologically relevant 3D cell culture models. We present a segmentation method based on Markov random fields (MRFs) and illustrate our method using 3D stack image data from an organotypic 3D model of prostate cancer cells co-cultured with cancer-associated fibroblasts (CAFs). The 3D segmentation output suggests that these cell types are in physical contact with each other within the model, which has important implications for tumour biology. Segmentation performance is quantified using ground truth labels and we show how each step of our method increases segmentation accuracy. We provide the ground truth labels along with the image data and code. Using independent image data we show that our segmentation method is also more generally applicable to other types of cellular microscopy and not only limited to fluorescence microscopy.
P
Endomapper Dataset
paperswithcode.com
Updated Apr 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pablo Azagra; Carlos Sostres; Ángel Ferrandez; Luis Riazuelo; Clara Tomasini; Oscar León Barbed; Javier Morlana; David Recasens; Victor M. Batlle; Juan J. Gómez-Rodríguez; Richard Elvira; Julia López; Cristina Oriol; Javier Civera; Juan D. Tardós; Ana Cristina Murillo; Angel Lanas; José M. M. Montiel (2022). Endomapper Dataset [Dataset]. https://paperswithcode.com/dataset/endomapper
Explore at:
Dataset updated
Apr 28, 2022
Authors
Pablo Azagra; Carlos Sostres; Ángel Ferrandez; Luis Riazuelo; Clara Tomasini; Oscar León Barbed; Javier Morlana; David Recasens; Victor M. Batlle; Juan J. Gómez-Rodríguez; Richard Elvira; Julia López; Cristina Oriol; Javier Civera; Juan D. Tardós; Ana Cristina Murillo; Angel Lanas; José M. M. Montiel
Description
The Endomapper dataset is the first collection of complete endoscopy sequences acquired during regular medical practice, including slow and careful screening explorations, making secondary use of medical data. Its original purpose is to facilitate the development and evaluation of VSLAM (Visual Simultaneous Localization and Mapping) methods in real endoscopy data. The first release of the dataset is composed of 50 sequences with a total of more than 13 hours of video. It is also the first endoscopic dataset that includes both the computed geometric and photometric endoscope calibration as well as the original calibration videos. Meta-data and annotations associated to the dataset varies from anatomical landmark and description of the procedure labeling, tools segmentation masks, COLMAP 3D reconstructions, simulated sequences with groundtruth and meta-data related to special cases, such as sequences from the same patient. This information will improve the research in endoscopic VSLAM, as well as other research lines, and create new research lines.
o
Optical signature dataset for living macrophages and monocytes
explore.openaire.eu
search.dataone.org
+2more
Updated Sep 27, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Dannhauser; Domenico Rossi; Vincenza De Gregorio; Paolo Antonio Netti; Giuseppe Terrazzano; Filippo Causa (2022). Optical signature dataset for living macrophages and monocytes [Dataset]. http://doi.org/10.5061/dryad.1ns1rn8wh
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.1ns1rn8wh
Dataset updated
Sep 27, 2022
Authors
David Dannhauser; Domenico Rossi; Vincenza De Gregorio; Paolo Antonio Netti; Giuseppe Terrazzano; Filippo Causa
Description
Pro-inflammatory (M1) and anti-inflammatory (M2) macrophage phenotypes play a fundamental role in the immune response. The interplay and consequently the classification between these two functional subtypes is significant for many therapeutic applications. Albeit, a fast classification of macrophage phenotypes is challenging. For instance, image-based classification systems need cell staining and coloration, which is usually time and cost-consuming, such as multiple cell surface markers, transcription factors and cytokine profiles are needed. A simple alternative would be to identify such cell types by using a single-cell, label-free and high-throughput light scattering pattern analyses combined with a straightforward machine-learning-based classification. Here, we compared different machine learning algorithms to classify distinct macrophage phenotypes based on their optical signature obtained from an ad-hoc developed wide angle static light scattering apparatus. As the main result, we were able to identify unpolarized macrophages from M1- and M2-polarized phenotypes and distinguished them from naive monocytes with an average accuracy above 85%. Therefore, we suggest that optical single-cell signatures within a lab-on-a-chip approach along with machine learning could be used as a fast, affordable, non-invasive macrophage phenotyping tool to supersede resource-intensive cell labelling. Fluid forces 3D align cells from a cell sample to the centreline of a microfluidic device, where a collimated laser beam interacts with passing individual cells. The light interaction reveals significantly different scattering patterns (optical signature) for each macrophage phenotype as well as monocytes, which a camera-based readout system records. The obtained data is processed (dataset) and classified with machine learning to obtain a label-free macrophage phenotype classification. The dataset indicates pooled data of three probands in the last column the cell type (0 - monocyte, 1 - MO-macrophages, 2 - M1-macrophages, 3 - M2-macrophages). In addition, we show the dataset of the PCR analysis and bright-field observations of investigated cells from one porband.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Data Insights Market (2025). Data Labeling Tools Report [Dataset]. https://www.datainsightsmarket.com/reports/data-labeling-tools-1368998

Data Labeling Tools Report

Explore at:

doc, pdf, pptAvailable download formats

Dataset updated

Jun 19, 2025

Dataset authored and provided by

Data Insights Market

License

https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

Time period covered

2025 - 2033

Area covered

Global

Variables measured

Market Size

Description

The Data Labeling Tools market is experiencing robust growth, driven by the escalating demand for high-quality training data in artificial intelligence (AI) and machine learning (ML) applications. The market's expansion is fueled by the increasing adoption of AI across various sectors, including automotive, healthcare, and finance, which necessitates vast amounts of accurately labeled data for model training and improvement. Technological advancements in automation and semi-supervised learning are streamlining the labeling process, improving efficiency and reducing costs, further contributing to market growth. A key trend is the shift towards more sophisticated labeling techniques, including 3D point cloud annotation and video annotation, reflecting the growing complexity of AI applications. Competition is fierce, with established players like Amazon Mechanical Turk and Google LLC coexisting with innovative startups offering specialized labeling solutions. The market is segmented by type of data labeling (image, text, video, audio), annotation method (manual, automated), and industry vertical, reflecting the diverse needs of different AI projects. Challenges include data privacy concerns, ensuring data quality and consistency, and the need for skilled annotators, which are all impacting the overall market growth, requiring continuous innovation and strategic investments to address these issues. Despite these challenges, the Data Labeling Tools market shows strong potential for continued expansion. The forecast period (2025-2033) anticipates a significant increase in market value, fueled by ongoing technological advancements, wider adoption of AI across various sectors, and a rising demand for high-quality data. The market is expected to witness increased consolidation as larger players acquire smaller companies to strengthen their market position and technological capabilities. Furthermore, the development of more sophisticated and automated labeling tools will continue to drive efficiency and reduce costs, making these tools accessible to a broader range of users and further fueling market growth. We anticipate that the focus on improving the accuracy and speed of data labeling will be paramount in shaping the future landscape of this dynamic market.

Clear search

Close search

Google apps

Main menu

Data Labeling Tools Report

3D Microvascular Image Data and Labels for Machine Learning

Immersive viewing experience self-assessment.

AI-derived annotations for the NLST and NSCLC-Radiomics computed tomography...

Assessments of video by audiences.

LUMPI: The Leibniz University Multi-Perspective Intersection Dataset

PC-Urban Outdoordataset for 3D Point Cloud semantic segmentation

The INI-30 Dataset : Event Camera for Eye Tracking

Segmentation of Image Data from Complex Organotypic 3D Models of Cancer...

Endomapper Dataset

Optical signature dataset for living macrophages and monocytes

Data Labeling Tools Report