11 datasets found
  1. D

    Data Labeling Tools Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Jun 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Data Labeling Tools Report [Dataset]. https://www.datainsightsmarket.com/reports/data-labeling-tools-1368998
    Explore at:
    doc, pdf, pptAvailable download formats
    Dataset updated
    Jun 19, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Data Labeling Tools market is experiencing robust growth, driven by the escalating demand for high-quality training data in artificial intelligence (AI) and machine learning (ML) applications. The market's expansion is fueled by the increasing adoption of AI across various sectors, including automotive, healthcare, and finance, which necessitates vast amounts of accurately labeled data for model training and improvement. Technological advancements in automation and semi-supervised learning are streamlining the labeling process, improving efficiency and reducing costs, further contributing to market growth. A key trend is the shift towards more sophisticated labeling techniques, including 3D point cloud annotation and video annotation, reflecting the growing complexity of AI applications. Competition is fierce, with established players like Amazon Mechanical Turk and Google LLC coexisting with innovative startups offering specialized labeling solutions. The market is segmented by type of data labeling (image, text, video, audio), annotation method (manual, automated), and industry vertical, reflecting the diverse needs of different AI projects. Challenges include data privacy concerns, ensuring data quality and consistency, and the need for skilled annotators, which are all impacting the overall market growth, requiring continuous innovation and strategic investments to address these issues. Despite these challenges, the Data Labeling Tools market shows strong potential for continued expansion. The forecast period (2025-2033) anticipates a significant increase in market value, fueled by ongoing technological advancements, wider adoption of AI across various sectors, and a rising demand for high-quality data. The market is expected to witness increased consolidation as larger players acquire smaller companies to strengthen their market position and technological capabilities. Furthermore, the development of more sophisticated and automated labeling tools will continue to drive efficiency and reduce costs, making these tools accessible to a broader range of users and further fueling market growth. We anticipate that the focus on improving the accuracy and speed of data labeling will be paramount in shaping the future landscape of this dynamic market.

  2. u

    3D Microvascular Image Data and Labels for Machine Learning

    • rdr.ucl.ac.uk
    bin
    Updated Apr 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Natalie Holroyd; Claire Walsh; Emmeline Brown; Emma Brown; Yuxin Zhang; Carles Bosch Pinol; Simon Walker-Samuel (2024). 3D Microvascular Image Data and Labels for Machine Learning [Dataset]. http://doi.org/10.5522/04/25715604.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Apr 30, 2024
    Dataset provided by
    University College London
    Authors
    Natalie Holroyd; Claire Walsh; Emmeline Brown; Emma Brown; Yuxin Zhang; Carles Bosch Pinol; Simon Walker-Samuel
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    These images and associated binary labels were collected from collaborators across multiple universities to serve as a diverse representation of biomedical images of vessel structures, for use in the training and validation of machine learning tools for vessel segmentation. The dataset contains images from a variety of imaging modalities, at different resolutions, using difference sources of contrast and featuring different organs/ pathologies. This data was use to train, test and validated a foundational model for 3D vessel segmentation, tUbeNet, which can be found on github. The paper descripting the training and validation of the model can be found here. Filenames are structured as follows: Data - [Modality]_[species Organ]_[resolution].tif Labels - [Modality]_[species Organ]_[resolution]_labels.tif Sub-volumes of larger dataset - [Modality]_[species Organ]_subvolume[dimensions in pixels].tif Manual labelling of blood vessels was carried out using Amira (2020.2, Thermo-Fisher, UK). Training data: opticalHREM_murineLiver_2.26x2.26x1.75um.tif: A high resolution episcopic microscopy (HREM) dataset, acquired in house by staining a healthy mouse liver with Eosin B and imaged using a standard HREM protocol. NB: 25% of this image volume was withheld from training, for use as test data. CT_murineTumour_20x20x20um.tif: X-ray microCT images of a microvascular cast, taken from a subcutaneous mouse model of colorectal cancer (acquired in house). NB: 25% of this image volume was withheld from training, for use as test data. RSOM_murineTumour_20x20um.tif: Raster-Scanning Optoacoustic Mesoscopy (RSOM) data from a subcutaneous tumour model (provided by Emma Brown, Bohndiek Group, University of Cambridge). The image data has undergone filtering to reduce the background ​(Brown et al., 2019)​. OCTA_humanRetina_24x24um.tif: retinal angiography data obtained using Optical Coherence Tomography Angiography (OCT-A) (provided by Dr Ranjan Rajendram, Moorfields Eye Hospital). Test data: MRI_porcineLiver_0.9x0.9x5mm.tif: T1-weighted Balanced Turbo Field Echo Magnetic Resonance Imaging (MRI) data from a machine-perfused porcine liver, acquired in-house. Test Data MFHREM_murineTumourLectin_2.76x2.76x2.61um.tif: a subcutaneous colorectal tumour mouse model was imaged in house using Multi-fluorescence HREM in house, with Dylight 647 conjugated lectin staining the vasculature ​(Walsh et al., 2021)​. The image data has been processed using an asymmetric deconvolution algorithm described by ​Walsh et al., 2020​. NB: A sub-volume of 480x480x640 voxels was manually labelled (MFHREM_murineTumourLectin_subvolume480x480x640.tif). MFHREM_murineBrainLectin_0.85x0.85x0.86um.tif: an MF-HREM image of the cortex of a mouse brain, stained with Dylight-647 conjugated lectin, was acquired in house ​(Walsh et al., 2021)​. The image data has been downsampled and processed using an asymmetric deconvolution algorithm described by ​Walsh et al., 2020​. NB: A sub-volume of 1000x1000x99 voxels was manually labelled. This sub-volume is provided at full resolution and without preprocessing (MFHREM_murineBrainLectin_subvol_0.57x0.57x0.86um.tif). 2Photon_murineOlfactoryBulbLectin_0.2x0.46x5.2um.tif: two-photon data of mouse olfactory bulb blood vessels, labelled with sulforhodamine 101, was kindly provided by Yuxin Zhang at the Sensory Circuits and Neurotechnology Lab, the Francis Crick Institute ​(Bosch et al., 2022)​. NB: A sub-volume of 500x500x79 voxel was manually labelled (2Photon_murineOlfactoryBulbLectin_subvolume500x500x79.tif). References: ​​Bosch, C., Ackels, T., Pacureanu, A., Zhang, Y., Peddie, C. J., Berning, M., Rzepka, N., Zdora, M. C., Whiteley, I., Storm, M., Bonnin, A., Rau, C., Margrie, T., Collinson, L., & Schaefer, A. T. (2022). Functional and multiscale 3D structural investigation of brain tissue through correlative in vivo physiology, synchrotron microtomography and volume electron microscopy. Nature Communications 2022 13:1, 13(1), 1–16. https://doi.org/10.1038/s41467-022-30199-6 ​Brown, E., Brunker, J., & Bohndiek, S. E. (2019). Photoacoustic imaging as a tool to probe the tumour microenvironment. DMM Disease Models and Mechanisms, 12(7). https://doi.org/10.1242/DMM.039636 ​Walsh, C., Holroyd, N. A., Finnerty, E., Ryan, S. G., Sweeney, P. W., Shipley, R. J., & Walker-Samuel, S. (2021). Multifluorescence High-Resolution Episcopic Microscopy for 3D Imaging of Adult Murine Organs. Advanced Photonics Research, 2(10), 2100110. https://doi.org/10.1002/ADPR.202100110 ​Walsh, C., Holroyd, N., Shipley, R., & Walker-Samuel, S. (2020). Asymmetric Point Spread Function Estimation and Deconvolution for Serial-Sectioning Block-Face Imaging. Communications in Computer and Information Science, 1248 CCIS, 235–249. https://doi.org/10.1007/978-3-030-52791-4_19 ​ 

  3. f

    Immersive viewing experience self-assessment.

    • figshare.com
    xls
    Updated Oct 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eric Allen Jensen; Kalina Borkiewicz; Jill P. Naiman; Stuart Levy; Jeff Carpenter (2024). Immersive viewing experience self-assessment. [Dataset]. http://doi.org/10.1371/journal.pone.0307733.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Oct 17, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Eric Allen Jensen; Kalina Borkiewicz; Jill P. Naiman; Stuart Levy; Jeff Carpenter
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Visualizing research data can be an important science communication tool. In recent decades, 3D data visualization has emerged as a key tool for engaging public audiences. Such visualizations are often embedded in scientific documentaries screened on giant domes in planetariums or delivered through video streaming services such as Amazon Prime. 3D data visualization has been shown to be an effective way to communicate complex scientific concepts to the public. With its ability to convey information in a scientifically accurate and visually engaging way, cinematic-style 3D data visualization has the potential to benefit millions of viewers by making scientific information more understandable and interesting. Maximizing the effectiveness of 3D data visualization can benefit millions of viewers. To support a wider shift in this professional field towards more evidence-based practice in 3D data visualization to enhance science communication impact, we have conducted a survey experiment comparing audience responses to two versions of 3D data visualizations from a scientific documentary film on the theme of ‘solar superstorms’ (n = 577). This study was conducted using a single (with two levels: labeled and unlabeled), between-subjects, factorial design. It reveals key strengths and weaknesses of communicating science using 3D data visualization. It also shows the limited power of strategically deployed informational labels to affect audience perceptions of the documentary film and its content. The major difference identified between experimental and control groups was that the quality ratings of the documentary film clip were significantly higher for the ‘labeled’ version. Other outcomes showed no statistically significant differences. The limited effects of informational labels point to the idea that other aspects, such as the story structure, voiceover narration and audio-visual content, are more important determinants of outcomes. This study concludes with a discussion of how this new research evidence informs our understanding of ‘what works and why’ with cinematic-style 3D data visualizations for the public.

  4. Z

    AI-derived annotations for the NLST and NSCLC-Radiomics computed tomography...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Deepa Krishnaswamy (2024). AI-derived annotations for the NLST and NSCLC-Radiomics computed tomography imaging collections [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7473970
    Explore at:
    Dataset updated
    Jan 22, 2024
    Dataset provided by
    Dennis Bontempi
    Deepa Krishnaswamy
    Andrey Fedorov
    Hugo Aerts
    David Clunie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Public imaging datasets are critical for the development and evaluation of automated tools in cancer imaging. Unfortunately, many of the available datasets do not provide annotations of tumors or organs-at-risk, crucial for the assessment of these tools. This is due to the fact that annotation of medical images is time consuming and requires domain expertise. It has been demonstrated that artificial intelligence (AI) based annotation tools can achieve acceptable performance and thus can be used to automate the annotation of large datasets. As part of the effort to enrich the public data available within NCI Imaging Data Commons (IDC) (https://imaging.datacommons.cancer.gov/) [1], we introduce this dataset that consists of such AI-generated annotations for two publicly available medical imaging collections of Computed Tomography (CT) images of the chest. For detailed information concerning this dataset, please refer to our publication here [2].

    We use publicly available pre-trained AI tools to enhance CT lung cancer collections that are unlabeled or partially labeled. The first tool is the nnU-Net deep learning framework [3] for volumetric segmentation of organs, where we use a pretrained model (Task D18 using the SegTHOR dataset) for labeling volumetric regions in the image corresponding to the heart, trachea, aorta and esophagus. These are the major organs-at-risk for radiation therapy for lung cancer. We further enhance these annotations by computing 3D shape radiomics features using the pyradiomics package [4]. The second tool is a pretrained model for per-slice automatic labeling of anatomic landmarks and imaged body part regions in axial CT volumes [5].

    We focus on enhancing two publicly available collections, the Non-small Cell Lung Cancer Radiomics (NSCLC-Radiomics collection) [6,7], and the National Lung Screening Trial (NLST collection) [8,9]. The CT data for these collections are available both in The Cancer Imaging Archive (TCIA) [10] and in NCI Imaging Data Commons (IDC). Further, the NSLSC-Radiomics collection includes expert-generated manual annotations of several chest organs, allowing us to quantify performance of the AI tools in that subset of data.

    IDC is relying on the DICOM standard to achieve FAIR [10] sharing of data and interoperability. Generated annotations are saved as DICOM Segmentation objects (volumetric segmentations of regions of interest) created using the dcmqi [12], and DICOM Structured Report (SR) objects (per-slice annotations of the body part imaged, anatomical landmarks and radiomics features) created using dcmqi and highdicom [13]. 3D shape radiomics features and corresponding DICOM SR objects are also provided for the manual segmentations available in the NSCLC-Radiomics collection.

    The dataset is available in IDC, and is accompanied by our publication here [2]. This pre-print details how the data were generated, and how the resulting DICOM objects can be interpreted and used in tools. Additionally, for further information about how to interact with and explore the dataset, please refer to our repository and accompanying Google Colaboratory notebook.

    The annotations are organized as follows. For NSCLC-Radiomics, three nnU-Net models were evaluated ('2d-tta', '3d_lowres-tta' and '3d_fullres-tta'). Within each folder, the PatientID and the StudyInstanceUID are subdirectories, and within this the DICOM Segmentation object and the DICOM SR for the 3D shape features are stored. A separate directory for the DICOM SR body part regression regions ('sr_regions') and landmarks ('sr_landmarks') are also provided with the same folder structure as above. Lastly, the DICOM SR for the existing manual annotations are provided in the 'sr_gt' directory. For NSCLC-Radiomics, each patient has a single StudyInstanceUID. The DICOM Segmentation and SR objects are named according to the SeriesInstanceUID of the original CT files.

    nsclc

    2d-tta

    PatientID

    StudyInstanceUID

    ReferencedSeriesInstanceUID_SEG.dcm

    ReferencedSeriesInstanceUID_features_SR.dcm

    3d_lowres-tta

    PatientID

    StudyInstanceUID

    ReferencedSeriesInstanceUID_SEG.dcm

    ReferencedSeriesInstanceUID_features_SR.dcm

    3d_fullres-tta

    PatientID

    StudyInstanceUID

    ReferencedSeriesInstanceUID_SEG.dcm

    ReferencedSeriesInstanceUID_features_SR.dcm

    sr_regions

    PatientID

    StudyInstanceUID

    ReferencedSeriesInstanceUID_regions_SR.dcm

    sr_landmarks

    PatientID

    StudyInstanceUID

    ReferencedSeriesInstanceUID_landmarks_SR.dcm

    sr_gt

    PatientID

    StudyInstanceUID

    ReferencedSeriesInstanceUID_features_SR.dcm

    For NLST, the '3d_fullres-tta' model was evaluated. The data is organized the same as above, where within each folder the PatientID and the StudyInstanceUID are subdirectories. For the NLST collection, it is possible that some patients have more than one StudyInstanceUID subdirectory. A separate directory for the DICOM SR body par regions ('sr_regions') and landmarks ('sr_landmarks') are also provided. The DICOM Segmentation and SR objects are named according to the SeriesInstanceUID of the original CT files.

    nlst

    3d_fullres-tta

    PatientID

    StudyInstanceUID

    ReferencedSeriesInstanceUID_SEG.dcm

    ReferencedSeriesInstanceUID_features_SR.dcm

    sr_regions

    PatientID

    StudyInstanceUID

    ReferencedSeriesInstanceUID_regions_SR.dcm

    sr_landmarks

    PatientID

    StudyInstanceUID

    ReferencedSeriesInstanceUID_landmarks_SR.dcm

    The query used for NSCLC-Radiomics is here, and a list of corresponding SeriesInstanceUIDs (along with PatientIDs and StudyInstanceUIDs) is here. The query used for NLST is here, and a list of corresponding SeriesInstanceUIDs (along with PatientIDs and StudyInstanceUIDs) is here. The two csv files that describe the series analyzed, nsclc_series_analyzed.csv and nlst_series_analyzed.csv, are also available as uploads to this repository.

    Version updates:

    Version 2: For the regions SR and landmarks SR, changed to use a distinct TrackingUniqueIdentifier for each MeasurementGroup. Also instead of using TargetRegion, changed to use FindingSite. Additionally for the landmarks SR, the TopographicalModifier was made a child of FindingSite instead of a sibling.

    Version 3: Added the two csv files that describe which series were analyzed

    Version 4: Modified the landmarks SR as the TopographicalModifier for the Kidney landmark (bottom) does not describe the landmark correctly. The Kidney landmark is the "first slice where both kidneys can be seen well." Instead, removed the use of the TopographicalModifier for that landmark. For the features SR, modified the units code for the Flatness and Elongation, as we incorrectly used mm units instead of no units.

  5. f

    Assessments of video by audiences.

    • plos.figshare.com
    xls
    Updated Oct 17, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eric Allen Jensen; Kalina Borkiewicz; Jill P. Naiman; Stuart Levy; Jeff Carpenter (2024). Assessments of video by audiences. [Dataset]. http://doi.org/10.1371/journal.pone.0307733.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Oct 17, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Eric Allen Jensen; Kalina Borkiewicz; Jill P. Naiman; Stuart Levy; Jeff Carpenter
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Visualizing research data can be an important science communication tool. In recent decades, 3D data visualization has emerged as a key tool for engaging public audiences. Such visualizations are often embedded in scientific documentaries screened on giant domes in planetariums or delivered through video streaming services such as Amazon Prime. 3D data visualization has been shown to be an effective way to communicate complex scientific concepts to the public. With its ability to convey information in a scientifically accurate and visually engaging way, cinematic-style 3D data visualization has the potential to benefit millions of viewers by making scientific information more understandable and interesting. Maximizing the effectiveness of 3D data visualization can benefit millions of viewers. To support a wider shift in this professional field towards more evidence-based practice in 3D data visualization to enhance science communication impact, we have conducted a survey experiment comparing audience responses to two versions of 3D data visualizations from a scientific documentary film on the theme of ‘solar superstorms’ (n = 577). This study was conducted using a single (with two levels: labeled and unlabeled), between-subjects, factorial design. It reveals key strengths and weaknesses of communicating science using 3D data visualization. It also shows the limited power of strategically deployed informational labels to affect audience perceptions of the documentary film and its content. The major difference identified between experimental and control groups was that the quality ratings of the documentary film clip were significantly higher for the ‘labeled’ version. Other outcomes showed no statistically significant differences. The limited effects of informational labels point to the idea that other aspects, such as the story structure, voiceover narration and audio-visual content, are more important determinants of outcomes. This study concludes with a discussion of how this new research evidence informs our understanding of ‘what works and why’ with cinematic-style 3D data visualizations for the public.

  6. t

    LUMPI: The Leibniz University Multi-Perspective Intersection Dataset

    • service.tib.eu
    • data.uni-hannover.de
    Updated May 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). LUMPI: The Leibniz University Multi-Perspective Intersection Dataset [Dataset]. https://service.tib.eu/ldmservice/dataset/luh-lumpi
    Explore at:
    Dataset updated
    May 16, 2025
    License

    Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
    License information was derived automatically

    Description

    Increasing improvements in sensor technologies as well as machine learning methods allow an efficient collection, processing and analysis of the dynamic environment, which can be used for detection and tracking of traffic participants. Current datasets in this domain mostly present a single view, preventing high accurate pose estimations by occlusions. The integration of different, simultaneously acquired data allows to exploit and develop collaboration principles to increase the quality, reliability and integrity of the derived information. This work addresses this problem by providing a multi-view dataset, including 2D image information (videos) and 3D point clouds with labels of the traffic participants in the scene. The dataset was recorded during different weather and light conditions on several days at a large junction in Hanover, Germany. Paper Dataset teaser video: https://youtu.be/elwFdCu5IFo Dataset download path: https://data.uni-hannover.de/vault/ikg/busch/LUMPI/ Labeling process pipeline video: https://youtu.be/Ns6qsHsb06E Python-SDK: https://github.com/St3ff3nBusch/LUMPI-SDK-Python Labeling Tool/ C++ SDK: https://github.com/St3ff3nBusch/LUMPI-Labeling

  7. r

    PC-Urban Outdoordataset for 3D Point Cloud semantic segmentation

    • researchdata.edu.au
    Updated 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ajmal Mian; Micheal Wise; Naveed Akhtar; Muhammad Ibrahim; Computer Science and Software Engineering (2021). PC-Urban Outdoordataset for 3D Point Cloud semantic segmentation [Dataset]. http://doi.org/10.21227/FVQD-K603
    Explore at:
    Dataset updated
    2021
    Dataset provided by
    IEEE DataPort
    The University of Western Australia
    Authors
    Ajmal Mian; Micheal Wise; Naveed Akhtar; Muhammad Ibrahim; Computer Science and Software Engineering
    Description

    The proposed dataset, termed PC-Urban (Urban Point Cloud), is captured with an Ouster LiDAR sensor with 64 channels. The sensor is installed on an SUV that drives through the downtown of Perth, Western Australia (WA), Australia. The dataset comprises over 4.3 billion points captured for 66K sensor frames. The labelled data is organized as registered and raw point cloud frames, where the former has a different number of registered consecutive frames. We provide 25 class labels in the dataset covering 23 million points and 5K instances. Labelling is performed with PC-Annotate and can easily be extended by the end-users employing the same tool.The data is organized into unlabelled and labelled 3D point clouds. The unlabelled data is provided in .PCAP file format, which is the direct output format of the used Ouster LiDAR sensor. Raw frames are extracted from the recorded .PCAP files in the form of Ply and Excel files using the Ouster Studio Software. Labelled 3D point cloud data consists of registered or raw point clouds. A labelled point cloud is a combination of Ply, Excel, Labels and Summary files. A point cloud in Ply file contains X, Y, Z values along with color information. An Excel file contains X, Y, Z values, Intensity, Reflectivity, Ring, Noise, and Range of each point. These attributes can be useful in semantic segmentation using deep learning algorithms. The Label and Label Summary files have been explained in the previous section. Our one GB raw data contains nearly 1,300 raw frames, whereas 66,425 frames are provided in the dataset, each comprising 65,536 points. Hence, 4.3 billion points captured with the Ouster LiDAR sensor are provided. Annotation of 25 general outdoor classes is provided, which include car, building, bridge, tree, road, letterbox, traffic signal, light-pole, rubbish bin, cycles, motorcycle, truck, bus, bushes, road sign board, advertising board, road divider, road lane, pedestrians, side-path, wall, bus stop, water, zebra-crossing, and background. With the released data, a total of 143 scenes are annotated which include both raw and registered frames.

  8. The INI-30 Dataset : Event Camera for Eye Tracking

    • zenodo.org
    zip
    Updated Jan 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pietro Bonazzi; Pietro Bonazzi (2025). The INI-30 Dataset : Event Camera for Eye Tracking [Dataset]. http://doi.org/10.5281/zenodo.11203260
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 21, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Pietro Bonazzi; Pietro Bonazzi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Ini-30 dataset is collected with two event cameras mounted on a glass frame. Each DVXplorer sensor (640 × 480 pixels) is attached on the side of the frame. The power supply was provided via a 2 meter cable connected from the cameras to a computer, which provided enough freedom of movement. Differently from [2, 24], the participants were not instructed to follow a dot on a screen, but rather encouraged to look around to collect natural eye movements. As shown in Fig. 1, the event cameras were securely screwed on a 3D-printed case attached to the side of the glass frame. The data was annotated based on accumulated linearly decayed events by defining the pixel intensity as function of the linear accumulation of previous pixel intensity. Next we labeled the position of the pupil in the DVS’s array manually, using an assistive labeling tool. We discarded the first 20ms of events to ensure the eye was visible and annotations met the level of image-based annotators. The number of labels per recording was intentionally variable, spanning from 475 to 1’848 with a time per label ranging from 20.0 to 235.77 milliseconds depending on the overall duration of the sample. This setup allows for unconstrained head movements, enables to capture event data from eye movement in a ”in-the-wild” setting and allows the generation of a representative, unique, diverse and challenging dataset.

  9. f

    Segmentation of Image Data from Complex Organotypic 3D Models of Cancer...

    • plos.figshare.com
    avi
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sean Robinson; Laurent Guyon; Jaakko Nevalainen; Mervi Toriseva; Malin Åkerfelt; Matthias Nees (2023). Segmentation of Image Data from Complex Organotypic 3D Models of Cancer Tissues with Markov Random Fields [Dataset]. http://doi.org/10.1371/journal.pone.0143798
    Explore at:
    aviAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Sean Robinson; Laurent Guyon; Jaakko Nevalainen; Mervi Toriseva; Malin Åkerfelt; Matthias Nees
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Organotypic, three dimensional (3D) cell culture models of epithelial tumour types such as prostate cancer recapitulate key aspects of the architecture and histology of solid cancers. Morphometric analysis of multicellular 3D organoids is particularly important when additional components such as the extracellular matrix and tumour microenvironment are included in the model. The complexity of such models has so far limited their successful implementation. There is a great need for automatic, accurate and robust image segmentation tools to facilitate the analysis of such biologically relevant 3D cell culture models. We present a segmentation method based on Markov random fields (MRFs) and illustrate our method using 3D stack image data from an organotypic 3D model of prostate cancer cells co-cultured with cancer-associated fibroblasts (CAFs). The 3D segmentation output suggests that these cell types are in physical contact with each other within the model, which has important implications for tumour biology. Segmentation performance is quantified using ground truth labels and we show how each step of our method increases segmentation accuracy. We provide the ground truth labels along with the image data and code. Using independent image data we show that our segmentation method is also more generally applicable to other types of cellular microscopy and not only limited to fluorescence microscopy.

  10. P

    Endomapper Dataset

    • paperswithcode.com
    Updated Apr 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pablo Azagra; Carlos Sostres; Ángel Ferrandez; Luis Riazuelo; Clara Tomasini; Oscar León Barbed; Javier Morlana; David Recasens; Victor M. Batlle; Juan J. Gómez-Rodríguez; Richard Elvira; Julia López; Cristina Oriol; Javier Civera; Juan D. Tardós; Ana Cristina Murillo; Angel Lanas; José M. M. Montiel (2022). Endomapper Dataset [Dataset]. https://paperswithcode.com/dataset/endomapper
    Explore at:
    Dataset updated
    Apr 28, 2022
    Authors
    Pablo Azagra; Carlos Sostres; Ángel Ferrandez; Luis Riazuelo; Clara Tomasini; Oscar León Barbed; Javier Morlana; David Recasens; Victor M. Batlle; Juan J. Gómez-Rodríguez; Richard Elvira; Julia López; Cristina Oriol; Javier Civera; Juan D. Tardós; Ana Cristina Murillo; Angel Lanas; José M. M. Montiel
    Description

    The Endomapper dataset is the first collection of complete endoscopy sequences acquired during regular medical practice, including slow and careful screening explorations, making secondary use of medical data. Its original purpose is to facilitate the development and evaluation of VSLAM (Visual Simultaneous Localization and Mapping) methods in real endoscopy data. The first release of the dataset is composed of 50 sequences with a total of more than 13 hours of video. It is also the first endoscopic dataset that includes both the computed geometric and photometric endoscope calibration as well as the original calibration videos. Meta-data and annotations associated to the dataset varies from anatomical landmark and description of the procedure labeling, tools segmentation masks, COLMAP 3D reconstructions, simulated sequences with groundtruth and meta-data related to special cases, such as sequences from the same patient. This information will improve the research in endoscopic VSLAM, as well as other research lines, and create new research lines.

  11. o

    Optical signature dataset for living macrophages and monocytes

    • explore.openaire.eu
    • search.dataone.org
    • +2more
    Updated Sep 27, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Dannhauser; Domenico Rossi; Vincenza De Gregorio; Paolo Antonio Netti; Giuseppe Terrazzano; Filippo Causa (2022). Optical signature dataset for living macrophages and monocytes [Dataset]. http://doi.org/10.5061/dryad.1ns1rn8wh
    Explore at:
    Dataset updated
    Sep 27, 2022
    Authors
    David Dannhauser; Domenico Rossi; Vincenza De Gregorio; Paolo Antonio Netti; Giuseppe Terrazzano; Filippo Causa
    Description

    Pro-inflammatory (M1) and anti-inflammatory (M2) macrophage phenotypes play a fundamental role in the immune response. The interplay and consequently the classification between these two functional subtypes is significant for many therapeutic applications. Albeit, a fast classification of macrophage phenotypes is challenging. For instance, image-based classification systems need cell staining and coloration, which is usually time and cost-consuming, such as multiple cell surface markers, transcription factors and cytokine profiles are needed. A simple alternative would be to identify such cell types by using a single-cell, label-free and high-throughput light scattering pattern analyses combined with a straightforward machine-learning-based classification. Here, we compared different machine learning algorithms to classify distinct macrophage phenotypes based on their optical signature obtained from an ad-hoc developed wide angle static light scattering apparatus. As the main result, we were able to identify unpolarized macrophages from M1- and M2-polarized phenotypes and distinguished them from naive monocytes with an average accuracy above 85%. Therefore, we suggest that optical single-cell signatures within a lab-on-a-chip approach along with machine learning could be used as a fast, affordable, non-invasive macrophage phenotyping tool to supersede resource-intensive cell labelling. Fluid forces 3D align cells from a cell sample to the centreline of a microfluidic device, where a collimated laser beam interacts with passing individual cells. The light interaction reveals significantly different scattering patterns (optical signature) for each macrophage phenotype as well as monocytes, which a camera-based readout system records. The obtained data is processed (dataset) and classified with machine learning to obtain a label-free macrophage phenotype classification. The dataset indicates pooled data of three probands in the last column the cell type (0 - monocyte, 1 - MO-macrophages, 2 - M1-macrophages, 3 - M2-macrophages). In addition, we show the dataset of the PCR analysis and bright-field observations of investigated cells from one porband.

  12. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Data Insights Market (2025). Data Labeling Tools Report [Dataset]. https://www.datainsightsmarket.com/reports/data-labeling-tools-1368998

Data Labeling Tools Report

Explore at:
doc, pdf, pptAvailable download formats
Dataset updated
Jun 19, 2025
Dataset authored and provided by
Data Insights Market
License

https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description

The Data Labeling Tools market is experiencing robust growth, driven by the escalating demand for high-quality training data in artificial intelligence (AI) and machine learning (ML) applications. The market's expansion is fueled by the increasing adoption of AI across various sectors, including automotive, healthcare, and finance, which necessitates vast amounts of accurately labeled data for model training and improvement. Technological advancements in automation and semi-supervised learning are streamlining the labeling process, improving efficiency and reducing costs, further contributing to market growth. A key trend is the shift towards more sophisticated labeling techniques, including 3D point cloud annotation and video annotation, reflecting the growing complexity of AI applications. Competition is fierce, with established players like Amazon Mechanical Turk and Google LLC coexisting with innovative startups offering specialized labeling solutions. The market is segmented by type of data labeling (image, text, video, audio), annotation method (manual, automated), and industry vertical, reflecting the diverse needs of different AI projects. Challenges include data privacy concerns, ensuring data quality and consistency, and the need for skilled annotators, which are all impacting the overall market growth, requiring continuous innovation and strategic investments to address these issues. Despite these challenges, the Data Labeling Tools market shows strong potential for continued expansion. The forecast period (2025-2033) anticipates a significant increase in market value, fueled by ongoing technological advancements, wider adoption of AI across various sectors, and a rising demand for high-quality data. The market is expected to witness increased consolidation as larger players acquire smaller companies to strengthen their market position and technological capabilities. Furthermore, the development of more sophisticated and automated labeling tools will continue to drive efficiency and reduce costs, making these tools accessible to a broader range of users and further fueling market growth. We anticipate that the focus on improving the accuracy and speed of data labeling will be paramount in shaping the future landscape of this dynamic market.

Search
Clear search
Close search
Google apps
Main menu