100+ datasets found
  1. Data from: X-ray CT data with semantic annotations for the paper "A workflow...

    • catalog.data.gov
    • agdatacommons.nal.usda.gov
    Updated Jun 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). X-ray CT data with semantic annotations for the paper "A workflow for segmenting soil and plant X-ray CT images with deep learning in Google’s Colaboratory" [Dataset]. https://catalog.data.gov/dataset/x-ray-ct-data-with-semantic-annotations-for-the-paper-a-workflow-for-segmenting-soil-and-p-d195a
    Explore at:
    Dataset updated
    Jun 5, 2025
    Dataset provided by
    Agricultural Research Servicehttps://www.ars.usda.gov/
    Description

    Leaves from genetically unique Juglans regia plants were scanned using X-ray micro-computed tomography (microCT) on the X-ray μCT beamline (8.3.2) at the Advanced Light Source (ALS) in Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA USA). Soil samples were collected in Fall of 2017 from the riparian oak forest located at the Russell Ranch Sustainable Agricultural Institute at the University of California Davis. The soil was sieved through a 2 mm mesh and was air dried before imaging. A single soil aggregate was scanned at 23 keV using the 10x objective lens with a pixel resolution of 650 nanometers on beamline 8.3.2 at the ALS. Additionally, a drought stressed almond flower bud (Prunus dulcis) from a plant housed at the University of California, Davis, was scanned using a 4x lens with a pixel resolution of 1.72 µm on beamline 8.3.2 at the ALS Raw tomographic image data was reconstructed using TomoPy. Reconstructions were converted to 8-bit tif or png format using ImageJ or the PIL package in Python before further processing. Images were annotated using Intel’s Computer Vision Annotation Tool (CVAT) and ImageJ. Both CVAT and ImageJ are free to use and open source. Leaf images were annotated in following Théroux-Rancourt et al. (2020). Specifically, Hand labeling was done directly in ImageJ by drawing around each tissue; with 5 images annotated per leaf. Care was taken to cover a range of anatomical variation to help improve the generalizability of the models to other leaves. All slices were labeled by Dr. Mina Momayyezi and Fiona Duong.To annotate the flower bud and soil aggregate, images were imported into CVAT. The exterior border of the bud (i.e. bud scales) and flower were annotated in CVAT and exported as masks. Similarly, the exterior of the soil aggregate and particulate organic matter identified by eye were annotated in CVAT and exported as masks. To annotate air spaces in both the bud and soil aggregate, images were imported into ImageJ. A gaussian blur was applied to the image to decrease noise and then the air space was segmented using thresholding. After applying the threshold, the selected air space region was converted to a binary image with white representing the air space and black representing everything else. This binary image was overlaid upon the original image and the air space within the flower bud and aggregate was selected using the “free hand” tool. Air space outside of the region of interest for both image sets was eliminated. The quality of the air space annotation was then visually inspected for accuracy against the underlying original image; incomplete annotations were corrected using the brush or pencil tool to paint missing air space white and incorrectly identified air space black. Once the annotation was satisfactorily corrected, the binary image of the air space was saved. Finally, the annotations of the bud and flower or aggregate and organic matter were opened in ImageJ and the associated air space mask was overlaid on top of them forming a three-layer mask suitable for training the fully convolutional network. All labeling of the soil aggregate and soil aggregate images was done by Dr. Devin Rippner. These images and annotations are for training deep learning models to identify different constituents in leaves, almond buds, and soil aggregates Limitations: For the walnut leaves, some tissues (stomata, etc.) are not labeled and only represent a small portion of a full leaf. Similarly, both the almond bud and the aggregate represent just one single sample of each. The bud tissues are only divided up into buds scales, flower, and air space. Many other tissues remain unlabeled. For the soil aggregate annotated labels are done by eye with no actual chemical information. Therefore particulate organic matter identification may be incorrect. Resources in this dataset:Resource Title: Annotated X-ray CT images and masks of a Forest Soil Aggregate. File Name: forest_soil_images_masks_for_testing_training.zipResource Description: This aggregate was collected from the riparian oak forest at the Russell Ranch Sustainable Agricultural Facility. The aggreagate was scanned using X-ray micro-computed tomography (microCT) on the X-ray μCT beamline (8.3.2) at the Advanced Light Source (ALS) in Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA USA) using the 10x objective lens with a pixel resolution of 650 nanometers. For masks, the background has a value of 0,0,0; pores spaces have a value of 250,250, 250; mineral solids have a value= 128,0,0; and particulate organic matter has a value of = 000,128,000. These files were used for training a model to segment the forest soil aggregate and for testing the accuracy, precision, recall, and f1 score of the model.Resource Title: Annotated X-ray CT images and masks of an Almond bud (P. Dulcis). File Name: Almond_bud_tube_D_P6_training_testing_images_and_masks.zipResource Description: Drought stressed almond flower bud (Prunis dulcis) from a plant housed at the University of California, Davis, was scanned by X-ray micro-computed tomography (microCT) on the X-ray μCT beamline (8.3.2) at the Advanced Light Source (ALS) in Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA USA) using the 4x lens with a pixel resolution of 1.72 µm using. For masks, the background has a value of 0,0,0; air spaces have a value of 255,255, 255; bud scales have a value= 128,0,0; and flower tissues have a value of = 000,128,000. These files were used for training a model to segment the almond bud and for testing the accuracy, precision, recall, and f1 score of the model.Resource Software Recommended: Fiji (ImageJ),url: https://imagej.net/software/fiji/downloads Resource Title: Annotated X-ray CT images and masks of Walnut leaves (J. Regia) . File Name: 6_leaf_training_testing_images_and_masks_for_paper.zipResource Description: Stems were collected from genetically unique J. regia accessions at the 117 USDA-ARS-NCGR in Wolfskill Experimental Orchard, Winters, California USA to use as scion, and were grafted by Sierra Gold Nursery onto a commonly used commercial rootstock, RX1 (J. microcarpa × J. regia). We used a common rootstock to eliminate any own-root effects and to simulate conditions for a commercial walnut orchard setting, where rootstocks are commonly used. The grafted saplings were repotted and transferred to the Armstrong lathe house facility at the University of California, Davis in June 2019, and kept under natural light and temperature. Leaves from each accession and treatment were scanned using X-ray micro-computed tomography (microCT) on the X-ray μCT beamline (8.3.2) at the Advanced Light Source (ALS) in Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA USA) using the 10x objective lens with a pixel resolution of 650 nanometers. For masks, the background has a value of 170,170,170; Epidermis value= 85,85,85; Mesophyll value= 0,0,0; Bundle Sheath Extension value= 152,152,152; Vein value= 220,220,220; Air value = 255,255,255.Resource Software Recommended: Fiji (ImageJ),url: https://imagej.net/software/fiji/downloads

  2. P

    Data from: ImageNet Dataset

    • paperswithcode.com
    Updated Apr 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jia Deng; Wei Dong; Richard Socher; Li-Jia Li; Kai Li; Fei-Fei Li (2024). ImageNet Dataset [Dataset]. https://paperswithcode.com/dataset/imagenet
    Explore at:
    Dataset updated
    Apr 15, 2024
    Authors
    Jia Deng; Wei Dong; Richard Socher; Li-Jia Li; Kai Li; Fei-Fei Li
    Description

    The ImageNet dataset contains 14,197,122 annotated images according to the WordNet hierarchy. Since 2010 the dataset is used in the ImageNet Large Scale Visual Recognition Challenge (ILSVRC), a benchmark in image classification and object detection. The publicly released dataset contains a set of manually annotated training images. A set of test images is also released, with the manual annotations withheld. ILSVRC annotations fall into one of two categories: (1) image-level annotation of a binary label for the presence or absence of an object class in the image, e.g., “there are cars in this image” but “there are no tigers,” and (2) object-level annotation of a tight bounding box and class label around an object instance in the image, e.g., “there is a screwdriver centered at position (20,25) with width of 50 pixels and height of 30 pixels”. The ImageNet project does not own the copyright of the images, therefore only thumbnails and URLs of images are provided.

    Total number of non-empty WordNet synsets: 21841 Total number of images: 14197122 Number of images with bounding box annotations: 1,034,908 Number of synsets with SIFT features: 1000 Number of images with SIFT features: 1.2 million

  3. Data Annotation Tools Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Jun 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Data Annotation Tools Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/data-annotation-tools-market-global-geographical-industry-analysis
    Explore at:
    csv, pdf, pptxAvailable download formats
    Dataset updated
    Jun 30, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Data Annotation Tools Market Outlook



    According to our latest research, the global Data Annotation Tools market size reached USD 2.1 billion in 2024. The market is set to expand at a robust CAGR of 26.7% from 2025 to 2033, projecting a remarkable value of USD 18.1 billion by 2033. The primary growth driver for this market is the escalating adoption of artificial intelligence (AI) and machine learning (ML) across various industries, which necessitates high-quality labeled data for model training and validation.




    One of the most significant growth factors propelling the data annotation tools market is the exponential rise in AI-powered applications across sectors such as healthcare, automotive, retail, and BFSI. As organizations increasingly integrate AI and ML into their core operations, the demand for accurately annotated data has surged. Data annotation tools play a crucial role in transforming raw, unstructured data into structured, labeled datasets that can be efficiently used to train sophisticated algorithms. The proliferation of deep learning and natural language processing technologies further amplifies the need for comprehensive data labeling solutions. This trend is particularly evident in industries like healthcare, where annotated medical images are vital for diagnostic algorithms, and in automotive, where labeled sensor data supports the evolution of autonomous vehicles.




    Another prominent driver is the shift toward automation and digital transformation, which has accelerated the deployment of data annotation tools. Enterprises are increasingly adopting automated and semi-automated annotation platforms to enhance productivity, reduce manual errors, and streamline the data preparation process. The emergence of cloud-based annotation solutions has also contributed to market growth by enabling remote collaboration, scalability, and integration with advanced AI development pipelines. Furthermore, the growing complexity and variety of data types, including text, audio, image, and video, necessitate versatile annotation tools capable of handling multimodal datasets, thus broadening the market's scope and applications.




    The market is also benefiting from a surge in government and private investments aimed at fostering AI innovation and digital infrastructure. Several governments across North America, Europe, and Asia Pacific have launched initiatives and funding programs to support AI research and development, including the creation of high-quality, annotated datasets. These efforts are complemented by strategic partnerships between technology vendors, research institutions, and enterprises, which are collectively advancing the capabilities of data annotation tools. As regulatory standards for data privacy and security become more stringent, there is an increasing emphasis on secure, compliant annotation solutions, further driving innovation and market demand.




    From a regional perspective, North America currently dominates the data annotation tools market, driven by the presence of major technology companies, well-established AI research ecosystems, and significant investments in digital transformation. However, Asia Pacific is emerging as the fastest-growing region, fueled by rapid industrialization, expanding IT infrastructure, and a burgeoning startup ecosystem focused on AI and data science. Europe also holds a substantial market share, supported by robust regulatory frameworks and active participation in AI research. Latin America and the Middle East & Africa are gradually catching up, with increasing adoption in sectors such as retail, automotive, and government. The global landscape is characterized by dynamic regional trends, with each market contributing uniquely to the overall growth trajectory.





    Component Analysis



    The data annotation tools market is segmented by component into software and services, each playing a pivotal role in the market's overall ecosystem. Software solutions form the backbone of the market, providing the technical infrastructure for auto

  4. R

    Car Highway Dataset

    • universe.roboflow.com
    zip
    Updated Sep 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sallar (2023). Car Highway Dataset [Dataset]. https://universe.roboflow.com/sallar/car-highway/dataset/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 13, 2023
    Dataset authored and provided by
    Sallar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Vehicles Bounding Boxes
    Description

    Car-Highway Data Annotation Project

    Introduction

    In this project, we aim to annotate car images captured on highways. The annotated data will be used to train machine learning models for various computer vision tasks, such as object detection and classification.

    Project Goals

    • Collect a diverse dataset of car images from highway scenes.
    • Annotate the dataset to identify and label cars within each image.
    • Organize and format the annotated data for machine learning model training.

    Tools and Technologies

    For this project, we will be using Roboflow, a powerful platform for data annotation and preprocessing. Roboflow simplifies the annotation process and provides tools for data augmentation and transformation.

    Annotation Process

    1. Upload the raw car images to the Roboflow platform.
    2. Use the annotation tools in Roboflow to draw bounding boxes around each car in the images.
    3. Label each bounding box with the corresponding class (e.g., car).
    4. Review and validate the annotations for accuracy.

    Data Augmentation

    Roboflow offers data augmentation capabilities, such as rotation, flipping, and resizing. These augmentations can help improve the model's robustness.

    Data Export

    Once the data is annotated and augmented, Roboflow allows us to export the dataset in various formats suitable for training machine learning models, such as YOLO, COCO, or TensorFlow Record.

    Milestones

    1. Data Collection and Preprocessing
    2. Annotation of Car Images
    3. Data Augmentation
    4. Data Export
    5. Model Training

    Conclusion

    By completing this project, we will have a well-annotated dataset ready for training machine learning models. This dataset can be used for a wide range of applications in computer vision, including car detection and tracking on highways.

  5. z

    Image Dataset of Accessibility Barriers

    • zenodo.org
    • explore.openaire.eu
    zip
    Updated Mar 25, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jakob Stolberg; Jakob Stolberg (2022). Image Dataset of Accessibility Barriers [Dataset]. http://doi.org/10.5281/zenodo.6382090
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 25, 2022
    Dataset provided by
    Zenodo
    Authors
    Jakob Stolberg; Jakob Stolberg
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Data
    The dataset consist of 5538 images of public spaces, annotated with steps, stairs, ramps and grab bars for stairs and ramps. The dataset has annotations 3564 of steps, 1492 of stairs, 143 of ramps and 922 of grab bars.

    Each step annotation is attributed with an estimate of the height of the step, as falling into one of three categories: less than 3cm, 3cm to 7cm or more than 7cm. Additionally it is attributed with a 'type', with the possibilities 'doorstep', 'curb' or 'other'.

    Stair annotations are attributed with the number of steps in the stair.

    Ramps are attributed with an estimate of their width, also falling into three categories: less than 50cm, 50cm to 100cm and more than 100cm.

    In order to preserve all additional attributes of the labels, the data is published in the CVAT XML format for images.

    Annotating Process
    The labelling has been done using bounding boxes around the objects. This format is compatible with many popular object detection models, e.g. the YOLO object model. A bounding box is placed so it contains exactly the visible part of the respective objects. This implies that only objects that are visible in the photo are annotated. This means in particular a photo of a stair or step from above, where the object cannot be seen, have not been annotated, even when a human viewer can possibly infer that there is a stair or a step from other features in the photo.

    Steps
    A step is annotated, when there is an vertical increment that functions as a passage between two surface areas intended human or vehicle traffic. This means that we have not included:

    • Increments that are to high to reasonably be considered at passage.
    • Increments that does not lead to a surface intended for human or vehicle traffic, e.g. a 'step' in front of a wall or a curb in front of a bush.

    In particular, the bounding box of a step object contains exactly the incremental part of the step, but does not extend into the top or bottom horizontal surface any more than necessary to enclose entirely the incremental part. This has been chosen for consistency reasons, as including parts of the horizontal surfaces would imply a non-trivial choice of how much to include, which we deemed would most likely lead to more inconstistent annotations.

    The height of the steps are estimated by the annotators, and are therefore not guarranteed to be accurate.

    The type of the steps typically fall into the category 'doorstep' or 'curb'. Steps that are in a doorway, entrance or likewise are attributed as doorsteps. We also include in this category steps that are immediately leading to a doorway within a proximity of 1-2m. Steps between different types of pathways, e.g. between streets and sidewalks, are annotated as curbs. Any other type of step are annotated with 'other'. Many of the 'other' steps are for example steps to terraces.

    Stairs
    The stair label is used whenever two or more steps directly follow each other in a consistent pattern. All vertical increments are enclosed in the bounding box, as well as intermediate surfaces of the steps. However the top and bottom surface is not included more than necessary for the same reason as for steps, as described in the previous section.

    The annotator counts the number of steps, and attribute this to the stair object label.

    Ramps
    Ramps have been annotated when a sloped passage way has been placed or built to connect two surface areas intended for human or vehicle traffic. This implies the same considerations as with steps. Alike also only the sloped part of a ramp is annotated, not including the bottom or top surface area.

    For each ramp, the annotator makes an assessment of the width of the ramp in three categories: less than 50cm, 50cm to 100cm and more than 100cm. This parameter is visually hard to assess, and sometimes impossible due to the view of the ramp.

    Grab Bars
    Grab bars are annotated for hand rails and similar that are in direct connection to a stair or a ramp. While horizontal grab bars could also have been included, this was omitted due to the implied ambiguities of fences and similar objects. As the grab bar was originally intended as an attributal information to stairs and ramps, we chose to keep this focus. The bounding box encloses the part of the grab bar that functions as a hand rail for the stair or ramp.

    Usage
    As is often the case when annotating data, much information depends on the subjective assessment of the annotator. As each data point in this dataset has been annotated only by one person, caution should be taken if the data is applied.

    Generally speaking, the mindset and usage guiding the annotations have been wheelchair accessibility. While we have strived to annotate at an object level, hopefully making the data more widely applicable than this, we state this explicitly as it may have swayed untrivial annotation choices.

    The attributal data, such as step height or ramp width are highly subjective estimations. We still provide this data to give a post-hoc method to adjust which annotations to use. E.g. for some purposes, one may be interested in detecting only steps that are indeed more than 3cm. The attributal data makes it possible to sort away the steps less than 3cm, so a machine learning algorithm can be trained on this more appropriate dataset for that use case. We stress however, that one cannot expect to train accurate machine learning algorithms inferring the attributal data, as this is not accurate data in the first place.

    We hope this dataset will be a useful building block in the endeavours for automating barrier detection and documentation.

  6. d

    25M+ Images | AI Training Data | Annotated imagery data for AI | Object &...

    • datarade.ai
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Seeds, 25M+ Images | AI Training Data | Annotated imagery data for AI | Object & Scene Detection | Global Coverage [Dataset]. https://datarade.ai/data-products/15m-images-ai-training-data-annotated-imagery-data-for-a-data-seeds
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset authored and provided by
    Data Seeds
    Area covered
    Yemen, Saint Lucia, French Polynesia, Barbados, United Arab Emirates, Iceland, Liberia, Morocco, Virgin Islands (U.S.), Nepal
    Description

    This dataset features over 25,000,000 high-quality general-purpose images sourced from photographers worldwide. Designed to support a wide range of AI and machine learning applications, it offers a richly diverse and extensively annotated collection of everyday visual content.

    Key Features: 1. Comprehensive Metadata: the dataset includes full EXIF data, detailing camera settings such as aperture, ISO, shutter speed, and focal length. Additionally, each image is pre-annotated with object and scene detection metadata, making it ideal for tasks like classification, detection, and segmentation. Popularity metrics, derived from engagement on our proprietary platform, are also included.

    2.Unique Sourcing Capabilities: the images are collected through a proprietary gamified platform for photographers. Competitions spanning various themes ensure a steady influx of diverse, high-quality submissions. Custom datasets can be sourced on-demand within 72 hours, allowing for specific requirements—such as themes, subjects, or scenarios—to be met efficiently.

    1. Global Diversity: photographs have been sourced from contributors in over 100 countries, covering a wide range of human experiences, cultures, environments, and activities. The dataset includes images of people, nature, objects, animals, urban and rural life, and more—captured across different times of day, seasons, and lighting conditions.

    2. High-Quality Imagery: the dataset includes images with resolutions ranging from standard to high-definition to meet the needs of various projects. Both professional and amateur photography styles are represented, offering a balance of realism and creativity across visual domains.

    3. Popularity Scores: each image is assigned a popularity score based on its performance in GuruShots competitions. This unique metric reflects how well the image resonates with a global audience, offering an additional layer of insight for AI models focused on aesthetics, engagement, or content curation.

    4. AI-Ready Design: this dataset is optimized for AI applications, making it ideal for training models in general image recognition, multi-label classification, content filtering, and scene understanding. It integrates easily with leading machine learning frameworks and pipelines.

    5. Licensing & Compliance: the dataset complies fully with data privacy regulations and offers transparent licensing for both commercial and academic use.

    Use Cases: 1. Training AI models for general-purpose image classification and tagging. 2. Enhancing content moderation and visual search systems. 3. Building foundational datasets for large-scale vision-language models. 4. Supporting research in computer vision, multimodal AI, and generative modeling.

    This dataset offers a comprehensive, diverse, and high-quality resource for training AI and ML models across a wide array of domains. Customizations are available to suit specific project needs. Contact us to learn more!

  7. R

    Complete Sample Annotate Data Dataset

    • universe.roboflow.com
    zip
    Updated Sep 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    he he (2024). Complete Sample Annotate Data Dataset [Dataset]. https://universe.roboflow.com/he-he/complete-sample-annotate-data
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 14, 2024
    Dataset authored and provided by
    he he
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Variables measured
    Metre Inch Ounce Pound Metremillimetre Meter Inch Yard Cup Millivolt Killovolt Volt Bounding Boxes
    Description

    Complete Sample Annotate Data

    ## Overview
    
    Complete Sample Annotate Data is a dataset for object detection tasks - it contains Metre Inch Ounce Pound Metremillimetre Meter Inch Yard Cup Millivolt Killovolt Volt annotations for 538 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [MIT license](https://creativecommons.org/licenses/MIT).
    
  8. q

    St Bees acoustic sensor data annotations

    • researchdatafinder.qut.edu.au
    • researchdata.edu.au
    Updated Dec 6, 2010
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Paul Roe (2010). St Bees acoustic sensor data annotations [Dataset]. https://researchdatafinder.qut.edu.au/individual/q82
    Explore at:
    Dataset updated
    Dec 6, 2010
    Dataset provided by
    Queensland University of Technology (QUT)
    Authors
    Paul Roe
    Description

    This dataset is the tagged csv file resulting from a study investigating the vocalisations of Koala populations on St Bees island. Audio data can be retrieved by date and time period and by searching annotation tags which have been applied to the audio recordings (for example it is possible to search for all audio samples tagged with Kookaburra). Researchers can download audio files and csv files containing information about the tags specified in the search. The 'tag' file includes: Tag Name,Start Time,End Time,Max Frequency (hz), Min Frequency (hz),Project Site, Sensor Name, Score and a link to the specific audio sample associated with the individual tag.

  9. P

    EVICAN Dataset

    • paperswithcode.com
    Updated Apr 7, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). EVICAN Dataset [Dataset]. https://paperswithcode.com/dataset/evican
    Explore at:
    Dataset updated
    Apr 7, 2020
    Description

    Deep learning use for quantitative image analysis is exponentially increasing. However, training accurate, widely deployable deep learning algorithms requires a plethora of annotated (ground truth) data. Image collections must contain not only thousands of images to provide sufficient example objects (i.e. cells), but also contain an adequate degree of image heterogeneity. We present a new dataset, EVICAN-Expert visual cell annotation, comprising partially annotated grayscale images of 30 different cell lines from multiple microscopes, contrast mechanisms and magnifications that is readily usable as training data for computer vision applications. With 4600 images and ∼26 000 segmented cells, our collection offers an unparalleled heterogeneous training dataset for cell biology deep learning application development. The dataset is freely available (https://edmond.mpdl.mpg.de/imeji/collection/l45s16atmi6Aa4sI?q=).

  10. Image Tagging and Annotation Services Market Report | Global Forecast From...

    • dataintelo.com
    csv, pdf, pptx
    Updated Jan 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Image Tagging and Annotation Services Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-image-tagging-and-annotation-services-market
    Explore at:
    pdf, pptx, csvAvailable download formats
    Dataset updated
    Jan 7, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Image Tagging and Annotation Services Market Outlook



    The global image tagging and annotation services market size was valued at approximately USD 1.5 billion in 2023 and is projected to reach around USD 4.8 billion by 2032, growing at a compound annual growth rate (CAGR) of about 14%. This robust growth is driven by the exponential rise in demand for machine learning and artificial intelligence applications, which heavily rely on annotated datasets to train algorithms effectively. The surge in digital content creation and the increasing need for organized data for analytical purposes are also significant contributors to the market expansion.



    One of the primary growth factors for the image tagging and annotation services market is the increasing adoption of AI and machine learning technologies across various industries. These technologies require large volumes of accurately labeled data to function optimally, making image tagging and annotation services crucial. Specifically, sectors such as healthcare, automotive, and retail are investing in AI-driven solutions that necessitate high-quality annotated images to enhance machine learning models' efficiency. For example, in healthcare, annotated medical images are essential for developing tools that can aid in diagnostics and treatment decisions. Similarly, in the automotive industry, annotated images are pivotal for the development of autonomous vehicles.



    Another significant driver is the growing emphasis on improving customer experience through personalized solutions. Companies are leveraging image tagging and annotation services to better understand consumer behavior and preferences by analyzing visual content. In retail, for instance, businesses analyze customer-generated images to tailor marketing strategies and improve product offerings. Additionally, the integration of augmented reality (AR) and virtual reality (VR) in various applications has escalated the need for precise image tagging and annotation, as these technologies rely on accurately labeled datasets to deliver immersive experiences.



    Data Collection and Labeling are foundational components in the realm of image tagging and annotation services. The process of collecting and labeling data involves gathering vast amounts of raw data and meticulously annotating it to create structured datasets. These datasets are crucial for training machine learning models, enabling them to recognize patterns and make informed decisions. The accuracy of data labeling directly impacts the performance of AI systems, making it a critical step in the development of reliable AI applications. As industries increasingly rely on AI-driven solutions, the demand for high-quality data collection and labeling services continues to rise, underscoring their importance in the broader market landscape.



    The rising trend of digital transformation across industries has also significantly bolstered the demand for image tagging and annotation services. Organizations are increasingly investing in digital tools that can automate processes and enhance productivity. Image annotation plays a critical role in enabling technologies such as computer vision, which is instrumental in automating tasks ranging from quality control to inventory management. Moreover, the proliferation of smart devices and the Internet of Things (IoT) has led to an unprecedented amount of image data generation, further fueling the need for efficient image tagging and annotation services to make sense of the vast data deluge.



    From a regional perspective, North America is currently the largest market for image tagging and annotation services, attributed to the early adoption of advanced technologies and the presence of numerous tech giants investing in AI and machine learning. The region is expected to maintain its dominance due to ongoing technological advancements and the growing demand for AI solutions across various sectors. Meanwhile, the Asia Pacific region is anticipated to experience the fastest growth during the forecast period, driven by rapid industrialization, increasing internet penetration, and the rising adoption of AI technologies in countries like China, India, and Japan. The European market is also witnessing steady growth, supported by government initiatives promoting digital innovation and the use of AI-driven applications.



    Service Type Analysis



    The service type segment in the image tagging and annotation services market is bifurcated into manual annotation and automa

  11. Z

    Ground truth annotations for boiling bubble detection and measurement in...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Feb 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xenophon Zabulis (2023). Ground truth annotations for boiling bubble detection and measurement in microgravity [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7553796
    Explore at:
    Dataset updated
    Feb 19, 2023
    Dataset provided by
    Thodoris Karapantsios
    Margaritis Kostoglou
    Axel Sielaff
    Xenophon Zabulis
    Peter Stephan
    Sotiris Evgenidis
    Ourania Oikonomidou
    Polykarpos Karamaounas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a dataset of ground truth annotations for benchmark data provided in A. Sielaff, D. Mangini, O. Kabov, M. Raza, A. Garivalis, M. Zupančič, S. Dehaeck, S. Evgenidis, C. Jacobs, D. Van Hoof, O. Oikonomidou, X. Zabulis, P. Karamaounas, A. Bender, F. Ronshin, M. Schinnerl,

    J. Sebilleau, C. Colin, P. Di Marco, T. Karapantsios, I. Golobič, A. Rednikov, P. Colinet, P. Stephan, L. Tadrist, The multiscale boiling investigation on-board the international space station:

    An overview, Applied Thermal Engineering 205 (2022) 117932. doi:10.1016/j.applthermaleng.2021.117932.

    The annotations regard the 15 image sequences provided in the benchmark data and denoted as D1-D15.

    The annotators were asked to localize the contact points and points on the bubble boundary so an adequate contour identification is provided, according to the judgement of the expert. The annotators were two multiphase dynamics experts (RO, SE) and one image processing expert (ICS). The annotators used custom-made software to pinpoint samples upon contour locations in the images carefully, using magnification, undo, and editing facilities. The experts annotated the contact points and multiple points on the contour of the bubble until they were satisfied with the result.

    The annotations were collected for the first bubble of each sequence. For each bubble, 20 frames were sampled in chronological order and in equidistant temporal steps and annotated. All experts annotated data sets D1-D15. The rest were annotated by ICS after learning annotation insights from the multiphase dynamics experts.

    The format of the dataset is as follows. A directory is dedicated to each bubble annotation. The directory name notes the number of the dataset and the annotator id. Each directory contains 20 text files and 20, corresponding, images. Each text file contains a list with the 2D coordinates of one bubble annotation. The first coordinate marks the left contact point and the last coordinate marks the right contact point. These coordinates refer to a corresponding image contained in the same directory. Text files and image files are corresponded through their file names, which contain the frame number. The frame number refers to the image sequence. Images are in lossless PNG format.

  12. f

    Data from: BreCaHAD: A Dataset for Breast Cancer Histopathological...

    • figshare.com
    png
    Updated Jan 28, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alper Aksac; Douglas J. Demetrick; Tansel Özyer; Reda Alhajj (2019). BreCaHAD: A Dataset for Breast Cancer Histopathological Annotation and Diagnosis [Dataset]. http://doi.org/10.6084/m9.figshare.7379186.v3
    Explore at:
    pngAvailable download formats
    Dataset updated
    Jan 28, 2019
    Dataset provided by
    figshare
    Authors
    Alper Aksac; Douglas J. Demetrick; Tansel Özyer; Reda Alhajj
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset consists of 1 .xlsx file, 2 .png files, 1 .json file and 1 .zip file:annotation_details.xlsx: The distribution of annotations in the previously mentioned six classes (mitosis, apoptosis, tumor nuclei, non-tumor nuclei, tubule, and non-tubule) is presented in a Excel spreadsheet.original.png: The input image.annotated.png: An example from the dataset. In the annotated image, blue circles indicate the tumor nuclei, pink circles show non-tumor nuclei such as blood cells, stroma nuclei, and lymphocytes; orange and green circles are mitosis and apoptosis, respectively; light blue circles are true lumen for tubules, and yellow circles represent white regions (non-lumen) such as fat, blood vessel, and broken tissues.data.json: The annotations for the BreCaHAD dataset are provided in JSON (JavaScript Object Notation) format. In the given example, the JSON file (ground truth) contains two mitosis and only one tumor nuclei annotations. Here, x and y are the coordinates of the centroid of the annotated object, and the values are between 0, 1.BreCaHAD.zip: An archive file containing dataset. Three folders are included: images (original images), groundTruth (json files), and groundTruth_display (groundTruth applied on original images)

  13. Data for the evaluation of the MAIA method for image annotation

    • zenodo.org
    • eprints.soton.ac.uk
    csv
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Martin Zurowietz; Martin Zurowietz; Daniel Langenkämper; Daniel Langenkämper; Brett Hosking; Brett Hosking; Henry A Ruhl; Tim W Nattkemper; Henry A Ruhl; Tim W Nattkemper (2020). Data for the evaluation of the MAIA method for image annotation [Dataset]. http://doi.org/10.5281/zenodo.1453836
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Martin Zurowietz; Martin Zurowietz; Daniel Langenkämper; Daniel Langenkämper; Brett Hosking; Brett Hosking; Henry A Ruhl; Tim W Nattkemper; Henry A Ruhl; Tim W Nattkemper
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains all annotations and annotation candidates that were used for the evaluation of the MAIA method for image annotation. Each row in the CSVs represents one annotation candidate or final annotation. Annotation candidates have the label "OOI candidate" (label_id 9974). All other entries represent final reviewed annotations. Each CSV contains the information for one of the three image datasets that were used in the evaluation.

    Visual exploration of the data is possible in the BIIGLE 2.0 image annotation system at https://biigle.de/projects/139 using the login maia@example.com and the password MAIApaper.

  14. d

    Annotated Imagery Data | AI Training Data| Face ID + 106 key points facial...

    • datarade.ai
    Updated Nov 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pixta AI (2022). Annotated Imagery Data | AI Training Data| Face ID + 106 key points facial landmark images | 30,000 Stock Images [Dataset]. https://datarade.ai/data-products/unique-face-ids-with-facial-landmark-106-key-points-pixta-ai
    Explore at:
    .json, .xml, .csv, .txtAvailable download formats
    Dataset updated
    Nov 25, 2022
    Dataset authored and provided by
    Pixta AI
    Area covered
    Vietnam, Belgium, Poland, New Zealand, Korea (Republic of), Spain, Canada, Portugal, Australia, Malaysia
    Description
    1. Overview This dataset is a collection of 30,000+ images of Face ID + 106 key points facial landmark that are ready to use for optimizing the accuracy of computer vision models. Images in the dataset includes People image with specific requirements as follow:
    2. Age: above 20
    3. Race: various
    4. Angle: no more than 90 degree All of the contents is sourced from PIXTA's stock library of 100M+ Asian-featured images and videos.

    5. Annotated Imagery Data of Face ID + 106 key points facial landmark This dataset contains 30,000+ images of Face ID + 106 key points facial landmark. The dataset has been annotated in - face bounding box, Attribute of race, gender, age, skin tone and 106 keypoints facial landmark. Each data set is supported by both AI and human review process to ensure labelling consistency and accuracy.

    6. About PIXTA PIXTASTOCK is the largest Asian-featured stock platform providing data, contents, tools and services since 2005. PIXTA experiences 15 years of integrating advanced AI technology in managing, curating, processing over 100M visual materials and serving global leading brands for their creative and data demands.

  15. Spam Images for Malicious Annotation Set (SIMAS)

    • zenodo.org
    application/gzip, bin +1
    Updated May 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maria Vukić; Maria Vukić; Emanuel Lacić; Emanuel Lacić; Denis Helic; Denis Helic (2025). Spam Images for Malicious Annotation Set (SIMAS) [Dataset]. http://doi.org/10.5281/zenodo.15423637
    Explore at:
    png, bin, application/gzipAvailable download formats
    Dataset updated
    May 23, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Maria Vukić; Maria Vukić; Emanuel Lacić; Emanuel Lacić; Denis Helic; Denis Helic
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    SIMAS Dataset

    This archive includes the SIMAS dataset for fine-tuning models for MMS (Multimedia Messaging Service) image moderation. SIMAS is a balanced collection of publicly available images, manually annotated in accordance with a specialized taxonomy designed for identifying visual spam in MMS messages.

    Taxonomy for MMS Visual Spam

    The following table presents the definitions of categories used for classifying MMS images.

    Table 1: Category definitions

    Category Description
    Alcohol* Content related to alcoholic beverages, including advertisements and consumption.
    Drugs* Content related to the use, sale, or trafficking of narcotics (e.g., cannabis, cocaine,
    Firearms* Content involving guns, pistols, knives, or military weapons.
    Gambling* Content related to gambling (casinos, poker, roulette, lotteries).
    Sexual Content involving nudity, sexual acts, or sexually suggestive material.
    Tobacco* Content related to tobacco use and advertisements.
    Violence Content showing violent acts, self-harm, or injury.
    Safe All other content, including neutral depictions, products, or harmless cultural symbols

    Note: Categories marked with an asterisk are regulated in some jurisdictions and may not be universally restricted.

    Dataset Collection and Annotation

    Data Sources

    The SIMAS dataset combines publicly available images from multiple sources, selected to reflect the categories defined in our content taxonomy. Each image was manually reviewed by three independent annotators, with final labels assigned when at least two annotators agreed.

    The largest portion of the dataset (30.4%) originates from LAION-400M, a large-scale image-text dataset. To identify relevant content, we first selected a list of ImageNet labels that semantically matched our taxonomy. These labels were generated using GPT-4o in a zero-shot setting, using separate prompts per category. This resulted in 194 candidate labels, of which 88.7% were retained after manual review. The structure of the prompts used in this process is shown in the file gpt4o_imagenet_prompting_scheme.png, which illustrates a shared base prompt template applied across all categories. The fields category_definition, file_examples, and exceptions are specified per category. Definitions align with the taxonomy, while the file_examples column includes sample labels retrieved from the ImageNet label list. The exceptions field contains category-specific filtering instructions; a dash indicates no exceptions were specified.

    Another 25.1% of images were sourced from Roboflow, using open datasets such as:

    The NudeNet dataset contributes 11.4% of the dataset. We sampled 1,000 images from the “porn” category to provide visual coverage of explicit sexual content.

    Another 11.0% of images were collected from Kaggle, including:

    An additional 9.9% of images were retrieved from Unsplash, using keyword-based search queries aligned with each category in our taxonomy.

    Images from UnsafeBench make up 8.0% of the dataset. Since its original binary labels did not match our taxonomy, all samples were manually reassigned to the most appropriate category.

    Finally, 4.2% of images were gathered from various publicly accessible websites. These were primarily used to improve category balance and model generalization, especially in safe classes.

    All images collected from the listed sources have been manually reviewed by three independent annotators. Each image is then assigned to a category when at least two annotators reach consensus.

    Table 2: Distribution of images per public source and category in SIMAS dataset

    TypeCategoryLAIONRoboflowNudeNetKaggleUnsplashUnsafeBenchOtherTotal
    UnsafeAlcohol2903267010300
    UnsafeDrugs17211001381250
    UnsafeFirearms05902290620350
    UnsafeGambling1323800733918300
    UnsafeSexual2042103686500
    UnsafeTobacco04460043110500
    UnsafeViolence0289000110300
    SafeAlcohol1403500161396300
    SafeDrugs6749015721730250
    SafeFirearms173150314487350
    SafeGambling164201121120300
    SafeSexual2352213920948500
    SafeTobacco3516751381640500
    SafeViolence212203210422300
    AllAll1,5221,2535715514934022085,000

    Balancing

    To ensure semantic diversity and dataset balance, undersampling was performed on overrepresented categories using a CLIP-based embedding and k-means clustering strategy. This resulted in a final dataset containing 2,500 spam and 2,500 safe images, evenly distributed across all categories.

    Table 3: Distribution of images per category in SIMAS

  16. c

    Annotations for A Randomized Phase III Study Comparing...

    • cancerimagingarchive.net
    csv, dicom, n/a
    Updated Nov 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Cancer Imaging Archive (2024). Annotations for A Randomized Phase III Study Comparing Carboplatin/Paclitaxel or Carboplatin/Paclitaxel/Bevacizumab With or Without Concurrent Cetuximab in Patients With Advanced Non-small Cell Lung Cancer [Dataset]. http://doi.org/10.7937/R0R8-BN93
    Explore at:
    n/a, dicom, csvAvailable download formats
    Dataset updated
    Nov 9, 2024
    Dataset authored and provided by
    The Cancer Imaging Archive
    License

    https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

    Time period covered
    Nov 9, 2024
    Dataset funded by
    National Cancer Institutehttp://www.cancer.gov/
    Description

    This dataset contains image annotations derived from the NCI Clinical Trial "A Randomized Phase III Study Comparing Carboplatin/Paclitaxel or Carboplatin/Paclitaxel/Bevacizumab With or Without Concurrent Cetuximab in Patients With Advanced Non-small Cell Lung Cancer”. This dataset was generated as part of an NCI project to augment TCIA datasets with annotations that will improve their value for cancer researchers and AI developers.

    Annotation Protocol

    For each patient, all scans were reviewed to identify and annotate the clinically relevant time points and sequences/series. Scans were initially annotated by an international team of radiologists holding MBBS degrees or higher, which were then reviewed by US-based board-certified radiologists to ensure accuracy. In a typical patient, all available time points were annotated. Every exam from the first available time point was annotated. One additional time point was annotated for each patient. The clinical data in the NCTN Archive was utilized to help determine the first evidence of disease progression. The first time point demonstrating disease progression was annotated. If that document was not accurate and did not demonstrate disease progression, then later time points were reviewed to assess for disease progression and the first time point demonstrating disease progression was annotated. If there was no evidence of disease progression on any time point, then the last available time point was annotated. Again, every exam from each chosen time point was annotated. For example, if there was a CT and a PET/CT, the PET was annotated along with one CT. If there was an MRI, that was annotated as well. The following annotation rules were followed:
    1. PERCIST criteria was followed for PET imaging. Specifically, the lesions estimated to have the most elevated SUVmax were annotated.
    2. RECIST 1.1 was otherwise generally followed for any MR and CT imaging. A maximum of 5 lesions were annotated per patient scan (timepoint); no more than 2 per organ. The same 5 lesions were annotated at each time point. Lymph nodes were however annotated if > 1 cm in short axis. Other lesions were annotated if > 1 cm. If the primary lesion is < 1 cm, it was still annotated. If there was evidence of disease progression with new lesions then additional annotations were allowed to demonstrate that progression. A representative sample of the new lesions was annotated at the radiologist's discretion.
    3. Lesions were annotated in the axial plane. If no axial plane was available, lesions were annotated in the coronal plane.
    4. MRIs were annotated using the T1-weighted post contrast sequence, fat saturated if available. If not available, T2-weighted sequences were utilized.
    5. CTs were annotated using the axial post contrast series. If not available, the non contrast series was annotated.
    6. PET/CTs were annotated on the CT and attenuation corrected PET images. However, if the post contrast CT was performed the same day as the PET/CT, the non contrast CT portion of the PET/CT was annotated.
    7. Lesions were labeled separately.
    8. The volume of each annotated lesion was calculated and reported in cubic centimeters [cc] in a dataset metadata report.
    9. Seed points were automatically generated but reviewed by a radiologist.
    10. A “negative” annotation was created for any exam without findings.
    At each time point:
    1. A seed point (kernel) was created for each segmented structure. The seed points for each segmentation are provided in a separate DICOM RTSS file.
    2. SNOMED-CT “Anatomic Region Sequence” and “Segmented Property Category Code Sequence” and codes were inserted for all segmented structures.
    3. “Tracking ID” and “Tracking UID” tags were inserted for each segmented structure to enable longitudinal lesion tracking.
    4. Imaging time point codes were inserted to help identify each annotation in the context of the clinical trial assessment protocol.
      1. “Clinical Trial Time Point ID” was used to encode time point type using one of the following strings as applicable: “pre-dose” or “post-chemotherapy”
      2. Content Item in “Acquisition Context Sequence” was added containing "Time Point Type" using Concept Code Sequence (0040,A168) selected from:
        1. (255235001, SCT, “Pre-dose”)
        2. (262502001, SCT, "Post-chemotherapy")

    Important supplementary information and sample code

    1. A spreadsheet containing key details about the annotations is available in the Data Access section below.
    2. A Jupyter notebook demonstrating how to use the NBIA Data Retriever Command-Line Interface application and the REST API to access these data can be found in the Additional Resources section below.

  17. Data Labeling And Annotation Tools Market Analysis, Size, and Forecast...

    • technavio.com
    Updated Jul 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Technavio (2025). Data Labeling And Annotation Tools Market Analysis, Size, and Forecast 2025-2029: North America (US, Canada, and Mexico), Europe (France, Germany, Italy, Spain, and UK), APAC (China), South America (Brazil), and Rest of World (ROW) [Dataset]. https://www.technavio.com/report/data-labeling-and-annotation-tools-market-industry-analysis
    Explore at:
    Dataset updated
    Jul 5, 2025
    Dataset provided by
    TechNavio
    Authors
    Technavio
    Time period covered
    2021 - 2025
    Area covered
    Germany, Canada, United States, Global
    Description

    Snapshot img

    Data Labeling And Annotation Tools Market Size 2025-2029

    The data labeling and annotation tools market size is forecast to increase by USD 2.69 billion at a CAGR of 28% between 2024 and 2029.

    The market is experiencing significant growth, driven by the explosive expansion of generative AI applications. As AI models become increasingly complex, there is a pressing need for specialized platforms to manage and label the vast amounts of data required for training. This trend is further fueled by the emergence of generative AI, which demands unique data pipelines for effective training. However, this market's growth trajectory is not without challenges. Maintaining data quality and managing escalating complexity pose significant obstacles. ML models are being applied across various sectors, from fraud detection and sales forecasting to speech recognition and image recognition.
    Ensuring the accuracy and consistency of annotated data is crucial for AI model performance, necessitating robust quality control measures. Moreover, the growing complexity of AI systems requires advanced tools to handle intricate data structures and diverse data types. The market continues to evolve, driven by advancements in machine learning (ML), computer vision, and natural language processing. Companies seeking to capitalize on market opportunities must address these challenges effectively, investing in innovative solutions to streamline data labeling and annotation processes while maintaining high data quality.
    

    What will be the Size of the Data Labeling And Annotation Tools Market during the forecast period?

    Explore in-depth regional segment analysis with market size data - historical 2019-2023 and forecasts 2025-2029 - in the full report.
    Request Free Sample

    The market is experiencing significant activity and trends, with a focus on enhancing annotation efficiency, ensuring data privacy, and improving model performance. Annotation task delegation and remote workflows enable teams to collaborate effectively, while version control systems facilitate model deployment pipelines and error rate reduction. Label inter-annotator agreement and quality control checks are crucial for maintaining data consistency and accuracy. Data security and privacy remain paramount, with cloud computing and edge computing solutions offering secure alternatives. Data privacy concerns are addressed through secure data handling practices and access controls. Model retraining strategies and cost optimization techniques are essential for adapting to evolving datasets and budgets. Dataset bias mitigation and accuracy improvement methods are key to producing high-quality annotated data.

    Training data preparation involves data preprocessing steps and annotation guidelines creation, while human-in-the-loop systems allow for real-time feedback and model fine-tuning. Data validation techniques and team collaboration tools are essential for maintaining data integrity and reducing errors. Scalable annotation processes and annotation project management tools streamline workflows and ensure a consistent output. Model performance evaluation and annotation tool comparison are ongoing efforts to optimize processes and select the best tools for specific use cases. Data security measures and dataset bias mitigation strategies are essential for maintaining trust and reliability in annotated data.

    How is this Data Labeling And Annotation Tools Industry segmented?

    The data labeling and annotation tools industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments.

    Type
    
      Text
      Video
      Image
      Audio
    
    
    Technique
    
      Manual labeling
      Semi-supervised labeling
      Automatic labeling
    
    
    Deployment
    
      Cloud-based
      On-premises
    
    
    Geography
    
      North America
    
        US
        Canada
        Mexico
    
    
      Europe
    
        France
        Germany
        Italy
        Spain
        UK
    
    
      APAC
    
        China
    
    
      South America
    
        Brazil
    
    
      Rest of World (ROW)
    

    By Type Insights

    The Text segment is estimated to witness significant growth during the forecast period. The data labeling market is witnessing significant growth and advancements, primarily driven by the increasing adoption of generative artificial intelligence and large language models (LLMs). This segment encompasses various annotation techniques, including text annotation, which involves adding structured metadata to unstructured text. Text annotation is crucial for machine learning models to understand and learn from raw data. Core text annotation tasks range from fundamental natural language processing (NLP) techniques, such as Named Entity Recognition (NER), where entities like persons, organizations, and locations are identified and tagged, to complex requirements of modern AI.

    Moreover,

  18. P

    Soil and Plant X-ray CT data with semantic annotations Dataset

    • paperswithcode.com
    Updated Mar 17, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Soil and Plant X-ray CT data with semantic annotations Dataset [Dataset]. https://paperswithcode.com/dataset/soil-and-plant-x-ray-ct-data-with-semantic
    Explore at:
    Dataset updated
    Mar 17, 2022
    Description

    Leaves from genetically unique Juglans regia plants were scanned using X-ray micro-computed tomography (microCT) on the X-ray μCT beamline (8.3.2) at the Advanced Light Source (ALS) in Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA USA).

    Soil samples were collected in Fall of 2017 from the riparian oak forest located at the Russell Ranch Sustainable Agricultural Institute at the University of California Davis. The soil was sieved through a 2 mm mesh and was air dried before imaging. A single soil aggregate was scanned at 23 keV using the 10x objective lens with a pixel resolution of 650 nanometers on beamline 8.3.2 at the ALS.

    Additionally, a drought stressed almond flower bud (Prunus dulcis) from a plant housed at the University of California, Davis, was scanned using a 4x lens with a pixel resolution of 1.72 µm on beamline 8.3.2 at the ALS

    Raw tomographic image data was reconstructed using TomoPy. Reconstructions were converted to 8-bit tif or png format using ImageJ or the PIL package in Python before further processing. Images were annotated using Intel’s Computer Vision Annotation Tool (CVAT) and ImageJ. Both CVAT and ImageJ are free to use and open source. Leaf images were annotated in following Théroux-Rancourt et al. (2020). Specifically, Hand labeling was done directly in ImageJ by drawing around each tissue; with 5 images annotated per leaf. Care was taken to cover a range of anatomical variation to help improve the generalizability of the models to other leaves. All slices were labeled by Dr. Mina Momayyezi and Fiona Duong.To annotate the flower bud and soil aggregate, images were imported into CVAT. The exterior border of the bud (i.e. bud scales) and flower were annotated in CVAT and exported as masks. Similarly, the exterior of the soil aggregate and particulate organic matter identified by eye were annotated in CVAT and exported as masks. To annotate air spaces in both the bud and soil aggregate, images were imported into ImageJ. A gaussian blur was applied to the image to decrease noise and then the air space was segmented using thresholding. After applying the threshold, the selected air space region was converted to a binary image with white representing the air space and black representing everything else. This binary image was overlaid upon the original image and the air space within the flower bud and aggregate was selected using the “free hand” tool. Air space outside of the region of interest for both image sets was eliminated. The quality of the air space annotation was then visually inspected for accuracy against the underlying original image; incomplete annotations were corrected using the brush or pencil tool to paint missing air space white and incorrectly identified air space black. Once the annotation was satisfactorily corrected, the binary image of the air space was saved. Finally, the annotations of the bud and flower or aggregate and organic matter were opened in ImageJ and the associated air space mask was overlaid on top of them forming a three-layer mask suitable for training the fully convolutional network. All labeling of the soil aggregate and soil aggregate images was done by Dr. Devin Rippner.

    These images and annotations are for training deep learning models to identify different constituents in leaves, almond buds, and soil aggregates

    Limitations: For the walnut leaves, some tissues (stomata, etc.) are not labeled and only represent a small portion of a full leaf. Similarly, both the almond bud and the aggregate represent just one single sample of each. The bud tissues are only divided up into buds scales, flower, and air space. Many other tissues remain unlabeled. For the soil aggregate annotated labels are done by eye with no actual chemical information. Therefore particulate organic matter identification may be incorrect.

  19. Data Annotationplace Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Jun 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Data Annotationplace Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/data-annotationplace-market
    Explore at:
    csv, pptx, pdfAvailable download formats
    Dataset updated
    Jun 29, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Data Annotation Market Outlook



    According to our latest research, the global data annotation market size reached USD 2.15 billion in 2024, fueled by the rapid proliferation of artificial intelligence and machine learning applications across industries. The market is witnessing a robust growth trajectory, registering a CAGR of 26.3% during the forecast period from 2025 to 2033. By 2033, the data annotation market is projected to attain a valuation of USD 19.14 billion. This growth is primarily driven by the increasing demand for high-quality annotated datasets to train sophisticated AI models, the expansion of automation in various sectors, and the escalating adoption of advanced technologies in emerging economies.




    The primary growth factor propelling the data annotation market is the surging adoption of artificial intelligence and machine learning across diverse sectors such as healthcare, automotive, retail, and IT & telecommunications. Organizations are increasingly leveraging AI-driven solutions for predictive analytics, automation, and enhanced decision-making, all of which require meticulously labeled datasets for optimal performance. The proliferation of computer vision, natural language processing, and speech recognition technologies has further intensified the need for accurate data annotation, as these applications rely heavily on annotated images, videos, text, and audio to function effectively. As businesses strive for digital transformation and increased operational efficiency, the demand for comprehensive data annotation services and software continues to escalate, thereby driving market expansion.




    Another significant driver for the data annotation market is the growing complexity and diversity of data types being utilized in AI projects. Modern AI systems require vast amounts of annotated data spanning multiple formats, including text, images, videos, and audio. This complexity has led to the emergence of specialized data annotation tools and services capable of handling intricate annotation tasks, such as semantic segmentation, entity recognition, and sentiment analysis. Moreover, the integration of data annotation platforms with cloud-based solutions and workflow automation tools has streamlined the annotation process, enabling organizations to scale their AI initiatives efficiently. As a result, both large enterprises and small-to-medium businesses are increasingly investing in advanced annotation solutions to maintain a competitive edge in their respective industries.




    Furthermore, the rise of data-centric AI development methodologies has placed greater emphasis on the quality and diversity of training datasets, further fueling the demand for professional data annotation services. Companies are recognizing that the success of AI models is heavily dependent on the accuracy and representativeness of the annotated data used during training. This realization has spurred investments in annotation technologies that offer features such as quality control, real-time collaboration, and integration with machine learning pipelines. Additionally, the growing trend of outsourcing annotation tasks to specialized service providers in regions with cost-effective labor markets has contributed to the market's rapid growth. As AI continues to permeate new domains, the need for scalable, high-quality data annotation solutions is expected to remain a key growth driver for the foreseeable future.




    From a regional perspective, North America currently dominates the data annotation market, accounting for the largest share due to the presence of major technology companies, robust research and development activities, and early adoption of AI technologies. However, the Asia Pacific region is expected to exhibit the fastest growth over the forecast period, driven by increasing investments in AI infrastructure, the expansion of IT and telecommunication networks, and the availability of a large, skilled workforce for annotation tasks. Europe also represents a significant market, characterized by stringent data privacy regulations and growing demand for AI-driven automation in industries such as automotive and healthcare. As global enterprises continue to prioritize AI initiatives, the data annotation market is poised for substantial growth across all major regions.



  20. Expert and AI-generated annotations of the tissue types for the...

    • zenodo.org
    • data.niaid.nih.gov
    bin, csv
    Updated Dec 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christopher Bridge; Christopher Bridge; G. Thomas Brown; Hyun Jung; Curtis Lisle; Curtis Lisle; David Clunie; David Clunie; David Milewski; Yanling Liu; Jack Collins; Corinne M. Linardic; Douglas S. Hawkins; Rajkumar Venkatramani; Andrey Fedorov; Andrey Fedorov; Javed Khan; G. Thomas Brown; Hyun Jung; David Milewski; Yanling Liu; Jack Collins; Corinne M. Linardic; Douglas S. Hawkins; Rajkumar Venkatramani; Javed Khan (2024). Expert and AI-generated annotations of the tissue types for the RMS-Mutation-Prediction microscopy images [Dataset]. http://doi.org/10.5281/zenodo.14041167
    Explore at:
    bin, csvAvailable download formats
    Dataset updated
    Dec 20, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Christopher Bridge; Christopher Bridge; G. Thomas Brown; Hyun Jung; Curtis Lisle; Curtis Lisle; David Clunie; David Clunie; David Milewski; Yanling Liu; Jack Collins; Corinne M. Linardic; Douglas S. Hawkins; Rajkumar Venkatramani; Andrey Fedorov; Andrey Fedorov; Javed Khan; G. Thomas Brown; Hyun Jung; David Milewski; Yanling Liu; Jack Collins; Corinne M. Linardic; Douglas S. Hawkins; Rajkumar Venkatramani; Javed Khan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset corresponds to a collection of images and/or image-derived data available from National Cancer Institute Imaging Data Commons (IDC) [1]. This dataset was converted into DICOM representation and ingested by the IDC team. You can explore and visualize the corresponding images using IDC Portal here: https://portal.imaging.datacommons.cancer.gov/explore/filters/?analysis_results_id=RMS-Mutation-Prediction-Expert-Annotations.. You can use the manifests included in this Zenodo record to download the content of the collection following the Download instructions below.

    Collection description

    This dataset contains 2 components:

    1. Annotations of multiple regions of interest performed by an expert pathologist with eight years of experience for a subset of hematoxylin and eosin (H&E) stained images from the RMS-Mutation-Prediction image collection [1,2]. Annotations were generated manually, using the Aperio ImageScope tool, to delineate regions of alveolar rhabdomyosarcoma (ARMS), embryonal rhabdomyosarcoma (ERMS), stroma, and necrosis [3]. The resulting planar contour annotations were originally stored in ImageScope-specific XML format, and subsequently converted into Digital Imaging and Communications in Medicine (DICOM) Structured Report (SR) representation using the open source conversion tool [4].
    2. AI-generated annotations stored as probabilistic segmentations. WARNING: After the release of v20, it was discovered that a mistake had been made during data conversion that affected the newly-released segmentations accompanying the "RMS-Mutation-Prediction" collection. Segmentations released in v20 for this collection have the segment labels for alveolar rhabdomyosarcoma (ARMS) and embryonal rhabdomyosarcoma (ERMS) switched in the metadata relative to the correct labels. Thus segment 3 in the released files is labelled in the metadata (the SegmentSequence) as ARMS but should correctly be interpreted as ERMS, and conversely segment 4 in the released files is labelled as ERMS but should be correctly interpreted as ARMS. We apologize for the mistake and any confusion that it has caused, and will be releasing a corrected version of the files in the next release as soon as possible.

    Many pixels from the whole slide images annotated by this dataset are not contained inside any annotation contours and are considered to belong to the background class. Other pixels are contained inside only one annotation contour and are assigned to a single class. However, cases also exist in this dataset where annotation contours overlap. In these cases, the pixels contained in multiple contours could be assigned membership in multiple classes. One example is a necrotic tissue contour overlapping an internal subregion of an area designated by a larger ARMS or ERMS annotation. The ordering of annotations in this DICOM dataset preserves the order in the original XML generated using ImageScope. These annotations were converted, in sequence, into segmentation masks and used in the training of several machine learning models. Details on the training methods and model results are presented in [1]. In the case of overlapping contours, the order in which annotations are processed may affect the generated segmentation mask if prior contours are overwritten by later contours in the sequence. It is up to the application consuming this data to decide how to interpret tissues regions annotated with multiple classes. The annotations included in this dataset are available for visualization and exploration from the National Cancer Institute Imaging Data Commons (IDC) [5] (also see IDC Portal at https://imaging.datacommons.cancer.gov) as of data release v18. Direct link to open the collection in IDC Portal: https://portal.imaging.datacommons.cancer.gov/explore/filters/?analysis_results_id=RMS-Mutation-Prediction-Expert-Annotations.

    Files included

    A manifest file's name indicates the IDC data release in which a version of collection data was first introduced. For example, pan_cancer_nuclei_seg_dicom-collection_id-idc_v19-aws.s5cmd corresponds to the annotations for th eimages in the collection_id collection introduced in IDC data release v19. DICOM Binary segmentations were introduced in IDC v20. If there is a subsequent version of this Zenodo page, it will indicate when a subsequent version of the corresponding collection was introduced.

    For each of the collections, the following manifest files are provided:

    1. rms_mutation_prediction_expert_annotations-idc_v20-aws.s5cmd: manifest of files available for download from public IDC Amazon Web Services buckets
    2. rms_mutation_prediction_expert_annotations-idc_v20-gcs.s5cmd: manifest of files available for download from public IDC Google Cloud Storage buckets
    3. rms_mutation_prediction_expert_annotations-idc_v20-dcf.dcf: Gen3 manifest (for details see https://learn.canceridc.dev/data/organization-of-data/guids-and-uuids)

    Note that manifest files that end in -aws.s5cmd reference files stored in Amazon Web Services (AWS) buckets, while -gcs.s5cmd reference files in Google Cloud Storage. The actual files are identical and are mirrored between AWS and GCP.

    Download instructions

    Each of the manifests include instructions in the header on how to download the included files.

    To download the files using .s5cmd manifests:

    1. install idc-index package: pip install --upgrade idc-index
    2. download the files referenced by manifests included in this dataset by passing the .s5cmd manifest file: idc download manifest.s5cmd

    To download the files using .dcf manifest, see manifest header.

    Acknowledgments

    Imaging Data Commons team has been funded in whole or in part with Federal funds from the National Cancer Institute, National Institutes of Health, under Task Order No. HHSN26110071 under Contract No. HHSN261201500003l.

    If you use the files referenced in the attached manifests, we ask you to cite this dataset, as well as the publication describing the original dataset [2] and publication acknowledging IDC [5].

    References

    [1] D. Milewski et al., "Predicting molecular subtype and survival of rhabdomyosarcoma patients using deep learning of H&E images: A report from the Children's Oncology Group," Clin. Cancer Res., vol. 29, no. 2, pp. 364–378, Jan. 2023, doi: 10.1158/1078-0432.CCR-22-1663.

    [2] Clunie, D., Khan, J., Milewski, D., Jung, H., Bowen, J., Lisle, C., Brown, T., Liu, Y., Collins, J., Linardic, C. M., Hawkins, D. S., Venkatramani, R., Clifford, W., Pot, D., Wagner, U., Farahani, K., Kim, E., & Fedorov, A. (2023). DICOM converted whole slide hematoxylin and eosin images of rhabdomyosarcoma from Children's Oncology Group trials [Data set]. Zenodo. https://doi.org/10.5281/zenodo.8225132

    [3] Agaram NP. Evolving classification of rhabdomyosarcoma. Histopathology. 2022 Jan;80(1):98-108. doi: 10.1111/his.14449. PMID: 34958505; PMCID: PMC9425116,https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9425116/

    [4] Chris Bridge. (2024). ImagingDataCommons/idc-sm-annotations-conversion: v1.0.0 (v1.0.0). Zenodo. https://doi.org/10.5281/zenodo.10632182

    [5] Fedorov, A., Longabaugh, W. J. R., Pot, D., Clunie, D. A., Pieper, S. D., Gibbs, D. L., Bridge, C., Herrmann, M. D., Homeyer, A., Lewis, R., Aerts, H. J. W. L., Krishnaswamy, D., Thiriveedhi, V. K., Ciausu, C., Schacherer, D. P., Bontempi, D., Pihl, T., Wagner, U., Farahani, K., Kim, E. & Kikinis, R. National cancer institute imaging data commons: Toward transparency, reproducibility, and scalability in imaging artificial intelligence. Radiographics 43, (2023).

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Agricultural Research Service (2025). X-ray CT data with semantic annotations for the paper "A workflow for segmenting soil and plant X-ray CT images with deep learning in Google’s Colaboratory" [Dataset]. https://catalog.data.gov/dataset/x-ray-ct-data-with-semantic-annotations-for-the-paper-a-workflow-for-segmenting-soil-and-p-d195a
Organization logo

Data from: X-ray CT data with semantic annotations for the paper "A workflow for segmenting soil and plant X-ray CT images with deep learning in Google’s Colaboratory"

Related Article
Explore at:
Dataset updated
Jun 5, 2025
Dataset provided by
Agricultural Research Servicehttps://www.ars.usda.gov/
Description

Leaves from genetically unique Juglans regia plants were scanned using X-ray micro-computed tomography (microCT) on the X-ray μCT beamline (8.3.2) at the Advanced Light Source (ALS) in Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA USA). Soil samples were collected in Fall of 2017 from the riparian oak forest located at the Russell Ranch Sustainable Agricultural Institute at the University of California Davis. The soil was sieved through a 2 mm mesh and was air dried before imaging. A single soil aggregate was scanned at 23 keV using the 10x objective lens with a pixel resolution of 650 nanometers on beamline 8.3.2 at the ALS. Additionally, a drought stressed almond flower bud (Prunus dulcis) from a plant housed at the University of California, Davis, was scanned using a 4x lens with a pixel resolution of 1.72 µm on beamline 8.3.2 at the ALS Raw tomographic image data was reconstructed using TomoPy. Reconstructions were converted to 8-bit tif or png format using ImageJ or the PIL package in Python before further processing. Images were annotated using Intel’s Computer Vision Annotation Tool (CVAT) and ImageJ. Both CVAT and ImageJ are free to use and open source. Leaf images were annotated in following Théroux-Rancourt et al. (2020). Specifically, Hand labeling was done directly in ImageJ by drawing around each tissue; with 5 images annotated per leaf. Care was taken to cover a range of anatomical variation to help improve the generalizability of the models to other leaves. All slices were labeled by Dr. Mina Momayyezi and Fiona Duong.To annotate the flower bud and soil aggregate, images were imported into CVAT. The exterior border of the bud (i.e. bud scales) and flower were annotated in CVAT and exported as masks. Similarly, the exterior of the soil aggregate and particulate organic matter identified by eye were annotated in CVAT and exported as masks. To annotate air spaces in both the bud and soil aggregate, images were imported into ImageJ. A gaussian blur was applied to the image to decrease noise and then the air space was segmented using thresholding. After applying the threshold, the selected air space region was converted to a binary image with white representing the air space and black representing everything else. This binary image was overlaid upon the original image and the air space within the flower bud and aggregate was selected using the “free hand” tool. Air space outside of the region of interest for both image sets was eliminated. The quality of the air space annotation was then visually inspected for accuracy against the underlying original image; incomplete annotations were corrected using the brush or pencil tool to paint missing air space white and incorrectly identified air space black. Once the annotation was satisfactorily corrected, the binary image of the air space was saved. Finally, the annotations of the bud and flower or aggregate and organic matter were opened in ImageJ and the associated air space mask was overlaid on top of them forming a three-layer mask suitable for training the fully convolutional network. All labeling of the soil aggregate and soil aggregate images was done by Dr. Devin Rippner. These images and annotations are for training deep learning models to identify different constituents in leaves, almond buds, and soil aggregates Limitations: For the walnut leaves, some tissues (stomata, etc.) are not labeled and only represent a small portion of a full leaf. Similarly, both the almond bud and the aggregate represent just one single sample of each. The bud tissues are only divided up into buds scales, flower, and air space. Many other tissues remain unlabeled. For the soil aggregate annotated labels are done by eye with no actual chemical information. Therefore particulate organic matter identification may be incorrect. Resources in this dataset:Resource Title: Annotated X-ray CT images and masks of a Forest Soil Aggregate. File Name: forest_soil_images_masks_for_testing_training.zipResource Description: This aggregate was collected from the riparian oak forest at the Russell Ranch Sustainable Agricultural Facility. The aggreagate was scanned using X-ray micro-computed tomography (microCT) on the X-ray μCT beamline (8.3.2) at the Advanced Light Source (ALS) in Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA USA) using the 10x objective lens with a pixel resolution of 650 nanometers. For masks, the background has a value of 0,0,0; pores spaces have a value of 250,250, 250; mineral solids have a value= 128,0,0; and particulate organic matter has a value of = 000,128,000. These files were used for training a model to segment the forest soil aggregate and for testing the accuracy, precision, recall, and f1 score of the model.Resource Title: Annotated X-ray CT images and masks of an Almond bud (P. Dulcis). File Name: Almond_bud_tube_D_P6_training_testing_images_and_masks.zipResource Description: Drought stressed almond flower bud (Prunis dulcis) from a plant housed at the University of California, Davis, was scanned by X-ray micro-computed tomography (microCT) on the X-ray μCT beamline (8.3.2) at the Advanced Light Source (ALS) in Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA USA) using the 4x lens with a pixel resolution of 1.72 µm using. For masks, the background has a value of 0,0,0; air spaces have a value of 255,255, 255; bud scales have a value= 128,0,0; and flower tissues have a value of = 000,128,000. These files were used for training a model to segment the almond bud and for testing the accuracy, precision, recall, and f1 score of the model.Resource Software Recommended: Fiji (ImageJ),url: https://imagej.net/software/fiji/downloads Resource Title: Annotated X-ray CT images and masks of Walnut leaves (J. Regia) . File Name: 6_leaf_training_testing_images_and_masks_for_paper.zipResource Description: Stems were collected from genetically unique J. regia accessions at the 117 USDA-ARS-NCGR in Wolfskill Experimental Orchard, Winters, California USA to use as scion, and were grafted by Sierra Gold Nursery onto a commonly used commercial rootstock, RX1 (J. microcarpa × J. regia). We used a common rootstock to eliminate any own-root effects and to simulate conditions for a commercial walnut orchard setting, where rootstocks are commonly used. The grafted saplings were repotted and transferred to the Armstrong lathe house facility at the University of California, Davis in June 2019, and kept under natural light and temperature. Leaves from each accession and treatment were scanned using X-ray micro-computed tomography (microCT) on the X-ray μCT beamline (8.3.2) at the Advanced Light Source (ALS) in Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA USA) using the 10x objective lens with a pixel resolution of 650 nanometers. For masks, the background has a value of 170,170,170; Epidermis value= 85,85,85; Mesophyll value= 0,0,0; Bundle Sheath Extension value= 152,152,152; Vein value= 220,220,220; Air value = 255,255,255.Resource Software Recommended: Fiji (ImageJ),url: https://imagej.net/software/fiji/downloads

Search
Clear search
Close search
Google apps
Main menu