100+ datasets found
  1. d

    Image Annotation Services | Image Labeling for AI & ML |Computer Vision...

    • datarade.ai
    Updated Dec 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2023). Image Annotation Services | Image Labeling for AI & ML |Computer Vision Data| Annotated Imagery Data [Dataset]. https://datarade.ai/data-products/nexdata-image-annotation-services-ai-assisted-labeling-nexdata
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Dec 29, 2023
    Dataset authored and provided by
    Nexdata
    Area covered
    Uzbekistan, Montenegro, Korea (Republic of), United States of America, Qatar, Taiwan, Philippines, Ireland, Morocco, Jamaica
    Description
    1. Overview We provide various types of Annotated Imagery Data annotation services, including:
    2. Bounding box
    3. Polygon
    4. Segmentation
    5. Polyline
    6. Key points
    7. Image classification
    8. Image description ...
    9. Our Capacity
    10. Platform: Our platform supports human-machine interaction and semi-automatic labeling, increasing labeling efficiency by more than 30% per annotator.It has successfully been applied to nearly 5,000 projects.
    • Annotation Tools: Nexdata's platform integrates 30 sets of annotation templates, covering audio, image, video, point cloud and text.

    -Secure Implementation: NDA is signed to gurantee secure implementation and Annotated Imagery Data is destroyed upon delivery.

    -Quality: Multiple rounds of quality inspections ensures high quality data output, certified with ISO9001

    1. About Nexdata Nexdata has global data processing centers and more than 20,000 professional annotators, supporting on-demand data annotation services, such as speech, image, video, point cloud and Natural Language Processing (NLP) Data, etc. Please visit us at https://www.nexdata.ai/computerVisionTraining?source=Datarade
  2. d

    Video Annotation Services | AI-assisted Labeling | Computer Vision Data |...

    • datarade.ai
    Updated Jan 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2024). Video Annotation Services | AI-assisted Labeling | Computer Vision Data | Video Labeling for AI & ML | Annotated Imagery Data [Dataset]. https://datarade.ai/data-products/nexdata-video-annotation-services-ai-assisted-labeling-nexdata
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Jan 27, 2024
    Dataset authored and provided by
    Nexdata
    Area covered
    United Arab Emirates, Paraguay, Portugal, Belarus, Korea (Republic of), Germany, Chile, Sri Lanka, United Kingdom, Montenegro
    Description
    1. Overview We provide various types of Annotated Imagery Data annotation services, including:
    2. Video classification
    3. Timestamps
    4. Video tracking
    5. Video detection ...
    6. Our Capacity
    7. Platform: Our platform supports human-machine interaction and semi-automatic labeling, increasing labeling efficiency by more than 30% per annotator.It has successfully been applied to nearly 5,000 projects.
    • Annotation Tools: Nexdata's platform integrates 30 sets of annotation templates, covering audio, image, video, point cloud and text.

    -Secure Implementation: NDA is signed to gurantee secure implementation and Annotated Imagery Data is destroyed upon delivery.

    -Quality: Multiple rounds of quality inspections ensures high quality data output, certified with ISO9001

    1. About Nexdata Nexdata has global data processing centers and more than 20,000 professional annotators, supporting on-demand data annotation services, such as speech, image, video, point cloud and Natural Language Processing (NLP) Data, etc. Please visit us at https://www.nexdata.ai/datasets/computervision?source=Datarade
  3. d

    Audio Annotation Services | AI-assisted Labeling |Speech Data | AI Training...

    • datarade.ai
    Updated Dec 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2023). Audio Annotation Services | AI-assisted Labeling |Speech Data | AI Training Data | Natural Language Processing (NLP) Data [Dataset]. https://datarade.ai/data-products/nexdata-audio-annotation-services-ai-assisted-labeling-nexdata
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Dec 29, 2023
    Dataset authored and provided by
    Nexdata
    Area covered
    Thailand, Bulgaria, Korea (Republic of), Spain, Cyprus, Lithuania, Australia, Ukraine, Austria, Belarus
    Description
    1. Overview We provide various types of Natural Language Processing (NLP) Data services, including:
    2. Audio cleaning
    3. Speech annotation
    4. Speech transcription
    5. Noise Annotation
    6. Phoneme segmentation
    7. Prosodic annotation
    8. Part-of-speech tagging ...
    9. Our Capacity
    10. Platform: Our platform supports human-machine interaction and semi-automatic labeling, increasing labeling efficiency by more than 30% per annotator.It has successfully been applied to nearly 5,000 projects.
    • Annotation Tools: Nexdata's platform integrates 30 sets of annotation templates, covering audio, image, video, point cloud and text.

    -Secure Implementation: NDA is signed to gurantee secure implementation and data is destroyed upon delivery.

    -Quality: Multiple rounds of quality inspections ensures high quality data output, certified with ISO9001

    1. About Nexdata Nexdata has global data processing centers and more than 20,000 professional annotators, supporting on-demand data annotation services, such as speech, image, video, point cloud and Natural Language Processing (NLP) Data, etc. Please visit us at https://www.nexdata.ai/datasets/speechrecog?=Datarade
  4. I

    TextTransfer: Datasets for Impact Detection

    • databank.illinois.edu
    Updated Mar 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maria Becker; Kanyao Han; Antonina Werthmann; Rezvaneh Rezapour; Haejin Lee; Jana Diesner; Andreas Witt (2024). TextTransfer: Datasets for Impact Detection [Dataset]. http://doi.org/10.13012/B2IDB-9934303_V1
    Explore at:
    Dataset updated
    Mar 21, 2024
    Authors
    Maria Becker; Kanyao Han; Antonina Werthmann; Rezvaneh Rezapour; Haejin Lee; Jana Diesner; Andreas Witt
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Dataset funded by
    German Federal Ministry of Education and Research
    Description

    Impact assessment is an evolving area of research that aims at measuring and predicting the potential effects of projects or programs. Measuring the impact of scientific research is a vibrant subdomain, closely intertwined with impact assessment. A recurring obstacle pertains to the absence of an efficient framework which can facilitate the analysis of lengthy reports and text labeling. To address this issue, we propose a framework for automatically assessing the impact of scientific research projects by identifying pertinent sections in project reports that indicate the potential impacts. We leverage a mixed-method approach, combining manual annotations with supervised machine learning, to extract these passages from project reports. This is a repository to save datasets and codes related to this project. Please read and cite the following paper if you would like to use the data: Becker M., Han K., Werthmann A., Rezapour R., Lee H., Diesner J., and Witt A. (2024). Detecting Impact Relevant Sections in Scientific Research. The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING). This folder contains the following files: evaluation_20220927.ods: Annotated German passages (Artificial Intelligence, Linguistics, and Music) - training data annotated_data.big_set.corrected.txt: Annotated German passages (Mobility) - training data incl_translation_all.csv: Annotated English passages (Artificial Intelligence, Linguistics, and Music) - training data incl_translation_mobility.csv: Annotated German passages (Mobility) - training data ttparagraph_addmob.txt: German corpus (unannotated passages) model_result_extraction.csv: Extracted impact-relevant passages from the German corpus based on the model we trained rf_model.joblib: The random forest model we trained to extract impact-relevant passages Data processing codes can be found at: https://github.com/khan1792/texttransfer

  5. T

    Guidelines for Data Annotation

    • dataverse.tdl.org
    pdf
    Updated Sep 15, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kate Mesh; Kate Mesh (2020). Guidelines for Data Annotation [Dataset]. http://doi.org/10.18738/T8/FWOOJQ
    Explore at:
    pdf(167426), pdf(2472574)Available download formats
    Dataset updated
    Sep 15, 2020
    Dataset provided by
    Texas Data Repository
    Authors
    Kate Mesh; Kate Mesh
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Included here are a coding manual and supplementary examples of gesture forms (in still images and video recordings) that informed the coding of the first author (Kate Mesh) and four project reliability coders.

  6. Data Annotationplace Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Aug 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Data Annotationplace Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/data-annotationplace-market
    Explore at:
    csv, pptx, pdfAvailable download formats
    Dataset updated
    Aug 4, 2025
    Dataset provided by
    Authors
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Data Annotation Market Outlook



    According to our latest research, the global data annotation market size reached USD 2.15 billion in 2024, fueled by the rapid proliferation of artificial intelligence and machine learning applications across industries. The market is witnessing a robust growth trajectory, registering a CAGR of 26.3% during the forecast period from 2025 to 2033. By 2033, the data annotation market is projected to attain a valuation of USD 19.14 billion. This growth is primarily driven by the increasing demand for high-quality annotated datasets to train sophisticated AI models, the expansion of automation in various sectors, and the escalating adoption of advanced technologies in emerging economies.




    The primary growth factor propelling the data annotation market is the surging adoption of artificial intelligence and machine learning across diverse sectors such as healthcare, automotive, retail, and IT & telecommunications. Organizations are increasingly leveraging AI-driven solutions for predictive analytics, automation, and enhanced decision-making, all of which require meticulously labeled datasets for optimal performance. The proliferation of computer vision, natural language processing, and speech recognition technologies has further intensified the need for accurate data annotation, as these applications rely heavily on annotated images, videos, text, and audio to function effectively. As businesses strive for digital transformation and increased operational efficiency, the demand for comprehensive data annotation services and software continues to escalate, thereby driving market expansion.




    Another significant driver for the data annotation market is the growing complexity and diversity of data types being utilized in AI projects. Modern AI systems require vast amounts of annotated data spanning multiple formats, including text, images, videos, and audio. This complexity has led to the emergence of specialized data annotation tools and services capable of handling intricate annotation tasks, such as semantic segmentation, entity recognition, and sentiment analysis. Moreover, the integration of data annotation platforms with cloud-based solutions and workflow automation tools has streamlined the annotation process, enabling organizations to scale their AI initiatives efficiently. As a result, both large enterprises and small-to-medium businesses are increasingly investing in advanced annotation solutions to maintain a competitive edge in their respective industries.




    Furthermore, the rise of data-centric AI development methodologies has placed greater emphasis on the quality and diversity of training datasets, further fueling the demand for professional data annotation services. Companies are recognizing that the success of AI models is heavily dependent on the accuracy and representativeness of the annotated data used during training. This realization has spurred investments in annotation technologies that offer features such as quality control, real-time collaboration, and integration with machine learning pipelines. Additionally, the growing trend of outsourcing annotation tasks to specialized service providers in regions with cost-effective labor markets has contributed to the market's rapid growth. As AI continues to permeate new domains, the need for scalable, high-quality data annotation solutions is expected to remain a key growth driver for the foreseeable future.




    From a regional perspective, North America currently dominates the data annotation market, accounting for the largest share due to the presence of major technology companies, robust research and development activities, and early adoption of AI technologies. However, the Asia Pacific region is expected to exhibit the fastest growth over the forecast period, driven by increasing investments in AI infrastructure, the expansion of IT and telecommunication networks, and the availability of a large, skilled workforce for annotation tasks. Europe also represents a significant market, characterized by stringent data privacy regulations and growing demand for AI-driven automation in industries such as automotive and healthcare. As global enterprises continue to prioritize AI initiatives, the data annotation market is poised for substantial growth across all major regions.



  7. Data Labeling And Annotation Tools Market Analysis, Size, and Forecast...

    • technavio.com
    pdf
    Updated Jul 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Technavio (2025). Data Labeling And Annotation Tools Market Analysis, Size, and Forecast 2025-2029: North America (US, Canada, and Mexico), Europe (France, Germany, Italy, Spain, and UK), APAC (China), South America (Brazil), and Rest of World (ROW) [Dataset]. https://www.technavio.com/report/data-labeling-and-annotation-tools-market-industry-analysis
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jul 4, 2025
    Dataset provided by
    TechNavio
    Authors
    Technavio
    License

    https://www.technavio.com/content/privacy-noticehttps://www.technavio.com/content/privacy-notice

    Time period covered
    2025 - 2029
    Area covered
    Germany, United States, Mexico, United Kingdom, Canada
    Description

    Snapshot img

    Data Labeling And Annotation Tools Market Size 2025-2029

    The data labeling and annotation tools market size is valued to increase USD 2.69 billion, at a CAGR of 28% from 2024 to 2029. Explosive growth and data demands of generative AI will drive the data labeling and annotation tools market.

    Major Market Trends & Insights

    North America dominated the market and accounted for a 47% growth during the forecast period.
    By Type - Text segment was valued at USD 193.50 billion in 2023
    By Technique - Manual labeling segment accounted for the largest market revenue share in 2023
    

    Market Size & Forecast

    Market Opportunities: USD 651.30 billion
    Market Future Opportunities: USD USD 2.69 billion 
    CAGR : 28%
    North America: Largest market in 2023
    

    Market Summary

    The market is a dynamic and ever-evolving landscape that plays a crucial role in powering advanced technologies, particularly in the realm of artificial intelligence (AI). Core technologies, such as deep learning and machine learning, continue to fuel the demand for data labeling and annotation tools, enabling the explosive growth and data demands of generative AI. These tools facilitate the emergence of specialized platforms for generative AI data pipelines, ensuring the maintenance of data quality and managing escalating complexity. Applications of data labeling and annotation tools span various industries, including healthcare, finance, and retail, with the market expected to grow significantly in the coming years. According to recent studies, the market share for data labeling and annotation tools is projected to reach over 30% by 2026. Service types or product categories, such as manual annotation, automated annotation, and semi-automated annotation, cater to the diverse needs of businesses and organizations. Regulations, such as GDPR and HIPAA, pose challenges for the market, requiring stringent data security and privacy measures. Regional mentions, including North America, Europe, and Asia Pacific, exhibit varying growth patterns, with Asia Pacific expected to witness the fastest growth due to the increasing adoption of AI technologies. The market continues to unfold, offering numerous opportunities for innovation and growth.

    What will be the Size of the Data Labeling And Annotation Tools Market during the forecast period?

    Get Key Insights on Market Forecast (PDF) Request Free Sample

    How is the Data Labeling And Annotation Tools Market Segmented and what are the key trends of market segmentation?

    The data labeling and annotation tools industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments. TypeTextVideoImageAudioTechniqueManual labelingSemi-supervised labelingAutomatic labelingDeploymentCloud-basedOn-premisesGeographyNorth AmericaUSCanadaMexicoEuropeFranceGermanyItalySpainUKAPACChinaSouth AmericaBrazilRest of World (ROW)

    By Type Insights

    The text segment is estimated to witness significant growth during the forecast period.

    The market is witnessing significant growth, fueled by the increasing adoption of artificial intelligence (AI) and machine learning (ML) technologies. According to recent studies, the market for data labeling and annotation services is projected to expand by 25% in the upcoming year. This expansion is primarily driven by the burgeoning demand for high-quality, accurately labeled datasets to train advanced AI and ML models. Scalable annotation workflows are essential to meeting the demands of large-scale projects, enabling efficient labeling and review processes. Data labeling platforms offer various features, such as error detection mechanisms, active learning strategies, and polygon annotation software, to ensure annotation accuracy. These tools are integral to the development of image classification models and the comparison of annotation tools. Video annotation services are gaining popularity, as they cater to the unique challenges of video data. Data labeling pipelines and project management tools streamline the entire annotation process, from initial data preparation to final output. Keypoint annotation workflows and annotation speed optimization techniques further enhance the efficiency of annotation projects. Inter-annotator agreement is a critical metric in ensuring data labeling quality. The data labeling lifecycle encompasses various stages, including labeling, assessment, and validation, to maintain the highest level of accuracy. Semantic segmentation tools and label accuracy assessment methods contribute to the ongoing refinement of annotation techniques. Text annotation techniques, such as named entity recognition, sentiment analysis, and text classification, are essential for natural language processing. Consistency checks an

  8. d

    Data from: The Distributed Annotation System

    • catalog.data.gov
    • data.virginia.gov
    Updated Sep 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (2025). The Distributed Annotation System [Dataset]. https://catalog.data.gov/dataset/the-distributed-annotation-system
    Explore at:
    Dataset updated
    Sep 6, 2025
    Dataset provided by
    National Institutes of Health
    Description

    Background Currently, most genome annotation is curated by centralized groups with limited resources. Efforts to share annotations transparently among multiple groups have not yet been satisfactory. Results Here we introduce a concept called the Distributed Annotation System (DAS). DAS allows sequence annotations to be decentralized among multiple third-party annotators and integrated on an as-needed basis by client-side software. The communication between client and servers in DAS is defined by the DAS XML specification. Annotations are displayed in layers, one per server. Any client or server adhering to the DAS XML specification can participate in the system; we describe a simple prototype client and server example. Conclusions The DAS specification is being used experimentally by Ensembl, WormBase, and the Berkeley Drosophila Genome Project. Continued success will depend on the readiness of the research community to adopt DAS and provide annotations. All components are freely available from the project website .

  9. R

    Car Highway Dataset

    • universe.roboflow.com
    zip
    Updated Sep 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sallar (2023). Car Highway Dataset [Dataset]. https://universe.roboflow.com/sallar/car-highway
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 13, 2023
    Dataset authored and provided by
    Sallar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Vehicles Bounding Boxes
    Description

    Car-Highway Data Annotation Project

    Introduction

    In this project, we aim to annotate car images captured on highways. The annotated data will be used to train machine learning models for various computer vision tasks, such as object detection and classification.

    Project Goals

    • Collect a diverse dataset of car images from highway scenes.
    • Annotate the dataset to identify and label cars within each image.
    • Organize and format the annotated data for machine learning model training.

    Tools and Technologies

    For this project, we will be using Roboflow, a powerful platform for data annotation and preprocessing. Roboflow simplifies the annotation process and provides tools for data augmentation and transformation.

    Annotation Process

    1. Upload the raw car images to the Roboflow platform.
    2. Use the annotation tools in Roboflow to draw bounding boxes around each car in the images.
    3. Label each bounding box with the corresponding class (e.g., car).
    4. Review and validate the annotations for accuracy.

    Data Augmentation

    Roboflow offers data augmentation capabilities, such as rotation, flipping, and resizing. These augmentations can help improve the model's robustness.

    Data Export

    Once the data is annotated and augmented, Roboflow allows us to export the dataset in various formats suitable for training machine learning models, such as YOLO, COCO, or TensorFlow Record.

    Milestones

    1. Data Collection and Preprocessing
    2. Annotation of Car Images
    3. Data Augmentation
    4. Data Export
    5. Model Training

    Conclusion

    By completing this project, we will have a well-annotated dataset ready for training machine learning models. This dataset can be used for a wide range of applications in computer vision, including car detection and tracking on highways.

  10. d

    Foundation Model Data Collection and Data Annotation | Large Language...

    • datarade.ai
    Updated Jan 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2024). Foundation Model Data Collection and Data Annotation | Large Language Model(LLM) Data | SFT Data| Red Teaming Services [Dataset]. https://datarade.ai/data-products/nexdata-foundation-model-data-solutions-llm-sft-rhlf-nexdata
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Jan 25, 2024
    Dataset authored and provided by
    Nexdata
    Area covered
    Ireland, Malta, Azerbaijan, Czech Republic, Taiwan, Russian Federation, Portugal, El Salvador, Kyrgyzstan, Spain
    Description
    1. Overview
    2. Unsupervised Learning: For the training data required in unsupervised learning, Nexdata delivers data collection and cleaning services for both single-modal and cross-modal data. We provide Large Language Model(LLM) Data cleaning and personnel support services based on the specific data types and characteristics of the client's domain.

    -SFT: Nexdata assists clients in generating high-quality supervised fine-tuning data for model optimization through prompts and outputs annotation.

    -Red teaming: Nexdata helps clients train and validate models through drafting various adversarial attacks, such as exploratory or potentially harmful questions. Our red team capabilities help clients identify problems in their models related to hallucinations, harmful content, false information, discrimination, language bias and etc.

    -RLHF: Nexdata assist clients in manually ranking multiple outputs generated by the SFT-trained model according to the rules provided by the client, or provide multi-factor scoring. By training annotators to align with values and utilizing a multi-person fitting approach, the quality of feedback can be improved.

    1. Our Capacity -Global Resources: Global resources covering hundreds of languages worldwide

    -Compliance: All the Large Language Model(LLM) Data is collected with proper authorization

    -Quality: Multiple rounds of quality inspections ensures high quality data output

    -Secure Implementation: NDA is signed to gurantee secure implementation and data is destroyed upon delivery.

    -Efficency: Our platform supports human-machine interaction and semi-automatic labeling, increasing labeling efficiency by more than 30% per annotator. It has successfully been applied to nearly 5,000 projects.

    3.About Nexdata Nexdata is equipped with professional data collection devices, tools and environments, as well as experienced project managers in data collection and quality control, so that we can meet the Large Language Model(LLM) Data collection requirements in various scenarios and types. We have global data processing centers and more than 20,000 professional annotators, supporting on-demand Large Language Model(LLM) Data annotation services, such as speech, image, video, point cloud and Natural Language Processing (NLP) Data, etc. Please visit us at https://www.nexdata.ai/?source=Datarade

  11. Z

    Bakta Annotation Examples

    • data.niaid.nih.gov
    • zenodo.org
    Updated Nov 10, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Schwengers, Oliver (2021). Bakta Annotation Examples [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_4770026
    Explore at:
    Dataset updated
    Nov 10, 2021
    Dataset authored and provided by
    Schwengers, Oliver
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data repository provides exemplary bacterial genome annotations conducted with Bakta v1.1 comprising a broad taxonomical range of many pathogenic (all ESKAPE), commensal and environmental genomes from RefSeq.

    Bakta is a tool for the rapid & standardized local annotation of bacterial genomes & plasmids. It provides dbxref-rich and sORF-including annotations in machine-readble JSON & bioinformatics standard file formats for automatic downstream analysis: https://github.com/oschwengers/bakta

  12. m

    Sample scRNA-seq Data for Cell Type Annotation Testing

    • mllmcelltype.com
    csv
    Updated Jan 8, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    mLLMCelltype Team (2025). Sample scRNA-seq Data for Cell Type Annotation Testing [Dataset]. https://www.mllmcelltype.com/
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 8, 2025
    Dataset authored and provided by
    mLLMCelltype Team
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Example single-cell RNA sequencing dataset containing marker genes for testing and demonstration of automated cell type annotation using AI models

  13. D

    Image Tagging and Annotation Services Market Report | Global Forecast From...

    • dataintelo.com
    csv, pdf, pptx
    Updated Jan 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Image Tagging and Annotation Services Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-image-tagging-and-annotation-services-market
    Explore at:
    pdf, pptx, csvAvailable download formats
    Dataset updated
    Jan 7, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Image Tagging and Annotation Services Market Outlook



    The global image tagging and annotation services market size was valued at approximately USD 1.5 billion in 2023 and is projected to reach around USD 4.8 billion by 2032, growing at a compound annual growth rate (CAGR) of about 14%. This robust growth is driven by the exponential rise in demand for machine learning and artificial intelligence applications, which heavily rely on annotated datasets to train algorithms effectively. The surge in digital content creation and the increasing need for organized data for analytical purposes are also significant contributors to the market expansion.



    One of the primary growth factors for the image tagging and annotation services market is the increasing adoption of AI and machine learning technologies across various industries. These technologies require large volumes of accurately labeled data to function optimally, making image tagging and annotation services crucial. Specifically, sectors such as healthcare, automotive, and retail are investing in AI-driven solutions that necessitate high-quality annotated images to enhance machine learning models' efficiency. For example, in healthcare, annotated medical images are essential for developing tools that can aid in diagnostics and treatment decisions. Similarly, in the automotive industry, annotated images are pivotal for the development of autonomous vehicles.



    Another significant driver is the growing emphasis on improving customer experience through personalized solutions. Companies are leveraging image tagging and annotation services to better understand consumer behavior and preferences by analyzing visual content. In retail, for instance, businesses analyze customer-generated images to tailor marketing strategies and improve product offerings. Additionally, the integration of augmented reality (AR) and virtual reality (VR) in various applications has escalated the need for precise image tagging and annotation, as these technologies rely on accurately labeled datasets to deliver immersive experiences.



    Data Collection and Labeling are foundational components in the realm of image tagging and annotation services. The process of collecting and labeling data involves gathering vast amounts of raw data and meticulously annotating it to create structured datasets. These datasets are crucial for training machine learning models, enabling them to recognize patterns and make informed decisions. The accuracy of data labeling directly impacts the performance of AI systems, making it a critical step in the development of reliable AI applications. As industries increasingly rely on AI-driven solutions, the demand for high-quality data collection and labeling services continues to rise, underscoring their importance in the broader market landscape.



    The rising trend of digital transformation across industries has also significantly bolstered the demand for image tagging and annotation services. Organizations are increasingly investing in digital tools that can automate processes and enhance productivity. Image annotation plays a critical role in enabling technologies such as computer vision, which is instrumental in automating tasks ranging from quality control to inventory management. Moreover, the proliferation of smart devices and the Internet of Things (IoT) has led to an unprecedented amount of image data generation, further fueling the need for efficient image tagging and annotation services to make sense of the vast data deluge.



    From a regional perspective, North America is currently the largest market for image tagging and annotation services, attributed to the early adoption of advanced technologies and the presence of numerous tech giants investing in AI and machine learning. The region is expected to maintain its dominance due to ongoing technological advancements and the growing demand for AI solutions across various sectors. Meanwhile, the Asia Pacific region is anticipated to experience the fastest growth during the forecast period, driven by rapid industrialization, increasing internet penetration, and the rising adoption of AI technologies in countries like China, India, and Japan. The European market is also witnessing steady growth, supported by government initiatives promoting digital innovation and the use of AI-driven applications.



    Service Type Analysis



    The service type segment in the image tagging and annotation services market is bifurcated into manual annotation and automa

  14. The EduRABSA Dataset and the ASQE-DPT Annotation Tool

    • zenodo.org
    zip
    Updated Sep 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cathy Hua; Cathy Hua; Paul Denny; Paul Denny; Jörg Simon Wicker; Jörg Simon Wicker; Katerina Taškova; Katerina Taškova (2025). The EduRABSA Dataset and the ASQE-DPT Annotation Tool [Dataset]. http://doi.org/10.5281/zenodo.16935018
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 1, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Cathy Hua; Cathy Hua; Paul Denny; Paul Denny; Jörg Simon Wicker; Jörg Simon Wicker; Katerina Taškova; Katerina Taškova
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains the full dataset and data annotation tool from our work: EduRABSA: An Education Review Dataset for Aspect-based Sentiment Analysis Tasks

    The full information and instructions are available at https://github.com/yhua219/edurabsa_dataset_and_annotation_tool

    The EduRABSA Dataset

    The EduRABSA dataset is licensed under a Creative Commons Attribution 4.0 International License.

    Education Review ABSA (EduRABSA) is a manually annotated student review text dataset for multiple Aspect-based Sentiment Analysis tasks, including:

    • Aspect-(opinion-category)-Sentiment Quadruplet Extraction (ASQE)
    • Aspect-(opinion)-Sentiment Triplet Extraction (ASTE)
    • Aspect Sentiment Classification (ASC; a.k.a. Aspect Polarity Classification; APC)
    • Aspect Category Detection (ACD) / Aspect-opinion Categorisation (AOC)
    • Aspect-Opinion Pair-Extraction (AOPE)
    • Aspect Extraction (AE)
    • Opinion Extraction (OE)

    The dataset consists of 6,500 pieces of stratified samples of public tertiary student review text in the English language released in 2020-2023 on courses ("course review", N=3,000), teaching staff ("teacher review", N=3,000), and university ("university review", N=500).

    Dataset Information

    Please visit https://github.com/yhua219/edurabsa_dataset_and_annotation_tool

    Unannotated dataset Source

    Review TypeDataset NamePublish YearLicenceTotal EntriesSampled (N=6,500)
    Course reviewCourse Reviews University of Waterloo [1]October 2022CC0: Public Domain14,8103,000
    Teacher reviewBig Data Set from RateMyProfessor.com for Professors' Teaching Evaluation [2]March 2020CC BY 4.019,1453,000
    University reviewUniversity of Exeter Reviews [3]June 2023CC0: Public Domain557500

    [1]: Waterloo Course Reviews. Course Reviews University of Waterloo. October 2022.
    [2]: RateMyProfessor Dataset. Big Data Set from RateMyProfessor.com for Professors' Teaching Evaluation. March 2020.
    [3]: Exeter Reviews. University of Exeter Reviews. June 2023.

    The ASQE-DPT Annotation Tool

    The ASQE-DPT data annotation tool is licensed under a MIT License.

    ASQE-DPT is a manual ABSA annotation tool that we extended based on the ABSA Dataset Prepare Tool (DPT) (source) for more comprehensive and challenging ABSA tasks.

    ASQE_DPT is a no-code, no-installation small HTML file that can be used locally and offline to protect data security/privacy.

    The .zip file contains the annotation tool and real unannotated and annotated data samples.

    For usage instructions, please visit https://github.com/yhua219/edurabsa_dataset_and_annotation_tool

  15. D

    Image Annotation Service Market Report | Global Forecast From 2025 To 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Oct 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2024). Image Annotation Service Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/image-annotation-service-market
    Explore at:
    pdf, pptx, csvAvailable download formats
    Dataset updated
    Oct 5, 2024
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Image Annotation Service Market Outlook



    The global Image Annotation Service market size was valued at approximately USD 1.2 billion in 2023 and is expected to reach around USD 4.5 billion by 2032, reflecting a compound annual growth rate (CAGR) of 15.6% during the forecast period. The driving factors behind this growth include the increasing adoption of artificial intelligence (AI) and machine learning (ML) technologies across various industries, which necessitate large volumes of annotated data for accurate model training.



    One of the primary growth factors for the Image Annotation Service market is the accelerating development and deployment of AI and ML applications. These technologies depend heavily on high-quality annotated data to improve the accuracy of their predictive models. As businesses across sectors such as autonomous vehicles, healthcare, and retail increasingly integrate AI-driven solutions, the demand for precise image annotation services is anticipated to surge. For instance, autonomous vehicles rely extensively on annotated images to identify objects, pedestrians, and road conditions, thereby ensuring safety and operational efficiency.



    Another significant growth factor is the escalating use of image annotation services in healthcare. Medical imaging, which includes X-rays, MRIs, and CT scans, requires precise annotation to assist in the diagnosis and treatment of various conditions. The integration of AI in medical imaging allows for faster and more accurate analysis, leading to improved patient outcomes. This has led to a burgeoning demand for image annotation services within the healthcare sector, propelling market growth further.



    The rise of e-commerce and retail sectors is yet another critical growth driver. With the growing trend of online shopping, retailers are increasingly leveraging AI to enhance customer experience through personalized recommendations and visual search capabilities. Annotated images play a pivotal role in training AI models to recognize products, thereby optimizing inventory management and improving customer satisfaction. Consequently, the retail sector's investment in image annotation services is expected to rise significantly.



    Geographically, North America is anticipated to dominate the Image Annotation Service market owing to its well-established technology infrastructure and the presence of leading AI and ML companies. Additionally, the region's strong focus on research and development, coupled with substantial investments in AI technologies by both government and private sectors, is expected to bolster market growth. Europe and Asia Pacific are also expected to experience significant growth, driven by increasing AI adoption and the expansion of tech startups focused on AI solutions.



    Annotation Type Analysis



    The image annotation service market is segmented into several annotation types, including Bounding Box, Polygon, Semantic Segmentation, Keypoint, and Others. Each annotation type serves distinct purposes and is applied based on the specific requirements of the AI and ML models being developed. Bounding Box annotation, for example, is widely used in object detection applications. By drawing rectangles around objects of interest in an image, this method allows AI models to learn how to identify and locate various items within a scene. Bounding Box annotation is integral in applications like autonomous vehicles and retail, where object identification and localization are crucial.



    Polygon annotation provides a more granular approach compared to Bounding Box. It involves outlining objects with polygons, which offers precise annotation, especially for irregularly shaped objects. This type is particularly useful in applications where accurate boundary detection is essential, such as in medical imaging and agricultural monitoring. For instance, in agriculture, polygon annotation aids in identifying and quantifying crop health by precisely mapping the shape of plants and leaves.



    Semantic Segmentation is another critical annotation type. Unlike the Bounding Box and Polygon methods, Semantic Segmentation involves labeling each pixel in an image with a class, providing a detailed understanding of the entire scene. This type of annotation is highly valuable in applications requiring comprehensive scene analysis, such as autonomous driving and medical diagnostics. Through semantic segmentation, AI models can distinguish between different objects and understand their spatial relationships, which is vital for safe navigation in autonomous vehicles and accurate disease detectio

  16. d

    Data from: X-ray CT data with semantic annotations for the paper "A workflow...

    • catalog.data.gov
    • agdatacommons.nal.usda.gov
    Updated Jun 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). X-ray CT data with semantic annotations for the paper "A workflow for segmenting soil and plant X-ray CT images with deep learning in Google’s Colaboratory" [Dataset]. https://catalog.data.gov/dataset/x-ray-ct-data-with-semantic-annotations-for-the-paper-a-workflow-for-segmenting-soil-and-p-d195a
    Explore at:
    Dataset updated
    Jun 5, 2025
    Dataset provided by
    Agricultural Research Service
    Description

    Leaves from genetically unique Juglans regia plants were scanned using X-ray micro-computed tomography (microCT) on the X-ray μCT beamline (8.3.2) at the Advanced Light Source (ALS) in Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA USA). Soil samples were collected in Fall of 2017 from the riparian oak forest located at the Russell Ranch Sustainable Agricultural Institute at the University of California Davis. The soil was sieved through a 2 mm mesh and was air dried before imaging. A single soil aggregate was scanned at 23 keV using the 10x objective lens with a pixel resolution of 650 nanometers on beamline 8.3.2 at the ALS. Additionally, a drought stressed almond flower bud (Prunus dulcis) from a plant housed at the University of California, Davis, was scanned using a 4x lens with a pixel resolution of 1.72 µm on beamline 8.3.2 at the ALS Raw tomographic image data was reconstructed using TomoPy. Reconstructions were converted to 8-bit tif or png format using ImageJ or the PIL package in Python before further processing. Images were annotated using Intel’s Computer Vision Annotation Tool (CVAT) and ImageJ. Both CVAT and ImageJ are free to use and open source. Leaf images were annotated in following Théroux-Rancourt et al. (2020). Specifically, Hand labeling was done directly in ImageJ by drawing around each tissue; with 5 images annotated per leaf. Care was taken to cover a range of anatomical variation to help improve the generalizability of the models to other leaves. All slices were labeled by Dr. Mina Momayyezi and Fiona Duong.To annotate the flower bud and soil aggregate, images were imported into CVAT. The exterior border of the bud (i.e. bud scales) and flower were annotated in CVAT and exported as masks. Similarly, the exterior of the soil aggregate and particulate organic matter identified by eye were annotated in CVAT and exported as masks. To annotate air spaces in both the bud and soil aggregate, images were imported into ImageJ. A gaussian blur was applied to the image to decrease noise and then the air space was segmented using thresholding. After applying the threshold, the selected air space region was converted to a binary image with white representing the air space and black representing everything else. This binary image was overlaid upon the original image and the air space within the flower bud and aggregate was selected using the “free hand” tool. Air space outside of the region of interest for both image sets was eliminated. The quality of the air space annotation was then visually inspected for accuracy against the underlying original image; incomplete annotations were corrected using the brush or pencil tool to paint missing air space white and incorrectly identified air space black. Once the annotation was satisfactorily corrected, the binary image of the air space was saved. Finally, the annotations of the bud and flower or aggregate and organic matter were opened in ImageJ and the associated air space mask was overlaid on top of them forming a three-layer mask suitable for training the fully convolutional network. All labeling of the soil aggregate and soil aggregate images was done by Dr. Devin Rippner. These images and annotations are for training deep learning models to identify different constituents in leaves, almond buds, and soil aggregates Limitations: For the walnut leaves, some tissues (stomata, etc.) are not labeled and only represent a small portion of a full leaf. Similarly, both the almond bud and the aggregate represent just one single sample of each. The bud tissues are only divided up into buds scales, flower, and air space. Many other tissues remain unlabeled. For the soil aggregate annotated labels are done by eye with no actual chemical information. Therefore particulate organic matter identification may be incorrect. Resources in this dataset:Resource Title: Annotated X-ray CT images and masks of a Forest Soil Aggregate. File Name: forest_soil_images_masks_for_testing_training.zipResource Description: This aggregate was collected from the riparian oak forest at the Russell Ranch Sustainable Agricultural Facility. The aggreagate was scanned using X-ray micro-computed tomography (microCT) on the X-ray μCT beamline (8.3.2) at the Advanced Light Source (ALS) in Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA USA) using the 10x objective lens with a pixel resolution of 650 nanometers. For masks, the background has a value of 0,0,0; pores spaces have a value of 250,250, 250; mineral solids have a value= 128,0,0; and particulate organic matter has a value of = 000,128,000. These files were used for training a model to segment the forest soil aggregate and for testing the accuracy, precision, recall, and f1 score of the model.Resource Title: Annotated X-ray CT images and masks of an Almond bud (P. Dulcis). File Name: Almond_bud_tube_D_P6_training_testing_images_and_masks.zipResource Description: Drought stressed almond flower bud (Prunis dulcis) from a plant housed at the University of California, Davis, was scanned by X-ray micro-computed tomography (microCT) on the X-ray μCT beamline (8.3.2) at the Advanced Light Source (ALS) in Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA USA) using the 4x lens with a pixel resolution of 1.72 µm using. For masks, the background has a value of 0,0,0; air spaces have a value of 255,255, 255; bud scales have a value= 128,0,0; and flower tissues have a value of = 000,128,000. These files were used for training a model to segment the almond bud and for testing the accuracy, precision, recall, and f1 score of the model.Resource Software Recommended: Fiji (ImageJ),url: https://imagej.net/software/fiji/downloads Resource Title: Annotated X-ray CT images and masks of Walnut leaves (J. Regia) . File Name: 6_leaf_training_testing_images_and_masks_for_paper.zipResource Description: Stems were collected from genetically unique J. regia accessions at the 117 USDA-ARS-NCGR in Wolfskill Experimental Orchard, Winters, California USA to use as scion, and were grafted by Sierra Gold Nursery onto a commonly used commercial rootstock, RX1 (J. microcarpa × J. regia). We used a common rootstock to eliminate any own-root effects and to simulate conditions for a commercial walnut orchard setting, where rootstocks are commonly used. The grafted saplings were repotted and transferred to the Armstrong lathe house facility at the University of California, Davis in June 2019, and kept under natural light and temperature. Leaves from each accession and treatment were scanned using X-ray micro-computed tomography (microCT) on the X-ray μCT beamline (8.3.2) at the Advanced Light Source (ALS) in Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA USA) using the 10x objective lens with a pixel resolution of 650 nanometers. For masks, the background has a value of 170,170,170; Epidermis value= 85,85,85; Mesophyll value= 0,0,0; Bundle Sheath Extension value= 152,152,152; Vein value= 220,220,220; Air value = 255,255,255.Resource Software Recommended: Fiji (ImageJ),url: https://imagej.net/software/fiji/downloads

  17. Z

    Data from: iRead4Skills Dataset 2: annotated corpora by level of complexity...

    • data.niaid.nih.gov
    • chef.afue.org
    • +4more
    Updated Jan 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Monteiro, Ricardo (2025). iRead4Skills Dataset 2: annotated corpora by level of complexity for FR, PT and SP [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_12821881
    Explore at:
    Dataset updated
    Jan 15, 2025
    Dataset provided by
    Reis, Maria Leonor
    François, Thomas
    Moutinho, Michell
    Rodríguez Rey, Sandra
    Correia, Susana
    Amaro, Raquel
    Mu, Keran
    Justine, Nagant de Deuxchaisnes
    Bernárdez Braña, André
    Pintard, Alice
    Barbosa, Sílvia
    Monteiro, Ricardo
    Garcia González, Marcos
    Blanco Escoda, Xavier
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    The Dataset 2: annotated corpora by level of complexity for FR, PT and SP is a collection of texts categorized by complexity level and annotated for complexity features, presented in Excel format (.xlsx). These corpora were compiled and annotated under the scope of the project iRead4Skills – Intelligent Reading Improvement System for Fundamental and Transversal Skills Development, funded by the European Commission (grant number: 1010094837). The project aims to enhance reading skills within the adult population by creating an intelligent system that assesses text complexity and recommends suitable reading materials to adults with low literacy skills, contributing to reducing skills gaps and facilitating access to information and culture (https://iread4skills.com).

    This dataset is the result of specifically devised classification and annotation tasks, in which selected texts were organized and distributed to trainers in Adult Learning (AL) and Vocational Education Training (VET) Centres, as well as to adult students in AL and VET centres. This task was conducted via the Qualtrics platform.

    The Dataset 2: annotated corpora by level of complexity for FR, PT and SP is derived from the iRead4Skills Dataset 1: corpora by level of complexity for FR, PT and SP ( https://doi.org/10.5281/zenodo.10055909), which comprises written texts of various genres and complexity levels. From this collection, a sample of texts was selected for classification and annotation. This classification and annotation task aimed to provide additional data and test sets for the complexity analysis systems for the three languages of the project: French, Portuguese, and Spanish. The sample texts in each of the language corpora were selected taking into account the diversity of topics/domains, genres, and the reading preferences of the target audience of the iRead4Skills project. This percentage amounted to the total of 462 texts per language, which were divided by level of complexity, resulting in the following distribution:

    · 140 Very Easy texts

    · 140 Easy texts

    · 140 Plain texts

    · 42 More Complex texts.

    Trainers and students were asked to classify the texts according to the complexity levels of the project, here informally defined as:

    · Very Easy (everyone can understand the text or most of the text).

    · Easy (a person with less than the 9th year of schooling can understand the text or most of the text)

    · Plain (a person with the 9th year of schooling can understand the text the first time he/she reads it)

    · More complex (a person with the 9th year of schooling cannot understand the text the first time he/she reads it).

    Annotators were also asked to mark the parts of the texts considered complex according to various type of features, at word-level and at sentence-level (e.g., word order, sentence composition, etc.), The full details regarding the students and the trainers’ tasks, data qualitative and quantitative description and inter-annotator agreement are described here: https://zenodo.org/records/14653180

    The results are here presented in Excel format. For each language, and for each group (trainers and students), two pairs of files exist – the annotation and the classification files – resulting in four files per language and twelve files, in total.

    In all files, the data is organized as a matrix, with each row representing an ‘answer’ from a particular participant, and the columns containing various details about that specific input, as shown below:

    Column name

    Data

    Annotator's ID

    The randomly generated ID code for each annotator, together with information on the dataset assigned to them.

    Progress

    Information on the completion of the task (for each text).

    Duration (seconds)

    Time used in the completion of the task (for each text).

    File Name

    N1 = Very Easy

    N2 = Easy

    N3 = Plain

    N4=More Complex

    File internal identification, providing its iRead4Skills classification.

    Text

    The content of the file, i.e. the text itself.

    Annotated Level

    Level assigned by the annotator (trainer).

    Proficiency SubLevel

    (Likert Scale - 1 to 5)

    SubLevel assigned by the annotator (trainer) for FR data.

    Corresponding CEFR Level

    CEFR level closest to the iRead4Skills

    Additional Info

    Observations made by the trainers/students

    Annotated Term

    Word or set of words selected for annotation

    Term Label

    Annotation assigned to the Annotated Term (difficult word, word order, etc.)

    Term Index

    Position of the annotated term in the text

    Annotator's Proficiency Level

    Level of AL/VET of the student

    Text adequate for user

    Validation of the text by the students

    The content of the column “File Name” is color-coded, where a green shade alludes to a text with a lower level of complexity and a red one alludes to one with a higher level of complexity.

    The complete datasets are available under creative CC BY-NC-ND 4.0.

  18. e

    B4 Tatian Corpus of Deviating Examples 2.1 - Dataset - B2FIND

    • b2find.eudat.eu
    Updated May 3, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). B4 Tatian Corpus of Deviating Examples 2.1 - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/2a46057f-c923-562f-b75d-d5145aca9700
    Explore at:
    Dataset updated
    May 3, 2023
    Description

    The present corpus, the Tatian Corpus of Deviating Examples T-CODEX 2.1, provides morpho-syntactic and information structural annotation of parts of the Old High German translation attested in the MS St. Gallen Cod. 56, traditionally called the OHG Tatian, one of the largest prose texts from the classical OHG period. This corpus was designed and annotated by Project B4 of Collaborative Research Center on Information Structure at Humboldt University Berlin. The present corpus compiles ca. 2.000 deviating examples found in the text portions of the scribes α, β, γ and ε. Each clause structure represents an extra file annotated with the annotation tool EXMARaLDA and searchable via ANNIS, a general-purpose tool for the publication, visualisation and querying of linguistic data collections, developed by Project D1 of the Collaborative Research Center on Information Structure at Potsdam University.CLARIN Metadata summary for B4 Tatian Corpus of Deviating Examples 2.1 (CMDI-based) Title: B4 Tatian Corpus of Deviating Examples 2.1 Description: The present corpus, the Tatian Corpus of Deviating Examples T-CODEX 2.1, provides morpho-syntactic and information structural annotation of parts of the Old High German translation attested in the MS St. Gallen Cod. 56, traditionally called the OHG Tatian, one of the largest prose texts from the classical OHG period. This corpus was designed and annotated by Project B4 of Collaborative Research Center on Information Structure at Humboldt University Berlin. The present corpus compiles ca. 2.000 deviating examples found in the text portions of the scribes α, β, γ and ε. Each clause structure represents an extra file annotated with the annotation tool EXMARaLDA and searchable via ANNIS, a general-purpose tool for the publication, visualisation and querying of linguistic data collections, developed by Project D1 of the Collaborative Research Center on Information Structure at Potsdam University. Publication date: 2014-12-01 Data owner: Prof. Dr. Svetlana Petrova Contributors: Svetlana Petrova (editor), Karin Donhauser (editor), Carolin Odebrecht (editor), Svetlana Petrova (annotator), Carolin Odebrecht (annotator), Michael Solf (annotator), Yen Chun Chen (annotator), Axel Kullick (annotator), Malte Battefeld (annotator), Sonja Linde (annotator), Anke Gehrlein (annotator) Project: Special Research Centre 632 Information structure, German Research Foundation Keywords: historical texts, religious texts, information structure Languages: Latin (lat), Old High German (goh) Size: 11295 Token Segmentation units: other Annotation types: aboutness (manual), tok (manual), LAT (manual), align (manual), pos (manual), cat (manual), clause-status (manual), gf (manual), syl_no (manual), givenness (manual), top-comm (manual), position (manual), topic-marker (manual), definiteness (manual), foc-bg (manual), foc-marker (manual), context (manual), comment (manual), bibl (manual), meta::writer (manual), meta::corpus-code (manual), meta::page (manual), X::abbreviation (manual), X::sex (manual) Temporal Coverage: 830-01-01/830-12-31 Spatial Coverage: Fulda, DE Genre: religious text Modality: written

  19. Data for the evaluation of the MAIA method for image annotation

    • zenodo.org
    • eprints.soton.ac.uk
    csv
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Martin Zurowietz; Martin Zurowietz; Daniel Langenkämper; Daniel Langenkämper; Brett Hosking; Brett Hosking; Henry A Ruhl; Tim W Nattkemper; Henry A Ruhl; Tim W Nattkemper (2020). Data for the evaluation of the MAIA method for image annotation [Dataset]. http://doi.org/10.5281/zenodo.1453836
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Martin Zurowietz; Martin Zurowietz; Daniel Langenkämper; Daniel Langenkämper; Brett Hosking; Brett Hosking; Henry A Ruhl; Tim W Nattkemper; Henry A Ruhl; Tim W Nattkemper
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains all annotations and annotation candidates that were used for the evaluation of the MAIA method for image annotation. Each row in the CSVs represents one annotation candidate or final annotation. Annotation candidates have the label "OOI candidate" (label_id 9974). All other entries represent final reviewed annotations. Each CSV contains the information for one of the three image datasets that were used in the evaluation.

    Visual exploration of the data is possible in the BIIGLE 2.0 image annotation system at https://biigle.de/projects/139 using the login maia@example.com and the password MAIApaper.

  20. H

    Older Adult Annotator Demographic and Attitudinal Survey

    • dataverse.harvard.edu
    Updated Jul 10, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mark Diaz (2020). Older Adult Annotator Demographic and Attitudinal Survey [Dataset]. http://doi.org/10.7910/DVN/GXS7DI
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 10, 2020
    Dataset provided by
    Harvard Dataverse
    Authors
    Mark Diaz
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Contains survey responses from the sample of older adult annotators. Data include demographic information, information regarding respondents' experience with age discrimination and awareness of age-related data in algorithmic systems. Data also include responses to an Age Anxiety survey developed by Lasher & Faulkender (https://doi.org/10.2190/1U69-9AU2-V6LH-9Y1L).

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Nexdata (2023). Image Annotation Services | Image Labeling for AI & ML |Computer Vision Data| Annotated Imagery Data [Dataset]. https://datarade.ai/data-products/nexdata-image-annotation-services-ai-assisted-labeling-nexdata

Image Annotation Services | Image Labeling for AI & ML |Computer Vision Data| Annotated Imagery Data

Explore at:
.bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
Dataset updated
Dec 29, 2023
Dataset authored and provided by
Nexdata
Area covered
Uzbekistan, Montenegro, Korea (Republic of), United States of America, Qatar, Taiwan, Philippines, Ireland, Morocco, Jamaica
Description
  1. Overview We provide various types of Annotated Imagery Data annotation services, including:
  2. Bounding box
  3. Polygon
  4. Segmentation
  5. Polyline
  6. Key points
  7. Image classification
  8. Image description ...
  9. Our Capacity
  10. Platform: Our platform supports human-machine interaction and semi-automatic labeling, increasing labeling efficiency by more than 30% per annotator.It has successfully been applied to nearly 5,000 projects.
  • Annotation Tools: Nexdata's platform integrates 30 sets of annotation templates, covering audio, image, video, point cloud and text.

-Secure Implementation: NDA is signed to gurantee secure implementation and Annotated Imagery Data is destroyed upon delivery.

-Quality: Multiple rounds of quality inspections ensures high quality data output, certified with ISO9001

  1. About Nexdata Nexdata has global data processing centers and more than 20,000 professional annotators, supporting on-demand data annotation services, such as speech, image, video, point cloud and Natural Language Processing (NLP) Data, etc. Please visit us at https://www.nexdata.ai/computerVisionTraining?source=Datarade
Search
Clear search
Close search
Google apps
Main menu