100+ datasets found
  1. d

    Annotated Database Bibliography (ADBB) of Datasets on Institutions and...

    • demo-b2find.dkrz.de
    Updated Sep 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Annotated Database Bibliography (ADBB) of Datasets on Institutions and Conflict in Divided Societies - Dataset - B2FIND [Dataset]. http://demo-b2find.dkrz.de/dataset/cc430fe6-ed4b-5db6-98ed-0cf1242be4c9
    Explore at:
    Dataset updated
    Sep 21, 2025
    Description

    The ADBB is a meta-dataset from Comparative Area Studies that collects and categorizes datasets in the study of institutions and conflict in divided societies at a global level (from 1945 - 2012). For detailed information see GIGA Working Paper No. 234.

  2. Video Annotation Services | AI-assisted Labeling | Computer Vision Data |...

    • datarade.ai
    Updated Jan 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2024). Video Annotation Services | AI-assisted Labeling | Computer Vision Data | Video Data | Annotated Imagery Data [Dataset]. https://datarade.ai/data-products/nexdata-video-annotation-services-ai-assisted-labeling-nexdata
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Jan 27, 2024
    Dataset authored and provided by
    Nexdata
    Area covered
    Chile, Korea (Republic of), Belarus, Montenegro, United Kingdom, Paraguay, United Arab Emirates, Portugal, Sri Lanka, Germany
    Description
    1. Overview We provide various types of Video Data annotation services, including:
    2. Video classification
    3. Timestamps
    4. Video tracking
    5. Video detection ...
    6. Our Capacity
    7. Platform: Our platform supports human-machine interaction and semi-automatic labeling, increasing labeling efficiency by more than 30% per annotator.It has successfully been applied to nearly 5,000 projects.
    • Annotation Tools: Nexdata's platform integrates 30 sets of annotation templates, covering audio, image, video, point cloud and text.

    -Secure Implementation: NDA is signed to gurantee secure implementation and Annotated Imagery Data is destroyed upon delivery.

    -Quality: Multiple rounds of quality inspections ensures high quality data output, certified with ISO9001

    1. About Nexdata Nexdata has global data processing centers and more than 20,000 professional annotators, supporting on-demand data annotation services, such as speech, image, video, point cloud and Natural Language Processing (NLP) Data, etc. Please visit us at https://www.nexdata.ai/datasets/computervision?source=Datarade
  3. D

    Data Annotation Platform Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Feb 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Data Annotation Platform Report [Dataset]. https://www.datainsightsmarket.com/reports/data-annotation-platform-1421124
    Explore at:
    doc, ppt, pdfAvailable download formats
    Dataset updated
    Feb 8, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The size of the Data Annotation Platform market was valued at USD XXX million in 2024 and is projected to reach USD XXX million by 2033, with an expected CAGR of XX% during the forecast period.

  4. q

    Annotated Data, part 5

    • data.researchdatafinder.qut.edu.au
    Updated Oct 24, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2016). Annotated Data, part 5 [Dataset]. https://data.researchdatafinder.qut.edu.au/dataset/saivt-buildingmonitoring/resource/827b8eb9-ffc1-4d0a-bb6b-19c3d986ea6b
    Explore at:
    Dataset updated
    Oct 24, 2016
    License

    http://researchdatafinder.qut.edu.au/display/n47576http://researchdatafinder.qut.edu.au/display/n47576

    Description

    md5sum: 116aade568ccfeaefcdd07b5110b815a QUT Research Data Respository Dataset Resource available for download

  5. A

    AI Data Annotation Service Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated May 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). AI Data Annotation Service Report [Dataset]. https://www.datainsightsmarket.com/reports/ai-data-annotation-service-528915
    Explore at:
    pdf, ppt, docAvailable download formats
    Dataset updated
    May 14, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    CA
    Variables measured
    Market Size
    Description

    The booming AI data annotation market, projected to reach $10 billion by 2033, is driven by increasing demand for high-quality training data in sectors like healthcare, autonomous driving, and content moderation. Learn about market trends, key players, and growth projections in this comprehensive analysis.

  6. Data from: OpenApePose: a database of annotated ape photographs for pose...

    • nde-dev.biothings.io
    • datasetcatalog.nlm.nih.gov
    • +2more
    zip
    Updated Aug 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jan Zimmermann; Nisarg Desai; Praneet Bala; Rebecca Richardson; Jessica Raper; Benjamin Hayden (2023). OpenApePose: a database of annotated ape photographs for pose estimation [Dataset]. http://doi.org/10.5061/dryad.c59zw3rds
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 8, 2023
    Dataset provided by
    University of Minnesota
    Emory University
    Authors
    Jan Zimmermann; Nisarg Desai; Praneet Bala; Rebecca Richardson; Jessica Raper; Benjamin Hayden
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Because of their close relationship with humans, non-human apes (chimpanzees, bonobos, gorillas, orangutans, and gibbons, including siamangs) are of great scientific interest. The goal of understanding their complex behavior would be greatly advanced by the ability to perform video-based pose tracking. Tracking, however, requires high-quality annotated datasets of ape photographs. Here we present OpenApePose, a new public dataset of 71,868 photographs, annotated with 16 body landmarks, of six ape species in naturalistic contexts. We show that a standard deep net (HRNet-W48) trained on ape photos can reliably track out-of-sample ape photos better than networks trained on monkeys (specifically, the OpenMonkeyPose dataset) and on humans (COCO) can. This trained network can track apes almost as well as the other networks can track their respective taxa, and models trained without one of the six ape species can track the held-out species better than the monkey and human models can. Ultimately, the results of our analyses highlight the importance of large specialized databases for animal tracking systems and confirm the utility of our new ape database.

  7. f

    Data from: Annotated Protein Database Using Known Cleavage Sites for Rapid...

    • acs.figshare.com
    zip
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dylan J. Harney; Mark Larance (2023). Annotated Protein Database Using Known Cleavage Sites for Rapid Detection of Secreted Proteins [Dataset]. http://doi.org/10.1021/acs.jproteome.1c00806.s002
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    ACS Publications
    Authors
    Dylan J. Harney; Mark Larance
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Liquid chromatography tandem mass spectrometry (LC–MS/MS) analysis of secreted proteins has contributed to our understanding of human disease and physiology but is limited by its need for accurate protein database annotation. Common assumptions used in proteomics of perfect protease specificity are inaccurate for secreted proteins, which are cleaved by numerous endogenous proteases. Here, we describe the generation of an optimized protein database that divides proteins into their individual biological chains and peptides to allow fast identification of semi-tryptic peptides from secreted proteins using fully tryptic searches. We applied this biologically annotated database to previously published human plasma proteome data sets containing either DIA or DDA data, using Spectronaut, DIA-NN, MaxDIA, and MaxQuant. Using our annotated database, we greatly reduced search times while achieving similar protein and peptide identifications compared to that obtained from standard approaches using semi-tryptic searches. Furthermore, our database enables the identification of biologically relevant semi-tryptic peptides using data analysis packages that are not capable of semi-tryptic searches. Together, these findings demonstrate that our annotated database is more capable than currently available databases for secreted protein analysis and is particularly useful for large-scale plasma proteome analysis.

  8. n

    UniProt Chordata protein annotation program

    • neuinfo.org
    • scicrunch.org
    • +2more
    Updated Jul 12, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2013). UniProt Chordata protein annotation program [Dataset]. http://identifiers.org/RRID:SCR_007071
    Explore at:
    Dataset updated
    Jul 12, 2013
    Description

    Data set of manually annotated chordata-specific proteins as well as those that are widely conserved. The program keeps existing human entries up-to-date and broadens the manual annotation to other vertebrate species, especially model organisms, including great apes, cow, mouse, rat, chicken, zebrafish, as well as Xenopus laevis and Xenopus tropicalis. A draft of the complete human proteome is available in UniProtKB/Swiss-Prot and one of the current priorities of the Chordata protein annotation program is to improve the quality of human sequences provided. To this aim, they are updating sequences which show discrepancies with those predicted from the genome sequence. Dubious isoforms, sequences based on experimental artifacts and protein products derived from erroneous gene model predictions are also revisited. This work is in part done in collaboration with the Hinxton Sequence Forum (HSF), which allows active exchange between UniProt, HAVANA, Ensembl and HGNC groups, as well as with RefSeq database. UniProt is a member of the Consensus CDS project and thye are in the process of reviewing their records to support convergence towards a standard set of protein annotation. They also continuously update human entries with functional annotation, including novel structural, post-translational modification, interaction and enzymatic activity data. In order to identify candidates for re-annotation, they use, among others, information extraction tools such as the STRING database. In addition, they regularly add new sequence variants and maintain disease information. Indeed, this annotation program includes the Variation Annotation Program, the goal of which is to annotate all known human genetic diseases and disease-linked protein variants, as well as neutral polymorphisms.

  9. G

    Automotive Data Annotation Services Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Oct 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Automotive Data Annotation Services Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/automotive-data-annotation-services-market
    Explore at:
    csv, pptx, pdfAvailable download formats
    Dataset updated
    Oct 6, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Automotive Data Annotation Services Market Outlook



    According to our latest research, the global automotive data annotation services market size was valued at USD 1.54 billion in 2024, with a robust compound annual growth rate (CAGR) of 24.7% expected during the forecast period. By 2033, the market is projected to reach USD 13.9 billion, driven by the accelerating adoption of artificial intelligence (AI) and machine learning (ML) in the automotive sector. The primary growth factor is the increasing demand for high-quality annotated datasets to power advanced driver assistance systems (ADAS) and autonomous vehicle technologies, as automakers and technology providers race to bring safer, smarter vehicles to market.




    One of the most significant growth drivers for the automotive data annotation services market is the rapid evolution of autonomous vehicles and connected car technologies. As automotive manufacturers and technology providers intensify their efforts to develop fully autonomous vehicles, the need for accurately labeled and annotated data has become paramount. Sophisticated AI models require vast amounts of labeled image, video, and sensor data to learn how to interpret real-world scenarios and make split-second decisions. This necessity has fueled a surge in demand for professional data annotation services capable of delivering large-scale, high-quality datasets that power the next generation of automotive intelligence. The complexity and diversity of driving environments—ranging from urban streets to rural highways—further amplify the need for precise and contextually relevant data annotation.




    Another crucial factor propelling the automotive data annotation services market is the growing integration of advanced driver assistance systems (ADAS) and predictive maintenance technologies across both passenger and commercial vehicles. Modern vehicles are increasingly equipped with sensors, cameras, and LiDAR systems that generate enormous volumes of raw data. To extract actionable insights and enable real-time decision-making, this data must be meticulously annotated. Data annotation services are thus playing a pivotal role in enhancing vehicle safety, reducing accidents, and enabling features such as lane departure warnings, adaptive cruise control, and predictive diagnostics. The adoption of connected fleet management solutions by logistics and transportation companies further contributes to market growth, as these solutions rely on annotated data for route optimization, driver behavior analysis, and predictive maintenance.




    The market is also benefiting from the proliferation of partnerships between automotive OEMs, Tier 1 suppliers, and specialized technology providers. These collaborations are fostering innovation in data annotation methodologies, including the development of semi-automated and fully automated annotation tools powered by AI. As the volume and complexity of automotive data continue to grow, companies are increasingly seeking scalable, cost-effective annotation solutions that can maintain high accuracy and consistency. The emergence of cloud-based annotation platforms and the integration of quality assurance mechanisms are further enhancing the reliability and scalability of data annotation services, making them indispensable to the automotive industry's digital transformation.




    Regionally, the Asia Pacific region is emerging as a powerhouse in the automotive data annotation services market, driven by the rapid expansion of the automotive sector in countries like China, Japan, and South Korea. The presence of leading automotive manufacturers and technology innovators, coupled with supportive government initiatives for smart mobility and intelligent transportation systems, is creating a fertile environment for market growth. North America and Europe are also significant contributors, thanks to their early adoption of autonomous vehicle technologies and strong focus on automotive safety standards. Meanwhile, Latin America and the Middle East & Africa are gradually catching up, as global automotive players expand their operations and invest in local talent for data annotation projects.



  10. Healthcare Data Annotation Tools Market Size, Share, Growth and Industry...

    • imarcgroup.com
    pdf,excel,csv,ppt
    Updated Oct 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IMARC Group (2023). Healthcare Data Annotation Tools Market Size, Share, Growth and Industry Report [Dataset]. https://www.imarcgroup.com/healthcare-data-annotation-tools-market
    Explore at:
    pdf,excel,csv,pptAvailable download formats
    Dataset updated
    Oct 10, 2023
    Dataset provided by
    Imarc Group
    Authors
    IMARC Group
    License

    https://www.imarcgroup.com/privacy-policyhttps://www.imarcgroup.com/privacy-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    The global healthcare data annotation tools market size reached USD 204.6 Million in 2024. Looking forward, IMARC Group expects the market to reach USD 1,308.5 Million by 2033, exhibiting a growth rate (CAGR) of 22.9% during 2025-2033. The increasing adoption of artificial intelligence (AI) and machine learning (ML) in healthcare, the rise in generating vast amounts of data, significant advancement in medical imaging technologies, and the increasing demand for telemedicine are some of the major factors propelling the market.

    Report Attribute
    Key Statistics
    Base Year
    2024
    Forecast Years
    2025-2033
    Historical Years
    2019-2024
    Market Size in 2024USD 204.6 Million
    Market Forecast in 2033USD 1,308.5 Million
    Market Growth Rate (2025-2033)22.9%

    IMARC Group provides an analysis of the key trends in each segment of the global healthcare data annotation tools market report, along with forecasts at the global, regional, and country levels for 2025-2033. Our report has categorized the market based on type, technology, application, and end user.

  11. d

    Data from: Annotated reference transcriptome for female Culicoides...

    • catalog.data.gov
    • agdatacommons.nal.usda.gov
    • +1more
    Updated Apr 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Annotated reference transcriptome for female Culicoides sonorensis biting midges [Dataset]. https://catalog.data.gov/dataset/annotated-reference-transcriptome-for-female-culicoides-sonorensis-biting-midges-fde74
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Agricultural Research Service
    Description

    Unigene sequences were annotated by BlastX alignment to the non-redundant protein database (National Center for Biotechnology Information/GenBank) and the Aedes aegypti and Culex quinquefasciatus gene annotations (Vectorbase). This was done with a 1e-05 expectation value. Top hits are shown including accession numbers and description, if available. Unigene number and corresponding GenBank accession numbers are provided for all C. sonorensis genes. Both tables are modified from supplementary information tables at http://dx.doi.org/10.1371/journal.pone.0098123.s003 and numbered accordingly. Resources in this dataset:Resource Title: table s2 annotation. File Name: table s2 annotation.xlsxResource Title: table S3 GO terms. File Name: table S3 GO terms.xlsxResource Title: data dictionary Nayduch S2 S3. File Name: data dictionary Nayduch S2 S3_2.csvResource Description: Defines parameters for annotation and GO terms.

  12. Portuguese Sentiment Corpus for Twitter and

    • kaggle.com
    zip
    Updated Feb 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Portuguese Sentiment Corpus for Twitter and [Dataset]. https://www.kaggle.com/datasets/thedevastator/portuguese-sentiment-corpus-for-twitter-and-busc
    Explore at:
    zip(934 bytes)Available download formats
    Dataset updated
    Feb 18, 2023
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Portuguese Sentiment Corpus for Twitter and Buscapé Reviews

    Accurately Labeled Word-Level Annotations

    By [source]

    About this dataset

    This dataset consists of a comprehensive list of Portuguese words and the corresponding sentiment labels attached to them. By providing finer-grained annotation and labeling, this dataset allows for comparative sentiment analysis in Portuguese from Twitter and Buscapé reviews. With humans assigned to annotate this data, it provides an accurate measure of the sentiment of Portuguese words in multiple contexts. The labels range from positive to negative with numeric values, allowing for more nuanced categorization and comparison between different subcategories within reviews. Whether you’re mining social media conversations or utilizing customer feedback for analytics purposes, this labeled corpus provides an invaluable resource that can help inform your decision making process

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset, comprised of Twitter and Buscapé reviews from Portuguese-speaking areas, provides sentiment labels at the word level. This makes it easy to apply to natural language processing models for analysis. The corpus is composed of 3,457 tweets and 476 Buscapé reviews, with a total of 114 unique words in the lexicon along with associated human-annotated sentiment scores for each word.

    To properly utilize this resource for comparative sentiment analysis, you need an environment that can read CSV files containing both text and numerical data. With such setting, users can use machine learning algorithms to compare words or phrases within texts or across different datasets and gain an understanding of the opinion expressed towards various topics so far as they have been labeled in this corpus. This data has been annotated according to 3 possible sentiment labels: negative (–1), neutral (0) or positive (+1).

    In order to work with this dataset effectively here are some tips:

    • Familiarize yourself with the data which contains a list of Portuguese words and their associated sentiment labels – by reading through a full content list you will be able to understand how it works better;
    • Create a visualization tool that allows you not only see the weight assigned for each word but also do comparative analyses such as finding differences between same nouns used in different sentences;
    • Analyzing text holistically by taking into account contextual information;
    • Experimenting on different methods that may increase accuracy when dealing with unequal distribution of examples due to class imbalance;

      By applying these above measures one should easily achieve reliable results by making use of this linguistically labeled database generated from two distinct corpora including tweets and Buscapé reviews which have previously never been bridged together like this before! With its help it is now easier than ever before gain insights into people’s opinion on various products based on their textual expressions in real time!

    Research Ideas

    • Comparing the sentiment of Twitter and Buscapé reviews to identify trends in customer opinions over time.
    • Understanding how the sentiment of customer reviews compares between different Portuguese languages and dialects.
    • Utilizing the labeled corpus for training machine learning models in natural language processing tasks such as sentiment analysis, text classification, and automated opinion summarization

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: portuguese_lexicon.csv

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit .

  13. Z

    Data Annotation Tools Market - By Annotation Approach (Automated Annotation...

    • zionmarketresearch.com
    pdf
    Updated Nov 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zion Market Research (2025). Data Annotation Tools Market - By Annotation Approach (Automated Annotation and Manual Annotation), By Data Type (Text, Audio, and Image/Video), By Application (Healthcare, Automotive, IT & Telecom, BFSI, Agriculture, and Retail), and By Region: Industry Perspective, Comprehensive Analysis, and Forecast, 2024 - 2032- [Dataset]. https://www.zionmarketresearch.com/report/data-annotation-tools-market
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Nov 23, 2025
    Dataset authored and provided by
    Zion Market Research
    License

    https://www.zionmarketresearch.com/privacy-policyhttps://www.zionmarketresearch.com/privacy-policy

    Time period covered
    2022 - 2030
    Area covered
    Global
    Description

    Global Data Annotation Tools Market size at US$ 102.38 Billion in 2023, set to reach US$ 908.57 Billion by 2032 at a CAGR of 24.4% from 2024 to 2032.

  14. o

    Annotated Object Itineraries for Museum Collections Data

    • ordo.open.ac.uk
    • figshare.com
    xml
    Updated Oct 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sarah Middle; Elton Barker; Maria Aristeidou (2024). Annotated Object Itineraries for Museum Collections Data [Dataset]. http://doi.org/10.21954/ou.rd.27323799.v1
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Oct 29, 2024
    Dataset provided by
    The Open University
    Authors
    Sarah Middle; Elton Barker; Maria Aristeidou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset was produced through a study funded by the Open Societal Challenges programme and using tools developed through the Pelagios Network (https://pelagios.org/) to annotate and visualise object itineraries in museum collections data - in this case, a sample of textual data about navigational instruments held at National Museums Scotland (NMS). The term ‘object itinerary’ describes the journey an object takes through space and time, including its interactions between the people, organisations and other objects it encounters.Data was annotated using the Recogito Studio platform (https://recogitostudio.org/) using a data model based on selected terms from the CIDOC CRM (https://www.cidoc-crm.org/) and Linked Art (https://linked.art/) ontologies. Recogito Studio's Geo-Tagger plugin was used to align places mentioned in the text to their equivalents in Wikidata (https://www.wikidata.org/), and the exported geo-tags were further processed to enable visualisation in Peripleo (the main repository for Peripleo is at https://github.com/britishlibrary/peripleo; the visualisation of this data can be found at https://sarahmiddle.github.io/Peripleo_PelagiosOSC/).The following files are included:Object Itineraries Data Model (ObjectItinerariesDataModel_20241029.owl): OWL 2 ontology, developed using Protege (https://protege.stanford.edu/), which represents key classes and properties (entities and relationships) in the description of object itineraries.Annotation Protocol (NMSDataAnnotationProtocol_20241028.pdf): document describing how the original data was annotated in accordance with the data model, which resulted in the CSV and GeoJSON export files.Annotations (NMSDataAnnotations_20241028.csv): CSV export from Recogito Studio, containing all annotations, including the annotated text, its position in the main document, and associated tags.Geo-Tags (NMSDataGeoTags_20241028.geojson): GeoJSON export from Recogito Studio, containing all geo-tags, including the co-ordinates of each annotated place and the identifiers of their equivalents in Wikidata.Enhanced Geo-Tags (NMSDataGeoTags_TransformedEnhanced_20241028.json): JSON-LD file containing a transformed and enhanced version of the GeoJSON export, used to visualise the data in Peripleo and provide additional contextual information, including links to the relevant NMS catalogue records.

  15. d

    DAVID

    • dknet.org
    • neuinfo.org
    • +1more
    Updated Dec 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). DAVID [Dataset]. http://identifiers.org/RRID:SCR_001881
    Explore at:
    Dataset updated
    Dec 1, 2024
    Description

    Bioinformatics resource system including web server and web service for functional annotation and enrichment analyses of gene lists. Consists of comprehensive knowledgebase and set of functional analysis tools. Includes gene centered database integrating heterogeneous gene annotation resources to facilitate high throughput gene functional analysis., THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 16,2025.

  16. r

    Gene annotation of Blastobotrys mokoenaii, Blastobotrys illinoisensis, and...

    • researchdata.se
    • figshare.scilifelab.se
    • +1more
    Updated Mar 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jonas Ravn; Amanda Sörensen Ristinmaa; Scott Mazurkewich; Guilherme Borges Dias; Johan Larsbrink; Cecilia Geijer (2025). Gene annotation of Blastobotrys mokoenaii, Blastobotrys illinoisensis, and Blastobotrys malaysiensis [Dataset]. http://doi.org/10.17044/SCILIFELAB.28606814
    Explore at:
    Dataset updated
    Mar 21, 2025
    Dataset provided by
    Chalmers University of Technology
    Authors
    Jonas Ravn; Amanda Sörensen Ristinmaa; Scott Mazurkewich; Guilherme Borges Dias; Johan Larsbrink; Cecilia Geijer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains the gene annotation data for three species of Blastobotrys yeats: B. mokoenaii, B. illinoisensis, and B. malaysiensis.

    The genome assemblies for B. mokoenaii (NRRL Y-27120) and B. malaysiensis (NRRL Y-6417) were publicly available on the National Center for Biotechnology Information (NCBI) under accessions GCA_003705765.3 and GCA_030558815.1, respectively.

    The genome assembly for B. illinoisensis (NRRL YB-1343) was generated by SciLifeLab's National Genomics Infrastructure (NGI) using PacBio long-read data and deposited in the European Nucleotide Archive (ENA) under accession GCA_965113335.1.

    File description- bmokoenaii_annotation.gff This file contains the gene models predicted for B. mokoenaii (GCA_003705765.3). - billinoisensis_annotation.gff This file contains the gene models predicted for B. illinoisensis (GCA_003705765.3). - bmalaysiensis_annotation.gff This file contains the gene models predicted for B. malaysiensis (GCA_030558815.1). Gene annotation methodsRepeat MaskingPrior to annotation, a repeat library was built for each species using RepeatModeler2 v2.0.2 and the genomes were soft-masked using RepeatMasker v4.1.5.

    $ RepeatModeler -database ${DB} -engine ncbi -pa 16 $ RepeatMasker -dir . -gff -u -no_is -xsmall -e ncbi -lib ${LIBRARY} -pa 16 genome.fasta

    Structural Annotation Structural annotation was performed on the soft-masked genomes using Braker3 v3.0.3 incorporating external evidence in the form of all fungal proteins from OrthoDB v11 (available at https://bioinf.uni-greifswald.de/bioinf/partitioned_odb11).

    $ braker.pl --genome="$genome" \

    --prot_seq=${protein} --workingdir=${PWD}
    --gff3 --threads=16 --verbosity=3
    --nocleanup --species=${i}

    Functional Annotation

    The predicted genes were functionally annotated using the National Bioiformatics Infrastructure Sweden (NBIS) functional_annotation nextflow pipeline v2.0.0 (https://github.com/NBISweden/pipelines-nextflow). Briefly, this pipeline performs similarity searches between the annotated proteins and the UniProtKB/Swiss-Prot database (downloaded on 2023-12) using the Basic Local Alignment Search Tool (BLAST). Then it uses InterProScan to query the proteins against InterPro v59-91 databases, and merges results using AGAT v1.2.0.

    tRNAs and rRNAs

    Transfer RNA (tRNA) and ribosomal RNA (rRNA) genes were annotated using tRNAscan-SE v2.0.12 and barrnap v0.9, respectively. Other ncRNAs, such as SRP RNA, RNase P RNA, spliceosomal ncRNAs etc. have not been predicted. Finnally, the functionally annotated protein-coding genes, tRNAs, and rRNAs were combined into a single GFF file using AGAT v1.2.0.

    $ tRNAscan-SE -E --gff ${output}_trnas.gff --thread 16 ${genome}.fasta $ barrnap --kingdom euk --threads 6 ${genome}.fasta > ${output}_rrna.gff

    Annotation integrationFinnally, the functionally annotated protein-coding genes, tRNAs, and rRNAs were combined into a single GFF file using AGAT v1.2.0.

    $ agat_sp_complement_annotations.pl --ref ${protein_coding} --add ${trna} --add ${rrna} --out full_annotation.gff

  17. E

    Data from: Parallel sense-annotated corpus ELEXIS-WSD 1.1

    • live.european-language-grid.eu
    binary format
    Updated May 21, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Parallel sense-annotated corpus ELEXIS-WSD 1.1 [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/22947
    Explore at:
    binary formatAvailable download formats
    Dataset updated
    May 21, 2023
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    ELEXIS-WSD is a parallel sense-annotated corpus in which content words (nouns, adjectives, verbs, and adverbs) have been assigned senses. Version 1.1 contains sentences for 10 languages: Bulgarian, Danish, English, Spanish, Estonian, Hungarian, Italian, Dutch, Portuguese, and Slovene.

    The corpus was compiled by automatically extracting a set of sentences from WikiMatrix (Schwenk et al., 2019), a large open-access collection of parallel sentences derived from Wikipedia, using an automatic approach based on multilingual sentence embeddings. The sentences were manually validated according to specific formal, lexical and semantic criteria (e.g. by removing incorrect punctuation, morphological errors, notes in square brackets and etymological information typically provided in Wikipedia pages). To obtain a satisfying semantic coverage, we filtered out sentences with less than 5 words and less than 2 polysemous words were filtered out. Subsequently, in order to obtain datasets in the other nine target languages, for each selected sentence in English, the corresponding WikiMatrix translation into each of the other languages was retrieved. If no translation was available, the English sentence was translated manually. The resulting corpus is comprised of 2,024 sentences for each language.

    The sentences were tokenized, lemmatized, and tagged with POS tags using UDPipe v2.6 (https://lindat.mff.cuni.cz/services/udpipe/). Senses were annotated using LexTag (https://elexis.babelscape.com/): each content word (noun, verb, adjective, and adverb) was assigned a sense from among the available senses from the sense inventory selected for the language (see below) or BabelNet. Sense inventories were also updated with new senses during annotation.

    List of sense inventories BG: Dictionary of Bulgarian DA: DanNet – The Danish WordNet EN: Open English WordNet ES: Spanish Wiktionary ET: The EKI Combined Dictionary of Estonian HU: The Explanatory Dictionary of the Hungarian Language IT: PSC + Italian WordNet NL: Open Dutch WordNet PT: Portuguese Academy Dictionary (DACL) SL: Digital Dictionary Database of Slovene

    The corpus is available in the CoNLL-U tab-separated format. In order, the columns contain the token ID, its form, its lemma, its UPOS-tag, five empty columns (reserved for e.g. dependency parsing, which is absent from this version), and the final MISC column containing the following: the token's whitespace information (whether the token is followed by a whitespace or not), the ID of the sense assigned to the token, and the index of the multiword expression (if the token is part of an annotated multiword expression).

    Each language has a separate sense inventory containing all the senses (and their definitions) used for annotation in the corpus. Not all the senses from the sense inventory are necessarily included in the corpus annotations: for instance, all occurrences of the English noun "bank" in the corpus might be annotated with the sense of "financial institution", but the sense inventory also contains the sense "edge of a river" as well as all other possible senses to disambiguate between.

    For more information, please refer to 00README.txt.

    Differences to version 1.0: - Several minor errors were fixed (e.g. a typo in one of the Slovene sense IDs). - The corpus was converted to the true CoNLL-U format (as opposed to the CoNLL-U-like format used in v1.0). - An error was fixed that resulted in missing UPOS tags in version 1.0. - The sentences in all corpora now follow the same order (from 1 to 2024).

  18. f

    Data from: A crustacean annotated transcriptome (CAT) database

    • figshare.com
    • datasetcatalog.nlm.nih.gov
    application/gzip
    Updated Sep 7, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wenyan NONG; Zacary Y. H. Chai; Xiaosen Jiang; Jing Qin; Mak Kai Yan; Billy Kwok Chong Chow; Jian-Wen Qiu; Jerome Ho Lam Hui; Ka-Hou Chu (2020). A crustacean annotated transcriptome (CAT) database [Dataset]. http://doi.org/10.6084/m9.figshare.12924230.v3
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Sep 7, 2020
    Dataset provided by
    figshare
    Authors
    Wenyan NONG; Zacary Y. H. Chai; Xiaosen Jiang; Jing Qin; Mak Kai Yan; Billy Kwok Chong Chow; Jian-Wen Qiu; Jerome Ho Lam Hui; Ka-Hou Chu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundDecapods are an order of crustaceans which includes shrimps, crabs, lobsters and crayfish. They occur worldwide and are of great scientific interest as well as being of ecological and economic importance in fisheries and aquaculture. However, our knowledge of their biology mainly comes from the group which is most closely related to crustaceans – insects. Here we produce a de novo transcriptome database, crustacean annotated transcriptome (CAT) database, spanning multiple tissues and the life stages of seven crustaceans.DescriptionA total of 71 transcriptome assemblies from six decapod species and a stomatopod species, including the coral shrimp Stenopus hispidus, the cherry shrimp Neocaridina davidi, the redclaw crayfish Cherax quadricarinatus, the spiny lobster Panulirus ornatus, the red king crab Paralithodes camtschaticus, the coconut crab Birgus latro, and the zebra mantis shrimp Lysiosquillina maculata, were generated. Differential gene expression analyses within species were generated as a reference and included in a graphical user interface database at http://cat.sls.cuhk.edu.hk/. Users can carry out gene name searches and also access gene sequences based on a sequence query using the BLAST search function.ConclusionsThe data generated and deposited in this database offers a valuable resource for the further study of these crustaceans, as well as being of use in aquaculture development.

  19. Z

    Data from: ODDS: Real-Time Object Detection using Depth Sensors on Embedded...

    • data.niaid.nih.gov
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mithun, Niluthpol Chowdhury; Munir, Sirajum; Guo, Karen; Shelton, Charles (2020). ODDS: Real-Time Object Detection using Depth Sensors on Embedded GPUs [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1163769
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    University of California, Riverside, CA
    Bosch Research and Technology Center, PA
    University of Minnesota, MN
    Authors
    Mithun, Niluthpol Chowdhury; Munir, Sirajum; Guo, Karen; Shelton, Charles
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ODDS Smart Building Depth Dataset

    Introduction:

    The goal of this dataset is to facilitate research focusing on recognizing objects in smart buildings using the depth sensor mounted at the ceiling. This dataset contains annotations of depth images for eight frequently seen object classes. The classes are: person, backpack, laptop, gun, phone, umbrella, cup, and box.

    Data Collection:

    We collected data from two settings. We had Kinect mounted at a 9.3 feet ceiling near to a 6 feet wide door. We also used a tripod with a horizontal extender holding the kinect at a similar height looking downwards. We asked about 20 volunteers to enter and exit a number of times each in different directions (3 times walking straight, 3 times walking towards left side, 3 times walking towards right side) holding objects in many different ways and poses underneath the Kinect. Each subject was using his/her own backpack, purse, laptop, etc. As a result, we considered varieties within the same object, e.g., for laptops, we considered Macbooks, HP laptops, Lenovo laptops of different years and models, and for backpacks, we considered backpacks, side bags, and purse of women. We asked the subjects to walk while holding it in many ways, e.g., for laptop, the laptop was fully open, partially closed, and fully closed while carried. Also, people hold laptops in front and side of their bodies, and underneath their elbow. The subjects carried their backpacks in their back, in their side at different levels from foot to shoulder. We wanted to collect data with real guns. However, bringing real guns to the office is prohibited. So, we obtained a few nerf guns and the subjects were carrying these guns pointing it to front, side, up, and down while walking.

    Annotated Data Description:

    The Annotated dataset is created following the structure of Pascal VOC devkit, so that the data preparation becomes simple and it can be used quickly with different with object detection libraries that are friendly to Pascal VOC style annotations (e.g. Faster-RCNN, YOLO, SSD). The annotated data consists of a set of images; each image has an annotation file giving a bounding box and object class label for each object in one of the eight classes present in the image. Multiple objects from multiple classes may be present in the same image. The dataset has 3 main directories:

    1)DepthImages: Contains all the images of training set and validation set.

    2)Annotations: Contains one xml file per image file, (e.g., 1.xml for image file 1.png). The xml file includes the bounding box annotations for all objects in the corresponding image.

    3)ImagesSets: Contains two text files training_samples.txt and testing_samples.txt. The training_samples.txt file has the name of images used in training and the testing_samples.txt has the name of images used for testing. (We randomly choose 80%, 20% split)

    UnAnnotated Data Description:

    The un-annotated data consists of several set of depth images. No ground-truth annotation is available for these images yet. These un-annotated sets contain several challenging scenarios and no data has been collected from this office during annotated dataset construction. Hence, it will provide a way to test generalization performance of the algorithm.

    Citation:

    If you use ODDS Smart Building dataset in your work, please cite the following reference in any publications: @inproceedings{mithun2018odds, title={ODDS: Real-Time Object Detection using Depth Sensors on Embedded GPUs}, author={Niluthpol Chowdhury Mithun and Sirajum Munir and Karen Guo and Charles Shelton}, booktitle={ ACM/IEEE Conference on Information Processing in Sensor Networks (IPSN)}, year={2018}, }

  20. u

    Data from: Plant Expression Database

    • agdatacommons.nal.usda.gov
    • datasetcatalog.nlm.nih.gov
    • +2more
    bin
    Updated Feb 9, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sudhansu S. Dash; John Van Hemert; Lu Hong; Roger P. Wise; Julie A. Dickerson (2024). Plant Expression Database [Dataset]. https://agdatacommons.nal.usda.gov/articles/dataset/Plant_Expression_Database/24661179
    Explore at:
    binAvailable download formats
    Dataset updated
    Feb 9, 2024
    Dataset provided by
    PLEXdb
    Authors
    Sudhansu S. Dash; John Van Hemert; Lu Hong; Roger P. Wise; Julie A. Dickerson
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Description

    [NOTE: PLEXdb is no longer available online. Oct 2019.] PLEXdb (Plant Expression Database) is a unified gene expression resource for plants and plant pathogens. PLEXdb is a genotype to phenotype, hypothesis building information warehouse, leveraging highly parallel expression data with seamless portals to related genetic, physical, and pathway data. PLEXdb (http://www.plexdb.org), in partnership with community databases, supports comparisons of gene expression across multiple plant and pathogen species, promoting individuals and/or consortia to upload genome-scale data sets to contrast them to previously archived data. These analyses facilitate the interpretation of structure, function and regulation of genes in economically important plants. A list of Gene Atlas experiments highlights data sets that give responses across different developmental stages, conditions and tissues. Tools at PLEXdb allow users to perform complex analyses quickly and easily. The Model Genome Interrogator (MGI) tool supports mapping gene lists onto corresponding genes from model plant organisms, including rice and Arabidopsis. MGI predicts homologies, displays gene structures and supporting information for annotated genes and full-length cDNAs. The gene list-processing wizard guides users through PLEXdb functions for creating, analyzing, annotating and managing gene lists. Users can upload their own lists or create them from the output of PLEXdb tools, and then apply diverse higher level analyses, such as ANOVA and clustering. PLEXdb also provides methods for users to track how gene expression changes across many different experiments using the Gene OscilloScope. This tool can identify interesting expression patterns, such as up-regulation under diverse conditions or checking any gene’s suitability as a steady-state control. Resources in this dataset:Resource Title: Website Pointer for Plant Expression Database, Iowa State University. File Name: Web Page, url: https://www.bcb.iastate.edu/plant-expression-database [NOTE: PLEXdb is no longer available online. Oct 2019.] Project description for the Plant Expression Database (PLEXdb) and integrated tools.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2025). Annotated Database Bibliography (ADBB) of Datasets on Institutions and Conflict in Divided Societies - Dataset - B2FIND [Dataset]. http://demo-b2find.dkrz.de/dataset/cc430fe6-ed4b-5db6-98ed-0cf1242be4c9

Annotated Database Bibliography (ADBB) of Datasets on Institutions and Conflict in Divided Societies - Dataset - B2FIND

Explore at:
Dataset updated
Sep 21, 2025
Description

The ADBB is a meta-dataset from Comparative Area Studies that collects and categorizes datasets in the study of institutions and conflict in divided societies at a global level (from 1945 - 2012). For detailed information see GIGA Working Paper No. 234.

Search
Clear search
Close search
Google apps
Main menu