100+ datasets found
  1. Data from: Coleção de Nematoda do Museu Nacional - UFRJ

    • portal.obis.org
    • gbif.org
    • +1more
    zip
    Updated Dec 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Universidade Federal do Rio de Janeiro (2023). Coleção de Nematoda do Museu Nacional - UFRJ [Dataset]. https://portal.obis.org/dataset/13df6cd5-b38e-46b3-96c3-f86f8c7d378e
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 11, 2023
    Dataset provided by
    Federal University of Rio de Janeirohttps://ufrj.br/
    Description

    Coleção de Nematoda do Museu Nacional - UFRJ

  2. Pfkit Dataset

    • universe.roboflow.com
    zip
    Updated Jul 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ibm (2025). Pfkit Dataset [Dataset]. https://universe.roboflow.com/ibm-pdnwf/pfkit/model/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 3, 2025
    Dataset provided by
    IBMhttp://ibm.com/
    Authors
    ibm
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Ppe Fire Fire And Smoke Bounding Boxes
    Description

    Pfkit

    ## Overview
    
    Pfkit is a dataset for object detection tasks - it contains Ppe Fire Fire And Smoke annotations for 8,828 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  3. avatar

    • huggingface.co
    Updated Apr 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IBM Research - University of Illinois Urbana Champaign Discovery Accelerator Institute (2024). avatar [Dataset]. https://huggingface.co/datasets/iidai/avatar
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 22, 2024
    Dataset provided by
    IBMhttp://ibm.com/
    Authors
    IBM Research - University of Illinois Urbana Champaign Discovery Accelerator Institute
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    iidai/avatar dataset hosted on Hugging Face and contributed by the HF Datasets community

  4. otter_uniprot_bindingdb_chembl

    • huggingface.co
    Updated Oct 18, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IBM Research (2023). otter_uniprot_bindingdb_chembl [Dataset]. https://huggingface.co/datasets/ibm-research/otter_uniprot_bindingdb_chembl
    Explore at:
    Dataset updated
    Oct 18, 2023
    Dataset provided by
    IBM Research
    IBMhttp://ibm.com/
    Authors
    IBM Research
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Otter UBC Dataset Card

    UBC is a dataset comprising entities (Proteins/Drugs) from Uniprot (U), BindingDB (B) and. ChemBL (C). It contains 6,207,654 triples.

      Dataset details
    
    
    
    
    
      Uniprot
    

    Uniprot comprises of 573,227 proteins from SwissProt, which is the subset of manually curated entries within UniProt, including attributes with different modalities like the sequence (567,483 of them), full name, organism, protein family, description of its function, catalytics… See the full description on the dataset page: https://huggingface.co/datasets/ibm-research/otter_uniprot_bindingdb_chembl.

  5. Data from: Coleção de Brachiopoda do Museu Nacional - UFRJ

    • portal.obis.org
    • gbif.org
    zip
    Updated Dec 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Universidade Federal do Rio de Janeiro (2023). Coleção de Brachiopoda do Museu Nacional - UFRJ [Dataset]. https://portal.obis.org/dataset/c67e6d09-ee1d-4da4-9b21-bfaf558d60f1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 11, 2023
    Dataset provided by
    Federal University of Rio de Janeirohttps://ufrj.br/
    Description

    Coleção de Brachiopoda do Museu Nacional - UFRJ

  6. DBPedia

    • processor1.francecentral.cloudapp.azure.com
    • ckan.govdata.de
    • +3more
    Updated Dec 12, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DBpedia.org (2016). DBPedia [Dataset]. http://processor1.francecentral.cloudapp.azure.com/pl/dataset/dbpedia
    Explore at:
    http://publications.europa.eu/resource/authority/file-type/htmlAvailable download formats
    Dataset updated
    Dec 12, 2016
    Dataset provided by
    DBpediahttp://dbpedia.org/
    License

    http://dcat-ap.de/def/licenses/cc-by-sahttp://dcat-ap.de/def/licenses/cc-by-sa

    Description

    DBpedia is a joint project of Leipzig University, Freie Universität Berlin and OpenLink Software to extract structured information from Wikipedia and make it accessible as linked data web applications. DBpedia also makes it possible to link this data with information from other web applications. The data sets are available under the GNU Free Documentation License and are linked to other free data collections.

  7. Modified bAbI dialog task

    • gitee.com
    • github.com
    Updated Dec 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IBM (2024). Modified bAbI dialog task [Dataset]. https://gitee.com/mirrors_ibm/modified-bAbI-dialog-tasks?skip_mobile=true
    Explore at:
    Dataset updated
    Dec 31, 2024
    Dataset provided by
    IBMhttp://ibm.com/
    Description

    The dataset modified-bAbI dialog tasks is an extension of original-bAbI-dialog-tasks, as described in the paper: "Learning End-to-End Goal-Oriented Dialog with maximal User task success and minimal Human Agent use". We modify the original-bAbI dialog tasks, by removing and replacing certain user behaviors from the training and validation data. The test set is left untouched. This simulates a scenario where some new user behaviors arise during the test (deployment) time that were not seen during the training and hence allows us to test our proposed method. This also mimics real-world data collection via crowdsourcing in the sense that certain user behavior is missing from the training data.

  8. Pfkit 2 Dataset

    • universe.roboflow.com
    zip
    Updated Jul 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ibm (2025). Pfkit 2 Dataset [Dataset]. https://universe.roboflow.com/ibm-pdnwf/pfkit-2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 3, 2025
    Dataset provided by
    IBMhttp://ibm.com/
    Authors
    ibm
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Ppe Kits Ppe Fire Fire And Smoke Bounding Boxes
    Description

    Pfkit 2

    ## Overview
    
    Pfkit 2 is a dataset for object detection tasks - it contains Ppe Kits Ppe Fire Fire And Smoke annotations for 9,893 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  9. Extended dialog bAbI tasks and CBT-OOV datasets

    • github.com
    Updated Jul 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IBM (2024). Extended dialog bAbI tasks and CBT-OOV datasets [Dataset]. https://github.com/IBM/ne-table-datasets
    Explore at:
    Dataset updated
    Jul 22, 2024
    Dataset provided by
    IBMhttp://ibm.com/
    Description

    Many Natural Language Processing (NLP) tasks depend on using Named Entities (NEs) that are contained in texts and in external knowledge sources. While this is easy for humans, the present neural methods that rely on learned word embeddings may not perform well for these NLP tasks, especially in the presence of Out-Of-Vocabulary (OOV) or rare NEs. The datasets contain extended versions of dialog bAbI tasks 1,2 and 4 and OOV versions of the CBT test set.

  10. Naturalistic Variation in Goal-Oriented Dialog datasets

    • github.com
    Updated Jul 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IBM (2024). Naturalistic Variation in Goal-Oriented Dialog datasets [Dataset]. https://github.com/IBM/naturalistic-variation-goal-oriented-dialog-datasets
    Explore at:
    Dataset updated
    Jul 22, 2024
    Dataset provided by
    IBMhttp://ibm.com/
    Description

    The datasets are new and more effective testbeds for bAbI dialog task 5 and Stanford Multi-Domain datasets, which incorporate naturalistic variation by the user. Existing benchmarks used to evaluate the performance of end-to-end neural dialog systems lack a key component: natural variation present in human conversations. Most datasets are constructed through crowdsourcing, where the crowd workers follow a fixed template of instructions while enacting the role of a user/agent. This results in straight-forward, somewhat routine, and mostly trouble-free conversations, as crowd workers do not think to represent the full range of actions that occur naturally with real users. We observe that there is a significant drop in performance (more than 60% in Ent. F1 on SMD and 85% in per-dialog accuracy on bAbI task) of recent state-of-the-art end-to-end neural methods such as BossNet and GLMP on both updated datasets which incorporate naturalistic variation by the user.

  11. identity_group_abuse_robustness

    • huggingface.co
    Updated Aug 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IBM Research (2024). identity_group_abuse_robustness [Dataset]. https://huggingface.co/datasets/ibm-research/identity_group_abuse_robustness
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 19, 2024
    Dataset provided by
    IBM Research
    IBMhttp://ibm.com/
    Authors
    IBM Research
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for "identity_group_abuse-robustness"

      Dataset Summary
    

    identity_group_abuse-robustness is an expanded version of the identity group abuse dataset (https://aclanthology.org/2022.naacl-main.410/) but with perturbations of the original input questions and passages. It is intended for use as a benchmark for evaluating model robustness on question-answering to these perturbations.

      Data Instances
    
    
    
    
    
      identity_group_abuse-robustness
    

    Size of… See the full description on the dataset page: https://huggingface.co/datasets/ibm-research/identity_group_abuse_robustness.

  12. Data from: Coleção de Sipuncula do Museu Nacional - UFRJ

    • obis.org
    • gbif.org
    zip
    Updated Dec 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Universidade Federal do Rio de Janeiro (2023). Coleção de Sipuncula do Museu Nacional - UFRJ [Dataset]. https://obis.org/dataset/df41ae5f-7fc7-40c4-967b-e8c0668741c7
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 11, 2023
    Dataset provided by
    Federal University of Rio de Janeirohttps://ufrj.br/
    Description

    Coleção de Sipuncula do Museu Nacional - UFRJ

  13. Twitter Conversations Dataset for Conversational Document Prediction (CDP)...

    • github.com
    Updated Jul 22, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IBM (2024). Twitter Conversations Dataset for Conversational Document Prediction (CDP) task [Dataset]. https://github.com/IBM/twitter-customer-care-document-prediction
    Explore at:
    Dataset updated
    Jul 22, 2024
    Dataset provided by
    IBMhttp://ibm.com/
    Description

    The dataset contains the Twitter Conversations for the task of Conversational Document Prediction (CDP). The dataset includes conversations that occurred between users and customer care agents in 25 organizations on the Twitter platform. Each conversation ends with a customer care agent providing a URL to a document to resolve the issue the user is facing. The task is to predict the document given a dialog context.

  14. o

    Data from: Coleção de Ascidiacea do Museu Nacional - UFRJ

    • obis.org
    • gbif.org
    zip
    Updated Dec 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Universidade Federal do Rio de Janeiro (2023). Coleção de Ascidiacea do Museu Nacional - UFRJ [Dataset]. https://obis.org/dataset/f1ac4ec4-6133-414a-bc4c-0f215def0cc5
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 11, 2023
    Dataset provided by
    Universidade Federal do Rio de Janeiro
    Description

    Coleção de Ascidiacea do Museu Nacional - UFRJ

  15. Pascal Xml To Yolo Txt Dataset

    • universe.roboflow.com
    zip
    Updated Nov 7, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IBM PubLayNet (2022). Pascal Xml To Yolo Txt Dataset [Dataset]. https://universe.roboflow.com/ibm-publaynet/pascal-xml-to-yolo-txt
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 7, 2022
    Dataset provided by
    IBMhttp://ibm.com/
    Authors
    IBM PubLayNet
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Documents Bounding Boxes
    Description

    Pascal XML To YOLO TXT

    ## Overview
    
    Pascal XML To YOLO TXT is a dataset for object detection tasks - it contains Documents annotations for 8,143 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  16. Permuted bAbI dialog task

    • github.com
    • paperswithcode.com
    • +1more
    Updated Jul 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IBM (2024). Permuted bAbI dialog task [Dataset]. https://github.com/IBM/permuted-bAbI-dialog-tasks
    Explore at:
    Dataset updated
    Jul 22, 2024
    Dataset provided by
    IBMhttp://ibm.com/
    Description

    The dataset permuted-bAbI dialog tasks is an extension of original-bAbI-dialog-tasks, as described in the paper: "Learning End-to-End Goal-Oriented Dialog with Multiple Answers". We modify the original-bAbI dialog tasks, by introducing multiple valid next utterances to the original-bAbI dialog tasks, which allows evaluation of end-to-end goal-oriented dialog systems in a more realistic setting.

  17. Road Accident Detection Dataset

    • universe.roboflow.com
    zip
    Updated May 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IBM (2025). Road Accident Detection Dataset [Dataset]. https://universe.roboflow.com/ibm-oj5gs/road-accident-detection-eioit/model/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 30, 2025
    Dataset authored and provided by
    IBMhttp://ibm.com/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Accidental Non_accidnetal Bounding Boxes
    Description

    Road Accident Detection

    ## Overview
    
    Road Accident Detection is a dataset for object detection tasks - it contains Accidental Non_accidnetal annotations for 3,208 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  18. BookTest

    • opendatalab.com
    • paperswithcode.com
    zip
    Updated Mar 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IBM Watson (2023). BookTest [Dataset]. https://opendatalab.com/OpenDataLab/BookTest
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 22, 2023
    Dataset provided by
    IBMhttp://ibm.com/
    Description

    BookTest is a new dataset similar to the popular Children’s Book Test (CBT), however more than 60 times larger.

  19. SynthTabNet

    • opendatalab.com
    zip
    Updated Mar 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IBM Research (2023). SynthTabNet [Dataset]. https://opendatalab.com/OpenDataLab/SynthTabNet
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 22, 2023
    Dataset provided by
    IBMhttp://ibm.com/
    License

    https://cdla.dev/permissive-1-0/https://cdla.dev/permissive-1-0/

    Description

    SynthTabNet is a dataset of 600k png images from synthetically generated table layouts with annotations in jsonl files.

  20. otter_dude

    • huggingface.co
    Updated Aug 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IBM Research (2023). otter_dude [Dataset]. https://huggingface.co/datasets/ibm-research/otter_dude
    Explore at:
    Dataset updated
    Aug 16, 2023
    Dataset provided by
    IBM Research
    IBMhttp://ibm.com/
    Authors
    IBM Research
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Otter DUDe Dataset Card

    Otter DUDe includes 1,452,568 instances of drug-target interactions.

      Dataset details
    
    
    
    
    
      DUDe
    

    DUDe comprises a collection of 22,886 active compounds and their corresponding affinities towards 102 targets. For our study, we utilized a preprocessed version of the DUDe, which includes 1,452,568 instances of drug-target interactions. To prevent any data leakage, we eliminated the negative interactions and the overlapping triples with the TDC DTI… See the full description on the dataset page: https://huggingface.co/datasets/ibm-research/otter_dude.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Universidade Federal do Rio de Janeiro (2023). Coleção de Nematoda do Museu Nacional - UFRJ [Dataset]. https://portal.obis.org/dataset/13df6cd5-b38e-46b3-96c3-f86f8c7d378e
Organization logo

Data from: Coleção de Nematoda do Museu Nacional - UFRJ

Related Article
Explore at:
zipAvailable download formats
Dataset updated
Dec 11, 2023
Dataset provided by
Federal University of Rio de Janeirohttps://ufrj.br/
Description

Coleção de Nematoda do Museu Nacional - UFRJ

Search
Clear search
Close search
Google apps
Main menu