46 datasets found
  1. h

    datacomp_1b

    • huggingface.co
    Updated Oct 15, 2014
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ML Foundations (2014). datacomp_1b [Dataset]. https://huggingface.co/datasets/mlfoundations/datacomp_1b
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 15, 2014
    Dataset authored and provided by
    ML Foundations
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    DataComp-1B

    This repository contains metadata files for DataComp-1B. For details on how to use the metadata, please visit our website and our github repository. We distribute the image url-text samples and metadata under a standard Creative Common CC-BY-4.0 license. The individual images are under their own copyrights.

      Terms and Conditions
    

    We have terms of service that are similar to those adopted by HuggingFace (https://huggingface.co/terms-of-service), which covers… See the full description on the dataset page: https://huggingface.co/datasets/mlfoundations/datacomp_1b.

  2. h

    Recap-DataComp-1B

    • huggingface.co
    Updated Jun 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UCSC-VLAA (2024). Recap-DataComp-1B [Dataset]. https://huggingface.co/datasets/UCSC-VLAA/Recap-DataComp-1B
    Explore at:
    Dataset updated
    Jun 12, 2024
    Dataset authored and provided by
    UCSC-VLAA
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset Card for Recap-DataComp-1B

    Recap-DataComp-1B is a large-scale image-text dataset that has been recaptioned using an advanced LLaVA-1.5-LLaMA3-8B model to enhance the alignment and detail of textual descriptions.

      Dataset Details
    
    
    
    
    
      Dataset Description
    

    Our paper aims to bridge this community effort, leveraging the powerful and open-sourced LLaMA-3, a GPT-4 level LLM. Our recaptioning pipeline is simple: first, we fine-tune a LLaMA-3-8B powered LLaVA-1.5… See the full description on the dataset page: https://huggingface.co/datasets/UCSC-VLAA/Recap-DataComp-1B.

  3. datacomp-hq

    • huggingface.co
    Updated Jan 15, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    LAION eV (2017). datacomp-hq [Dataset]. https://huggingface.co/datasets/laion/datacomp-hq
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 15, 2017
    Dataset provided by
    LAIONhttps://laion.ai/
    Authors
    LAION eV
    Description

    laion/datacomp-hq dataset hosted on Hugging Face and contributed by the HF Datasets community

  4. R

    Datacomp.io Dataset

    • universe.roboflow.com
    zip
    Updated Dec 2, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DQ1109 (2021). Datacomp.io Dataset [Dataset]. https://universe.roboflow.com/dq1109/datacomp.io
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 2, 2021
    Dataset authored and provided by
    DQ1109
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Mask Bounding Boxes
    Description

    Datacomp.io

    ## Overview
    
    Datacomp.io is a dataset for object detection tasks - it contains Mask annotations for 792 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  5. h

    datacomp-small-clip

    • huggingface.co
    Updated Mar 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fondant (2024). datacomp-small-clip [Dataset]. https://huggingface.co/datasets/fondant-ai/datacomp-small-clip
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 6, 2024
    Dataset authored and provided by
    Fondant
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Production-ready data processing made easy and shareable

    Explore the Fondant docs »
    
    
    
    
    
    
    
    
      Dataset Card for fondant-ai/datacomp-small-clip
    

    This is a dataset containing image urls and their CLIP embeddings, based on the datacomp_small dataset, and processed with fondant.

      Dataset Details
    
    
    
    
    
    
      Dataset Description
    

    Large (image) datasets are often unwieldy to use due to their… See the full description on the dataset page: https://huggingface.co/datasets/fondant-ai/datacomp-small-clip.

  6. h

    datacomp-small-filtered

    • huggingface.co
    Updated Jul 24, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Niels Rogge (2023). datacomp-small-filtered [Dataset]. https://huggingface.co/datasets/nielsr/datacomp-small-filtered
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 24, 2023
    Authors
    Niels Rogge
    Description

    Dataset Card for "datacomp-small-filtered"

    This is the DataComp-small dataset with CLIP-large-patch14 image embeddings added, as well as:

    captions filtered for English using a FastText model captions filtered to have at least complexity of 1

  7. DataCompDR-12M-bf16

    • huggingface.co
    Updated Aug 26, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Apple (2025). DataCompDR-12M-bf16 [Dataset]. https://huggingface.co/datasets/apple/DataCompDR-12M-bf16
    Explore at:
    Dataset updated
    Aug 26, 2025
    Dataset authored and provided by
    Applehttp://apple.com/
    License

    https://choosealicense.com/licenses/apple-amlr/https://choosealicense.com/licenses/apple-amlr/

    Description

    Dataset Card for DataCompDR-12M-BFloat16

    This dataset contains synthetic captions, embeddings, and metadata for DataCompDR-12M. The metadata has been generated using pretrained image-text models on a 12M subset of DataComp-1B. For details on how to use the metadata, please visit our github repository. The dataset with the original captions is now available at mlfoundations/DataComp-12M. The UIDs per shards match between mlfoundations/DataComp-12M and apple/DataCompDR-12M-bf16.… See the full description on the dataset page: https://huggingface.co/datasets/apple/DataCompDR-12M-bf16.

  8. a

    HH NoInternetAccess DataComp

    • egisdata-dallasgis.hub.arcgis.com
    Updated Nov 2, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Dallas GIS Services (2020). HH NoInternetAccess DataComp [Dataset]. https://egisdata-dallasgis.hub.arcgis.com/datasets/hh-nointernetaccess-datacomp
    Explore at:
    Dataset updated
    Nov 2, 2020
    Dataset authored and provided by
    City of Dallas GIS Services
    Description

    Dashboard to compare the 2020 HH_Low Internet Access data with 2014-18 valuesBase Map: https://dallasgis.maps.arcgis.com/home/item.html?id=72fcea4726a14a2c82e48189807984e0

  9. h

    datacomp-small-10-rows-with-image-feature

    • huggingface.co
    Updated Aug 16, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Niels Rogge (2023). datacomp-small-10-rows-with-image-feature [Dataset]. https://huggingface.co/datasets/nielsr/datacomp-small-10-rows-with-image-feature
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 16, 2023
    Authors
    Niels Rogge
    Description

    Dataset Card for "datacomp-small-10-rows-with-image-feature"

    More Information needed

  10. h

    Recap-DataComp-1B_split_4

    • huggingface.co
    Updated Mar 13, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Khalil Hennara (2025). Recap-DataComp-1B_split_4 [Dataset]. https://huggingface.co/datasets/Hennara/Recap-DataComp-1B_split_4
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 13, 2025
    Authors
    Khalil Hennara
    Description

    Hennara/Recap-DataComp-1B_split_4 dataset hosted on Hugging Face and contributed by the HF Datasets community

  11. R

    Fpt Mask Detection Base Data Set Ver 3 Dataset

    • universe.roboflow.com
    zip
    Updated Nov 8, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    New Dataset FPT DataComp (2021). Fpt Mask Detection Base Data Set Ver 3 Dataset [Dataset]. https://universe.roboflow.com/new-dataset-fpt-datacomp/fpt-mask-detection-base-data-set-ver-3
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 8, 2021
    Dataset authored and provided by
    New Dataset FPT DataComp
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    NoMask Mask IncorrectMask Bounding Boxes
    Description

    FPT Mask Detection Base Data Set Ver 3

    ## Overview
    
    FPT Mask Detection Base Data Set Ver 3 is a dataset for object detection tasks - it contains NoMask Mask IncorrectMask annotations for 976 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  12. MFC Data compe 20181014

    • kaggle.com
    zip
    Updated Oct 14, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Keroro (2018). MFC Data compe 20181014 [Dataset]. https://www.kaggle.com/datasets/keroru/mfc-data-compe-20181014
    Explore at:
    zip(396601 bytes)Available download formats
    Dataset updated
    Oct 14, 2018
    Authors
    Keroro
    Description

    Dataset

    This dataset was created by Keroro

    Contents

  13. R

    Fpt Mask Detection Base Data Set Ver 2 Dataset

    • universe.roboflow.com
    zip
    Updated Nov 7, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    New Dataset FPT DataComp (2021). Fpt Mask Detection Base Data Set Ver 2 Dataset [Dataset]. https://universe.roboflow.com/new-dataset-fpt-datacomp/fpt-mask-detection-base-data-set-ver-2/dataset/13
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 7, 2021
    Dataset authored and provided by
    New Dataset FPT DataComp
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    NoMask Mask IncorrectMask Bounding Boxes
    Description

    FPT Mask Detection Base Data Set Ver 2

    ## Overview
    
    FPT Mask Detection Base Data Set Ver 2 is a dataset for object detection tasks - it contains NoMask Mask IncorrectMask annotations for 976 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  14. h

    datacomp_large

    • huggingface.co
    Updated Jun 21, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ML Foundations (2023). datacomp_large [Dataset]. https://huggingface.co/datasets/mlfoundations/datacomp_large
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 21, 2023
    Dataset authored and provided by
    ML Foundations
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    DataComp Large Pool

    This repository contains metadata files for the large pool of DataComp. For details on how to use the metadata, please visit our website and our github repository. We distribute the image url-text samples and metadata under a standard Creative Common CC-BY-4.0 license. The individual images are under their own copyrights.

      Terms and Conditions
    

    We have terms of service that are similar to those adopted by HuggingFace… See the full description on the dataset page: https://huggingface.co/datasets/mlfoundations/datacomp_large.

  15. R

    Fpt Mask Detection Augmentation + Incorrect Mask Dataset

    • universe.roboflow.com
    zip
    Updated Nov 5, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    New Dataset FPT DataComp (2021). Fpt Mask Detection Augmentation + Incorrect Mask Dataset [Dataset]. https://universe.roboflow.com/new-dataset-fpt-datacomp/fpt-mask-detection-augmentation---incorrect-mask
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 5, 2021
    Dataset authored and provided by
    New Dataset FPT DataComp
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    NoMask Mask IncorrectMask Bounding Boxes
    Description

    FPT Mask Detection Augmentation + Incorrect Mask

    ## Overview
    
    FPT Mask Detection Augmentation + Incorrect Mask is a dataset for object detection tasks - it contains NoMask Mask IncorrectMask annotations for 2,439 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  16. Ecuador Census Data Comp Store Sales Time Series

    • kaggle.com
    zip
    Updated Oct 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    wtstephens (2023). Ecuador Census Data Comp Store Sales Time Series [Dataset]. https://www.kaggle.com/datasets/wtstephens/ecuador-census-data-comp-store-sales-time-series/suggestions
    Explore at:
    zip(773 bytes)Available download formats
    Dataset updated
    Oct 9, 2023
    Authors
    wtstephens
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    Ecuador
    Description

    Please be aware this is not thorough. It is a select amount of information specifically for the beginner competition Store Sales Time Series Forecasting. This dataset includes Ecuadorian 2010 census data for: - The breakdown of ethnic groups in percentages - Mean HHI - Population

  17. h

    datacomp_xlarge

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ML Foundations, datacomp_xlarge [Dataset]. https://huggingface.co/datasets/mlfoundations/datacomp_xlarge
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset authored and provided by
    ML Foundations
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    DataComp XLarge Pool

    This repository contains metadata files for the xlarge pool of DataComp. For details on how to use the metadata, please visit our website and our github repository. We distribute the image url-text samples and metadata under a standard Creative Common CC-BY-4.0 license. The individual images are under their own copyrights.

      Terms and Conditions
    

    We have terms of service that are similar to those adopted by HuggingFace… See the full description on the dataset page: https://huggingface.co/datasets/mlfoundations/datacomp_xlarge.

  18. R

    Fpt Mask Detection Base Dataset

    • universe.roboflow.com
    zip
    Updated Oct 31, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FPT DataComp (2021). Fpt Mask Detection Base Dataset [Dataset]. https://universe.roboflow.com/fpt-datacomp/fpt-mask-detection-base-dataset
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 31, 2021
    Dataset authored and provided by
    FPT DataComp
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    NoMask Mask IncorrectMask Bounding Boxes
    Description

    FPT Mask Detection Base Dataset

    ## Overview
    
    FPT Mask Detection Base Dataset is a dataset for object detection tasks - it contains NoMask Mask IncorrectMask annotations for 976 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  19. h

    DataComp-12M-part-2-translated

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alex Alex, DataComp-12M-part-2-translated [Dataset]. https://huggingface.co/datasets/Alexator26/DataComp-12M-part-2-translated
    Explore at:
    Authors
    Alex Alex
    Description

    Alexator26/DataComp-12M-part-2-translated dataset hosted on Hugging Face and contributed by the HF Datasets community

  20. h

    datacomp

    • huggingface.co
    Updated Aug 31, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ADHITHYAN (2025). datacomp [Dataset]. https://huggingface.co/datasets/ADHIZ/datacomp
    Explore at:
    Dataset updated
    Aug 31, 2025
    Authors
    ADHITHYAN
    Description

    ADHIZ/datacomp dataset hosted on Hugging Face and contributed by the HF Datasets community

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
ML Foundations (2014). datacomp_1b [Dataset]. https://huggingface.co/datasets/mlfoundations/datacomp_1b

datacomp_1b

mlfoundations/datacomp_1b

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 15, 2014
Dataset authored and provided by
ML Foundations
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

DataComp-1B

This repository contains metadata files for DataComp-1B. For details on how to use the metadata, please visit our website and our github repository. We distribute the image url-text samples and metadata under a standard Creative Common CC-BY-4.0 license. The individual images are under their own copyrights.

  Terms and Conditions

We have terms of service that are similar to those adopted by HuggingFace (https://huggingface.co/terms-of-service), which covers… See the full description on the dataset page: https://huggingface.co/datasets/mlfoundations/datacomp_1b.

Search
Clear search
Close search
Google apps
Main menu