3 datasets found
  1. h

    datacomp-medium-12m

    • huggingface.co
    Updated Dec 30, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tim (2016). datacomp-medium-12m [Dataset]. https://huggingface.co/datasets/cornuHGF/datacomp-medium-12m
    Explore at:
    Dataset updated
    Dec 30, 2016
    Authors
    Tim
    Description

    cornuHGF/datacomp-medium-12m dataset hosted on Hugging Face and contributed by the HF Datasets community

  2. h

    datacomp_medium

    • huggingface.co
    Updated Jun 7, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ML Foundations (2023). datacomp_medium [Dataset]. https://huggingface.co/datasets/mlfoundations/datacomp_medium
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 7, 2023
    Dataset authored and provided by
    ML Foundations
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    DataComp Medium Pool

    This repository contains metadata files for the medium pool of DataComp. For details on how to use the metadata, please visit our website and our github repository. We distribute the image url-text samples and metadata under a standard Creative Common CC-BY-4.0 license. The individual images are under their own copyrights.

      Terms and Conditions
    

    We have terms of service that are similar to those adopted by HuggingFace… See the full description on the dataset page: https://huggingface.co/datasets/mlfoundations/datacomp_medium.

  3. h

    Open-Qwen2VL-Data

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Weizhi Wang, Open-Qwen2VL-Data [Dataset]. https://huggingface.co/datasets/weizhiwang/Open-Qwen2VL-Data
    Explore at:
    Authors
    Weizhi Wang
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Introduction

    This repository contains the data for Open-Qwen2VL: Compute-Efficient Pre-Training of Fully-Open Multimodal LLMs on Academic Resources. Project page: https://victorwz.github.io/Open-Qwen2VL Code: https://github.com/Victorwz/Open-Qwen2VL

      Dataset
    

    ccs_ebdataset: CC3M-CC12M-SBU filtered by CLIP, we directly download the webdataset based on the released of curated subset of BLIP-1 datacomp_medium_dfn_webdataset: DataComp-Medium-128M filtered by DFN, we just… See the full description on the dataset page: https://huggingface.co/datasets/weizhiwang/Open-Qwen2VL-Data.

  4. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Tim (2016). datacomp-medium-12m [Dataset]. https://huggingface.co/datasets/cornuHGF/datacomp-medium-12m

datacomp-medium-12m

cornuHGF/datacomp-medium-12m

Explore at:
Dataset updated
Dec 30, 2016
Authors
Tim
Description

cornuHGF/datacomp-medium-12m dataset hosted on Hugging Face and contributed by the HF Datasets community

Search
Clear search
Close search
Google apps
Main menu