1 dataset found
  1. P

    LAION-5B Dataset

    • paperswithcode.com
    Updated Apr 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christoph Schuhmann; Romain Beaumont; Richard Vencu; Cade Gordon; Ross Wightman; Mehdi Cherti; Theo Coombes; Aarush Katta; Clayton Mullis; Mitchell Wortsman; Patrick Schramowski; Srivatsa Kundurthy; Katherine Crowson; Ludwig Schmidt; Robert Kaczmarczyk; Jenia Jitsev (2022). LAION-5B Dataset [Dataset]. https://paperswithcode.com/dataset/laion-5b
    Explore at:
    Dataset updated
    Apr 3, 2022
    Authors
    Christoph Schuhmann; Romain Beaumont; Richard Vencu; Cade Gordon; Ross Wightman; Mehdi Cherti; Theo Coombes; Aarush Katta; Clayton Mullis; Mitchell Wortsman; Patrick Schramowski; Srivatsa Kundurthy; Katherine Crowson; Ludwig Schmidt; Robert Kaczmarczyk; Jenia Jitsev
    Description

    LAION 5B is a large-scale dataset for research purposes consisting of 5,85B CLIP-filtered image-text pairs. 2,3B contain English language, 2,2B samples from 100+ other languages and 1B samples have texts that do not allow a certain language assignment (e.g. names ). Additionally, we provide several nearest neighbor indices, an improved web interface for exploration & subset creation as well as detection scores for watermark and NSFW.

  2. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Christoph Schuhmann; Romain Beaumont; Richard Vencu; Cade Gordon; Ross Wightman; Mehdi Cherti; Theo Coombes; Aarush Katta; Clayton Mullis; Mitchell Wortsman; Patrick Schramowski; Srivatsa Kundurthy; Katherine Crowson; Ludwig Schmidt; Robert Kaczmarczyk; Jenia Jitsev (2022). LAION-5B Dataset [Dataset]. https://paperswithcode.com/dataset/laion-5b

LAION-5B Dataset

Explore at:
Dataset updated
Apr 3, 2022
Authors
Christoph Schuhmann; Romain Beaumont; Richard Vencu; Cade Gordon; Ross Wightman; Mehdi Cherti; Theo Coombes; Aarush Katta; Clayton Mullis; Mitchell Wortsman; Patrick Schramowski; Srivatsa Kundurthy; Katherine Crowson; Ludwig Schmidt; Robert Kaczmarczyk; Jenia Jitsev
Description

LAION 5B is a large-scale dataset for research purposes consisting of 5,85B CLIP-filtered image-text pairs. 2,3B contain English language, 2,2B samples from 100+ other languages and 1B samples have texts that do not allow a certain language assignment (e.g. names ). Additionally, we provide several nearest neighbor indices, an improved web interface for exploration & subset creation as well as detection scores for watermark and NSFW.

Search
Clear search
Close search
Google apps
Main menu