17 datasets found

h
laion-aesthetics-12m-umap
huggingface.co
Updated Apr 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David McClure (2023). laion-aesthetics-12m-umap [Dataset]. https://huggingface.co/datasets/dclure/laion-aesthetics-12m-umap
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 7, 2023
Authors
David McClure
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
LAION-Aesthetics :: CLIP → UMAP

This dataset is a CLIP (text) → UMAP embedding of the LAION-Aesthetics dataset - specifically the improved_aesthetics_6plus version, which filters the full dataset to images with scores of > 6 under the "aesthetic" filtering model. Thanks LAION for this amazing corpus!

The dataset here includes coordinates for 3x separate UMAP fits using different values for the n_neighbors parameter - 10, 30, and 60 - which are broken out as separate columns with… See the full description on the dataset page: https://huggingface.co/datasets/dclure/laion-aesthetics-12m-umap.
P
LAION-Aesthetics V2 6.5+ Dataset
paperswithcode.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
LAION-Aesthetics V2 6.5+ Dataset [Dataset]. https://paperswithcode.com/dataset/laion-aesthetics-v2-6-5
Explore at:
Description
A subset of the LAION 5B samples with English captions, obtained using LAION-Aesthetics_Predictor V2 625K image-text pairs with predicted aesthetics scores of 6.5 or higher available at https://huggingface.co/datasets/ChristophSchuhmann/improved_aesthetics_6.5plus
laion2B-en-aesthetic
huggingface.co
Updated May 27, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
LAION eV (2025). laion2B-en-aesthetic [Dataset]. http://doi.org/10.57967/hf/5792
Explore at:
Unique identifier
https://doi.org/10.57967/hf/5792
Dataset updated
May 27, 2025
Dataset provided by
LAIONhttps://laion.ai/
Authors
LAION eV
Description
laion/laion2B-en-aesthetic dataset hosted on Hugging Face and contributed by the HF Datasets community
h
laion-coco-aesthetic
huggingface.co
Updated Feb 15, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Guangyi Liu (2019). laion-coco-aesthetic [Dataset]. https://huggingface.co/datasets/guangyil/laion-coco-aesthetic
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 15, 2019
Authors
Guangyi Liu
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
LAION COCO with aesthetic score and watermark score

This dataset contains 10% samples of the LAION-COCO dataset filtered by some text rules (remove url, special tokens, etc.), and image rules (image size > 384x384, aesthetic score>4.75 and watermark probability<0.5). There are total 8,563,753 data instances in this dataset. And the corresponding aesthetic score and watermark score are also included. Noted: watermark score in the table means the probability of the existence of the… See the full description on the dataset page: https://huggingface.co/datasets/guangyil/laion-coco-aesthetic.
LAION-Aesthetics 9
kaggle.com
Updated Aug 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CookieMonsterYum (2023). LAION-Aesthetics 9 [Dataset]. https://www.kaggle.com/datasets/cookiemonsteryum/laion-aesthetics-9
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 8, 2023
Dataset provided by
Kaggle
Authors
CookieMonsterYum
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset

This dataset was created by CookieMonsterYum

Released under CC0: Public Domain

Contents
t
Xiang Gao, Zhengbo Xu, Junhan Zhao, Jiaying Liu (2024). Dataset:...
service.tib.eu
Updated Dec 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Xiang Gao, Zhengbo Xu, Junhan Zhao, Jiaying Liu (2024). Dataset: LAION-Aesthetics 6.5+. https://doi.org/10.57702/zvbnqhl9 [Dataset]. https://service.tib.eu/ldmservice/dataset/laion-aesthetics-6-5-
Explore at:
Dataset updated
Dec 2, 2024
Description
LAION-Aesthetics 6.5+ dataset contains 625K image-text pairs.
t
LAION-Aesthetic - Dataset - LDM
service.tib.eu
Updated Dec 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). LAION-Aesthetic - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/laion-aesthetic
Explore at:
Dataset updated
Dec 2, 2024
Description
The dataset used in the paper is LAION-Aesthetic, a large-scale image dataset.
Ko-LAION-Aesthetics-10M
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ETRI VILAB(Visual Intelligence Lab), Ko-LAION-Aesthetics-10M [Dataset]. https://huggingface.co/datasets/etri-vilab/Ko-LAION-Aesthetics-10M
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset provided by
한국전자통신연구원https://www.etri.re.kr/
Authors
ETRI VILAB(Visual Intelligence Lab)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
LAION-Aesthetics 10M Dataset Card

Dataset details

Dataset type: Laion aesthetic is a subset of laion5B that has been estimated by a model trained on top of clip embeddings to be aesthetic. The intended usage of this dataset is image generation Paper or resources for more information: https://laion.ai/blog/laion-aesthetics/ Acknowledgements This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grants funded by the… See the full description on the dataset page: https://huggingface.co/datasets/etri-vilab/Ko-LAION-Aesthetics-10M.
Audio-aesthetics-score
huggingface.co
Updated May 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
LAION eV (2025). Audio-aesthetics-score [Dataset]. https://huggingface.co/datasets/laion/Audio-aesthetics-score
Explore at:
Dataset updated
May 27, 2025
Dataset provided by
LAIONhttps://laion.ai/
Authors
LAION eV
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
laion/Audio-aesthetics-score dataset hosted on Hugging Face and contributed by the HF Datasets community
h
spright_coco
huggingface.co
Updated Apr 2, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SPRIGHT (2024). spright_coco [Dataset]. https://huggingface.co/datasets/SPRIGHT-T2I/spright_coco
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 2, 2024
Dataset authored and provided by
SPRIGHT
License
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Description
Dataset Description

SPRIGHT (SPatially RIGHT) is the first spatially focused, large scale vision-language dataset. It was built by re-captioning ∼6 million images from 4 widely-used datasets:

CC12M Segment Anything COCO Validation LAION Aesthetics

This repository contains the re-captioned data from COCO-Validation Set, while the data from CC12 and Segment Anything is present here. We do not release images from LAION, as the parent images are currently private.

Dataset… See the full description on the dataset page: https://huggingface.co/datasets/SPRIGHT-T2I/spright_coco.
h
18_obj_444
huggingface.co
Updated Apr 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SPRIGHT (2024). 18_obj_444 [Dataset]. https://huggingface.co/datasets/SPRIGHT-T2I/18_obj_444
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 12, 2024
Dataset authored and provided by
SPRIGHT
License
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Description
Dataset Description

This dataset contains the 444 images that we used for training our model - https://huggingface.co/SPRIGHT-T2I/spright-t2i-sd2. This contains the samples of this subset related to the Segment Anything images. We will release the LAION images, when the parent images are made public again. Our training and validation set are a subset of the SPRIGHT dataset, and consists of 444 and 50 images respectively, randomly sampled in a 50:50 split between LAION-Aesthetics and… See the full description on the dataset page: https://huggingface.co/datasets/SPRIGHT-T2I/18_obj_444.
h
t2i-diversity-captions
huggingface.co
Updated Jun 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Artificial Intelligence & Machine Learning Lab at TU Darmstadt (2025). t2i-diversity-captions [Dataset]. https://huggingface.co/datasets/AIML-TUDA/t2i-diversity-captions
Explore at:
Dataset updated
Jun 23, 2025
Dataset authored and provided by
Artificial Intelligence & Machine Learning Lab at TU Darmstadt
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
📄 Synthetic Captioned LAION Subset

🧾 Dataset Summary

This dataset consists of over 39 Million synthetic image captions generated for 1 Million curated images from LAION Aesthetics. Images have an aesthetics score >6, at minimum resolution of 512p, and have been screened for NSFW, CSAM and watermarks. We also removed exact duplicates.

📦 Data Structure

strID - Unique string identifier for the sample intID - Unique integer identifier for the sample imageURL -… See the full description on the dataset page: https://huggingface.co/datasets/AIML-TUDA/t2i-diversity-captions.
h
MMFR-Dataset
huggingface.co
Updated Jun 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AnnaGao (2025). MMFR-Dataset [Dataset]. https://huggingface.co/datasets/AnnaGao/MMFR-Dataset
Explore at:
Dataset updated
Jun 21, 2025
Authors
AnnaGao
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
📦 Data Sources

The training set of MMFR-Dataset contains:

Fake images sourced from DiffusionDB, released under the CC0 1.0 Public Domain Dedication. Real images drawn from LAION-Aesthetics, a subset of LAION-5B, licensed under the CC BY 4.0 License.

Licenses of evaluation sets are:

BigGAN: Provided by the GenImage dataset, licensed under CC BY-NC-SA 4.0. GauGAN: Obtained from CNNDetection, released under the CC BY-NC-SA 4.0. StyleGAN-XL: Collected from AntifakePrompts, under the… See the full description on the dataset page: https://huggingface.co/datasets/AnnaGao/MMFR-Dataset.
h
nobodies
huggingface.co
Updated Mar 12, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Plat (2023). nobodies [Dataset]. https://huggingface.co/datasets/p1atdev/nobodies
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 12, 2023
Authors
Plat
License
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Description
Nobodies

AI-generated human image dataset.

Contents Face

vol1: 32 photos of women's faces. Generated with WD1.5 beta 2.

Sample:

Portrait

vol1: 31 photos of women's portraits. Generated with WD1.5 beta 2 and the fashion LoCon.

Sample:

vol2: 165 photos of woman's portraits. Generated with WD1.5 beta 2 and the fashion LoCon. Classified with LAION Aesthetic v2.75 hair bun photos 90 medium hair photos
h
LAION-art-EN-improved-captions
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Re:cast AI, LAION-art-EN-improved-captions [Dataset]. https://huggingface.co/datasets/recastai/LAION-art-EN-improved-captions
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset authored and provided by
Re:cast AI
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset Card for LAION-art-EN-improved-captions

Dataset Summary

This dataset has been created by Re:cast AI for improving the semantic relationship of image-caption pairs. generated_captions were created in a semi-supervised fashion using the Salesforce/blip2-flan-t5-xxl model.

Supported Tasks

Fine-tuning text-to-image generators (e.g. stable-diffusion), or a searchable prompt database (requires faiss-index).

Dataset Structure Data Fields… See the full description on the dataset page: https://huggingface.co/datasets/recastai/LAION-art-EN-improved-captions.
h
laion2b-23ish-1216px
huggingface.co
Updated Feb 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Open Diffusion AI, laion2b-23ish-1216px [Dataset]. https://huggingface.co/datasets/opendiffusionai/laion2b-23ish-1216px
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 12, 2025
Dataset authored and provided by
Open Diffusion AI
Description
Overview

This is a subset of https://huggingface.co/datasets/laion/laion2B-en-aesthetic, selected for aspect ratio, and with better captioning. Approximate image count is around 250k.

23ish, 1216px

I picked out the images that are portrait aspect ratio of 2:3, or a little wider (Because images that are a little too wide, can be safely cropped narrower) I also picked a minimum height of 1216 pixels, because that is what 1024x1024 pixelcount converted to 2:3 looks like.… See the full description on the dataset page: https://huggingface.co/datasets/opendiffusionai/laion2b-23ish-1216px.
h
sd-4.4M
huggingface.co
Updated Aug 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Javier Martín (2023). sd-4.4M [Dataset]. https://huggingface.co/datasets/jamarju/sd-4.4M
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 8, 2023
Authors
Javier Martín
License
https://choosealicense.com/licenses/openrail/https://choosealicense.com/licenses/openrail/
Description
This is a dataset of 4.4M images generated with Stable Diffusion 2 for Kaggle's stable diffusion image to prompt competition. Prompts were extracted from public databases:

mp: Magic Prompt - 1M db: DiffusionDB op: Open Prompts co: COCO cc: Conceptual Captions l0: LAION-2B-en-aesthetic

The following prompts were filtered out:

those with token length >77 CLIP tokens those whose all-MiniLM-L6-v2 embedding have a cosine similarity >0.9 to any other prompt

Samples were clustered by their… See the full description on the dataset page: https://huggingface.co/datasets/jamarju/sd-4.4M.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

David McClure (2023). laion-aesthetics-12m-umap [Dataset]. https://huggingface.co/datasets/dclure/laion-aesthetics-12m-umap

laion-aesthetics-12m-umap

dclure/laion-aesthetics-12m-umap

Explore at:

6 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Apr 7, 2023

Authors

David McClure

License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

LAION-Aesthetics :: CLIP → UMAP

This dataset is a CLIP (text) → UMAP embedding of the LAION-Aesthetics dataset - specifically the improved_aesthetics_6plus version, which filters the full dataset to images with scores of > 6 under the "aesthetic" filtering model. Thanks LAION for this amazing corpus!

The dataset here includes coordinates for 3x separate UMAP fits using different values for the n_neighbors parameter - 10, 30, and 60 - which are broken out as separate columns with… See the full description on the dataset page: https://huggingface.co/datasets/dclure/laion-aesthetics-12m-umap.

Clear search

Close search

Google apps

Main menu

laion-aesthetics-12m-umap

LAION-Aesthetics V2 6.5+ Dataset

laion2B-en-aesthetic

laion-coco-aesthetic

LAION-Aesthetics 9

Dataset

Contents

Xiang Gao, Zhengbo Xu, Junhan Zhao, Jiaying Liu (2024). Dataset:...

LAION-Aesthetic - Dataset - LDM

Ko-LAION-Aesthetics-10M

Audio-aesthetics-score

spright_coco

18_obj_444

t2i-diversity-captions

MMFR-Dataset

nobodies

LAION-art-EN-improved-captions

laion2b-23ish-1216px

sd-4.4M

laion-aesthetics-12m-umap

laion-aesthetics-12m-umap

dclure/laion-aesthetics-12m-umap