MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
LAION-Aesthetics :: CLIP → UMAP
This dataset is a CLIP (text) → UMAP embedding of the LAION-Aesthetics dataset - specifically the improved_aesthetics_6plus version, which filters the full dataset to images with scores of > 6 under the "aesthetic" filtering model. Thanks LAION for this amazing corpus!
The dataset here includes coordinates for 3x separate UMAP fits using different values for the n_neighbors parameter - 10, 30, and 60 - which are broken out as separate columns with… See the full description on the dataset page: https://huggingface.co/datasets/dclure/laion-aesthetics-12m-umap.
A subset of the LAION 5B samples with English captions, obtained using LAION-Aesthetics_Predictor V2 625K image-text pairs with predicted aesthetics scores of 6.5 or higher available at https://huggingface.co/datasets/ChristophSchuhmann/improved_aesthetics_6.5plus
laion/laion2B-en-aesthetic dataset hosted on Hugging Face and contributed by the HF Datasets community
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
LAION COCO with aesthetic score and watermark score
This dataset contains 10% samples of the LAION-COCO dataset filtered by some text rules (remove url, special tokens, etc.), and image rules (image size > 384x384, aesthetic score>4.75 and watermark probability<0.5). There are total 8,563,753 data instances in this dataset. And the corresponding aesthetic score and watermark score are also included. Noted: watermark score in the table means the probability of the existence of the… See the full description on the dataset page: https://huggingface.co/datasets/guangyil/laion-coco-aesthetic.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by CookieMonsterYum
Released under CC0: Public Domain
LAION-Aesthetics 6.5+ dataset contains 625K image-text pairs.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
LAION-Aesthetics 10M Dataset Card
Dataset details
Dataset type: Laion aesthetic is a subset of laion5B that has been estimated by a model trained on top of clip embeddings to be aesthetic. The intended usage of this dataset is image generation Paper or resources for more information: https://laion.ai/blog/laion-aesthetics/ Acknowledgements This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grants funded by the… See the full description on the dataset page: https://huggingface.co/datasets/etri-vilab/Ko-LAION-Aesthetics-10M.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
laion/Audio-aesthetics-score dataset hosted on Hugging Face and contributed by the HF Datasets community
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Dataset Description
SPRIGHT (SPatially RIGHT) is the first spatially focused, large scale vision-language dataset. It was built by re-captioning ∼6 million images from 4 widely-used datasets:
CC12M Segment Anything COCO Validation LAION Aesthetics
This repository contains the re-captioned data from COCO-Validation Set, while the data from CC12 and Segment Anything is present here. We do not release images from LAION, as the parent images are currently private.
Dataset… See the full description on the dataset page: https://huggingface.co/datasets/SPRIGHT-T2I/spright_coco.
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Dataset Description
This dataset contains the 444 images that we used for training our model - https://huggingface.co/SPRIGHT-T2I/spright-t2i-sd2. This contains the samples of this subset related to the Segment Anything images. We will release the LAION images, when the parent images are made public again. Our training and validation set are a subset of the SPRIGHT dataset, and consists of 444 and 50 images respectively, randomly sampled in a 50:50 split between LAION-Aesthetics and… See the full description on the dataset page: https://huggingface.co/datasets/SPRIGHT-T2I/18_obj_444.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
📄 Synthetic Captioned LAION Subset
🧾 Dataset Summary
This dataset consists of over 39 Million synthetic image captions generated for 1 Million curated images from LAION Aesthetics. Images have an aesthetics score >6, at minimum resolution of 512p, and have been screened for NSFW, CSAM and watermarks. We also removed exact duplicates.
📦 Data Structure
strID - Unique string identifier for the sample intID - Unique integer identifier for the sample imageURL -… See the full description on the dataset page: https://huggingface.co/datasets/AIML-TUDA/t2i-diversity-captions.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
📦 Data Sources
The training set of MMFR-Dataset contains:
Fake images sourced from DiffusionDB, released under the CC0 1.0 Public Domain Dedication. Real images drawn from LAION-Aesthetics, a subset of LAION-5B, licensed under the CC BY 4.0 License.
Licenses of evaluation sets are:
BigGAN: Provided by the GenImage dataset, licensed under CC BY-NC-SA 4.0. GauGAN: Obtained from CNNDetection, released under the CC BY-NC-SA 4.0. StyleGAN-XL: Collected from AntifakePrompts, under the… See the full description on the dataset page: https://huggingface.co/datasets/AnnaGao/MMFR-Dataset.
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Nobodies
AI-generated human image dataset.
Contents
Face
vol1: 32 photos of women's faces. Generated with WD1.5 beta 2.
Sample:
Portrait
vol1: 31 photos of women's portraits. Generated with WD1.5 beta 2 and the fashion LoCon.
Sample:
vol2: 165 photos of woman's portraits. Generated with WD1.5 beta 2 and the fashion LoCon. Classified with LAION Aesthetic v2.75 hair bun photos 90 medium hair photos
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset Card for LAION-art-EN-improved-captions
Dataset Summary
This dataset has been created by Re:cast AI for improving the semantic relationship of image-caption pairs. generated_captions were created in a semi-supervised fashion using the Salesforce/blip2-flan-t5-xxl model.
Supported Tasks
Fine-tuning text-to-image generators (e.g. stable-diffusion), or a searchable prompt database (requires faiss-index).
Dataset Structure
Data Fields… See the full description on the dataset page: https://huggingface.co/datasets/recastai/LAION-art-EN-improved-captions.
Overview
This is a subset of https://huggingface.co/datasets/laion/laion2B-en-aesthetic, selected for aspect ratio, and with better captioning. Approximate image count is around 250k.
23ish, 1216px
I picked out the images that are portrait aspect ratio of 2:3, or a little wider (Because images that are a little too wide, can be safely cropped narrower) I also picked a minimum height of 1216 pixels, because that is what 1024x1024 pixelcount converted to 2:3 looks like.… See the full description on the dataset page: https://huggingface.co/datasets/opendiffusionai/laion2b-23ish-1216px.
https://choosealicense.com/licenses/openrail/https://choosealicense.com/licenses/openrail/
This is a dataset of 4.4M images generated with Stable Diffusion 2 for Kaggle's stable diffusion image to prompt competition. Prompts were extracted from public databases:
mp: Magic Prompt - 1M db: DiffusionDB op: Open Prompts co: COCO cc: Conceptual Captions l0: LAION-2B-en-aesthetic
The following prompts were filtered out:
those with token length >77 CLIP tokens those whose all-MiniLM-L6-v2 embedding have a cosine similarity >0.9 to any other prompt
Samples were clustered by their… See the full description on the dataset page: https://huggingface.co/datasets/jamarju/sd-4.4M.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
LAION-Aesthetics :: CLIP → UMAP
This dataset is a CLIP (text) → UMAP embedding of the LAION-Aesthetics dataset - specifically the improved_aesthetics_6plus version, which filters the full dataset to images with scores of > 6 under the "aesthetic" filtering model. Thanks LAION for this amazing corpus!
The dataset here includes coordinates for 3x separate UMAP fits using different values for the n_neighbors parameter - 10, 30, and 60 - which are broken out as separate columns with… See the full description on the dataset page: https://huggingface.co/datasets/dclure/laion-aesthetics-12m-umap.