Facebook
TwitterDiffusion models (DMs) have revolutionized image generation, producing high-quality images with applications spanning various fields. However, their ability to create hyper-realistic images poses significant challenges in distinguishing between real and synthetic content, raising concerns about digital authenticity and potential misuse in creating deepfakes. This work introduces a robust detection framework that integrates image and text features extracted by CLIP model with a Multilayer Perceptron (MLP) classifier.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset was created by Dev K11212
Released under MIT
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Dataset described in the paper "Synthbuster: Towards Detection of Diffusion Model Generated Images" (Quentin Bammey, 2023, Open Journal of Signal Processing)
This dataset contains synthetic, AI-generated images from 9 different models:
1000 images were generated per model. The images are loosely based on raise-1k images (Dang-Nguyen, Duc-Tien, et al. "Raise: A raw images dataset for digital image forensics." Proceedings of the 6th ACM multimedia systems conference. 2015.). For each image of the raise-1k dataset, a description was generated using the Midjourney /describe function and CLIP interrogator (https://github.com/pharmapsychotic/clip-interrogator/). Each of these prompts was manually edited to produce results as photorealistic as possible and remove living persons and artists names.
In addition to this, parameters were randomly selected within reasonable values for methods requiring so.
The prompts and parameters used for each method can be found in the `prompts.csv` file.
This dataset can be used to evaluate AI-generated image detection methods. We recommend matching the generated images with the real Raise-1k images, to evaluate whether the methods can distinguish the two of them. Raise-1k images are not included in the dataset, they can be downloaded separately at (http://loki.disi.unitn.it/RAISE/download.html).
None of the images suffered degradations such as JPEG compression or resampling, which leaves room to add your own degradations to test robustness to various transformation in a controlled manner.
Facebook
TwitterA dataset for subject-driven generation, containing 30 subjects, including objects and live subjects/pets.
Facebook
TwitterThe dataset used for text-to-image diffusion models, including Bluefire, Paintings, 3D, and Origami styles.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Stable Diffusion Images is a dataset for object detection tasks - it contains Planes Yyvq annotations for 350 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains 500 synthetic images generated via prompt-based text-to-image diffusion modeling using Stable Diffusion XL. Each image belongs to one of five classes: cat, dog, horse, car, and tree.Gurpreet, S. (2025). Synthetic Image Dataset of Five Object Classes Generated Using Stable Diffusion XL [Data set]. Zenodo. https://doi.org/10.5281/zenodo.16414387
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset consists of images generated by Stable Diffusion v1.4 from diffusers library. 100 images per class. The prompt a photo of {class}, realistic, high quality was used, 50 sampling steps and 7.5 classifier guidance. Each image is 512x512 pixels.
Facebook
Twitterhttps://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/
Stable Diffusion Dataset
This is a set of about 80,000 prompts filtered and extracted from the image finder for Stable Diffusion: "Lexica.art". It was a little difficult to extract the data, since the search engine still doesn't have a public API without being protected by cloudflare. If you want to test the model with a demo, you can go to: "spaces/Gustavosta/MagicPrompt-Stable-Diffusion". If you want to see the model, go to: "Gustavosta/MagicPrompt-Stable-Diffusion".
Facebook
Twitterdubm/stable-diffusion-xl-base-1.0-images dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterhttps://researchdata.ntu.edu.sg/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.21979/N9/VYPJ0Ohttps://researchdata.ntu.edu.sg/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.21979/N9/VYPJ0O
While diffusion-based image restoration (IR) methods have achieved remarkable success, they are still limited by the low inference speed attributed to the necessity of executing hundreds or even thousands of sampling steps. Existing acceleration sampling techniques, though seeking to expedite the process, inevitably sacrifice performance to some extent, resulting in over-blurry restored outcomes. To address this issue, this study proposes a novel and efficient diffusion model for IR that significantly reduces the required number of diffusion steps. Our method avoids the need for post-acceleration during inference, thereby avoiding the associated performance deterioration. Specifically, our proposed method establishes a Markov chain that facilitates the transitions between the high-quality and low-quality images by shifting their residuals, substantially improving the transition efficiency. A carefully formulated noise schedule is devised to flexibly control the shifting speed and the noise strength during the diffusion process. Extensive experimental evaluations demonstrate that the proposed method achieves superior or comparable performance to current state-of-the-art methods on three classical IR tasks, namely image super-resolution, image inpainting, and blind face restoration, even only with four sampling steps.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Fake Ai generated Human faces using Stable Diffusion 1.5, 2.1, and SDXL 1.0 checkpoint. The main objective was to generate photos that were as realistic as possible, without any specific style, focusing mainly on the face.
Fake Ai generated Human faces
More details on the images and the process of creating the images in the readme file.
The data is not mine, the data is taken from a GitHub repository to a user named: tobecwb Repo link: https://github.com/tobecwb/stable-diffusion-face-dataset
Facebook
TwitterText-to-image diffusion models, e.g. Stable Diffusion (SD), lately have shown remarkable ability in high-quality content generation, and become one of the representatives for the recent wave of transformative AI.
Facebook
TwitterText-to-image diffusion models with an ensemble of expert denoisers.
Facebook
TwitterWe present Imagen, a text-to-image diffusion model with an unprecedented degree of photorealism and a deep level of language understanding. Imagen builds on the power of large transformer language models in understanding text and hinges on the strength of diffusion models in high-fidelity image generation. Our key discovery is that generic large language models (e.g. T5), pretrained on text-only corpora, are surprisingly effective at encoding text for image synthesis: increasing the size of the language model in Imagen boosts both sample fidelity and imagetext alignment much more than increasing the size of the image diffusion model. Imagen achieves a new state-of-the-art FID score of 7.27 on the COCO dataset, without ever training on COCO, and human raters find Imagen samples to be on par with the COCO data itself in image-text alignment. To assess text-to-image models in greater depth, we introduce DrawBench, a comprehensive and challenging benchmark for text-to-image models. With DrawBench, we compare Imagen with recent methods including VQ-GAN+CLIP, Latent Diffusion Models, GLIDE and DALL-E 2, and find that human raters prefer Imagen over other models in side-byside comparisons, both in terms of sample quality and image-text alignment.
Facebook
TwitterCollection of 100,000+ CivitAI models including Stable Diffusion checkpoints, LoRA models, Textual Inversions, and VAE models for text-to-image generation
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Generative image models have revolutionized artificial intelligence by enabling the synthesis of high-quality, realistic images. These models utilize deep learning techniques to learn complex data distributions and generate novel images that closely resemble the training dataset. Recent advancements, particularly in diffusion models, have led to remarkable improvements in image fidelity, diversity, and controllability. In this work, we investigate the application of a conditional latent diffusion model in the healthcare domain. Specifically, we trained a latent diffusion model using unlabeled histopathology images. Initially, these images were embedded into a lower-dimensional latent space using a Vector Quantized Generative Adversarial Network (VQ-GAN). Subsequently, a diffusion process was applied within this latent space, and clustering was performed on the resulting latent features. The clustering results were then used as a conditioning mechanism for the diffusion model, enabling conditional image generation. Finally, we determined the optimal number of clusters using cluster validation metrics and assessed the quality of the synthetic images through quantitative methods. To enhance the interpretability of the synthetic image generation process, expert input was incorporated into the cluster assignments.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Generative image models have revolutionized artificial intelligence by enabling the synthesis of high-quality, realistic images. These models utilize deep learning techniques to learn complex data distributions and generate novel images that closely resemble the training dataset. Recent advancements, particularly in diffusion models, have led to remarkable improvements in image fidelity, diversity, and controllability. In this work, we investigate the application of a conditional latent diffusion model in the healthcare domain. Specifically, we trained a latent diffusion model using unlabeled histopathology images. Initially, these images were embedded into a lower-dimensional latent space using a Vector Quantized Generative Adversarial Network (VQ-GAN). Subsequently, a diffusion process was applied within this latent space, and clustering was performed on the resulting latent features. The clustering results were then used as a conditioning mechanism for the diffusion model, enabling conditional image generation. Finally, we determined the optimal number of clusters using cluster validation metrics and assessed the quality of the synthetic images through quantitative methods. To enhance the interpretability of the synthetic image generation process, expert input was incorporated into the cluster assignments.
Facebook
Twitterhttps://choosealicense.com/licenses/gpl-3.0/https://choosealicense.com/licenses/gpl-3.0/
gryffindor-ISWS/stable-diffusion-2-1-without-images dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterhttps://researchdata.ntu.edu.sg/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.21979/N9/SZJQMEhttps://researchdata.ntu.edu.sg/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.21979/N9/SZJQME
This study presents a new image super-resolution (SR) technique based on diffusion inversion, aiming at harnessing the rich image priors encapsulated in large pre-trained diffusion models to improve SR performance. We design a Partial noise Prediction strategy to construct an intermediate state of the diffusion model, which serves as the starting sampling point. Central to our approach is a deep noise predictor to estimate the optimal noise maps for the forward diffusion process. Once trained, this noise predictor can be used to initialize the sampling process partially along the diffusion trajectory, generating the desirable high-resolution result. Compared to existing approaches, our method offers a flexible and efficient sampling mechanism that supports an arbitrary number of sampling steps, ranging from one to five. Even with a single sampling step, our method demonstrates superior or comparable performance to recent state-of-the-art approaches.
Facebook
TwitterDiffusion models (DMs) have revolutionized image generation, producing high-quality images with applications spanning various fields. However, their ability to create hyper-realistic images poses significant challenges in distinguishing between real and synthetic content, raising concerns about digital authenticity and potential misuse in creating deepfakes. This work introduces a robust detection framework that integrates image and text features extracted by CLIP model with a Multilayer Perceptron (MLP) classifier.