Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains anime images for 231 different anime, with approximately 380 image for each of those anime. Please note that you might need to clean the image directories a bit, since the images might contain merchandise and live-action photos in addition to the actual anime itself.
If you'd like to take a look at the scripts used to make this dataset, you can find them on this GitHub repo.
Feel free to extend it, scrape your own images, etc. etc.
As a big anime fan, I found a lot of anime related datasets on Kaggle. I was however disappointed to find no dataset containing anime specific images for popular anime. Some other great datasets that I've been inspired by include: - Top 250 Anime 2023 - Anime Recommendations Database - Anime Recommendation Database 2020 - Anime Face Dataset - Safebooru - Anime Image Metadata
Facebook
Twitter🌸 200+ High-Quality Anime Images Dataset
Welcome to the Ultimate Anime Image Dataset – a carefully curated collection of 200+ stunning anime images for researchers, developers, and anime lovers.Perfect for Machine Learning, Computer Vision, Anime Art Generation, or Creative Projects. ✨
⚠️ Attribution is MandatoryWhen using this dataset (personal or commercial), credit is required: **"Anime Images Dataset by Banana_Leopard" Dataset by Dhiraj45
✨ Highlights
✅ 200+… See the full description on the dataset page: https://huggingface.co/datasets/Dhiraj45/Animes.
Facebook
TwitterThis is a dataset of richly tagged and labeled artwork depicting characters from Japanese anime. The data comes from two image boards, danbooru and moeimouto. This data can be used in an variety of different interesting ways, from classification to generative modeling. Please note that while all of the images in this dataset have been tagged as SFW (non-explicit), the websites these are from do not ban explicit or pornographic images and mislabeled images are possibly still in the dataset.
The first set of data comes from the imageboard Danbooru. The entire corpus of Danbooru images was scraped from the site with permission and was collected into a dataset. The zip files included here have the full metadata for these images as well as a subset of 300,000 of the images in normalized 512px x 512px form. Full information about this dataset is available here:
https://www.gwern.net/Danbooru2017
From the article:
Deep learning for computer revision relies on large annotated datasets. Classification/categorization has benefited from the creation of ImageNet, which classifies 1m photos into 1000 categories. But classification/categorization is a coarse description of an image which limits application of classifiers, and there is no comparably large dataset of images with many tags or labels which would allow learning and detecting much richer information about images. Such a dataset would ideally be >1m images with at least 10 descriptive tags each which can be publicly distributed to all interested researchers, hobbyists, and organizations. There are currently no such public datasets, as ImageNet, Birds, Flowers, and MS COCO fall short either on image or tag count or restricted distribution. I suggest that the image -boorus be used. The image boorus are longstanding web databases which host large numbers of images which can be tagged or labeled with an arbitrary number of textual descriptions; they were developed for and are most popular among fans of anime, who provide detailed annotations.
The best known booru, with a focus on quality, is Danbooru. We create & provide a torrent which contains ~1.9tb of 2.94m images with 77.5m tag instances (of 333k defined tags, ~26.3/image) covering Danbooru from 24 May 2005 through 31 December 2017 (final ID: #2,973,532), providing the image files & a JSON export of the metadata. We also provide a smaller torrent of SFW images downscaled to 512x512px JPG (241GB; 2,232,462 images) for convenience.
Our hope is that a Danbooru2017 dataset can be used for rich large-scale classification/tagging & learned embeddings, test out the transferability of existing computer vision techniques (primarily developed using photographs) to illustration/anime-style images, provide an archival backup for the Danbooru community, feed back metadata improvements & corrections, and serve as a testbed for advanced techniques such as conditional image generation or style transfer.
The second set of data included in this dataset is a little more manageable than the first, it includes a number of cropped illustrated faces from the now defunct site moeimouto. This dataset has been used in GAN work in the past. The data comes from:
http://www.nurs.or.jp/~nagadomi/animeface-character-dataset/
More information:
http://www.nurs.or.jp/~nagadomi/animeface-character-dataset/README.html
If you are interested in creating more face data (potentially from the Danbooru data), here is a helpful resource: https://github.com/nagadomi/lbpcascade_animeface
If you are looking for something a little easier to crack into, check out this other great anime image booru dataset: https://www.kaggle.com/alamson/safebooru
Facebook
TwitterThis dataset was created by theo Vincent
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Danbooru2023: A Large-Scale Crowdsourced and Tagged Anime Illustration Dataset
Danbooru2023 is a large-scale anime image dataset with over 5 million images contributed and annotated in detail by an enthusiast community. Image tags cover aspects like characters, scenes, copyrights, artists, etc with an average of 30 tags per image. Danbooru is a veteran anime image board with high-quality images and extensive tag metadata. The dataset can be used to train image classification… See the full description on the dataset page: https://huggingface.co/datasets/nyanko7/danbooru2023.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
I made an anime dataset of my own. I am learning GAN. I tried making a simple GAN out of this. It works well for me. You can use this dataset for yourself and do similar stuff. I download every possible images from internet which is suitable for my gan testing and crop that into 256px so that I can change into 64px or 128px accordind to my requirements. These are all the photos of anime girls that I have already watched.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Danbooru2019 Portraits is a dataset of n=302,652 (16GB) 512px anime faces cropped from solo SFW Danbooru2019 images in a relatively broad 'portrait' style encompassing necklines/ears/hats/etc rather than tightly focused on the face, upscaled to 512px as necessary, and low-quality images deleted by manual review using Discriminator ranking. It has been used for creating TWDNE.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
myanimelist.net is the most popular site where fans of Japanese animated films and series share their opinions on various productions. The portal works in a similar way to IMDb and allows users to rate various positions and then create various types of rankings based on them. In this database you will find a distribution of votes on a scale from 1 to 10 for the 100 most popular (most often cast) anime votes from a specific point in time.
The data is grouped into folders where one folder corresponds to one series. We have a total of 100, as we took the 100 most popular anime based on the current ranking on the site. In each of the folders we can find pictures from the "pics" tab for a specific anime title. The number of pictures in each series differ from each other and for some there is only one picture, and for some even a dozen. Data was obtained using webscraping. Python was used for this process with the "BeautifulSoup", "requests", "re", "urllib", and "os" packages. For each movie or series, we managed to link to the pictures, and by using the HTML page preview, we managed to find direct links to the pictures. Then, in the loop, we managed to download each picture in turn and save to the appropriate folder with the appropriate name based on the link and the title of the series.
The inspiration to create a notebook based on this data may be a comparative analysis of the titles among themselves based on the pictures or the creation of a model which, based on the pictures, can predict which anime they come from. Unfortunately, there is little data in each category, which means that working with a small amount of data can be quite a challenge.
Photo by Dex Ezekiel on Unsplash
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
https://animeallinblog.files.wordpress.com/2016/08/anime-characters-best-multi.jpg?w=705&h=435&crop=1" alt="Anime World">
Following dataset contains the names, images and links of anime characters.
Data contains 3 parts. 1: anime_links.csv file contains the links to all anime characters pages present in above mentioned site. There are around 73000 links. (That's a huge family xd😅 ). 2: anime_names.csv file contains the names of all anime characters excluding the one's with characters other than ascii in them (You can include them by editing anime-links.csv.). There are around 71000 names. (Those 2000 characters have a very complex names lol👀 , sorry if you are a fan of one of them🙁) 3: final_names.csv file contains the names of anime characters but not with names having the characters other than ASCII (eg: Raoul_Mathias_Jean_Aimée ) 3: dataset folder contains all the images of anime characters(over 58000 images are present) with name of file as it's character name.
Well the data can be used in many ways, Some of the ways are 0: If you are an anime fan, you can use images in dataset folder as desktop wallpaper😄. 1: anime_links.csv can be used for web scraping to extract other features of anime characters (True anime lover will do this😄) 2: anime_names.csv can be used to generate new anime character names. 3: dataset images can be used to generate new anime images. 4: My target is to combine 2,3 and create a new anime character with a new name. (Very exciting right💥)
If you find any other usage, kindly inform me , we can work on that together🙏
All the data is extracted from Anime and Manga
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Card for image-preferences-results
Prompt: Anime-style concept art of a Mayan Quetzalcoatl biomutant, dystopian world, vibrant colors, 4K.
Image 1
Image 2
Prompt: 8-bit pixel art of a blue knight, green car, and glacier landscape in Norway, fantasy style, colorful and detailed.
Image 1… See the full description on the dataset page: https://huggingface.co/datasets/data-is-better-together/open-image-preferences-v1-results.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Benchmarking the robustness to distribution shifts traditionally relies on dataset collection which is typically laborious and expensive, in particular for datasets with a large number of classes like ImageNet. An exception to this procedure is ImageNet-C (Hendrycks & Dietterich, 2019), a dataset created by applying common real-world corruptions at different levels of intensity to the (clean) ImageNet images. Inspired by this work, we introduce ImageNet-Cartoon and ImageNet-Drawing, two datasets constructed by converting ImageNet images into cartoons and colored pencil drawings, using a GAN framework (Wang & Yu, 2020) and simple image processing (Lu et al., 2012), respectively.
This repository contains ImageNet-Cartoon and ImageNet-Drawing. Checkout the official GitHub Repo for the code on how to reproduce the datasets.
If you find this useful in your research, please consider citing:
@inproceedings{imagenetshift,
title={ImageNet-Cartoon and ImageNet-Drawing: two domain shift datasets for ImageNet},
author={Tiago Salvador and Adam M. Oberman},
booktitle={ICML Workshop on Shift happens: Crowdsourcing metrics and test datasets beyond ImageNet.},
year={2022}
}
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Overview This dataset contains: - anime information for 13,379 animes - user information for 1,123,284 myanimelist users - 214,271 interactions between anime pairs (recommended and related animes) - 5,048,994 interactions between user pairs (friendship) - 223,812,614 interactions between users and animes
Schemas anime.csv : - anime_id: The id of the anime - anime_url: The myanimelist url of the anime - title: The name of the anime - synopsis: Short description of the plot of the anime - main_pic: Url to the cover picture of the anime - type: Type of the anime (example TV, Movie, OVA etc...) - source_type: Type of the source of the anime (example Manga, Light Novel etc..) - num_episodes: Number of episodes in the anime - status: The current status of the anime (Finished airing, Currently airing or Not yet aired) - start_date: Start date of the anime - end_date: End date of the anime - season: Season the anime started airing on (example animes that started in Jan 2020 have season Winter 2020) - studios: List of studios that created the anime - genres: List of the anime genres (Action, Shonen etc..) - score: Average score of the anime on myanimelist - score_count: Number of users that scored the anime - score_rank: Rank of anime based on its score on myanimelist - popularity_rank: Rank of anime based on its popularity on myanimelist - members_count: Number of users that are members of the anime - favorites_count: Number of users that have the anime as a favorite anime - watching_count: Number of users watching the anime - completed_count: Number of users that have completed the anime - on_hold_count: Number of users that have the anime on hold - dropped_count: Number of users that have dropped the anime - plan_to_watch_count: Number of users that plan to watch the anime - total_count: Total number of users that either completed, plan to watch, are watching, dropped or have the anime on hold - score_10_count: Number of users that score the anime a 10 - score_09_count: Number of users that score the anime a 9 - score_08_count: Number of users that score the anime a 8 - score_07_count: Number of users that score the anime a 7 - score_06_count: Number of users that score the anime a 6 - score_05_count: Number of users that score the anime a 5 - score_04_count: Number of users that score the anime a 4 - score_03_count: Number of users that score the anime a 3 - score_02_count: Number of users that score the anime a 2 - score_01_count: Number of users that score the anime a 1 - clubs: List of MAL clubs the anime is part of - pics: List of urls too pictures of the anime
user.csv - user_id: The id of the user - user_url: The url of the user on myanimelist - last_online_date: Datetime of the last time the user logged into myanimelist.net - num_watching: Number of animes the user is watching - num_completed: Number of animes the user completed - num_on_hold: Number of animes the user has on hold - num_dropped: Number of animes the user has dropped - num_plan_to_watch: Number of animes the user plans to watch - num_days: Number of days the user has spent watching anime - mean_score: Mean score the user has given to animes - clubs: List of MAL clubs the user is member of
user_anime000000000000.csv to user_anime00000000069.csv Files contain relationships between user and animes - user_id: The id of the user - anime_id: The id of the anime - favorite: 0 or 1 depending if anime_id is a favorite anime of user_id - review_id: Id of the review if user_id reviewed anime_id - review_date: Date the review was made - review_num_useful: Number of users that found the review useful - review_score: Overall score for the anime given in the review - review_story_score: Story score for the anime given in the review - review_animation_score: Animation score for the anime given in the review - review_sound_score: Sound score for the anime given in the review - review_character_score: Character score for the anime given in the review - review_enjoyment_score: Enjoyment score for the anime given in the review - score: Score the user has given to the anime (does not need to have given a review) - status: Has the user "completed", "watching", "plan_to_watch", "dropped", "on_hold" the anime - progress: Number of episodes the user has watched - last_interaction_date: Last datetime the user has interacted with this anime
anime_anime.csv File contains relationships between pairs of animes - animeA: The id of the first anime - animeB: The id of the second anime - recommendation: 0 or 1 depending if animeB is a recommendation of animeA - recommendation_url: Url of the recommendation if animeB is a recommendation of animeA - num_recommenders: Number of users that recommend animeB for animeA - related: 0 or 1 depending if animeB is related to animeA - relation_type: The type of relation between related animes (Sequel, Prequel etc...)
user_...
Facebook
TwitterThis dataset contains information about various anime series and movies, fetched via the Anime DB API. Each entry includes essential details like:
Title
Ranking
Genres
Number of episodes
Status (e.g., Completed, Ongoing)
Synopsis This dataset contains detailed information about various anime series and movies, collected from the Anime DB API. It includes key metadata for each title, making it suitable for a wide range of analytical and machine learning projects.
Dataset Columns: _id: Unique identifier for each anime entry
title: Main title of the anime
alternativeTitles: Other known titles (e.g., Japanese, English)
ranking: Popularity or performance ranking of the anime
genres: List of genres the anime falls under
episodes: Total number of episodes
hasEpisode: Boolean indicating if episode details are available
hasRanking: Boolean indicating if ranking data is present
image: URL to the anime’s official image
link: Direct link to more information about the anime
status: Current airing status (e.g., Completed, Ongoing)
type: Format of the anime (e.g., TV, Movie, OVA)
genre_requested: The genre used when fetching this entry via the API
This dataset is ideal for:
Building anime recommendation systems
Performing genre-based exploration
Analyzing popularity trends
Creating interactive dashboards
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Context: This dataset provides comprehensive details about anime, catering to enthusiasts, researchers, and analysts who are interested in understanding trends and patterns in anime content. It includes essential information such as anime names, ratings, genres, popularity, and user recommendations.
Source: The data has been curated from MyAnimeList.
Inspiration: Anime has become a global phenomenon, transcending cultural barriers and gaining widespread popularity. This dataset is inspired by the need to analyze:
Key Features: - Name & English Name: Includes both the original and English-translated titles of anime. - Image Source: Links to visual representations of each anime. - Synopsis: A brief description or storyline for each anime. - Rating & Ranked by Users: Quantitative ratings and the number of users contributing to those ratings. - Popularity & Rank: Indicators of an anime's popularity in the community. - Producers, Studios, and Genres: Information about production houses, studios, and genre classifications. - Themes & Demographics: Target audience and recurring themes across anime. - Release Time & Episodes: Information on when the anime aired and its duration. - Anime Recommendations: Suggested similar anime based on user feedback and algorithms.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The PACS dataset is a collection of 10,000 images of 4 different domains: Photo, Art Painting, Cartoon, and Sketch. Each domain contains 2,500 images. The dataset is divided into a training set of 8,000 images and a test set of 2,000 images.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset offers a comprehensive overview of the top animes of 2024, and is useful for building recommendation systems, visualizing trends in anime popularity and score, predicting scores and popularity, and such.
The dataset contains 22 features:
All of the information in this dataset has been gathered by scraping the MyAnimeList website, and is available under the Creative Commons License.
Cover Photo by: Playground.ai
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset contains test and training image data for results presented in: Bredies K., Chenchene E., Hosseini A. A hybrid proximal generalized conditional gradient method and application to total variation parameter learning. 2022. https://arxiv.org/abs/2211.00997 Also, see the GitHub repository for the implementation. Training and test sets have been downloaded from Pixabay with permission.
Facebook
Twitterhttps://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/
Dataset Card for PACS
PACS is an image dataset for domain generalization. It consists of four domains, namely Photo (1,670 images), Art Painting (2,048 images), Cartoon (2,344 images), and Sketch (3,929 images). Each domain contains seven categories (labels): Dog, Elephant, Giraffe, Guitar, Horse, and Person. The total number of sample is 9991.
Dataset Details
PACS DG dataset is created by intersecting the classes found in Caltech256 (Photo), Sketchy (Photo, Sketch)… See the full description on the dataset page: https://huggingface.co/datasets/flwrlabs/pacs.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Takayoshi Makabe
Released under CC0: Public Domain
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was originally created for my pet project: Ficbot.
Anime character information is easily accessible and often includes high-quality illustrations in a consistent anime art style, making it a valuable resource for various AI-driven generation tasks—such as image → name prediction and name → bio generation.
The dataset is provided in anime_characters.csv, containing openly available character data from MyAnimeList.
✔ English Name (if available)
✔ Source Language Name (Japanese, Chinese, Korean, etc.)
✔ Character Bio (if available)
✔ Character Link (direct MyAnimeList page)
✔ Image Link (from MyAnimeList, if available)
✔ Unique Image Name (hashed using ImageHash, hash_size=12 with average hashing)
⚠ Note: Some characters may share the same name, such as historical figures (e.g., Date Masamune).
This dataset was collected using Selenium and the Jikan API via a public Python wrapper.
As of February 26, 2025, Jikan remains the only publicly available API that supports the Character endpoint for MyAnimeList. The official MyAnimeList API does not yet support character data, and thus, was not used in this dataset's creation.
This dataset is released under the CC0 license, meaning you are free to use it without restrictions. However, if you find it useful, I’d appreciate a citation or mention in your work.
Kirill Nikolaev
MAL Character Dataset. Kaggle, 2025.
DOI: 10.34740/kaggle/ds/1906340
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains anime images for 231 different anime, with approximately 380 image for each of those anime. Please note that you might need to clean the image directories a bit, since the images might contain merchandise and live-action photos in addition to the actual anime itself.
If you'd like to take a look at the scripts used to make this dataset, you can find them on this GitHub repo.
Feel free to extend it, scrape your own images, etc. etc.
As a big anime fan, I found a lot of anime related datasets on Kaggle. I was however disappointed to find no dataset containing anime specific images for popular anime. Some other great datasets that I've been inspired by include: - Top 250 Anime 2023 - Anime Recommendations Database - Anime Recommendation Database 2020 - Anime Face Dataset - Safebooru - Anime Image Metadata