Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
The original Insects dataset is created by the National Museum of Natural History, Paris (https://www.mnhn.fr/fr). It has more than 290 000 images in different sizes and orientations. The dataset has hierarchical classes which are listed from top to bottom as Order, Super-Family, Family, and Texa. Each image contains an insect in its natural environment or habitat, i.e, either on a flower or near to vegetation. The images are collected by the researchers and hundreds of volunteers from SPIPOLL Science project(https://www.spipoll.org/). The images are uploaded to a centralized server either by using the SPIPOLL website, Android application or IOS application. The preprocessed insect dataset is prepared from the original Insects dataset by carefully preprocessing the images, i.e., cropping the images from either side to make squared images. These cropped images are then resized into 128x128 using Open-CV with an anti-aliasing filter.
https://meta-album.github.io/assets/img/samples/INS.png" alt="">
Meta Album ID: SM_AM.INS
Meta Album URL: https://meta-album.github.io/datasets/INS.html
Domain ID: SM_AM
Domain Name: Small Aninamls
Dataset ID: INS
Dataset Name: Insects
Short Description: Insects dataset from Science Project SPIPOLL
# Classes: 117
# Images: 170506
Keywords: insects, ecology
Data Format: images
Image size: 128x128
License (original data release): CC BY-NC 2.0
License URL(original data release): https://www.spipoll.org/mentions-legales
License (Meta-Album data release): CC BY-NC 2.0
License URL (Meta-Album data release): https://creativecommons.org/licenses/by-nc/2.0/
Source: SPIPOLL; National Museum of Natural History, Paris
Source URL: https://www.spipoll.org/
Original Author: Gregoire Lois, Colin Fontaine, Jean-Francois Julien
Original contact: contact@spipoll.org
Meta Album author: Ihsan Ullah
Created Date: 01 March 2022
Contact Name: Ihsan Ullah
Contact Email: meta-album@chalearn.org
Contact URL: https://meta-album.github.io/
@article{insects,
title={Data quality and participant engagement in citizen science: comparing two approaches for monitoring pollinators in France and South Korea},
author={Serret, Hortense and Deguines, Nicolas and Jang, Yikweon and Lois, Gregoire and Julliard, Romain},
journal={Citizen Science: Theory and Practice},
volume={4},
number={1},
pages={22},
year={2019}
}
@inproceedings{meta-album-2022,
title={Meta-Album: Multi-domain Meta-Dataset for Few-Shot Image Classification},
author={Ullah, Ihsan and Carrion, Dustin and Escalera, Sergio and Guyon, Isabelle M and Huisman, Mike and Mohr, Felix and van Rijn, Jan N and Sun, Haozhe and Vanschoren, Joaquin and Vu, Phan Anh},
booktitle={Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
url = {https://meta-album.github.io/},
year = {2022}
}
For more information on the Meta-Album dataset, please see the [NeurIPS 2022 paper]
For details on the dataset preprocessing, please see the [supplementary materials]
Supporting code can be found on our [GitHub repo]
Meta-Album on Papers with Code [Meta-Album]
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Scraped this data from the Spotify API, searching for bands by genre, then cycling through those bands and saving the image from the album art URL. Put it together as my first lesson project for Fast.ai's Practical Deep Learning for Coders.
Have gone for high level sub-genre classes (e.g. for these purposes am not interested that Blind Guardian were influenced by Thrash Metal, or that Fleshgod Apocalypse's use of orchestral elements might make them "Symphonic Death Metal"). There are also a lot of exclusions of bands being labelled incorrectly, or the first band that comes up in Spotify's API query clearly being a totally different band.
Just working as a proof of concept, then I'll add in Thrash Metal and Death Metal. Feel free to fork / improve the web scraper I've got in the notebook here: https://github.com/fraserwat/metal-album-subgenre-classifier
Once I've done that I won't be maintaining this as it's just for a fun little project to get me used to basic NN stuff.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset was gathered in order to train a GAN to be able to generate heavy metal artwork.
In each row you'll find basic artist info (artist name, country, status, main and second subgenres), an album name and album artwork URL. Note that since the artist info and the albums list were gathered from different sources, there are a lot of empty values in the artist info fields.
To build this dataset, I used: * https://pypi.org/project/metalparser/ * https://pypi.org/project/coverpy/ * https://github.com/jonchar/ma-scraper
The header photo is a sample taken during training of the GAN.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Nightwish is a symphonic power metal band from Finland and one of the biggest names in the European metal scene. Since 1996, they have released 9 studio albums and numerous singles, EP's, and live albums. The most recent album Human. :II: Nature. just came out this April. Throughout their career, Nightwish has explored many different themes, ranging from love and sorrow to science and nature. It might be super interesting to look at how the lyrical themes of Nightwish have evolved over the past 24 years.
I scraped all the lyrics of Nightwish from the metal lyrics archive Dark Lyrics using GitHub user medakk's script.
The raw data scraped from Dark Lyrics is stored in a plain text file (nightwish_lyrics.txt). To increase data usability, I extracted the following information and stored it in a CSV file (nightwish_lyrics.csv):
You can use the cleaned CSV file or start from the plain text file and do your own text mining!
I was inspired by the Taylor Swift lyrics dataset and thought we could perform similar analyses on Nightwish. Some example questions to explore:
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
The original Insects dataset is created by the National Museum of Natural History, Paris (https://www.mnhn.fr/fr). It has more than 290 000 images in different sizes and orientations. The dataset has hierarchical classes which are listed from top to bottom as Order, Super-Family, Family, and Texa. Each image contains an insect in its natural environment or habitat, i.e, either on a flower or near to vegetation. The images are collected by the researchers and hundreds of volunteers from SPIPOLL Science project(https://www.spipoll.org/). The images are uploaded to a centralized server either by using the SPIPOLL website, Android application or IOS application. The preprocessed insect dataset is prepared from the original Insects dataset by carefully preprocessing the images, i.e., cropping the images from either side to make squared images. These cropped images are then resized into 128x128 using Open-CV with an anti-aliasing filter.
https://meta-album.github.io/assets/img/samples/INS.png" alt="">
Meta Album ID: SM_AM.INS
Meta Album URL: https://meta-album.github.io/datasets/INS.html
Domain ID: SM_AM
Domain Name: Small Aninamls
Dataset ID: INS
Dataset Name: Insects
Short Description: Insects dataset from Science Project SPIPOLL
# Classes: 117
# Images: 170506
Keywords: insects, ecology
Data Format: images
Image size: 128x128
License (original data release): CC BY-NC 2.0
License URL(original data release): https://www.spipoll.org/mentions-legales
License (Meta-Album data release): CC BY-NC 2.0
License URL (Meta-Album data release): https://creativecommons.org/licenses/by-nc/2.0/
Source: SPIPOLL; National Museum of Natural History, Paris
Source URL: https://www.spipoll.org/
Original Author: Gregoire Lois, Colin Fontaine, Jean-Francois Julien
Original contact: contact@spipoll.org
Meta Album author: Ihsan Ullah
Created Date: 01 March 2022
Contact Name: Ihsan Ullah
Contact Email: meta-album@chalearn.org
Contact URL: https://meta-album.github.io/
@article{insects,
title={Data quality and participant engagement in citizen science: comparing two approaches for monitoring pollinators in France and South Korea},
author={Serret, Hortense and Deguines, Nicolas and Jang, Yikweon and Lois, Gregoire and Julliard, Romain},
journal={Citizen Science: Theory and Practice},
volume={4},
number={1},
pages={22},
year={2019}
}
@inproceedings{meta-album-2022,
title={Meta-Album: Multi-domain Meta-Dataset for Few-Shot Image Classification},
author={Ullah, Ihsan and Carrion, Dustin and Escalera, Sergio and Guyon, Isabelle M and Huisman, Mike and Mohr, Felix and van Rijn, Jan N and Sun, Haozhe and Vanschoren, Joaquin and Vu, Phan Anh},
booktitle={Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
url = {https://meta-album.github.io/},
year = {2022}
}
For more information on the Meta-Album dataset, please see the [NeurIPS 2022 paper]
For details on the dataset preprocessing, please see the [supplementary materials]
Supporting code can be found on our [GitHub repo]
Meta-Album on Papers with Code [Meta-Album]