2 datasets found

Corpus Nummorum - Object Detection Coin Dataset
zenodo.org
zip
Updated Sep 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo (2024). Corpus Nummorum - Object Detection Coin Dataset [Dataset]. http://doi.org/10.5281/zenodo.13748799
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.13748799
Dataset updated
Sep 13, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Description
This Object Detection dataset is a collection of ancient coin images from three different sources: the Corpus Nummorum (CN) project, the Münzkabinett Berlin and the Bibliothèque nationale de France, Département des Monnaies, médailles et antiques. It covers Greek and Roman coins from ancient Thrace, Moesia Inferior, Troad and Mysia. This is a selection of the coins published on the CN portal (due to copyrights).

This dataset contains 506 different classes with about 179.000 coin images (approx. 29.000 unique coins). The classes come from four different categories: persons, objects, animals and plants. The coin images were assigned to the classes using our NLP pipeline. For this purpose, our Named Entity Recognition and Relation Extraction were performed on every coin's description (separated into obverse and reverse). Each coin image assigned to this description was then copied to the folder of the predicted classes. A coin image can therefore also be assigned to different classes. The file name contains both the coin id and the coin type of the CN database. Whether the image belongs to a coin obverse or reverse can be recognized by the suffix obv or rev. An "sources" csv file holds the sources for every image. Due to copyrights the image size is limited to 299*299 pixels. However, this should be sufficient for most ML approaches.

Due to the numerically different occurrences of the individual entities, the data set is not balanced. In addition, a class can contain very different representations of the same entity. Therefore, some classes can be difficult to train. Unfortunately, we cannot provide any annotations for the data set.

During the summer semester 2024, we held the "Data Challenge" event at our Department of Computer Science at the Goethe-University. Our students could choose between the Object Detection dataset and a Natural Language dataset as their challenge. One team opted for the Object Detection challenge. We gave them this dataset with the task to use to try out their own ideas. Here are their results:

Multilabel Classification as Backbone for Object Detection

Now we would like to invite you to try out your own ideas and models on our coin data.

If you have any questions or suggestions, please, feel free to contact us.
Corpus Nummorum - Coin Image Dataset
zenodo.org
data.niaid.nih.gov
zip
Updated Nov 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Corpus_Nummorum; Corpus_Nummorum (2023). Corpus Nummorum - Coin Image Dataset [Dataset]. http://doi.org/10.5281/zenodo.10033993
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10033993
Dataset updated
Nov 7, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Corpus_Nummorum; Corpus_Nummorum
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Corpus Nummorum - Coin Image Dataset
This dataset is a collection of ancient coin images from three different sources: the Corpus Nummorum (CN) project, the Münzkabinett Berlin and the Bibliothèque nationale de France, Département des Monnaies, médailles et antiques. It covers Greek and Roman coins from ancient Thrace, Moesia Inferior, Troad and Mysia. This is a selection of the coins published on the CN portal (due to copyrights).
The dataset contains 115,160 images with about 29,000 unique coins. The images are split in three main folders with different assignment of the coins. Each main folder is sorted with the help fo subfolders which hold the coin images. The "dataset_coins" folder contains the coin photos divided into obverse and reverse and arranged by coin types. In the "dataset_types" folder the obverse and reverse image of the coins are concatenated and transformed to a quadratic format with black bars on the top and bottom. The images here are sorted by their coin type. The last folder "dataset_mints" contains the also concatenated images sorted by their mint. An "sources" csv file holds the sources for every image. Due to copyrights the image size is limited to 299*299 pixels. However, this should be sufficient for most ML approaches.
The main purpose for this dataset in the CN project is the training of Machine Learning based Image Recognition models. We use three different Convolutional Neural Network based architectures: VGG16, VGG19 and ResNet50. Our best model (VGG16) archieves on this dataset a 79% Top-1 and a 97% Top-5 accuracy for the coin type recognition. The mint recognition achieves an 79% Top-1 and 94% Top-5 accuracy. We have a Colab notebook with two models (trained on the whole CN dataset) online.
During the summer semester 2023, we held the "Data Challenge" event at our Department of Computer Science at the Goethe-University. We gave our students this dataset with the task to achieve better results than us. Here are their experiments:
Team 1: Voting and stacking of models
Team 2: Multimodal model
Team 3: Transformer models
Team 4: Dockerized TIMM Computer Vision Backend & FastAPI
Approach | Type Dataset | Mint Dataset
Ours 79% 79%
Team 1 - 86%
Team 2 86% -
Team 3 88% 58%
Team 4 - -

Now we would like to invite you to try out your own ideas and models on our coin data.
If you have any questions or suggestions, please, feel free to contact us.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Zenodo (2024). Corpus Nummorum - Object Detection Coin Dataset [Dataset]. http://doi.org/10.5281/zenodo.13748799

Corpus Nummorum - Object Detection Coin Dataset

Explore at:

zipAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.13748799

Dataset updated

Sep 13, 2024

Dataset provided by

Zenodohttp://zenodo.org/

Description

This Object Detection dataset is a collection of ancient coin images from three different sources: the Corpus Nummorum (CN) project, the Münzkabinett Berlin and the Bibliothèque nationale de France, Département des Monnaies, médailles et antiques. It covers Greek and Roman coins from ancient Thrace, Moesia Inferior, Troad and Mysia. This is a selection of the coins published on the CN portal (due to copyrights).

This dataset contains 506 different classes with about 179.000 coin images (approx. 29.000 unique coins). The classes come from four different categories: persons, objects, animals and plants. The coin images were assigned to the classes using our NLP pipeline. For this purpose, our Named Entity Recognition and Relation Extraction were performed on every coin's description (separated into obverse and reverse). Each coin image assigned to this description was then copied to the folder of the predicted classes. A coin image can therefore also be assigned to different classes. The file name contains both the coin id and the coin type of the CN database. Whether the image belongs to a coin obverse or reverse can be recognized by the suffix obv or rev. An "sources" csv file holds the sources for every image. Due to copyrights the image size is limited to 299*299 pixels. However, this should be sufficient for most ML approaches.

Due to the numerically different occurrences of the individual entities, the data set is not balanced. In addition, a class can contain very different representations of the same entity. Therefore, some classes can be difficult to train. Unfortunately, we cannot provide any annotations for the data set.

During the summer semester 2024, we held the "Data Challenge" event at our Department of Computer Science at the Goethe-University. Our students could choose between the Object Detection dataset and a Natural Language dataset as their challenge. One team opted for the Object Detection challenge. We gave them this dataset with the task to use to try out their own ideas. Here are their results:

Multilabel Classification as Backbone for Object Detection

Now we would like to invite you to try out your own ideas and models on our coin data.

If you have any questions or suggestions, please, feel free to contact us.

Clear search

Close search

Google apps

Main menu

Corpus Nummorum - Object Detection Coin Dataset

Corpus Nummorum - Coin Image Dataset

Corpus Nummorum - Object Detection Coin Dataset