3 datasets found
  1. Corpus Nummorum - Coin Image Dataset

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Nov 7, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Corpus_Nummorum; Corpus_Nummorum (2023). Corpus Nummorum - Coin Image Dataset [Dataset]. http://doi.org/10.5281/zenodo.10033993
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 7, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Corpus_Nummorum; Corpus_Nummorum
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Corpus Nummorum - Coin Image Dataset

    This dataset is a collection of ancient coin images from three different sources: the Corpus Nummorum (CN) project, the Münzkabinett Berlin and the Bibliothèque nationale de France, Département des Monnaies, médailles et antiques. It covers Greek and Roman coins from ancient Thrace, Moesia Inferior, Troad and Mysia. This is a selection of the coins published on the CN portal (due to copyrights).

    The dataset contains 115,160 images with about 29,000 unique coins. The images are split in three main folders with different assignment of the coins. Each main folder is sorted with the help fo subfolders which hold the coin images. The "dataset_coins" folder contains the coin photos divided into obverse and reverse and arranged by coin types. In the "dataset_types" folder the obverse and reverse image of the coins are concatenated and transformed to a quadratic format with black bars on the top and bottom. The images here are sorted by their coin type. The last folder "dataset_mints" contains the also concatenated images sorted by their mint. An "sources" csv file holds the sources for every image. Due to copyrights the image size is limited to 299*299 pixels. However, this should be sufficient for most ML approaches.

    The main purpose for this dataset in the CN project is the training of Machine Learning based Image Recognition models. We use three different Convolutional Neural Network based architectures: VGG16, VGG19 and ResNet50. Our best model (VGG16) archieves on this dataset a 79% Top-1 and a 97% Top-5 accuracy for the coin type recognition. The mint recognition achieves an 79% Top-1 and 94% Top-5 accuracy. We have a Colab notebook with two models (trained on the whole CN dataset) online.

    During the summer semester 2023, we held the "Data Challenge" event at our Department of Computer Science at the Goethe-University. We gave our students this dataset with the task to achieve better results than us. Here are their experiments:

    Team 1: Voting and stacking of models

    Team 2: Multimodal model

    Team 3: Transformer models

    Team 4: Dockerized TIMM Computer Vision Backend & FastAPI

    • Approach | Type Dataset | Mint Dataset
    • Ours 79% 79%
    • Team 1 - 86%
    • Team 2 86% -
    • Team 3 88% 58%
    • Team 4 - -

    Now we would like to invite you to try out your own ideas and models on our coin data.

    If you have any questions or suggestions, please, feel free to contact us.

  2. Z

    Keras video classification example with a subset of UCF101 - Action...

    • data.niaid.nih.gov
    Updated May 11, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mikolaj Buchwald (2023). Keras video classification example with a subset of UCF101 - Action Recognition Data Set (top 10 videos) [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_7882860
    Explore at:
    Dataset updated
    May 11, 2023
    Dataset authored and provided by
    Mikolaj Buchwald
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Classify video clips with natural scenes of actions performed by people visible in the videos.

    See the UCF101 Dataset web page: https://www.crcv.ucf.edu/data/UCF101.php#Results_on_UCF101

    This example datasets consists of the 10 most numerous video from the UCF101 dataset. For the top 5 version, see: https://doi.org/10.5281/zenodo.7924745 .

    Based on this code: https://keras.io/examples/vision/video_classification/ (needs to be updated, if has not yet been already; see the issue: https://github.com/keras-team/keras-io/issues/1342).

    Testing if data can be downloaded from figshare with wget, see: https://github.com/mojaveazure/angsd-wrapper/issues/10

    For generating the subset, see this notebook: https://colab.research.google.com/github/sayakpaul/Action-Recognition-in-TensorFlow/blob/main/Data_Preparation_UCF101.ipynb -- however, it also needs to be adjusted (if has not yet been already - then I will post a link to the notebook here or elsewhere, e.g., in the corrected notebook with Keras example).

    I would like to thank Sayak Paul for contacting me about his example at Keras documentation being out of date.

    Cite this dataset as:

    Soomro, K., Zamir, A. R., & Shah, M. (2012). UCF101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402. https://doi.org/10.48550/arXiv.1212.0402

    To download the dataset via the command line, please use:

    wget -q https://zenodo.org/record/7882861/files/ucf101_top10.tar.gz -O ucf101_top10.tar.gz tar xf ucf101_top10.tar.gz

  3. Data from: ReaLSAT, a global dataset of reservoir and lake surface area...

    • zenodo.org
    bin, html, zip
    Updated Feb 7, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ankush Khandelwal; Ankush Khandelwal; Anuj Karpatne; Zhihao Wei; Rahul Ghosh; Hilary Dugan; Paul Hanson; Vipin Kumar; Anuj Karpatne; Zhihao Wei; Rahul Ghosh; Hilary Dugan; Paul Hanson; Vipin Kumar (2023). ReaLSAT, a global dataset of reservoir and lake surface area variations [Dataset]. http://doi.org/10.5281/zenodo.6344848
    Explore at:
    zip, bin, htmlAvailable download formats
    Dataset updated
    Feb 7, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Ankush Khandelwal; Ankush Khandelwal; Anuj Karpatne; Zhihao Wei; Rahul Ghosh; Hilary Dugan; Paul Hanson; Vipin Kumar; Anuj Karpatne; Zhihao Wei; Rahul Ghosh; Hilary Dugan; Paul Hanson; Vipin Kumar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Reservoir and Lake Surface Area Timeseries (ReaLSAT) dataset provides an unprecedented reconstruction of surface area variations of lakes and reservoirs at a global scale using Earth Observation (EO) data and novel machine learning techniques. The dataset provides monthly scale surface area variations (1984 to 2015) of 683734 water bodies below 50°N and sizes between 0.1 to 100 square kilometers.

    The dataset contains the following files:

    1) ReaLSAT.zip: A shapefile that contains the reference shape of waterbodies in the dataset.

    2) monthly_timeseries.zip: contains one CSV file for each water body. The CSV file provides monthly surface area variation values. The CSV files are stored in a subfolder corresponding to each 10 degree by 10 degree cell. For example, monthly_timeseries_60_-50 folders contain CSV files of lakes that lie between 60 E and 70 E longitude, and 50S and 40 S.

    3) monthly_shapes_

    4) ReaLSAT.html: a readme python notebook that provides information about reading and visualizing the dataset. The notebook also contains the code to download the data to reduce the overhead of downloading each file manually.

    5) evaluation_data.zip: contains the random subsets of the dataset used for evaluation. The zip file contains a README file that describes the evaluation data.

    6) generate_realsat_timeseries.ipynb: a Google Colab notebook that provides the code to generate timerseries and surface extent maps for any waterbody.

    Please refer to the following papers to learn more about the processing pipeline used to create ReaLSAT dataset:

    [1] Khandelwal, Ankush, Rahul Ghosh, Zhihao Wei, Huangying Kuang, Hilary Dugan, Paul Hanson, Anuj Karpatne, and Vipin Kumar. "ReaLSAT: A new Reservoir and Lake Surface Area Timeseries Dataset created using machine learning and satellite imagery." (2020).

    [2] Khandelwal, Ankush. "ORBIT (Ordering Based Information Transfer): A Physics Guided Machine Learning Framework to Monitor the Dynamics of Water Bodies at a Global Scale." (2019).

    Version Updates

    Version 1.3: fixed visualization related bug in generate_realsat_timeseries.ipynb

    Version 1.2: added a Google Colab notebook that provides the code to generate timerseries and surface extent maps for any waterbody in ReaLSAT database.

  4. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Corpus_Nummorum; Corpus_Nummorum (2023). Corpus Nummorum - Coin Image Dataset [Dataset]. http://doi.org/10.5281/zenodo.10033993
Organization logo

Corpus Nummorum - Coin Image Dataset

Related Article
Explore at:
zipAvailable download formats
Dataset updated
Nov 7, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Corpus_Nummorum; Corpus_Nummorum
License

Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically

Description

Corpus Nummorum - Coin Image Dataset

This dataset is a collection of ancient coin images from three different sources: the Corpus Nummorum (CN) project, the Münzkabinett Berlin and the Bibliothèque nationale de France, Département des Monnaies, médailles et antiques. It covers Greek and Roman coins from ancient Thrace, Moesia Inferior, Troad and Mysia. This is a selection of the coins published on the CN portal (due to copyrights).

The dataset contains 115,160 images with about 29,000 unique coins. The images are split in three main folders with different assignment of the coins. Each main folder is sorted with the help fo subfolders which hold the coin images. The "dataset_coins" folder contains the coin photos divided into obverse and reverse and arranged by coin types. In the "dataset_types" folder the obverse and reverse image of the coins are concatenated and transformed to a quadratic format with black bars on the top and bottom. The images here are sorted by their coin type. The last folder "dataset_mints" contains the also concatenated images sorted by their mint. An "sources" csv file holds the sources for every image. Due to copyrights the image size is limited to 299*299 pixels. However, this should be sufficient for most ML approaches.

The main purpose for this dataset in the CN project is the training of Machine Learning based Image Recognition models. We use three different Convolutional Neural Network based architectures: VGG16, VGG19 and ResNet50. Our best model (VGG16) archieves on this dataset a 79% Top-1 and a 97% Top-5 accuracy for the coin type recognition. The mint recognition achieves an 79% Top-1 and 94% Top-5 accuracy. We have a Colab notebook with two models (trained on the whole CN dataset) online.

During the summer semester 2023, we held the "Data Challenge" event at our Department of Computer Science at the Goethe-University. We gave our students this dataset with the task to achieve better results than us. Here are their experiments:

Team 1: Voting and stacking of models

Team 2: Multimodal model

Team 3: Transformer models

Team 4: Dockerized TIMM Computer Vision Backend & FastAPI

  • Approach | Type Dataset | Mint Dataset
  • Ours 79% 79%
  • Team 1 - 86%
  • Team 2 86% -
  • Team 3 88% 58%
  • Team 4 - -

Now we would like to invite you to try out your own ideas and models on our coin data.

If you have any questions or suggestions, please, feel free to contact us.

Search
Clear search
Close search
Google apps
Main menu