3 datasets found

T
plant_village
tensorflow.org
opendatalab.com
+1more
Updated Jun 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). plant_village [Dataset]. http://identifiers.org/arxiv:1511.08060
Explore at:
Unique identifier
https://identifiers.org/arxiv:1511.08060
Dataset updated
Jun 1, 2024
Description
The PlantVillage dataset consists of 54303 healthy and unhealthy leaf images divided into 38 categories by species and disease.

NOTE: The original dataset is not available from the original source (plantvillage.org), therefore we get the unaugmented dataset from a paper that used that dataset and republished it. Moreover, we dropped images with Background_without_leaves label, because these were not present in the original dataset.

Original paper URL: https://arxiv.org/abs/1511.08060 Dataset URL: https://data.mendeley.com/datasets/tywbtsjrjv/1

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('plant_village', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/plant_village-1.0.2.png" alt="Visualization" width="500px">
Van Gogh vs Tree Oil Painting Ai Analysis
kaggle.com
Updated May 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
HARONTHAI MONGBUNSRI (2025). Van Gogh vs Tree Oil Painting Ai Analysis [Dataset]. https://www.kaggle.com/datasets/haronthaimongbunsri/van-gogh-vs-tree-oil-painting-ai-analysis/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 6, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
HARONTHAI MONGBUNSRI
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Title: AI model for artwork verification using brushstroke structure, rhythm, and X-ray analysis. Developed from 10 years of research and refined through AI Natural Matching.

Overview

This dataset presents an AI model developed specifically for artwork authentication. The goal is to determine whether two paintings—one known and one unknown—may have been created by the same artist, based on brushstroke analysis and structural features.

This model demonstrates how machine learning can assist in verifying stylistic consistency using visual data, supported by structural scans such as X-ray imaging. The method was built from human-led comparisons and later refined into a formal system of AI-driven brushstroke analysis.

Method: The 18 Supreme Techniques

The model uses a set of 18 core visual techniques designed to analyze:

Stroke rhythm and torque

Brushstroke flow direction

Angular deviation and energy pattern

Layer buildup and underpainting patterns (via X-ray)

These techniques were developed in Google Colab with AI support, based on over 10 years of image-based manual study. The system is optimized for AI Natural Matching, a visual neural comparison approach that mimics human perception but operates at scale.

Requirements:

High-resolution candidate image

Reference image (known work by artist)

X-ray scan of candidate image

AI Natural Matching only (SSIM is strictly forbidden)

This model must use AI Natural Matching. Do not use SSIM or any structural similarity index.

Dataset Use

This dataset is intended for researchers, AI developers, and art historians who wish to:

Validate or disprove potential authorship of a painting

Test the model using different artist references

Study visual fingerprinting and stylistic consistency

Scientific pigment data (XRF, FTIR, SEM) and aging process validation for The Tree Oil Painting are available in a separate dataset. Cross-checking with physical material data is strongly encouraged.

Licensing and Attribution

All data is licensed under CC BY 4.0 and freely available for academic, research, and AI development use.

Model and research developed by Haronthai Mongbunsri (Independent Researcher, 2015–2025) AI structure refined through collaboration with neural tools via Google Colab.

This dataset is part of an open effort to build transparent, reproducible systems for artwork verification.

Reference: Scientific Verification Dataset on Hugging Face

This analysis is built upon scientific pigment data, X-ray, and FTIR results hosted on Hugging Face:

Tree Oil – Scientific Core: CrVI/CrIII Cross-Verified (2025)

We strongly recommend reviewing this core dataset to understand the chemical and material basis behind the visual AI analysis.
Clean dirty containers in Montevideo
kaggle.com
Updated Aug 21, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rodrigo Laguna (2021). Clean dirty containers in Montevideo [Dataset]. https://www.kaggle.com/rodrigolaguna/clean-dirty-containers-in-montevideo/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 21, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Rodrigo Laguna
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Montevideo
Description
Context

It all started during last #StayAtHome during 2020's pandemic: some neighbors worried about trash in Montevideo's container.

The goal is to automatically detect clean from dirty containers to ask for maintenance.

Want to know more about the entire process? Checkout this thread on how it began, and this other with respect to version 6 update process.

Content

Data is splitted in training/testing split, they are independent. However, each split contains several near duplicate images (typicaly, same container from different perspectives or days). Image sizes differ a lot among them.

There are four major sources: * Images taken from Google Street View, they are 600x600 pixels, automatically collected through its API. * Images contributed by individual persons, most of which I took my self. * Images taken from social networks (Twitter & Facebook) and news. * Images contributed by pormibarrio.uy - 17-11-2020

Images were took from green containers, the most popular in Montevideo, but also widely used in some other cities.

Current version (clean-dirty-garbage-containers-V6) is also available here or you can download it as follows: wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=1mdfJoOrO6MeTc3eMEjIDkAKlwK9bUFg6' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/ /p')&id=1mdfJoOrO6MeTc3eMEjIDkAKlwK9bUFg6" -O clean-dirty-garbage-containers-V6.zip && rm -rf /tmp/cookies.txt This is specially useful if you want to download it in Google Colab.

This repo contains the code used during its building and documentation process, including the baselines for the purposed tasks.

Dataset on news

Since this is a hot topic in Montevideo, specially nowadays, with elections next week, it catch some attention from local press:

19-09-2020: Promueven solución de inteligencia artificial para evitar basura alrededor de contenedores, El Observador.

24-09-2020: Ingeniero en computación trabaja en proyecto para monitorear contenedores de basura, El Pais.

09-10-2020: Rodrigo Laguna: monitoreo de contenedores de basura, La mañana en casa.

Acknowledgements

Thanks to every single person who give me images from their containers. Special thanks to my friend Diego, whose idea of using google street view as a source of data really contributed to increase the dataset. And finally to my wife, who supported me during this project and contributed a lot to this dataset.

Citation

If you use these data in a publication, presentation, or other research project or product, please use the following citation:

Laguna, Rodrigo. 2021. Clean dirty containers in Montevideo - Version 6.1. url: https://www.kaggle.com/rodrigolaguna/clean-dirty-containers-in-montevideo

@dataset{RLaguna-clean-dirty:2021, author = {Rodrigo Laguna}, title = {Clean dirty containers in Montevideo}, year = {2021}, url = {https://www.kaggle.com/rodrigolaguna/clean-dirty-containers-in-montevideo}, version = {6.1} }

Contact

I'm on twitter, @ro_laguna_ or write to me r.laguna.queirolo at outlook.com

Future steps:

Add images from mapillary, an open source project similar to GoogleStreetView.

Keep going on with manually taken images.

Add any image from anyone who would like to contribute.

Develop & deploy a bot for automatically report container's status.

Translate docs to Spanish

Crop images to let one and only one container per image, taking most of the image

Changelog

19-05-2020: V1 - Initial version

20-05-2020: V2 - Include more training samples

12-09-2020: V3 - Include more training (+676) & testing (+64) samples:

train/clean from 574 to 1005 (+431)

train/dirty from 365 to 610 (+245)

test/clean from 100 to 128 (+28)

test/dirty from 100 to 136 (+36)

21-12-2020: V4 - Include more training (+367) & testing (+794) samples, including ~400...
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

(2024). plant_village [Dataset]. http://identifiers.org/arxiv:1511.08060

plant_village

Explore at:

Unique identifier

https://identifiers.org/arxiv:1511.08060

Dataset updated

Jun 1, 2024

Description

The PlantVillage dataset consists of 54303 healthy and unhealthy leaf images divided into 38 categories by species and disease.

NOTE: The original dataset is not available from the original source (plantvillage.org), therefore we get the unaugmented dataset from a paper that used that dataset and republished it. Moreover, we dropped images with Background_without_leaves label, because these were not present in the original dataset.

Original paper URL: https://arxiv.org/abs/1511.08060 Dataset URL: https://data.mendeley.com/datasets/tywbtsjrjv/1

To use this dataset:

import tensorflow_datasets as tfds

ds = tfds.load('plant_village', split='train')
for ex in ds.take(4):
 print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/plant_village-1.0.2.png" alt="Visualization" width="500px">

Clear search

Close search

Google apps

Main menu

plant_village

Van Gogh vs Tree Oil Painting Ai Analysis

Reference: Scientific Verification Dataset on Hugging Face

Clean dirty containers in Montevideo

Context

Content

Dataset on news

Acknowledgements

Citation

Contact

Future steps:

Changelog

plant_village