23 datasets found

Iris Species Dataset and Database
kaggle.com
Updated May 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ghanshyam Saini (2025). Iris Species Dataset and Database [Dataset]. https://www.kaggle.com/datasets/ghnshymsaini/iris-species-dataset-and-database
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 15, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ghanshyam Saini
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Iris Flower Dataset

This is a classic and very widely used dataset in machine learning and statistics, often serving as a first dataset for classification problems. Introduced by the British statistician and biologist Ronald Fisher in his 1936 paper "The use of multiple measurements in taxonomic problems," it is a foundational resource for learning classification algorithms.

Overview:

The dataset contains measurements for 150 samples of iris flowers. Each sample belongs to one of three species of iris:

Iris setosa

Iris versicolor

Iris virginica

For each flower, four features were measured:

Sepal length (in cm)

Sepal width (in cm)

Petal length (in cm)

Petal width (in cm)

The goal is typically to build a model that can classify iris flowers into their correct species based on these four features.

File Structure:

The dataset is usually provided as a single CSV (Comma Separated Values) file, often named iris.csv or similar. This file typically contains the following columns:

sepal_length (cm): Numerical. The length of the sepal of the iris flower.

sepal_width (cm): Numerical. The width of the sepal of the iris flower.

petal_length (cm): Numerical. The length of the petal of the iris flower.

petal_width (cm): Numerical. The width of the petal of the iris flower.

species: Categorical. The species of the iris flower (either 'setosa', 'versicolor', or 'virginica'). This is the target variable for classification.

Content of the Data:

The dataset contains an equal number of samples (50) for each of the three iris species. The measurements of the sepal and petal dimensions vary between the species, allowing for their differentiation using machine learning models.

How to Use This Dataset:

Download the iris.csv file.

Load the data using libraries like Pandas in Python.

Explore the data through visualization and statistical analysis to understand the relationships between the features and the different species.

Build classification models (e.g., Logistic Regression, Support Vector Machines, Decision Trees, K-Nearest Neighbors) using the sepal and petal measurements as features and the 'species' column as the target variable.

Evaluate the performance of your model using appropriate metrics (e.g., accuracy, precision, recall, F1-score).

The dataset is small and well-behaved, making it excellent for learning and experimenting with various classification techniques.

Citation:

When using the Iris dataset, it is common to cite Ronald Fisher's original work:

Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7(2), 179-188.

Data Contribution:

Thank you for providing this classic and fundamental dataset to the Kaggle community. The Iris dataset remains an invaluable resource for both beginners learning the basics of classification and experienced practitioners testing new algorithms. Its simplicity and clear class separation make it an ideal starting point for many data science projects.

If you find this dataset description helpful and the dataset itself useful for your learning or projects, please consider giving it an upvote after downloading. Your appreciation is valuable!
A
‘Iris Flower Dataset’ analyzed by Analyst-2
analyst-2.ai
Updated Jan 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Iris Flower Dataset’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-iris-flower-dataset-bb8a/eb51f303/?iid=001-010&v=presentation
Explore at:
Dataset updated
Jan 28, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Iris Flower Dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/arshid/iris-flower-dataset on 28 January 2022.

--- Dataset description provided by original source is as follows ---

Context

The Iris flower data set is a multivariate data set introduced by the British statistician and biologist Ronald Fisher in his 1936 paper The use of multiple measurements in taxonomic problems. It is sometimes called Anderson's Iris data set because Edgar Anderson collected the data to quantify the morphologic variation of Iris flowers of three related species. The data set consists of 50 samples from each of three species of Iris (Iris Setosa, Iris virginica, and Iris versicolor). Four features were measured from each sample: the length and the width of the sepals and petals, in centimeters.

This dataset became a typical test case for many statistical classification techniques in machine learning such as support vector machines

Content

The dataset contains a set of 150 records under 5 attributes - Petal Length, Petal Width, Sepal Length, Sepal width and Class(Species).

Acknowledgements

This dataset is free and is publicly available at the UCI Machine Learning Repository

--- Original source retains full ownership of the source dataset ---
h
iris
huggingface.co
Updated Apr 3, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bernardo Ronquillo (2025). iris [Dataset]. https://huggingface.co/datasets/brjapon/iris
Explore at:
Dataset updated
Apr 3, 2025
Authors
Bernardo Ronquillo
License
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Description
Iris Species Dataset

The Iris dataset is a classic dataset in machine learning, originally published by Ronald Fisher. It contains 150 instances of iris flowers, each described by four features (sepal length, sepal width, petal length, and petal width), along with the corresponding species label (setosa, versicolor, or virginica). It is commonly used as an introductory dataset for classification tasks and for demonstrating basic data exploration and model training workflows.… See the full description on the dataset page: https://huggingface.co/datasets/brjapon/iris.
Z
Metrics As Scores Dataset: The Iris Flower Data Set
data.niaid.nih.gov
explore.openaire.eu
Updated Jul 12, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sebastian Hönel (2024). Metrics As Scores Dataset: The Iris Flower Data Set [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7669645
Explore at:
Dataset updated
Jul 12, 2024
Dataset authored and provided by
Sebastian Hönel
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The Iris flower data set or Fisher’s Iris data set is a multivariate data set used and made famous by the British statistician and biologist Ronald Fisher. The dataset was introduced in his 1936 paper "The Use of Multiple Measurements in Taxonomic Problems" (Fisher 1936) as an example of linear discriminant analysis.

This dataset has the following Features:

Petal.Length: Length of the petal

Petal.Width: Width of the petal

Sepal.Length: Length of the sepal

Sepal.Width: Width of the sepal

It has a total of 3 Groups: setosa, versicolor, and virginica.
h
iris-clase
huggingface.co
Updated Apr 5, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andrés Eduardo García Herrera (2025). iris-clase [Dataset]. https://huggingface.co/datasets/aegarciaherrera/iris-clase
Explore at:
Dataset updated
Apr 5, 2025
Authors
Andrés Eduardo García Herrera
License
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Description
Dataset Card for "iris"

Dataset Summary

The Iris dataset is one of the most classic datasets in machine learning, often used for classification and clustering tasks. It contains 150 samples of iris flowers, each described by four features: sepal length, sepal width, petal length, and petal width. The task is to classify the samples into one of three species: Iris setosa, Iris versicolor, or Iris virginica. This dataset is especially useful for:

Supervised learning… See the full description on the dataset page: https://huggingface.co/datasets/aegarciaherrera/iris-clase.
Data from: Iris flower classification
kaggle.com
Updated Jan 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ovoke Major (2023). Iris flower classification [Dataset]. https://www.kaggle.com/datasets/ovokemajor/iris-flower-classification
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 10, 2023
Dataset provided by
Kaggle
Authors
Ovoke Major
Description
The Iris Dataset. ¶. This data sets consists of 3 different types of irises’ (Setosa, Versicolour, and Virginica) petal and sepal length, stored in a 150x4 numpy.ndarray. The rows being the samples and the columns being: Sepal Length, Sepal Width, Petal Length and Petal Width.
O
iris
opendatalab.com
tensorflow.org
zip
Updated Sep 22, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
University of Tsukuba (2022). iris [Dataset]. https://opendatalab.com/OpenDataLab/iris
Explore at:
zip(4551 bytes)Available download formats
Dataset updated
Sep 22, 2022
Dataset provided by
University of Tsukuba
Description
The Iris flower data set is a multivariate data set introduced by the British statistician and biologist Ronald Fisher in his 1936 paper The use of multiple measurements in taxonomic problems. It is sometimes called Anderson's Iris data set because Edgar Anderson collected the data to quantify the morphologic variation of Iris flowers of three related species. The data set consists of 50 samples from each of three species of Iris (Iris Setosa, Iris virginica, and Iris versicolor). Four features were measured from each sample: the length and the width of the sepals and petals, in centimeters.This dataset became a typical test case for many statistical classification techniques in machine learning such as support vector machines
c
Iris Species Dataset
cubig.ai
Updated May 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CUBIG (2025). Iris Species Dataset [Dataset]. https://cubig.ai/store/products/387/iris-species-dataset
Explore at:
Dataset updated
May 29, 2025
Dataset authored and provided by
CUBIG
License
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
Measurement technique
Privacy-preserving data transformation via differential privacy, Synthetic data generation using AI techniques for model training
Description
1) Data Introduction • The Iris Species Dataset is a classic multi-class classification data that collected a total of 150 samples, 50 for each of the three iris species (Setosa, Versicolor, Virginica), consisting of four numerical characteristics and species labels, including calyx length, width, petal length, and width.

2) Data Utilization (1) The Iris Species Dataset has characteristics that: • This dataset consists of a total of six columns and is labeled as one of three types, making it suitable for class division and basic statistical analysis. (2) The Iris Species Dataset can be used to: • Classification Algorithm Practice: You can easily practice various machine learning classification models such as logistic regression, SVM, and decision tree by inputting four characteristics: calyx and petal length and width. • Visualize data and analyze basic statistics: Visualize the distribution of characteristics by variety into scatterplots, boxplots, etc. to explore differences between classes and correlations between characteristics.
o
Edgar Anderson's Iris Data
explore.openaire.eu
zenodo.org
Updated Jul 22, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Edgar Anderson (2018). Edgar Anderson's Iris Data [Dataset]. http://doi.org/10.5281/zenodo.10396807
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.10396807
Dataset updated
Jul 22, 2018
Authors
Edgar Anderson
Description
This famous (Fisher's or Anderson's) iris data set gives the measurements in centimeters of the variables sepal length and width and petal length and width, respectively, for 50 flowers from each of 3 species of iris. The species are Iris setosa, versicolor, and virginica. This dataset is mostly just for testing the Zenodo API
h
iris_clase
huggingface.co
Updated Apr 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Soriano (2025). iris_clase [Dataset]. https://huggingface.co/datasets/mariaasoriaano/iris_clase
Explore at:
Dataset updated
Apr 4, 2025
Authors
Soriano
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset Card for Iris Dataset

Dataset Summary

The Iris dataset is a classic multivariate dataset introduced by Ronald Fisher in 1936. It contains 150 samples of iris flowers from three different species: Iris setosa, Iris versicolor, and Iris virginica. Each sample has four features: sepal length, sepal width, petal length, and petal width. This dataset is widely used for classification tasks, especially in machine learning tutorials and benchmarks.

Dataset… See the full description on the dataset page: https://huggingface.co/datasets/mariaasoriaano/iris_clase.
t
Kenneth D. Morton, Jr., Peter Torrione, Leslie Collins, Sam Keene (2024)....
service.tib.eu
Updated Dec 16, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Kenneth D. Morton, Jr., Peter Torrione, Leslie Collins, Sam Keene (2024). Dataset: Fisher's Iris dataset. https://doi.org/10.57702/c75q51m4 [Dataset]. https://service.tib.eu/ldmservice/dataset/fisher-s-iris-dataset
Explore at:
Dataset updated
Dec 16, 2024
Description
Fisher's Iris dataset is a multivariate dataset introduced by Sir Ronald Fisher in his 1936 paper "The use of multiple measurements in taxonomic problems". It contains 150 samples from three species of iris flowers (Iris setosa, Iris virginica, and Iris versicolor). Each sample is described by 4 features: the length and width of the sepal and petal.
Ronald Fisher (1936)-IRIS
kaggle.com
Updated Aug 25, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ravi Dutt Ramanujapu (2021). Ronald Fisher (1936)-IRIS [Dataset]. https://www.kaggle.com/raviduttramanujapu/ronald-fisher-1936iris/metadata
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 25, 2021
Dataset provided by
Kaggle
Authors
Ravi Dutt Ramanujapu
License
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Description
Title: Iris Plants Database

Sources: (a) Creator: R.A. Fisher (b) Donor: Michael Marshall (MARSHALL%PLU@io.arc.nasa.gov) (c) Date: July, 1988

Past Usage:

Publications: too many to mention!!! Here are a few.

Fisher,R.A. "The use of multiple measurements in taxonomic problems" Annual Eugenics, 7, Part II, 179-188 (1936); also in "Contributions to Mathematical Statistics" (John Wiley, NY, 1950).

Duda,R.O., & Hart,P.E. (1973) Pattern Classification and Scene Analysis. (Q327.D83) John Wiley & Sons. ISBN 0-471-22361-1. See page 218.

Dasarathy, B.V. (1980) "Nosing Around the Neighborhood: A New System Structure and Classification Rule for Recognition in Partially Exposed Environments". IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAMI-2, No. 1, 67-71. -- Results: -- very low misclassification rates (0% for the setosa class)

Gates, G.W. (1972) "The Reduced Nearest Neighbor Rule". IEEE Transactions on Information Theory, May 1972, 431-433. -- Results: -- very low misclassification rates again

See also: 1988 MLC Proceedings, 54-64. Cheeseman et al's AUTOCLASS II conceptual clustering system finds 3 classes in the data.

Relevant Information: --- This is perhaps the best known database to be found in the pattern recognition literature. Fisher's paper is a classic in the field and is referenced frequently to this day. (See Duda & Hart, for example.) The data set contains 3 classes of 50 instances each, where each class refers to a type of iris plant. One class is linearly separable from the other 2; the latter are NOT linearly separable from each other. --- Predicted attribute: class of iris plant. --- This is an exceedingly simple domain. --- This data differs from the data presented in Fishers article

Number of Instances: 150 (50 in each of three classes)

Number of Attributes: 4 numeric, predictive attributes and the class

Attribute Information:

sepal length in cm

sepal width in cm

petal length in cm

petal width in cm

class: -- Iris Setosa -- Iris Versicolour -- Iris Virginica

Missing Attribute Values: None

Summary Statistics:

sepal length: 4.3 7.9 5.84 0.83 0.7826
sepal width: 2.0 4.4 3.05 0.43 -0.4194 petal length: 1.0 6.9 3.76 1.76 0.9490 (high!) petal width: 0.1 2.5 1.20 0.76 0.9565 (high!)

Class Distribution: 33.3% for each of 3 classes.
A
‘Iris Dataset for EDA’ analyzed by Analyst-2
analyst-2.ai
Updated Feb 13, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Iris Dataset for EDA’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-iris-dataset-for-eda-88d7/e4eea4c7/?iid=001-016&v=presentation
Explore at:
Dataset updated
Feb 13, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Iris Dataset for EDA’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/mdjafrilalamshihab/iris-dataset-for-eda on 13 February 2022.

--- Dataset description provided by original source is as follows ---

Iris dataset for EDA. This dataset consists petal length and width , sepal length and width and name of species.

--- Original source retains full ownership of the source dataset ---
h
iris-partitions
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Khoa Nguyen, iris-partitions [Dataset]. https://huggingface.co/datasets/khoaguin/iris-partitions
Explore at:
Authors
Khoa Nguyen
Description
Partitioned IRIS Datasets

This repository contains a script (dataset.py) to download the Iris dataset and split it into multiple partitions. Each partition is further divided into a public "mock" dataset and a "private" dataset.

IRIS Dataset Overview

The Iris dataset is a classic dataset in machine learning, consisting of 150 samples of iris flowers. Each sample has four features (sepal length, sepal width, petal length, and petal width) and belongs to one of three… See the full description on the dataset page: https://huggingface.co/datasets/khoaguin/iris-partitions.
Data from: IrisDataset
kaggle.com
zip
Updated Apr 15, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sourav Bhattacharya (2019). IrisDataset [Dataset]. https://www.kaggle.com/souravbhattacharya10/irisdataset
Explore at:
zip(1039 bytes)Available download formats
Dataset updated
Apr 15, 2019
Authors
Sourav Bhattacharya
Description
The simple Iris dataset for Multiclass Classification (required dataset for Hello World program in Machine Learning). It has data for three species (Iris-setosa, Iris-versicolor and Iris-virginica) of Iris flower. It contains 4 features as input (petal_length, petal-width, sepal_length and sepal_width) and label as output. It contains 150 rows, 50 rows for each species of Iris.
f
Data_Sheet_7_“R” U ready?: a case study using R to analyze changes in gene...
frontiersin.figshare.com
docx
Updated Mar 22, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Amy E. Pomeroy; Andrea Bixler; Stefanie H. Chen; Jennifer E. Kerr; Todd D. Levine; Elizabeth F. Ryder (2024). Data_Sheet_7_“R” U ready?: a case study using R to analyze changes in gene expression during evolution.docx [Dataset]. http://doi.org/10.3389/feduc.2024.1379910.s007
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/feduc.2024.1379910.s007
Dataset updated
Mar 22, 2024
Dataset provided by
Frontiers
Authors
Amy E. Pomeroy; Andrea Bixler; Stefanie H. Chen; Jennifer E. Kerr; Todd D. Levine; Elizabeth F. Ryder
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
As high-throughput methods become more common, training undergraduates to analyze data must include having them generate informative summaries of large datasets. This flexible case study provides an opportunity for undergraduate students to become familiar with the capabilities of R programming in the context of high-throughput evolutionary data collected using macroarrays. The story line introduces a recent graduate hired at a biotech firm and tasked with analysis and visualization of changes in gene expression from 20,000 generations of the Lenski Lab’s Long-Term Evolution Experiment (LTEE). Our main character is not familiar with R and is guided by a coworker to learn about this platform. Initially this involves a step-by-step analysis of the small Iris dataset built into R which includes sepal and petal length of three species of irises. Practice calculating summary statistics and correlations, and making histograms and scatter plots, prepares the protagonist to perform similar analyses with the LTEE dataset. In the LTEE module, students analyze gene expression data from the long-term evolutionary experiments, developing their skills in manipulating and interpreting large scientific datasets through visualizations and statistical analysis. Prerequisite knowledge is basic statistics, the Central Dogma, and basic evolutionary principles. The Iris module provides hands-on experience using R programming to explore and visualize a simple dataset; it can be used independently as an introduction to R for biological data or skipped if students already have some experience with R. Both modules emphasize understanding the utility of R, rather than creation of original code. Pilot testing showed the case study was well-received by students and faculty, who described it as a clear introduction to R and appreciated the value of R for visualizing and analyzing large datasets.
IRIS DATASET
kaggle.com
Updated Jan 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
vijayaadithyan V.G (2023). IRIS DATASET [Dataset]. https://www.kaggle.com/datasets/vijayaadithyanvg/iris-dataset/suggestions
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 14, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
vijayaadithyan V.G
Description
The Iris Dataset contains four features (length and width of sepals and petals) of 50 samples of three species of Iris (Iris setosa, Iris virginica and Iris versicolor).
Data from: Iris-Dataset
kaggle.com
Updated May 29, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Davor Budimir (2020). Iris-Dataset [Dataset]. https://www.kaggle.com/datasets/davorbudimir/irisdataset/versions/2
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 29, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Davor Budimir
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset

This dataset was created by Davor Budimir

Released under CC0: Public Domain

Contents
irsiUCI
kaggle.com
Updated Dec 17, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Weipanpan (2018). irsiUCI [Dataset]. https://www.kaggle.com/jodiewpp/irsiuci/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 17, 2018
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Weipanpan
Description
Here's a brief version of what you'll find in the data description file.

Source: Creator: R.A. Fisher Donor: Michael Marshall (MARSHALL%PLU '@' io.arc.nasa.gov)

Data Set Information:

This is perhaps the best known database to be found in the pattern recognition literature. Fisher's paper is a classic in the field and is referenced frequently to this day. (See Duda & Hart, for example.) The data set contains 3 classes of 50 instances each, where each class refers to a type of iris plant. One class is linearly separable from the other 2; the latter are NOT linearly separable from each other. Predicted attribute: class of iris plant. This is an exceedingly simple domain. This data differs from the data presented in Fishers article (identified by Steve Chadwick, spchadwick '@' espeedaz.net ). The 35th sample should be: 4.9,3.1,1.5,0.2,"Iris-setosa" where the error is in the fourth feature. The 38th sample: 4.9,3.6,1.4,0.1,"Iris-setosa" where the errors are in the second and third features.

Attribute Information:

sepal length in cm

sepal width in cm

petal length in cm

petal width in cm

class: -- Iris Setosa -- Iris Versicolour -- Iris Virginica
Iris Species Classifier using Flux.jl
kaggle.com
Updated Jun 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sadia Mazhar26 (2025). Iris Species Classifier using Flux.jl [Dataset]. https://www.kaggle.com/datasets/sadiamazhar26/iris-species-classifier-using-flux-jl
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 5, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sadia Mazhar26
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This project uses the Iris dataset from the RDatasets Julia package. It consists of 150 flower samples equally distributed across three species: Setosa, Versicolor, and Virginica. Each sample includes four numerical features: sepal length, sepal width, petal length, and petal width. The features are normalized for model input. The dataset is split into 80% training and 20% testing to evaluate a neural network model developed using Flux.jl for accurate species classification.

Facebook

Twitter

Click to copy link

Link copied

Cite

Ghanshyam Saini (2025). Iris Species Dataset and Database [Dataset]. https://www.kaggle.com/datasets/ghnshymsaini/iris-species-dataset-and-database

Iris Species Dataset and Database

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

May 15, 2025

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Ghanshyam Saini

License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Iris Flower Dataset

This is a classic and very widely used dataset in machine learning and statistics, often serving as a first dataset for classification problems. Introduced by the British statistician and biologist Ronald Fisher in his 1936 paper "The use of multiple measurements in taxonomic problems," it is a foundational resource for learning classification algorithms.

Overview:

The dataset contains measurements for 150 samples of iris flowers. Each sample belongs to one of three species of iris:

Iris setosa
Iris versicolor
Iris virginica

For each flower, four features were measured:

Sepal length (in cm)
Sepal width (in cm)
Petal length (in cm)
Petal width (in cm)

The goal is typically to build a model that can classify iris flowers into their correct species based on these four features.

File Structure:

The dataset is usually provided as a single CSV (Comma Separated Values) file, often named iris.csv or similar. This file typically contains the following columns:

sepal_length (cm): Numerical. The length of the sepal of the iris flower.
sepal_width (cm): Numerical. The width of the sepal of the iris flower.
petal_length (cm): Numerical. The length of the petal of the iris flower.
petal_width (cm): Numerical. The width of the petal of the iris flower.
species: Categorical. The species of the iris flower (either 'setosa', 'versicolor', or 'virginica'). This is the target variable for classification.

Content of the Data:

The dataset contains an equal number of samples (50) for each of the three iris species. The measurements of the sepal and petal dimensions vary between the species, allowing for their differentiation using machine learning models.

How to Use This Dataset:

Download the iris.csv file.
Load the data using libraries like Pandas in Python.
Explore the data through visualization and statistical analysis to understand the relationships between the features and the different species.
Build classification models (e.g., Logistic Regression, Support Vector Machines, Decision Trees, K-Nearest Neighbors) using the sepal and petal measurements as features and the 'species' column as the target variable.
Evaluate the performance of your model using appropriate metrics (e.g., accuracy, precision, recall, F1-score).
The dataset is small and well-behaved, making it excellent for learning and experimenting with various classification techniques.

Citation:

When using the Iris dataset, it is common to cite Ronald Fisher's original work:

Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7(2), 179-188.

Data Contribution:

Thank you for providing this classic and fundamental dataset to the Kaggle community. The Iris dataset remains an invaluable resource for both beginners learning the basics of classification and experienced practitioners testing new algorithms. Its simplicity and clear class separation make it an ideal starting point for many data science projects.

If you find this dataset description helpful and the dataset itself useful for your learning or projects, please consider giving it an upvote after downloading. Your appreciation is valuable!

Clear search

Close search

Google apps

Main menu

Iris Species Dataset and Database

Iris Flower Dataset

‘Iris Flower Dataset’ analyzed by Analyst-2

Context

Content

Acknowledgements

iris

Metrics As Scores Dataset: The Iris Flower Data Set

iris-clase

Data from: Iris flower classification

iris

Iris Species Dataset

Edgar Anderson's Iris Data

iris_clase

Kenneth D. Morton, Jr., Peter Torrione, Leslie Collins, Sam Keene (2024)....

Ronald Fisher (1936)-IRIS

‘Iris Dataset for EDA’ analyzed by Analyst-2

iris-partitions

Data from: IrisDataset

Data_Sheet_7_“R” U ready?: a case study using R to analyze changes in gene...

IRIS DATASET

Data from: Iris-Dataset

Dataset

Contents

irsiUCI

Iris Species Classifier using Flux.jl

Iris Species Dataset and Database

Iris Flower Dataset