23 datasets found
  1. Iris Species Dataset and Database

    • kaggle.com
    Updated May 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ghanshyam Saini (2025). Iris Species Dataset and Database [Dataset]. https://www.kaggle.com/datasets/ghnshymsaini/iris-species-dataset-and-database
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 15, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ghanshyam Saini
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Iris Flower Dataset

    This is a classic and very widely used dataset in machine learning and statistics, often serving as a first dataset for classification problems. Introduced by the British statistician and biologist Ronald Fisher in his 1936 paper "The use of multiple measurements in taxonomic problems," it is a foundational resource for learning classification algorithms.

    Overview:

    The dataset contains measurements for 150 samples of iris flowers. Each sample belongs to one of three species of iris:

    • Iris setosa
    • Iris versicolor
    • Iris virginica

    For each flower, four features were measured:

    • Sepal length (in cm)
    • Sepal width (in cm)
    • Petal length (in cm)
    • Petal width (in cm)

    The goal is typically to build a model that can classify iris flowers into their correct species based on these four features.

    File Structure:

    The dataset is usually provided as a single CSV (Comma Separated Values) file, often named iris.csv or similar. This file typically contains the following columns:

    1. sepal_length (cm): Numerical. The length of the sepal of the iris flower.
    2. sepal_width (cm): Numerical. The width of the sepal of the iris flower.
    3. petal_length (cm): Numerical. The length of the petal of the iris flower.
    4. petal_width (cm): Numerical. The width of the petal of the iris flower.
    5. species: Categorical. The species of the iris flower (either 'setosa', 'versicolor', or 'virginica'). This is the target variable for classification.

    Content of the Data:

    The dataset contains an equal number of samples (50) for each of the three iris species. The measurements of the sepal and petal dimensions vary between the species, allowing for their differentiation using machine learning models.

    How to Use This Dataset:

    1. Download the iris.csv file.
    2. Load the data using libraries like Pandas in Python.
    3. Explore the data through visualization and statistical analysis to understand the relationships between the features and the different species.
    4. Build classification models (e.g., Logistic Regression, Support Vector Machines, Decision Trees, K-Nearest Neighbors) using the sepal and petal measurements as features and the 'species' column as the target variable.
    5. Evaluate the performance of your model using appropriate metrics (e.g., accuracy, precision, recall, F1-score).
    6. The dataset is small and well-behaved, making it excellent for learning and experimenting with various classification techniques.

    Citation:

    When using the Iris dataset, it is common to cite Ronald Fisher's original work:

    Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7(2), 179-188.

    Data Contribution:

    Thank you for providing this classic and fundamental dataset to the Kaggle community. The Iris dataset remains an invaluable resource for both beginners learning the basics of classification and experienced practitioners testing new algorithms. Its simplicity and clear class separation make it an ideal starting point for many data science projects.

    If you find this dataset description helpful and the dataset itself useful for your learning or projects, please consider giving it an upvote after downloading. Your appreciation is valuable!

  2. A

    ‘Iris Flower Dataset’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Jan 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Iris Flower Dataset’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-iris-flower-dataset-bb8a/eb51f303/?iid=001-010&v=presentation
    Explore at:
    Dataset updated
    Jan 28, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Iris Flower Dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/arshid/iris-flower-dataset on 28 January 2022.

    --- Dataset description provided by original source is as follows ---

    Context

    The Iris flower data set is a multivariate data set introduced by the British statistician and biologist Ronald Fisher in his 1936 paper The use of multiple measurements in taxonomic problems. It is sometimes called Anderson's Iris data set because Edgar Anderson collected the data to quantify the morphologic variation of Iris flowers of three related species. The data set consists of 50 samples from each of three species of Iris (Iris Setosa, Iris virginica, and Iris versicolor). Four features were measured from each sample: the length and the width of the sepals and petals, in centimeters.

    This dataset became a typical test case for many statistical classification techniques in machine learning such as support vector machines

    Content

    The dataset contains a set of 150 records under 5 attributes - Petal Length, Petal Width, Sepal Length, Sepal width and Class(Species).

    Acknowledgements

    This dataset is free and is publicly available at the UCI Machine Learning Repository

    --- Original source retains full ownership of the source dataset ---

  3. h

    iris

    • huggingface.co
    Updated Apr 3, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bernardo Ronquillo (2025). iris [Dataset]. https://huggingface.co/datasets/brjapon/iris
    Explore at:
    Dataset updated
    Apr 3, 2025
    Authors
    Bernardo Ronquillo
    License

    https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/

    Description

    Iris Species Dataset

    The Iris dataset is a classic dataset in machine learning, originally published by Ronald Fisher. It contains 150 instances of iris flowers, each described by four features (sepal length, sepal width, petal length, and petal width), along with the corresponding species label (setosa, versicolor, or virginica). It is commonly used as an introductory dataset for classification tasks and for demonstrating basic data exploration and model training workflows.… See the full description on the dataset page: https://huggingface.co/datasets/brjapon/iris.

  4. Z

    Metrics As Scores Dataset: The Iris Flower Data Set

    • data.niaid.nih.gov
    • explore.openaire.eu
    Updated Jul 12, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sebastian Hönel (2024). Metrics As Scores Dataset: The Iris Flower Data Set [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7669645
    Explore at:
    Dataset updated
    Jul 12, 2024
    Dataset authored and provided by
    Sebastian Hönel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Iris flower data set or Fisher’s Iris data set is a multivariate data set used and made famous by the British statistician and biologist Ronald Fisher. The dataset was introduced in his 1936 paper "The Use of Multiple Measurements in Taxonomic Problems" (Fisher 1936) as an example of linear discriminant analysis.

    This dataset has the following Features:

    Petal.Length: Length of the petal

    Petal.Width: Width of the petal

    Sepal.Length: Length of the sepal

    Sepal.Width: Width of the sepal

    It has a total of 3 Groups: setosa, versicolor, and virginica.

  5. h

    iris-clase

    • huggingface.co
    Updated Apr 5, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrés Eduardo García Herrera (2025). iris-clase [Dataset]. https://huggingface.co/datasets/aegarciaherrera/iris-clase
    Explore at:
    Dataset updated
    Apr 5, 2025
    Authors
    Andrés Eduardo García Herrera
    License

    https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/

    Description

    Dataset Card for "iris"

      Dataset Summary
    

    The Iris dataset is one of the most classic datasets in machine learning, often used for classification and clustering tasks. It contains 150 samples of iris flowers, each described by four features: sepal length, sepal width, petal length, and petal width. The task is to classify the samples into one of three species: Iris setosa, Iris versicolor, or Iris virginica. This dataset is especially useful for:

    Supervised learning… See the full description on the dataset page: https://huggingface.co/datasets/aegarciaherrera/iris-clase.

  6. Data from: Iris flower classification

    • kaggle.com
    Updated Jan 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ovoke Major (2023). Iris flower classification [Dataset]. https://www.kaggle.com/datasets/ovokemajor/iris-flower-classification
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 10, 2023
    Dataset provided by
    Kaggle
    Authors
    Ovoke Major
    Description

    The Iris Dataset. ¶. This data sets consists of 3 different types of irises’ (Setosa, Versicolour, and Virginica) petal and sepal length, stored in a 150x4 numpy.ndarray. The rows being the samples and the columns being: Sepal Length, Sepal Width, Petal Length and Petal Width.

  7. O

    iris

    • opendatalab.com
    • tensorflow.org
    zip
    Updated Sep 22, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Tsukuba (2022). iris [Dataset]. https://opendatalab.com/OpenDataLab/iris
    Explore at:
    zip(4551 bytes)Available download formats
    Dataset updated
    Sep 22, 2022
    Dataset provided by
    University of Tsukuba
    Description

    The Iris flower data set is a multivariate data set introduced by the British statistician and biologist Ronald Fisher in his 1936 paper The use of multiple measurements in taxonomic problems. It is sometimes called Anderson's Iris data set because Edgar Anderson collected the data to quantify the morphologic variation of Iris flowers of three related species. The data set consists of 50 samples from each of three species of Iris (Iris Setosa, Iris virginica, and Iris versicolor). Four features were measured from each sample: the length and the width of the sepals and petals, in centimeters.This dataset became a typical test case for many statistical classification techniques in machine learning such as support vector machines

  8. c

    Iris Species Dataset

    • cubig.ai
    Updated May 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). Iris Species Dataset [Dataset]. https://cubig.ai/store/products/387/iris-species-dataset
    Explore at:
    Dataset updated
    May 29, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Privacy-preserving data transformation via differential privacy, Synthetic data generation using AI techniques for model training
    Description

    1) Data Introduction • The Iris Species Dataset is a classic multi-class classification data that collected a total of 150 samples, 50 for each of the three iris species (Setosa, Versicolor, Virginica), consisting of four numerical characteristics and species labels, including calyx length, width, petal length, and width.

    2) Data Utilization (1) The Iris Species Dataset has characteristics that: • This dataset consists of a total of six columns and is labeled as one of three types, making it suitable for class division and basic statistical analysis. (2) The Iris Species Dataset can be used to: • Classification Algorithm Practice: You can easily practice various machine learning classification models such as logistic regression, SVM, and decision tree by inputting four characteristics: calyx and petal length and width. • Visualize data and analyze basic statistics: Visualize the distribution of characteristics by variety into scatterplots, boxplots, etc. to explore differences between classes and correlations between characteristics.

  9. o

    Edgar Anderson's Iris Data

    • explore.openaire.eu
    • zenodo.org
    Updated Jul 22, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Edgar Anderson (2018). Edgar Anderson's Iris Data [Dataset]. http://doi.org/10.5281/zenodo.10396807
    Explore at:
    Dataset updated
    Jul 22, 2018
    Authors
    Edgar Anderson
    Description

    This famous (Fisher's or Anderson's) iris data set gives the measurements in centimeters of the variables sepal length and width and petal length and width, respectively, for 50 flowers from each of 3 species of iris. The species are Iris setosa, versicolor, and virginica. This dataset is mostly just for testing the Zenodo API

  10. h

    iris_clase

    • huggingface.co
    Updated Apr 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Soriano (2025). iris_clase [Dataset]. https://huggingface.co/datasets/mariaasoriaano/iris_clase
    Explore at:
    Dataset updated
    Apr 4, 2025
    Authors
    Soriano
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset Card for Iris Dataset

      Dataset Summary
    

    The Iris dataset is a classic multivariate dataset introduced by Ronald Fisher in 1936. It contains 150 samples of iris flowers from three different species: Iris setosa, Iris versicolor, and Iris virginica. Each sample has four features: sepal length, sepal width, petal length, and petal width. This dataset is widely used for classification tasks, especially in machine learning tutorials and benchmarks.

      Dataset… See the full description on the dataset page: https://huggingface.co/datasets/mariaasoriaano/iris_clase.
    
  11. t

    Kenneth D. Morton, Jr., Peter Torrione, Leslie Collins, Sam Keene (2024)....

    • service.tib.eu
    Updated Dec 16, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Kenneth D. Morton, Jr., Peter Torrione, Leslie Collins, Sam Keene (2024). Dataset: Fisher's Iris dataset. https://doi.org/10.57702/c75q51m4 [Dataset]. https://service.tib.eu/ldmservice/dataset/fisher-s-iris-dataset
    Explore at:
    Dataset updated
    Dec 16, 2024
    Description

    Fisher's Iris dataset is a multivariate dataset introduced by Sir Ronald Fisher in his 1936 paper "The use of multiple measurements in taxonomic problems". It contains 150 samples from three species of iris flowers (Iris setosa, Iris virginica, and Iris versicolor). Each sample is described by 4 features: the length and width of the sepal and petal.

  12. Ronald Fisher (1936)-IRIS

    • kaggle.com
    Updated Aug 25, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ravi Dutt Ramanujapu (2021). Ronald Fisher (1936)-IRIS [Dataset]. https://www.kaggle.com/raviduttramanujapu/ronald-fisher-1936iris/metadata
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 25, 2021
    Dataset provided by
    Kaggle
    Authors
    Ravi Dutt Ramanujapu
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description
    1. Title: Iris Plants Database

    2. Sources: (a) Creator: R.A. Fisher (b) Donor: Michael Marshall (MARSHALL%PLU@io.arc.nasa.gov) (c) Date: July, 1988

    3. Past Usage:

      • Publications: too many to mention!!! Here are a few.
      • Fisher,R.A. "The use of multiple measurements in taxonomic problems" Annual Eugenics, 7, Part II, 179-188 (1936); also in "Contributions to Mathematical Statistics" (John Wiley, NY, 1950).
      • Duda,R.O., & Hart,P.E. (1973) Pattern Classification and Scene Analysis. (Q327.D83) John Wiley & Sons. ISBN 0-471-22361-1. See page 218.
      • Dasarathy, B.V. (1980) "Nosing Around the Neighborhood: A New System Structure and Classification Rule for Recognition in Partially Exposed Environments". IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAMI-2, No. 1, 67-71. -- Results: -- very low misclassification rates (0% for the setosa class)
      • Gates, G.W. (1972) "The Reduced Nearest Neighbor Rule". IEEE Transactions on Information Theory, May 1972, 431-433. -- Results: -- very low misclassification rates again
      • See also: 1988 MLC Proceedings, 54-64. Cheeseman et al's AUTOCLASS II conceptual clustering system finds 3 classes in the data.
    4. Relevant Information: --- This is perhaps the best known database to be found in the pattern recognition literature. Fisher's paper is a classic in the field and is referenced frequently to this day. (See Duda & Hart, for example.) The data set contains 3 classes of 50 instances each, where each class refers to a type of iris plant. One class is linearly separable from the other 2; the latter are NOT linearly separable from each other. --- Predicted attribute: class of iris plant. --- This is an exceedingly simple domain. --- This data differs from the data presented in Fishers article

    5. Number of Instances: 150 (50 in each of three classes)

    6. Number of Attributes: 4 numeric, predictive attributes and the class

    7. Attribute Information:

      1. sepal length in cm
      2. sepal width in cm
      3. petal length in cm
      4. petal width in cm
      5. class: -- Iris Setosa -- Iris Versicolour -- Iris Virginica
    8. Missing Attribute Values: None

    Summary Statistics:

    sepal length: 4.3 7.9 5.84 0.83 0.7826
    sepal width: 2.0 4.4 3.05 0.43 -0.4194 petal length: 1.0 6.9 3.76 1.76 0.9490 (high!) petal width: 0.1 2.5 1.20 0.76 0.9565 (high!)

    1. Class Distribution: 33.3% for each of 3 classes.
  13. A

    ‘Iris Dataset for EDA’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Feb 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Iris Dataset for EDA’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-iris-dataset-for-eda-88d7/e4eea4c7/?iid=001-016&v=presentation
    Explore at:
    Dataset updated
    Feb 13, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Iris Dataset for EDA’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/mdjafrilalamshihab/iris-dataset-for-eda on 13 February 2022.

    --- Dataset description provided by original source is as follows ---

    Iris dataset for EDA. This dataset consists petal length and width , sepal length and width and name of species.

    --- Original source retains full ownership of the source dataset ---

  14. h

    iris-partitions

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Khoa Nguyen, iris-partitions [Dataset]. https://huggingface.co/datasets/khoaguin/iris-partitions
    Explore at:
    Authors
    Khoa Nguyen
    Description

    Partitioned IRIS Datasets

    This repository contains a script (dataset.py) to download the Iris dataset and split it into multiple partitions. Each partition is further divided into a public "mock" dataset and a "private" dataset.

      IRIS Dataset Overview
    

    The Iris dataset is a classic dataset in machine learning, consisting of 150 samples of iris flowers. Each sample has four features (sepal length, sepal width, petal length, and petal width) and belongs to one of three… See the full description on the dataset page: https://huggingface.co/datasets/khoaguin/iris-partitions.

  15. Data from: IrisDataset

    • kaggle.com
    zip
    Updated Apr 15, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sourav Bhattacharya (2019). IrisDataset [Dataset]. https://www.kaggle.com/souravbhattacharya10/irisdataset
    Explore at:
    zip(1039 bytes)Available download formats
    Dataset updated
    Apr 15, 2019
    Authors
    Sourav Bhattacharya
    Description

    The simple Iris dataset for Multiclass Classification (required dataset for Hello World program in Machine Learning). It has data for three species (Iris-setosa, Iris-versicolor and Iris-virginica) of Iris flower. It contains 4 features as input (petal_length, petal-width, sepal_length and sepal_width) and label as output. It contains 150 rows, 50 rows for each species of Iris.

  16. f

    Data_Sheet_7_“R” U ready?: a case study using R to analyze changes in gene...

    • frontiersin.figshare.com
    docx
    Updated Mar 22, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amy E. Pomeroy; Andrea Bixler; Stefanie H. Chen; Jennifer E. Kerr; Todd D. Levine; Elizabeth F. Ryder (2024). Data_Sheet_7_“R” U ready?: a case study using R to analyze changes in gene expression during evolution.docx [Dataset]. http://doi.org/10.3389/feduc.2024.1379910.s007
    Explore at:
    docxAvailable download formats
    Dataset updated
    Mar 22, 2024
    Dataset provided by
    Frontiers
    Authors
    Amy E. Pomeroy; Andrea Bixler; Stefanie H. Chen; Jennifer E. Kerr; Todd D. Levine; Elizabeth F. Ryder
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    As high-throughput methods become more common, training undergraduates to analyze data must include having them generate informative summaries of large datasets. This flexible case study provides an opportunity for undergraduate students to become familiar with the capabilities of R programming in the context of high-throughput evolutionary data collected using macroarrays. The story line introduces a recent graduate hired at a biotech firm and tasked with analysis and visualization of changes in gene expression from 20,000 generations of the Lenski Lab’s Long-Term Evolution Experiment (LTEE). Our main character is not familiar with R and is guided by a coworker to learn about this platform. Initially this involves a step-by-step analysis of the small Iris dataset built into R which includes sepal and petal length of three species of irises. Practice calculating summary statistics and correlations, and making histograms and scatter plots, prepares the protagonist to perform similar analyses with the LTEE dataset. In the LTEE module, students analyze gene expression data from the long-term evolutionary experiments, developing their skills in manipulating and interpreting large scientific datasets through visualizations and statistical analysis. Prerequisite knowledge is basic statistics, the Central Dogma, and basic evolutionary principles. The Iris module provides hands-on experience using R programming to explore and visualize a simple dataset; it can be used independently as an introduction to R for biological data or skipped if students already have some experience with R. Both modules emphasize understanding the utility of R, rather than creation of original code. Pilot testing showed the case study was well-received by students and faculty, who described it as a clear introduction to R and appreciated the value of R for visualizing and analyzing large datasets.

  17. IRIS DATASET

    • kaggle.com
    Updated Jan 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    vijayaadithyan V.G (2023). IRIS DATASET [Dataset]. https://www.kaggle.com/datasets/vijayaadithyanvg/iris-dataset/suggestions
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 14, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    vijayaadithyan V.G
    Description

    The Iris Dataset contains four features (length and width of sepals and petals) of 50 samples of three species of Iris (Iris setosa, Iris virginica and Iris versicolor).

  18. Data from: Iris-Dataset

    • kaggle.com
    Updated May 29, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Davor Budimir (2020). Iris-Dataset [Dataset]. https://www.kaggle.com/datasets/davorbudimir/irisdataset/versions/2
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 29, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Davor Budimir
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Davor Budimir

    Released under CC0: Public Domain

    Contents

  19. irsiUCI

    • kaggle.com
    Updated Dec 17, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Weipanpan (2018). irsiUCI [Dataset]. https://www.kaggle.com/jodiewpp/irsiuci/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 17, 2018
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Weipanpan
    Description

    Here's a brief version of what you'll find in the data description file.

    Source: Creator: R.A. Fisher Donor: Michael Marshall (MARSHALL%PLU '@' io.arc.nasa.gov)

    Data Set Information:

    This is perhaps the best known database to be found in the pattern recognition literature. Fisher's paper is a classic in the field and is referenced frequently to this day. (See Duda & Hart, for example.) The data set contains 3 classes of 50 instances each, where each class refers to a type of iris plant. One class is linearly separable from the other 2; the latter are NOT linearly separable from each other. Predicted attribute: class of iris plant. This is an exceedingly simple domain. This data differs from the data presented in Fishers article (identified by Steve Chadwick, spchadwick '@' espeedaz.net ). The 35th sample should be: 4.9,3.1,1.5,0.2,"Iris-setosa" where the error is in the fourth feature. The 38th sample: 4.9,3.6,1.4,0.1,"Iris-setosa" where the errors are in the second and third features.

    Attribute Information:

    1. sepal length in cm
    2. sepal width in cm
    3. petal length in cm
    4. petal width in cm
    5. class: -- Iris Setosa -- Iris Versicolour -- Iris Virginica
  20. Iris Species Classifier using Flux.jl

    • kaggle.com
    Updated Jun 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sadia Mazhar26 (2025). Iris Species Classifier using Flux.jl [Dataset]. https://www.kaggle.com/datasets/sadiamazhar26/iris-species-classifier-using-flux-jl
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 5, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sadia Mazhar26
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This project uses the Iris dataset from the RDatasets Julia package. It consists of 150 flower samples equally distributed across three species: Setosa, Versicolor, and Virginica. Each sample includes four numerical features: sepal length, sepal width, petal length, and petal width. The features are normalized for model input. The dataset is split into 80% training and 20% testing to evaluate a neural network model developed using Flux.jl for accurate species classification.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ghanshyam Saini (2025). Iris Species Dataset and Database [Dataset]. https://www.kaggle.com/datasets/ghnshymsaini/iris-species-dataset-and-database
Organization logo

Iris Species Dataset and Database

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 15, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ghanshyam Saini
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Iris Flower Dataset

This is a classic and very widely used dataset in machine learning and statistics, often serving as a first dataset for classification problems. Introduced by the British statistician and biologist Ronald Fisher in his 1936 paper "The use of multiple measurements in taxonomic problems," it is a foundational resource for learning classification algorithms.

Overview:

The dataset contains measurements for 150 samples of iris flowers. Each sample belongs to one of three species of iris:

  • Iris setosa
  • Iris versicolor
  • Iris virginica

For each flower, four features were measured:

  • Sepal length (in cm)
  • Sepal width (in cm)
  • Petal length (in cm)
  • Petal width (in cm)

The goal is typically to build a model that can classify iris flowers into their correct species based on these four features.

File Structure:

The dataset is usually provided as a single CSV (Comma Separated Values) file, often named iris.csv or similar. This file typically contains the following columns:

  1. sepal_length (cm): Numerical. The length of the sepal of the iris flower.
  2. sepal_width (cm): Numerical. The width of the sepal of the iris flower.
  3. petal_length (cm): Numerical. The length of the petal of the iris flower.
  4. petal_width (cm): Numerical. The width of the petal of the iris flower.
  5. species: Categorical. The species of the iris flower (either 'setosa', 'versicolor', or 'virginica'). This is the target variable for classification.

Content of the Data:

The dataset contains an equal number of samples (50) for each of the three iris species. The measurements of the sepal and petal dimensions vary between the species, allowing for their differentiation using machine learning models.

How to Use This Dataset:

  1. Download the iris.csv file.
  2. Load the data using libraries like Pandas in Python.
  3. Explore the data through visualization and statistical analysis to understand the relationships between the features and the different species.
  4. Build classification models (e.g., Logistic Regression, Support Vector Machines, Decision Trees, K-Nearest Neighbors) using the sepal and petal measurements as features and the 'species' column as the target variable.
  5. Evaluate the performance of your model using appropriate metrics (e.g., accuracy, precision, recall, F1-score).
  6. The dataset is small and well-behaved, making it excellent for learning and experimenting with various classification techniques.

Citation:

When using the Iris dataset, it is common to cite Ronald Fisher's original work:

Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7(2), 179-188.

Data Contribution:

Thank you for providing this classic and fundamental dataset to the Kaggle community. The Iris dataset remains an invaluable resource for both beginners learning the basics of classification and experienced practitioners testing new algorithms. Its simplicity and clear class separation make it an ideal starting point for many data science projects.

If you find this dataset description helpful and the dataset itself useful for your learning or projects, please consider giving it an upvote after downloading. Your appreciation is valuable!

Search
Clear search
Close search
Google apps
Main menu