3 datasets found
  1. Iris Species Dataset and Database

    • kaggle.com
    Updated May 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ghanshyam Saini (2025). Iris Species Dataset and Database [Dataset]. https://www.kaggle.com/datasets/ghnshymsaini/iris-species-dataset-and-database
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 15, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ghanshyam Saini
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Iris Flower Dataset

    This is a classic and very widely used dataset in machine learning and statistics, often serving as a first dataset for classification problems. Introduced by the British statistician and biologist Ronald Fisher in his 1936 paper "The use of multiple measurements in taxonomic problems," it is a foundational resource for learning classification algorithms.

    Overview:

    The dataset contains measurements for 150 samples of iris flowers. Each sample belongs to one of three species of iris:

    • Iris setosa
    • Iris versicolor
    • Iris virginica

    For each flower, four features were measured:

    • Sepal length (in cm)
    • Sepal width (in cm)
    • Petal length (in cm)
    • Petal width (in cm)

    The goal is typically to build a model that can classify iris flowers into their correct species based on these four features.

    File Structure:

    The dataset is usually provided as a single CSV (Comma Separated Values) file, often named iris.csv or similar. This file typically contains the following columns:

    1. sepal_length (cm): Numerical. The length of the sepal of the iris flower.
    2. sepal_width (cm): Numerical. The width of the sepal of the iris flower.
    3. petal_length (cm): Numerical. The length of the petal of the iris flower.
    4. petal_width (cm): Numerical. The width of the petal of the iris flower.
    5. species: Categorical. The species of the iris flower (either 'setosa', 'versicolor', or 'virginica'). This is the target variable for classification.

    Content of the Data:

    The dataset contains an equal number of samples (50) for each of the three iris species. The measurements of the sepal and petal dimensions vary between the species, allowing for their differentiation using machine learning models.

    How to Use This Dataset:

    1. Download the iris.csv file.
    2. Load the data using libraries like Pandas in Python.
    3. Explore the data through visualization and statistical analysis to understand the relationships between the features and the different species.
    4. Build classification models (e.g., Logistic Regression, Support Vector Machines, Decision Trees, K-Nearest Neighbors) using the sepal and petal measurements as features and the 'species' column as the target variable.
    5. Evaluate the performance of your model using appropriate metrics (e.g., accuracy, precision, recall, F1-score).
    6. The dataset is small and well-behaved, making it excellent for learning and experimenting with various classification techniques.

    Citation:

    When using the Iris dataset, it is common to cite Ronald Fisher's original work:

    Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7(2), 179-188.

    Data Contribution:

    Thank you for providing this classic and fundamental dataset to the Kaggle community. The Iris dataset remains an invaluable resource for both beginners learning the basics of classification and experienced practitioners testing new algorithms. Its simplicity and clear class separation make it an ideal starting point for many data science projects.

    If you find this dataset description helpful and the dataset itself useful for your learning or projects, please consider giving it an upvote after downloading. Your appreciation is valuable!

  2. Iris dataset

    • kaggle.com
    Updated Jul 20, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Himanshu Nakrani (2022). Iris dataset [Dataset]. https://www.kaggle.com/datasets/himanshunakrani/iris-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 20, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Himanshu Nakrani
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    It includes three iris species with 50 samples each as well as some properties of each flower. One flower species is linearly separable from the other two, but the other two are not linearly separable from each other.

    FIle name: iris.csv

  3. h

    iris

    • huggingface.co
    Updated Sep 23, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    scikit-learn (2022). iris [Dataset]. https://huggingface.co/datasets/scikit-learn/iris
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 23, 2022
    Dataset authored and provided by
    scikit-learn
    License

    https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/

    Description

    Iris Species Dataset

    The Iris dataset was used in R.A. Fisher's classic 1936 paper, The Use of Multiple Measurements in Taxonomic Problems, and can also be found on the UCI Machine Learning Repository. It includes three iris species with 50 samples each as well as some properties about each flower. One flower species is linearly separable from the other two, but the other two are not linearly separable from each other. The dataset is taken from UCI Machine Learning Repository's… See the full description on the dataset page: https://huggingface.co/datasets/scikit-learn/iris.

  4. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ghanshyam Saini (2025). Iris Species Dataset and Database [Dataset]. https://www.kaggle.com/datasets/ghnshymsaini/iris-species-dataset-and-database
Organization logo

Iris Species Dataset and Database

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 15, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ghanshyam Saini
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Iris Flower Dataset

This is a classic and very widely used dataset in machine learning and statistics, often serving as a first dataset for classification problems. Introduced by the British statistician and biologist Ronald Fisher in his 1936 paper "The use of multiple measurements in taxonomic problems," it is a foundational resource for learning classification algorithms.

Overview:

The dataset contains measurements for 150 samples of iris flowers. Each sample belongs to one of three species of iris:

  • Iris setosa
  • Iris versicolor
  • Iris virginica

For each flower, four features were measured:

  • Sepal length (in cm)
  • Sepal width (in cm)
  • Petal length (in cm)
  • Petal width (in cm)

The goal is typically to build a model that can classify iris flowers into their correct species based on these four features.

File Structure:

The dataset is usually provided as a single CSV (Comma Separated Values) file, often named iris.csv or similar. This file typically contains the following columns:

  1. sepal_length (cm): Numerical. The length of the sepal of the iris flower.
  2. sepal_width (cm): Numerical. The width of the sepal of the iris flower.
  3. petal_length (cm): Numerical. The length of the petal of the iris flower.
  4. petal_width (cm): Numerical. The width of the petal of the iris flower.
  5. species: Categorical. The species of the iris flower (either 'setosa', 'versicolor', or 'virginica'). This is the target variable for classification.

Content of the Data:

The dataset contains an equal number of samples (50) for each of the three iris species. The measurements of the sepal and petal dimensions vary between the species, allowing for their differentiation using machine learning models.

How to Use This Dataset:

  1. Download the iris.csv file.
  2. Load the data using libraries like Pandas in Python.
  3. Explore the data through visualization and statistical analysis to understand the relationships between the features and the different species.
  4. Build classification models (e.g., Logistic Regression, Support Vector Machines, Decision Trees, K-Nearest Neighbors) using the sepal and petal measurements as features and the 'species' column as the target variable.
  5. Evaluate the performance of your model using appropriate metrics (e.g., accuracy, precision, recall, F1-score).
  6. The dataset is small and well-behaved, making it excellent for learning and experimenting with various classification techniques.

Citation:

When using the Iris dataset, it is common to cite Ronald Fisher's original work:

Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7(2), 179-188.

Data Contribution:

Thank you for providing this classic and fundamental dataset to the Kaggle community. The Iris dataset remains an invaluable resource for both beginners learning the basics of classification and experienced practitioners testing new algorithms. Its simplicity and clear class separation make it an ideal starting point for many data science projects.

If you find this dataset description helpful and the dataset itself useful for your learning or projects, please consider giving it an upvote after downloading. Your appreciation is valuable!

Search
Clear search
Close search
Google apps
Main menu