25 datasets found
  1. Clustering Iris Data Set

    • kaggle.com
    zip
    Updated Sep 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rifki Ilham (2023). Clustering Iris Data Set [Dataset]. https://www.kaggle.com/datasets/rifkiilham/clustering-iris-data-set/suggestions
    Explore at:
    zip(11024 bytes)Available download formats
    Dataset updated
    Sep 2, 2023
    Authors
    Rifki Ilham
    Description

    The Iris flower data set or Fisher's Iris data set is a multivariate data set used and made famous by the British statistician and biologist Ronald Fisher in his 1936 paper The use of multiple measurements in taxonomic problems as an example of linear discriminant analysis. Please use this data set to clustering the iris flowers data. You can use k-means clustering algorithm.

  2. Data from: Galaxy clustering

    • kaggle.com
    zip
    Updated Jan 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Galaxy clustering [Dataset]. https://www.kaggle.com/datasets/thedevastator/clustering-polygons-utilizing-iris-moon-and-circ
    Explore at:
    zip(6339 bytes)Available download formats
    Dataset updated
    Jan 3, 2023
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Galaxy clustering

    Iris, Moon, and Circles datasets for Galaxy clustering tutorial

    By [source]

    About this dataset

    This dataset contains a wealth of information that can be used to explore the effectiveness of various clustering algorithms. With its inclusion of numerical measurements (X, Y, Sepal.Length, and Petal.Length) and categorical values (Species), it is possible to investigate the relationship between different types of variables and clustering performance. Additionally, by comparing results for the 3 datasets provided - moon.csv (which contains x and y coordinates), iris.csv (which contains measurements for sepal and petal lengths),and circles.csv - we can gain insights into how different data distributions affect clustering techniques such as K-Means or Hierarchical Clustering among others!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚹 Your notebook can be here! 🚹!

    How to use the dataset

    This dataset can also be a great starting point to further explore more complex clusters by using higher dimensional space variables such as color or texture that may be present in other datasets not included here but which can help to form more accurate groups when using cluster-analysis algorithms. Additionally, it could also assist in visualization projects where clusters may need to be generated such as plotting mapped data points or examining relationships between two different variables within a certain region drawn on a chart.

    To use this dataset effectively it is important to understand how exactly your chosen algorithm works since some require specifying parameters beforehand while others take care of those details automatically; otherwise the interpretation may be invalid depending on the methods used alongside clustering you intend for your project. Furthermore, familiarize yourself with concepts like silhouette score and rand index - these are commonly used metrics that measure your cluster’s performance against other clusterings models so you know if what you have done so far satisfies an acceptable level of accuracy or not yet! Good luck!

    Research Ideas

    • Utilizing the sepal and petal lengths and widths to perform flower recognition or part of a larger image recognition pipeline.
    • Classifying the data points in each dataset by the X-Y coordinates using clustering algorithms to analyze galaxy locations or overall formation patterns for stars, planets, or galaxies.
    • Exploring correlations between species of flowers in terms of sepal/petal lengths by performing supervised learning tasks such as classification with this dataset

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: moon.csv | Column name | Description | |:--------------|:------------------------------------------| | X | X coordinate of the data point. (Numeric) | | Y | Y coordinate of the data point. (Numeric) |

    File: iris.csv | Column name | Description | |:-----------------|:---------------------------------------------| | Sepal.Length | Length of the sepal of the flower. (Numeric) | | Petal.Length | Length of the petal of the flower. (Numeric) | | Species | Species of the flower. (Categorical) |

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit .

  3. Iris Dataset (JSON Version)

    • kaggle.com
    zip
    Updated Apr 6, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rachael Tatman (2018). Iris Dataset (JSON Version) [Dataset]. https://www.kaggle.com/rtatman/iris-dataset-json-version
    Explore at:
    zip(1345 bytes)Available download formats
    Dataset updated
    Apr 6, 2018
    Authors
    Rachael Tatman
    Description

    Content

    This is a JSON version of the famous Iris dataset. It's provided as a introduction to the data storage format with a familiar dataset.

    It has five keys: sepalLength, sepalWidth, petalLength, petalWidth and species.

    Acknowledgements

    The citation for this dataset is:

    Fisher, R. A. (1936) The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7, Part II, 179–188.

    The data were collected by Anderson, Edgar (1935). The irises of the Gaspe Peninsula, Bulletin of the American Iris Society, 59, 2–5.

    Inspiration

    Use this dataset to practice reading JSON data into kernels and manipulating it.

  4. h

    iris-clase

    • huggingface.co
    Updated Apr 5, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrés Eduardo García Herrera (2025). iris-clase [Dataset]. https://huggingface.co/datasets/aegarciaherrera/iris-clase
    Explore at:
    Dataset updated
    Apr 5, 2025
    Authors
    Andrés Eduardo García Herrera
    License

    https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/

    Description

    Dataset Card for "iris"

      Dataset Summary
    

    The Iris dataset is one of the most classic datasets in machine learning, often used for classification and clustering tasks. It contains 150 samples of iris flowers, each described by four features: sepal length, sepal width, petal length, and petal width. The task is to classify the samples into one of three species: Iris setosa, Iris versicolor, or Iris virginica. This dataset is especially useful for:

    Supervised learning
 See the full description on the dataset page: https://huggingface.co/datasets/aegarciaherrera/iris-clase.

  5. Iris Dataset - Using Clustering

    • kaggle.com
    zip
    Updated Aug 22, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abhishek Agarwal (2021). Iris Dataset - Using Clustering [Dataset]. https://www.kaggle.com/abhi8901/iris-dataset-using-clustering
    Explore at:
    zip(1315 bytes)Available download formats
    Dataset updated
    Aug 22, 2021
    Authors
    Abhishek Agarwal
    Description

    Dataset

    This dataset was created by Abhishek Agarwal

    Contents

  6. IRIS Flower Dataset

    • kaggle.com
    zip
    Updated May 27, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tayyaba DataAce (2024). IRIS Flower Dataset [Dataset]. https://www.kaggle.com/datasets/tayyaba477/iris-flower-dataset
    Explore at:
    zip(1010 bytes)Available download formats
    Dataset updated
    May 27, 2024
    Authors
    Tayyaba DataAce
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Iris Dataset: A Classic Dataset for Machine Learning

    Overview

    The Iris dataset is one of the most famous datasets in the machine learning community. It contains 150 observations of iris flowers from three different species: Setosa, Versicolour, and Virginica. Each observation includes four features, which are measurements of the flowers' physical dimensions.

    Features

    Sepal Length (cm): Length of the sepal. Sepal Width (cm): Width of the sepal. Petal Length (cm): Length of the petal. Petal Width (cm): Width of the petal. Species: The species of the iris flower (Setosa, Versicolour, Virginica).

    Why Use This Dataset?

    Ideal for Beginners: Perfect for those new to data science and machine learning. Widely Recognized: A standard dataset for benchmarking algorithms and models. Balanced Classes: Each species has 50 observations, providing a balanced dataset for classification tasks. Simple Yet Powerful: Despite its simplicity, the dataset offers great opportunities for learning and applying various machine learning techniques.

    Applications

    Classification Algorithms: Test and compare different classification algorithms. Data Visualization: Explore and visualize the data to gain insights into the patterns and relationships between features. Feature Engineering: Experiment with creating new features and transforming existing ones to improve model performance. Dimensionality Reduction: Apply techniques like PCA to reduce the number of features while retaining most of the variance. Example Project Ideas Build a classifier to predict the species of iris flowers. Perform exploratory data analysis (EDA) and visualize the dataset in 2D and 3D. Create new features (e.g., sepal area, petal area) and evaluate their impact on model performance. Apply clustering algorithms to see how well they separate the species.

  7. Iris dataset local result table for A and B (RA, RB) using Davies Bouldin...

    • plos.figshare.com
    • figshare.com
    xls
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    WAQAR ISHAQ; ELIYA BUYUKKAYA; MUSHTAQ ALI; ZAKIR KHAN (2023). Iris dataset local result table for A and B (RA, RB) using Davies Bouldin index. [Dataset]. http://doi.org/10.1371/journal.pone.0244691.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    WAQAR ISHAQ; ELIYA BUYUKKAYA; MUSHTAQ ALI; ZAKIR KHAN
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Iris dataset local result table for A and B (RA, RB) using Davies Bouldin index.

  8. FastLloyd Clustering Datasets

    • zenodo.org
    xz
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abdulrahman Diaa; Abdulrahman Diaa; Thomas Humphries; Thomas Humphries; Florian Kerschbaum; Florian Kerschbaum (2025). FastLloyd Clustering Datasets [Dataset]. http://doi.org/10.5281/zenodo.15530593
    Explore at:
    xzAvailable download formats
    Dataset updated
    May 28, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Abdulrahman Diaa; Abdulrahman Diaa; Thomas Humphries; Thomas Humphries; Florian Kerschbaum; Florian Kerschbaum
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This artifact bundles the five dataset archives used in our private federated clustering evaluation, corresponding to the real-world benchmarks, scaling experiments, ablation studies, and timing performance tests described in the paper. The real_datasets.tar.xz includes ten established clustering benchmarks drawn from UCI and the Clustering basic benchmark (DOI: https://doi.org/10.1007/s10489-018-1238-7); scale_datasets.tar.xz contains the SynthNew family generated to assess scalability via the R clusterGeneration package ; ablate_datasets.tar.xz holds the AblateSynth sets varying cluster separation for ablation analysis also powered by clusterGeneration ; g2_datasets.tar.xz packages the G2 sets—Gaussian clusters of size 2048 across dimensions 2–1024 with two clusters each, collected from the Clustering basic benchmark (DOI: https://doi.org/10.1007/s10489-018-1238-7) ; and timing_datasets.tar.xz includes the real s1 and lsun datasets alongside TimeSynth files (balanced synthetic clusters for timing), as per Mohassel et al.’s experimental framework .

    Contents

    1. real_datasets.tar.xz

    Contains ten real-world benchmark datasets and formatted as one sample per line with space-separated features:

    • iris.txt: 150 samples, 4 features, 3 classes; classic UCI Iris dataset for petal/sepal measurements.

    • lsun.txt: 400 samples, 2 features, 3 clusters; two-dimensional variant of the LSUN dataset for clustering experiments .

    • s1.txt: 5,000 samples, 2 features, 15 clusters; synthetic benchmark from FrĂ€nti’s S1 series.

    • house.txt: 1,837 samples, 3 features, 3 clusters; housing data transformed for clustering tasks.

    • adult.txt: 48,842 samples, 6 features, 3 clusters; UCI Census Income (“Adult”) dataset for income bracket prediction.

    • wine.txt: 178 samples, 13 features, 3 cultivars; UCI Wine dataset with chemical analysis features.

    • breast.txt: 569 samples, 9 features, 2 classes; Wisconsin Diagnostic Breast Cancer dataset.

    • yeast.txt: 1,484 samples, 8 features, 10 localization sites; yeast protein localization data.

    • mnist.txt: 10,000 samples, 784 features (28×28 pixels), 10 digit classes; MNIST handwritten digits.

    • birch2.txt: (a random) 25,000/100,000 subset of samples, 2 features, 100 clusters; synthetic BIRCH2 dataset for high-cluster‐count evaluation .

    2. scale_datasets.tar.xz

    Holds the SynthNew_{k}_{d}_{s}.txt files for scaling experiments, where:

    • $k \in \{2,4,8,16,32\}$ is the number of clusters,

    • $d \in \{2,4,8,16,32,64,128,256,512\}$ is the dimensionality,

    • $s \in \{1,2,3\}$ are different random seeds.

    These are generated with the R clusterGeneration package with cluster sizes following a $1:2:...:k$ ratio. We incorporate a random number (in $[0, 100]$) of randomly sampled outliers and set the cluster separation degrees randomly in $[0.16, 0.26]$, spanning partially overlapping to separated clusters.

    3. ablate_datasets.tar.xz

    Contains the AblateSynth_{k}_{d}_{sep}.txt files for ablation studies, with:

    • $k \in \{2,4,8,16\}$ clusters,

    • $d \in \{2,4,8,16\}$ dimensions,

    • $sep \in \{0.25, 0.5, 0.75\}$ controlling cluster separation degrees.

    Also generated via clusterGeneration.

    4. g2_datasets.tar.xz

    Packages the G2 synthetic sets (g2-{dim}-{var}.txt) from the clustering-data benchmarks:

    • $N=2048$ samples, $k=2$ Gaussian clusters,

    • Dimensions $d \in \{1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024\}$

    • Cluster overlap $var \in \{10, 20, 30, 40, 50, 60, 70, 80, 90, 100\}$

    5. timing_datasets.tar.xz

    Includes:

    • s1.txt, lsun.txt: two real datasets for baseline timing.

    • timesynth_{k}_{d}_{n}.txt: synthetic timing datasets with balanced cluster sizes C_{avg}=N/K, varying:

      • $k \in \{2,5\}$

      • $d \in \{2,5\}$

      • $N \in \{10000; 100000\}$

    Generated similarly to the scaling sets, following Mohassel et al.’s timing experiment protocol .

    Usage:

    Unpack any archive with tar -xJf

  9. Iris dataset

    • kaggle.com
    zip
    Updated Jul 20, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Himanshu Nakrani (2022). Iris dataset [Dataset]. https://www.kaggle.com/datasets/himanshunakrani/iris-dataset/discussion
    Explore at:
    zip(1006 bytes)Available download formats
    Dataset updated
    Jul 20, 2022
    Authors
    Himanshu Nakrani
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    It includes three iris species with 50 samples each as well as some properties of each flower. One flower species is linearly separable from the other two, but the other two are not linearly separable from each other.

    FIle name: iris.csv

  10. Iris dataset local result Table for A and B (RA, RB) using purity index.

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    WAQAR ISHAQ; ELIYA BUYUKKAYA; MUSHTAQ ALI; ZAKIR KHAN (2023). Iris dataset local result Table for A and B (RA, RB) using purity index. [Dataset]. http://doi.org/10.1371/journal.pone.0244691.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    WAQAR ISHAQ; ELIYA BUYUKKAYA; MUSHTAQ ALI; ZAKIR KHAN
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Iris dataset local result Table for A and B (RA, RB) using purity index.

  11. Iris dataset

    • kaggle.com
    zip
    Updated Jan 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ehsan Zafari (2024). Iris dataset [Dataset]. https://www.kaggle.com/datasets/ehsanzafari/iris-dataset
    Explore at:
    zip(955 bytes)Available download formats
    Dataset updated
    Jan 16, 2024
    Authors
    Ehsan Zafari
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    The Iris dataset is a classic dataset in the field of machine learning and statistics. It's often used for demonstrating various data analysis, machine learning, and statistical techniques. Here are some key details about it:

    Background - Origin: The dataset was introduced by the British statistician and biologist Ronald Fisher in his 1936 paper titled "The use of multiple measurements in taxonomic problems." - Purpose: Fisher developed the dataset as an example of linear discriminant analysis.

    Data Composition - Data Points: The dataset consists of 150 samples from three species of Iris flowers: Iris Setosa, Iris Versicolour, and Iris Virginica. - Features: There are four features measured in centimeters for each sample: 1. Sepal Length 2. Sepal Width 3. Petal Length 4. Petal Width - Classes: The dataset contains three classes, corresponding to the three species of Iris. Each class has 50 samples.

    Usage - Classification: The Iris dataset is widely used for classification tasks, especially to illustrate the principles of supervised machine learning algorithms. - Testing Algorithms: It's often used to test out algorithms for linear regression, classification, and clustering due to its simplicity and small size. - Educational Purpose: Because of its clarity and simplicity, it's frequently used in teaching data science and machine learning.

    Characteristics - Simple and Clean: The dataset is straightforward, with minimal preprocessing required, making it ideal for beginners. - Well-Behaved Classes: The species are relatively well separated, though there's some overlap between Versicolor and Virginica. - Multivariate Data: It involves understanding the relationship between multiple variables (the four features).

    Applications - Benchmarking: The Iris dataset serves as a benchmark for evaluating the performance of different algorithms. - Visualization**: It's great for practicing data visualization, especially for exploring techniques like scatter plots, box plots, and pair plots to understand feature relationships.

    Despite its simplicity, the Iris dataset remains one of the most famous datasets in the world of data science and machine learning. It serves as an excellent starting point for anyone new to the field and remains a baseline for testing algorithms and teaching concepts.

  12. Consistency of variables for the dataset Iris Plant.

    • plos.figshare.com
    xls
    Updated Jun 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anny K. G. Rodrigues; Raydonal Ospina; Marcelo R. P. Ferreira (2023). Consistency of variables for the dataset Iris Plant. [Dataset]. http://doi.org/10.1371/journal.pone.0259266.t006
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 8, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Anny K. G. Rodrigues; Raydonal Ospina; Marcelo R. P. Ferreira
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Consistency of variables for the dataset Iris Plant.

  13. Performance of the VKFCM-K-LP clustering algorithm with the WDS, PDS and OCS...

    • plos.figshare.com
    xls
    Updated Jun 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anny K. G. Rodrigues; Raydonal Ospina; Marcelo R. P. Ferreira (2023). Performance of the VKFCM-K-LP clustering algorithm with the WDS, PDS and OCS strategies for the dataset Iris Plant. [Dataset]. http://doi.org/10.1371/journal.pone.0259266.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 8, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Anny K. G. Rodrigues; Raydonal Ospina; Marcelo R. P. Ferreira
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Performance of the VKFCM-K-LP clustering algorithm with the WDS, PDS and OCS strategies for the dataset Iris Plant.

  14. Dataset for Feature Scaling [Standardization]

    • kaggle.com
    zip
    Updated Nov 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mit Gandhi (2024). Dataset for Feature Scaling [Standardization] [Dataset]. https://www.kaggle.com/datasets/mitgandhi10/dataset-for-feature-scaling-standardization
    Explore at:
    zip(951 bytes)Available download formats
    Dataset updated
    Nov 30, 2024
    Authors
    Mit Gandhi
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This dataset contains information about three species of Iris flowers: Setosa, Versicolour, and Virginica. It is a well-known dataset in the machine learning and statistics communities, often used for classification and clustering tasks. Each row represents a sample of an Iris flower, with measurements of its physical attributes and the corresponding target label.

    Dataset Features: sepal length (cm): The length of the sepal in centimeters. sepal width (cm): The width of the sepal in centimeters. petal length (cm): The length of the petal in centimeters. petal width (cm): The width of the petal in centimeters. target: A numerical label (0, 1, or 2) indicating the flower species: 0: Setosa 1: Versicolour 2: Virginica

    Purpose: This dataset can be used for: Supervised learning tasks, particularly classification. Exploratory data analysis and visualization of flower attributes. Understanding the application of machine learning algorithms like decision trees, KNN, and support vector machines.

    Source: This is a modified version of the classic Iris flower dataset, often used for beginner-level machine learning projects and demonstrations.

    Potential Use Cases: Training machine learning models for flower classification. Practicing data preprocessing, feature scaling, and visualization techniques. Understanding the relationships between features through scatter plots and correlation analysis.

  15. Consistency of variables in the VKFCM-K-LP clustering with the imputation of...

    • plos.figshare.com
    xls
    Updated Jun 2, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anny K. G. Rodrigues; Raydonal Ospina; Marcelo R. P. Ferreira (2023). Consistency of variables in the VKFCM-K-LP clustering with the imputation of missing values using mean values for the Iris Plant dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0259266.t013
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Anny K. G. Rodrigues; Raydonal Ospina; Marcelo R. P. Ferreira
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Consistency of variables in the VKFCM-K-LP clustering with the imputation of missing values using mean values for the Iris Plant dataset.

  16. iris kmeans clustering

    • kaggle.com
    zip
    Updated Oct 8, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ANSHAD RAHMAN (2020). iris kmeans clustering [Dataset]. https://www.kaggle.com/anshadrahmanp/iris-kmeans-clustering
    Explore at:
    zip(273441 bytes)Available download formats
    Dataset updated
    Oct 8, 2020
    Authors
    ANSHAD RAHMAN
    Description

    Dataset

    This dataset was created by ANSHAD RAHMAN

    Contents

  17. f

    DataSheet_1_Metabolic Profiling in Human Fibroblasts Enables Subtype...

    • datasetcatalog.nlm.nih.gov
    • frontiersin.figshare.com
    Updated Nov 23, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hannibal, Luciana; Klotz, Katharina; Wingert, Victoria; Bierschenk, Iris; Theimer, Jule; Nitschke, Roland; GrĂŒnert, Sarah C.; Spiekerkoetter, Ute (2020). DataSheet_1_Metabolic Profiling in Human Fibroblasts Enables Subtype Clustering in Glycogen Storage Disease.pdf [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000511613
    Explore at:
    Dataset updated
    Nov 23, 2020
    Authors
    Hannibal, Luciana; Klotz, Katharina; Wingert, Victoria; Bierschenk, Iris; Theimer, Jule; Nitschke, Roland; GrĂŒnert, Sarah C.; Spiekerkoetter, Ute
    Description

    Glycogen storage disease subtypes I and III (GSD I and GSD III) are monogenic inherited disorders of metabolism that disrupt glycogen metabolism. Unavailability of glucose in GSD I and induction of gluconeogenesis in GSD III modify energy sources and possibly, mitochondrial function. Abnormal mitochondrial structure and function were described in mice with GSD Ia, yet significantly less research is available in human cells and ketotic forms of the disease. We hypothesized that impaired glycogen storage results in distinct metabolic phenotypes in the extra- and intracellular compartments that may contribute to pathogenesis. Herein, we examined mitochondrial organization in live cells by spinning-disk confocal microscopy and profiled extra- and intracellular metabolites by targeted LC-MS/MS in cultured fibroblasts from healthy controls and from patients with GSD Ia, GSD Ib, and GSD III. Results from live imaging revealed that mitochondrial content and network morphology of GSD cells are comparable to that of healthy controls. Likewise, healthy controls and GSD cells exhibited comparable basal oxygen consumption rates. Targeted metabolomics followed by principal component analysis (PCA) and hierarchical clustering (HC) uncovered metabolically distinct poises of healthy controls and GSD subtypes. Assessment of individual metabolites recapitulated dysfunctional energy production (glycolysis, Krebs cycle, succinate), reduced creatinine export in GSD Ia and GSD III, and reduced antioxidant defense of the cysteine and glutathione systems. Our study serves as proof-of-concept that extra- and intracellular metabolite profiles distinguish glycogen storage disease subtypes from healthy controls. We posit that metabolite profiles provide hints to disease mechanisms as well as to nutritional and pharmacological elements that may optimize current treatment strategies.

  18. f

    Iris Davies Bouldin measurement.

    • figshare.com
    xls
    Updated Jun 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    WAQAR ISHAQ; ELIYA BUYUKKAYA; MUSHTAQ ALI; ZAKIR KHAN (2023). Iris Davies Bouldin measurement. [Dataset]. http://doi.org/10.1371/journal.pone.0244691.t006
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 12, 2023
    Dataset provided by
    PLOS ONE
    Authors
    WAQAR ISHAQ; ELIYA BUYUKKAYA; MUSHTAQ ALI; ZAKIR KHAN
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Iris Davies Bouldin measurement.

  19. Topological descriptors obtained from the lupus murine spleen data set which...

    • figshare.com
    xlsx
    Updated Oct 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maria Torras-Pérez; Iris H. R. Yoon; Praveen Weeratunga; Ling-Pei Ho; Helen M. Byrne; Ulrike Tillmann; Heather A. Harrington (2025). Topological descriptors obtained from the lupus murine spleen data set which yield a correct 2- or 4-clustering into disease stages, sorted by density of the cell type, degree of homology and silhouette score of the 2-clustering. [Dataset]. http://doi.org/10.1371/journal.pcbi.1013460.s008
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Oct 15, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Maria Torras-Pérez; Iris H. R. Yoon; Praveen Weeratunga; Ling-Pei Ho; Helen M. Byrne; Ulrike Tillmann; Heather A. Harrington
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In the first column, we include the average number of cells in that type of point cloud across all samples. We also include the average number of landmarks, if the witness complex is computed for that cell type. (XLSX)

  20. Voronoi Tessellation Captures Very Early Clustering of Single Primary Cells...

    • plos.figshare.com
    tiff
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Iris Hödl; Josef Hödl; Anders Wörman; Gabriel Singer; Katharina Besemer; Tom J. Battin (2023). Voronoi Tessellation Captures Very Early Clustering of Single Primary Cells as Induced by Interactions in Nascent Biofilms [Dataset]. http://doi.org/10.1371/journal.pone.0026368
    Explore at:
    tiffAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Iris Hödl; Josef Hödl; Anders Wörman; Gabriel Singer; Katharina Besemer; Tom J. Battin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Biofilms dominate microbial life in numerous aquatic ecosystems, and in engineered and medical systems, as well. The formation of biofilms is initiated by single primary cells colonizing surfaces from the bulk liquid. The next steps from primary cells towards the first cell clusters as the initial step of biofilm formation remain relatively poorly studied. Clonal growth and random migration of primary cells are traditionally considered as the dominant processes leading to organized microcolonies in laboratory grown monocultures. Using Voronoi tessellation, we show that the spatial distribution of primary cells colonizing initially sterile surfaces from natural streamwater community deviates from uniform randomness already during the very early colonisation. The deviation from uniform randomness increased with colonisation — despite the absence of cell reproduction — and was even more pronounced when the flow of water above biofilms was multidirectional and shear stress elevated. We propose a simple mechanistic model that captures interactions, such as cell-to-cell signalling or chemical surface conditioning, to simulate the observed distribution patterns. Model predictions match empirical observations reasonably well, highlighting the role of biotic interactions even already during very early biofilm formation despite few and distant cells. The transition from single primary cells to clustering accelerated by biotic interactions rather than by reproduction may be particularly advantageous in harsh environments — the rule rather than the exception outside the laboratory.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Rifki Ilham (2023). Clustering Iris Data Set [Dataset]. https://www.kaggle.com/datasets/rifkiilham/clustering-iris-data-set/suggestions
Organization logo

Clustering Iris Data Set

Explore at:
144 scholarly articles cite this dataset (View in Google Scholar)
zip(11024 bytes)Available download formats
Dataset updated
Sep 2, 2023
Authors
Rifki Ilham
Description

The Iris flower data set or Fisher's Iris data set is a multivariate data set used and made famous by the British statistician and biologist Ronald Fisher in his 1936 paper The use of multiple measurements in taxonomic problems as an example of linear discriminant analysis. Please use this data set to clustering the iris flowers data. You can use k-means clustering algorithm.

Search
Clear search
Close search
Google apps
Main menu