6 datasets found

Iris Species
kaggle.com
zip
Updated Sep 27, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UCI Machine Learning (2016). Iris Species [Dataset]. https://www.kaggle.com/datasets/uciml/iris
Explore at:
zip(3687 bytes)Available download formats
Dataset updated
Sep 27, 2016
Dataset authored and provided by
UCI Machine Learning
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
The Iris dataset was used in R.A. Fisher's classic 1936 paper, The Use of Multiple Measurements in Taxonomic Problems, and can also be found on the UCI Machine Learning Repository.

It includes three iris species with 50 samples each as well as some properties about each flower. One flower species is linearly separable from the other two, but the other two are not linearly separable from each other.

The columns in this dataset are:

Id

SepalLengthCm

SepalWidthCm

PetalLengthCm

PetalWidthCm

Species
Iris flower prediction using streamlit in python
kaggle.com
Updated Mar 23, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
sadaf koondhar (2023). Iris flower prediction using streamlit in python [Dataset]. https://www.kaggle.com/datasets/sadafkoondhar/iris-flower-prediction-using-streamlit-in-python
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 23, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
sadaf koondhar
Description
Dataset

This dataset was created by sadaf koondhar

Contents
f
Iris Webpage
figshare.com
html
Updated Mar 9, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jesus Rogel-Salazar (2020). Iris Webpage [Dataset]. http://doi.org/10.6084/m9.figshare.7053392.v4
Explore at:
htmlAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.7053392.v4
Dataset updated
Mar 9, 2020
Dataset provided by
figshare
Authors
Jesus Rogel-Salazar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A simple web page containing Fisher's Iris Dataset.
All Seaborn Built-in Datasets 📊✨
kaggle.com
Updated Aug 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abdelrahman Mohamed (2024). All Seaborn Built-in Datasets 📊✨ [Dataset]. https://www.kaggle.com/datasets/abdoomoh/all-seaborn-built-in-datasets
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 27, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Abdelrahman Mohamed
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Description: - This dataset includes all 22 built-in datasets from the Seaborn library, a widely used Python data visualization tool. Seaborn's built-in datasets are essential resources for anyone interested in practicing data analysis, visualization, and machine learning. They span a wide range of topics, from classic datasets like the Iris flower classification to real-world data such as Titanic survival records and diamond characteristics.

Included Datasets:

Anagrams: Analysis of word anagram patterns.

Anscombe: Anscombe's quartet demonstrating the importance of data visualization.

Attention: Data on attention span variations in different scenarios.

Brain Networks: Connectivity data within brain networks.

Car Crashes: US car crash statistics.

Diamonds: Data on diamond properties including price, cut, and clarity.

Dots: Randomly generated data for scatter plot visualization.

Dow Jones: Historical records of the Dow Jones Industrial Average.

Exercise: The relationship between exercise and health metrics.

Flights: Monthly passenger numbers on flights.

FMRI: Functional MRI data capturing brain activity.

Geyser: Eruption times of the Old Faithful geyser.

Glue: Strength of glue under different conditions.

Health Expenditure: Health expenditure statistics across countries.

Iris: Famous dataset for classifying Iris species.

MPG: Miles per gallon for various vehicles.

Penguins: Data on penguin species and their features.

Planets: Characteristics of discovered exoplanets.

Sea Ice: Measurements of sea ice extent.

Taxis: Taxi trips data in a city.

Tips: Tipping data collected from a restaurant.

Titanic: Survival data from the Titanic disaster.

This complete collection serves as an excellent starting point for anyone looking to improve their data science skills, offering a wide array of datasets suitable for both beginners and advanced users.
Visual ECoG dataset
openneuro.org
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Iris Groen; Kenichi Yuasa; Amber Brands; Giovanni Piantoni; Stephanie Montenegro; Adeen Flinker; Sasha Devore; Orrin Devinsky; Werner Doyle; Patricia Dugan; Daniel Friedman; Nick Ramsey; Natalia Petridou; Jonathan Winawer (2025). Visual ECoG dataset [Dataset]. http://doi.org/10.18112/openneuro.ds004194.v3.0.0
Explore at:
Unique identifier
https://doi.org/10.18112/openneuro.ds004194.v3.0.0
Dataset updated
Apr 1, 2025
Dataset provided by
OpenNeurohttps://openneuro.org/
Authors
Iris Groen; Kenichi Yuasa; Amber Brands; Giovanni Piantoni; Stephanie Montenegro; Adeen Flinker; Sasha Devore; Orrin Devinsky; Werner Doyle; Patricia Dugan; Daniel Friedman; Nick Ramsey; Natalia Petridou; Jonathan Winawer
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Details related to access to the data

Contact person

Please contact Iris Groen (i.i.a.groen@uva.nl, https://orcid.org/0000-0002-5536-6128) for more information.

Please see the following papers for more details on the data collection and preprocessing:

Groen IIA, Piantoni G, Montenegro S, Flinker A, Devore S, Devinsky O, Doyle W, Dugan P, Friedman D, Ramsey N, Petridou N, Winawer JA (2022) Temporal dynamics of neural responses in human visual cortex. The Journal of Neuroscience 42(40):7562-7580 (https://doi.org/10.1523/JNEUROSCI.1812-21.2022)

Yuasa K, Groen IIA, Piantoni G, Montenegro S, Flinker A, Devore S, Devinsky O, Doyle W, Dugan P, Friedman D, Ramsey N, Petridou N, Winawer JA. Precise Spatial Tuning of Visually Driven Alpha Oscillations in Human Visual Cortex. eLife12:RP90387 https://doi.org/10.7554/eLife.90387.1

Brands AM, Devore S, Devinsky O, Doyle W, Flinker A, Friedman D, Dugan P, Winawer JA, Groen IIA (2024). Temporal dynamics of short-term neural adaptation in human visual cortex. https://doi.org/10.1101/2023.09.13.557378

Practical information to access the data

Processed data and model fits reported in Groen et al., (2022) are available in derivatives/Groenetal2022TemporalDynamicsECoG as matlab .mat files. Matlab code to load, process and plot these data (including 3D renderings of the participant's surface reconstructions and electrode positions) is available in https://github.com/WinawerLab/ECoG_utils and https://github.com/irisgroen/temporalECoG. These repositories have dependencies on other Matlab toolboxes (e.g., FieldTrip). See instructions on Github for relevant links and guidelines.

Processed data and model fits reported in Yuasa et al., (2023) are available in the Github repositories described in the paper.

Processed data and model fits reported in Brands et al., (2024) are available in derivatives/Brandsetal2024TemporalAdaptationECoGCategories as python .py files. Python code to process and analyze these data is available in the Github repositories described in the paper.

Overview

Project name

Visual ECoG dataset

Years that the project ran

Data were collected between 2017-2020. Exact recording dates have been scrubbed for anonymization purposes.

Brief overview of the tasks in the experiment

Participants sub-p01 to sub-p11 viewed grayscale visual pattern stimuli that were varied in temporal or spatial properties. Participans sub-p11 to sub-p14 additionally saw color images of different image classes (faces, bodies, buildings, objects, scenes, and scrambled) that were varied in temporal properties. See 'Independent Variables' below for more details.

In all tasks, participants were instructed to fixate a cross or point in the center of the screen and monitor it for a color change, i.e. to perform a stimulus-orthogonal task (see the task-specific _events.json files, e.g., task-prf_events.json, for further details).

Description of the contents of the dataset

The data consists of cortical iEEG recordings in 14 epilepsy patients in response to visual stimulation. Patients were implanted with standard clinical surface (grid) and depth electrodes. Two patients were additionally implanted with a high-density research grid. In addition to the ieeg recordings, pre-implantation MRI T1 scans are provided for the purpose of localizing electrodes. Participants performed a varying number of tasks and runs.

Independent variables

The data are divided in 6 different sets of stimulus types or events:

prf: grayscale, oriented bar stimuli consisting of curved, band-pass filtered lines that were swept across the screen (up to (~16 degree of visual angle) in a fixed order for the purpose of estimating spatial population receptive fields (pRFs).

spatialpattern: grayscale, centrally presented pattern stimuli (~16 degree of visual angle diameter) consisting of curved, band-pass filtered lines that were systematically varied in level of contrast and density, as well as various oriented grating stimuli.

temporalpattern: grayscale, centrally presented pattern stimuli (~16 degree of visual angle diameter) consisting of curved, band-pass filtered lines that were systematically varied in temporal duration and interval.

soc: combination of the spatialpattern and temporalpattern stimuli.

sixcatloctemporal: color images of six stimulus classes: faces, bodies (hands/feet only), buildings, objects, scenes and scrambled, systematically varied in temporal duration and interval, whereby interval stimuli consisted of direct repeats of the identical image.

sixcatlocisidiff/sixcatlocdiffisi: color images of six stimulus classes: faces, bodies (hands/feet only), buildings, objects, scenes and scrambled, systematically varied in temporal duration and interval, whereby the first interval stimulus was followed by images from either the same or a different category (but not the identical image).

Participant-, task- and run-specific stimuli are provided in the /stimuli folder as matlab .mat files.

Dependent variables

The main BIDS folder contains the raw voltage data, split up in individual task runs. The /derivatives/ECoGCAR folder contains common-average-referenced version of the data. The /derivatives/ECoGBroadband folder contains time-varying broadband responses estimated by band-pass filtering the common-average-referenced voltage data and taking the average power envelope. The /derivatives/ECoGPreprocessed folder contains epoched trials used in Brands et al., (2024). The /derivatives/freesurfer folder contains surface reconstructions of each participant's T1, along with retinotopic atlas files. The /derivatives/Groen2022TemporalDynamicsECoG contains preprocessed data and model fits that can be used to reproduce the results reported in Groen et al., (2022). The /derivatives/Brands2024TemporalAdaptationECoG contains preprocessed data and model fits that can be used to reproduce the results reported in Brands et al., (2024).

Quality assessment of the data

Data quality and number of trials per subjects varies considerably across patients, for various reasons.

First, for each recording session, attempts were made to optimize the environment for running visual experiments; e.g. room illumination was stabilized as much as possible by closing blinds when available, the visual display was calibrated (for most patients), and interference from medical staff or visitors was minimized. However, it was not possible to equate this with great precision across patients and sessions/runs.

Second, implantations were determined based on clinical needs and electrode locations therefore vary across participants. The strength and robustness of the neural responses varies greatly with the electrode location (e.g. early vs higher-level visual cortex), as well as with uncontrolled factors such as how well the electrode made contact with the cortex and whether it was primarily situated on grey matter (surface/grid electrodes) or could be located in white matter (some depth electrodes). Electrodes that were marked as containing epileptic activity by clinicians, or that did not have good signal based on visual inspection of the raw data, are marked as 'bad' in the channels.tsv files.

Third, patients varied greatly in their cognitive abilities and mental/medical state, which affected their ability to follow task instructions, e.g. to remain alert and fixation. Some patients were able to perform repeated runs of multiple tasks across multiple sessions, while others only managed to do a few runs.

All patients included in this dataset have sufficiently good responses in some electrodes/tasks as judged by Groen et al., (2022) and Brands et al., (2024). However, when using this dataset to address further research questions, it is advisable to set stringent requirements on electrode and trial selection. See Groen et al., (2022) and associated code repository for an example preprocessing pipeline that selected for robust visual responses to temporally- and contrast-varying stimuli.

Methods

Subjects

All participants were intractable epilepsy patients who were undergoing ECoG for the purpose of monitoring seizures. Participants were included if their implantation covered parts of visual cortex and if they consented to participate in research.

Apparatus

Data were collected in a clinical setting, i.e. at bedside in the patient's hospital room. Information about iEEG recording apparatus is provided the meta data for each patient. Information about the visual stimulation equipment and behavioral response recordings are provided in Groen et al., (2022), Yuasa et al., (2023) and Brands et al., (2024).

Experimental location

Data were collected at NYU University Langone Hospital (New York, USA) or at University Medical Center Utrecht (The Netherlands).

Missing data

Stimulus files are missing for a few runs of sub-02. These are marked as N/A in the associated event files.

Notes

Further participant-specific notes:

For sub-03 and sub-04 the spatial pattern and temporal pattern stimuli are combined in the soc task runs, for the remaining participants these are split across the spatialpattern and temporalpattern task runs.

The pRF task from sub-04 has different prf parameters (bar duration and gap).

The first two runs of the pRF task from sub-05 are not of good quality (participant repeatedly broke fixation). In addition, the triggers in all pRF runs from sub-05 are not correct due to a stimulus coding problem and will need to be re-interpolated if one wishes to use these data.

Participants sub-10 and sub-11 have high density grids in addition to clinical grids.

Note that all stimuli and stimulus parameters can be found in the participant-specific stimulus *.mat files.
Earthquake Early Warning Dataset
figshare.com
txt
Updated Nov 20, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kevin Fauvel; Daniel Balouek-Thomert; Diego Melgar; Pedro Silva; Anthony Simonet; Gabriel Antoniu; Alexandru Costan; Véronique Masson; Manish Parashar; Ivan Rodero; Alexandre Termier (2019). Earthquake Early Warning Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.9758555.v3
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.9758555.v3
Dataset updated
Nov 20, 2019
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Kevin Fauvel; Daniel Balouek-Thomert; Diego Melgar; Pedro Silva; Anthony Simonet; Gabriel Antoniu; Alexandru Costan; Véronique Masson; Manish Parashar; Ivan Rodero; Alexandre Termier
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is composed of GPS stations (1 file) and seismometers (1 file) multivariate time series data associated with three types of events (normal activity / medium earthquakes / large earthquakes). Files Format: plain textFiles Creation Date: 02/09/2019Data Type: multivariate time seriesNumber of Dimensions: 3 (east-west, north-south and up-down)Time Series Length: 60 (one data point per second)Period: 2001-2018Geographic Location: -62 ≤ latitude ≤ 73, -179 ≤ longitude ≤ 25Data Collection - Large Earthquakes: GPS stations and seismometers data are obtained from the archive [1]. This archive includes 29 large eathquakes. In order to be able to adopt a homogeneous labeling method, dataset is limited to the data available from the American Incorporated Research Institutions for Seismology - IRIS (14 large earthquakes remaining over 29). > GPS stations (14 events): High Rate Global Navigation Satellite System (HR-GNSS) displacement data (1-5Hz). Raw observations have been processed with a precise point positioning algorithm [2] to obtain displacement time series in geodetic coordinates. Undifferenced GNSS ambiguities were fixed to integers to improve accuracy, especially over the low frequency band of tens of seconds [3]. Then, coordinates have been rotated to a local east-west, north-south and up-down system. > Seismometers (14 events): seismometers strong motion data (1-10Hz). Channel files are specifying the units, sample rates, and gains of each channel. - Normal Activity / Medium Earthquakes: > GPS stations (255 events: 255 normal activity): High Rate Global Navigation Satellite System (HR-GNSS) normal activity displacement data (1Hz). GPS data outside of large earthquake periods can be considered as normal activity (noise). Data is downloaded from [4], an archive maintained by the University of Oregon which stores a representative extract of GPS noise. It is an archive of real-time three component positions for 240 stations in the western U.S. from California to Alaska and spanning from October 2018 to the present day. The raw GPS data (observations of phase and range to visible satellites) are processed with an algorithm called FastLane [5] and converted to 1 Hz sampled positions. Normal activity MTS are randomly sampled from the archive to match the number of seismometers events and to keep a ratio above 30% between the number of large earthquakes MTS and normal activity in order not to encounter a class imbalance issue.> Seismometers (255 events: 170 normal activity, 85 medium earthquakes): seismometers strong motion data (1-10Hz). Time series data collected from the international Federation of Digital Seismograph Networks (FDSN) client available in Python package ObsPy [6]. Channel information is specifying the units, sample rates, and gains of each channel. The number of medium earthquakes is calculated by the ratio of medium over large earthquakes during the past 10 years in the region. A ratio above 30% is kept between the number of 60 seconds MTS corresponding to earthquakes (medium + large) and total (earthquakes + normal activity) number of MTS to prevent a class imbalance issue. The number of GPS stations and seismometers for each event varies (tens to thousands). Preprocessing:- Conversion (seismometers): data are available as digital signal, which is speciﬁc for each sensor. Therefore, each instrument digital signal is converted to its physical signal (acceleration) to obtain comparable seismometers data- Aggregation (GPS stations and seismometers): data aggregation by second (mean)Variables:- event_id: unique ID of an event. Dataset is composed of 269 events.- event_time: timestamp of the event occurence - event_magnitude: magnitude of the earthquake (Richter scale)- event_latitude: latitude of the event recorded (degrees)- event_longitude: longitude of the event recorded (degrees)- event_depth: distance below Earth's surface where earthquake happened (km)- mts_id: unique multivariate time series ID. Dataset is composed of 2,072 MTS from GPS stations and 13,265 MTS from seismometers.- station: sensor name (GPS station or seismometer)- station_latitude: sensor (GPS station or seismometer) latitude (degrees)- station_longitude: sensor (GPS station or seismometer) longitude (degrees)- timestamp: timestamp of the multivariate time series- dimension_E: East-West component of the sensor (GPS station or seismometer) signal (cm/s/s)- dimension_N: North-South component of the sensor (GPS station or seismometer) signal (cm/s/s)- dimension_Z: Up-Down component of the sensor (GPS station or seismometer) signal (cm/s/s)- label: label associated with the event. There are 3 labels: normal activity (GPS stations: 255 events, seismometers: 170 events) / medium earthquake (GPS stations: 0 event, seismometers: 85 events) / large earthquake (GPS stations: 14 events, seismometers: 14 events). EEW relies on the detection of the primary wave (P-wave) before the secondary wave (damaging wave) arrive. P-waves follow a propagation model (IASP91 [7]). Therefore, each MTS is labeled based on the P-wave arrival time on each sensor (seismometers, GPS stations) calculated with the propagation model.[1] Ruhl, C. J., Melgar, D., Chung, A. I., Grapenthin, R. and Allen, R. M. 2019. Quantifying the value of real‐time geodetic constraints for earthquake early warning using a global seismic and geodetic data set. Journal of Geophysical Research: Solid Earth 124:3819-3837.[2] Geng, J., Bock, Y., Melgar, D, Crowell, B. W., and Haase, J. S. 2013. A new seismogeodetic approach applied to GPS and accelerometer observations of the 2012 Brawley seismic swarm: Implications for earthquake early warning. Geochemistry, Geophysics, Geosystems 14:2124-2142.[3] Geng, J., Jiang, P., and Liu, J. 2017. Integrating GPS with GLONASS for high‐rate seismogeodesy. Geophysical Research Letters 44:3139-3146.[4] http://tunguska.uoregon.edu/rtgnss/data/cwu/mseed/[5] Melgar, D., Melbourne, T., Crowell, B., Geng, J, Szeliga, W., Scrivner, C., Santillan, M. and Goldberg, D. 2019. Real-Time High-Rate GNSS Displacements: Performance Demonstration During the 2019 Ridgecrest, CA Earthquakes (Version 1.0) [Data set]. Zenodo.[6] https://docs.obspy.org/packages/obspy.clients.fdsn.html[7] Kennet, B. L. N. 1991. Iaspei 1991 Seismological Tables. Terra Nova 3:122–122.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

UCI Machine Learning (2016). Iris Species [Dataset]. https://www.kaggle.com/datasets/uciml/iris

Iris Species

Classify iris plants into three species in this classic dataset

Explore at:

39 scholarly articles cite this dataset (View in Google Scholar)

zip(3687 bytes)Available download formats

Dataset updated

Sep 27, 2016

Dataset authored and provided by

UCI Machine Learning

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

The Iris dataset was used in R.A. Fisher's classic 1936 paper, The Use of Multiple Measurements in Taxonomic Problems, and can also be found on the UCI Machine Learning Repository.

It includes three iris species with 50 samples each as well as some properties about each flower. One flower species is linearly separable from the other two, but the other two are not linearly separable from each other.

The columns in this dataset are:

Id
SepalLengthCm
SepalWidthCm
PetalLengthCm
PetalWidthCm
Species

Clear search

Close search

Google apps

Main menu

Iris Species

Iris flower prediction using streamlit in python

Dataset

Contents

Iris Webpage

All Seaborn Built-in Datasets 📊✨

Visual ECoG dataset

Details related to access to the data

Overview

Methods

Notes

Earthquake Early Warning Dataset

Iris Species

Classify iris plants into three species in this classic dataset