17 datasets found
  1. Classification Analysis Using Python

    • kaggle.com
    Updated Jul 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nibedita Sahu (2023). Classification Analysis Using Python [Dataset]. https://www.kaggle.com/datasets/nibeditasahu/classification-analysis-using-python
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 3, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Nibedita Sahu
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    The Iris dataset is a classic and widely used dataset in machine learning for classification tasks. It consists of measurements of different iris flowers, including sepal length, sepal width, petal length, and petal width, along with their corresponding species. With a total of 150 samples, the dataset is balanced and serves as an excellent choice for understanding and implementing classification algorithms. This notebook explores the dataset, preprocesses the data, builds a decision tree classification model, and evaluates its performance, showcasing the effectiveness of decision trees in solving classification problems.

  2. Explore data formats and ingestion methods

    • kaggle.com
    Updated Feb 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gabriel Preda (2021). Explore data formats and ingestion methods [Dataset]. https://www.kaggle.com/datasets/gpreda/iris-dataset/discussion?sort=undefined
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 12, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Gabriel Preda
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Why this Dataset

    This dataset brings to you Iris Dataset in several data formats (see more details in the next sections).

    You can use it to test the ingestion of data in all these formats using Python or R libraries. We also prepared Python Jupyter Notebook and R Markdown report that input all these formats:

    Iris Dataset

    Iris Dataset was created by R. A. Fisher and donated by Michael Marshall.

    Repository on UCI site: https://archive.ics.uci.edu/ml/datasets/iris

    Data Source: https://archive.ics.uci.edu/ml/machine-learning-databases/iris/

    The file downloaded is iris.data and is formatted as a comma delimited file.

    This small data collection was created to help you test your skills with ingesting various data formats.

    Content

    This file was processed to convert the data in the following formats: * csv - comma separated values format * tsv - tab separated values format * parquet - parquet format
    * feather - feather format * parquet.gzip - compressed parquet format * h5 - hdf5 format * pickle - Python binary object file - pickle format * xslx - Excel format
    * npy - Numpy (Python library) binary format * npz - Numpy (Python library) binary compressed format * rds - Rds (R specific data format) binary format

    Acknowledgements

    I would like to acknowledge the work of the creator of the dataset - R. A. Fisher and of the donor - Michael Marshall.

    Inspiration

    Use these data formats to test your skills in ingesting data in various formats.

  3. Iris Species Dataset and Database

    • kaggle.com
    Updated May 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ghanshyam Saini (2025). Iris Species Dataset and Database [Dataset]. https://www.kaggle.com/datasets/ghnshymsaini/iris-species-dataset-and-database
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 15, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ghanshyam Saini
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Iris Flower Dataset

    This is a classic and very widely used dataset in machine learning and statistics, often serving as a first dataset for classification problems. Introduced by the British statistician and biologist Ronald Fisher in his 1936 paper "The use of multiple measurements in taxonomic problems," it is a foundational resource for learning classification algorithms.

    Overview:

    The dataset contains measurements for 150 samples of iris flowers. Each sample belongs to one of three species of iris:

    • Iris setosa
    • Iris versicolor
    • Iris virginica

    For each flower, four features were measured:

    • Sepal length (in cm)
    • Sepal width (in cm)
    • Petal length (in cm)
    • Petal width (in cm)

    The goal is typically to build a model that can classify iris flowers into their correct species based on these four features.

    File Structure:

    The dataset is usually provided as a single CSV (Comma Separated Values) file, often named iris.csv or similar. This file typically contains the following columns:

    1. sepal_length (cm): Numerical. The length of the sepal of the iris flower.
    2. sepal_width (cm): Numerical. The width of the sepal of the iris flower.
    3. petal_length (cm): Numerical. The length of the petal of the iris flower.
    4. petal_width (cm): Numerical. The width of the petal of the iris flower.
    5. species: Categorical. The species of the iris flower (either 'setosa', 'versicolor', or 'virginica'). This is the target variable for classification.

    Content of the Data:

    The dataset contains an equal number of samples (50) for each of the three iris species. The measurements of the sepal and petal dimensions vary between the species, allowing for their differentiation using machine learning models.

    How to Use This Dataset:

    1. Download the iris.csv file.
    2. Load the data using libraries like Pandas in Python.
    3. Explore the data through visualization and statistical analysis to understand the relationships between the features and the different species.
    4. Build classification models (e.g., Logistic Regression, Support Vector Machines, Decision Trees, K-Nearest Neighbors) using the sepal and petal measurements as features and the 'species' column as the target variable.
    5. Evaluate the performance of your model using appropriate metrics (e.g., accuracy, precision, recall, F1-score).
    6. The dataset is small and well-behaved, making it excellent for learning and experimenting with various classification techniques.

    Citation:

    When using the Iris dataset, it is common to cite Ronald Fisher's original work:

    Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7(2), 179-188.

    Data Contribution:

    Thank you for providing this classic and fundamental dataset to the Kaggle community. The Iris dataset remains an invaluable resource for both beginners learning the basics of classification and experienced practitioners testing new algorithms. Its simplicity and clear class separation make it an ideal starting point for many data science projects.

    If you find this dataset description helpful and the dataset itself useful for your learning or projects, please consider giving it an upvote after downloading. Your appreciation is valuable!

  4. Iris Webpage

    • figshare.com
    html
    Updated Mar 9, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jesus Rogel-Salazar (2020). Iris Webpage [Dataset]. http://doi.org/10.6084/m9.figshare.7053392.v4
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Mar 9, 2020
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Jesus Rogel-Salazar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A simple web page containing Fisher's Iris Dataset.

  5. iris_data

    • kaggle.com
    Updated Aug 1, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hamza Tanç (2020). iris_data [Dataset]. https://www.kaggle.com/hamzatanc/iris-data/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 1, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Hamza Tanç
    Description

    Dataset

    This dataset was created by Hamza Tanç

    Contents

  6. Iris Data Analysis and Machine Learning in Python

    • kaggle.com
    Updated Jul 10, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anjali Wani (2018). Iris Data Analysis and Machine Learning in Python [Dataset]. https://www.kaggle.com/datasets/anjwani96/iris-data-analysis-and-machine-learning-in-python/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 10, 2018
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Anjali Wani
    Description

    Dataset

    This dataset was created by Anjali Wani

    Contents

  7. i

    The EarthScope DS Noise Toolkit

    • ds.iris.edu
    Updated Apr 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Help (2025). The EarthScope DS Noise Toolkit [Dataset]. https://ds.iris.edu/ds/products/noise-toolkit/
    Explore at:
    Dataset updated
    Apr 23, 2025
    Authors
    Data Help
    Description

    The EarthScope DS Noise Toolkit is a collection of 3 open-source Python script bundles for:

    Computing Power Spectral Densities (PSD) of waveform data

    Performing microseism energy computations from PSDs

    Performing frequency dependent polarization analysis of seismograms

    ✓ https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.5.1/css/all.min.css">The Noise Toolkit code is available from the "EarthScope Noise Toolkit (NTK) GitHub repository":https://github.com/iris-edu/noise-toolkit.

  8. Data from: A successful short-term volcanic eruption forecasting using...

    • zenodo.org
    • produccioncientifica.ugr.es
    • +1more
    bin, txt
    Updated Jul 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rey-Devesa, Pablo (1,2); Rey-Devesa, Pablo (1,2); Benitez Carmen (3); Benitez Carmen (3); Prudencio, Janire (1,2); Prudencio, Janire (1,2); Gutiérrez, Ligdamis (1,2); Gutiérrez, Ligdamis (1,2); Cortés, Guillermo (1,2); Manuel (3) Títos; Manuel (3) Títos; Koulakov, Iván (4,5); Koulakov, Iván (4,5); Luciano (6) Zuccarello; Luciano (6) Zuccarello; Ibáñez, Jesús (1,2); Ibáñez, Jesús (1,2); Cortés, Guillermo (1,2) (2022). A successful short-term volcanic eruption forecasting using seismic features: datasets and Sotware [Dataset]. http://doi.org/10.5281/zenodo.6821530
    Explore at:
    bin, txtAvailable download formats
    Dataset updated
    Jul 25, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Rey-Devesa, Pablo (1,2); Rey-Devesa, Pablo (1,2); Benitez Carmen (3); Benitez Carmen (3); Prudencio, Janire (1,2); Prudencio, Janire (1,2); Gutiérrez, Ligdamis (1,2); Gutiérrez, Ligdamis (1,2); Cortés, Guillermo (1,2); Manuel (3) Títos; Manuel (3) Títos; Koulakov, Iván (4,5); Koulakov, Iván (4,5); Luciano (6) Zuccarello; Luciano (6) Zuccarello; Ibáñez, Jesús (1,2); Ibáñez, Jesús (1,2); Cortés, Guillermo (1,2)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Successful Short-Term Volcanic Eruption Forecasting Using Seismic Features, Suplementary Material

    by Rey-Devesa (1,2), Benítez (3), Prudencio, Ligdamis Gutiérrez (1,2), Cortés (1,2), Titos (3), Koulakov (4,5), Zuccarello (6) and Ibáñez (1,2).


    Institutions associated:

    (1) Department of Theoretical Physics and Cosmos. Science Faculty. Avd. Fuentenueva s/n. University of Granada. 18071. Granada. Spain.

    (2) Andalusian Institute of Geophysiscs. Campus de Cartuja. University of Granada. C/Profesor Clavera 12. 18071. Granada. Spain.

    (3) Department of Signal Theory, Telematics and Communication. University of Granada. Informatics and Telecommunication School. 18071. Granada. Spain.

    (4) Trofimuk Institute of Petroleum Geology and Geophysics SB RAS, Prospekt Koptyuga, 3, 630090 Novosibirsk, Russia

    (5) Institute of the Earth’s Crust SB RAS, Lermontova 128, Irkutsk, Russia

    (6) Istituto Nazionale di Geofisica e Vulcanologia, Sezione di Pisa (INGV-Pisa), via Cesare Battisti, 53, 56125, Pisa, Italy.


    Acknowledgment:

    This study was partially supported by the Spanish FEMALE project (PID2019-106260GB-I00).
    P. Rey-Devesa was funded by the Ministerio de Ciencia e Innovación del Gobierno de España (MCIN),
    Agencia Estatal de Investigación (AEI), Fondo Social Europeo (FSE),
    and Programa Estatal de Promoción del Talento y su Empleabilidad en I+D+I Ayudas para contratos predoctorales para la formación de doctores 2020 (PRE2020-092719).
    Ivan Koulakov was supported by the Russian Science Foundation (Grant No. 20-17-00075).
    Luciano Zuccarello was supported by the INGV Pianeta Dinamico 2021 Tema 8 SOME project (grant no. CUP D53J1900017001)
    funded by the Italian Ministry of University and Research
    “Fondo finalizzato al rilancio degli investimenti delle amministrazioni centrali dello Stato e allo sviluppo del Paese, legge 145/2018”.
    English language editing was performed by Tornillo Scientific, UK.


    Data availability statement:

    1.- Seismic data from Kilauea, Augustine, Bezymianny (2007), and Mount St. Helens are available from the IRIS data repository (http://ds.iris.edu/seismon/index.phtml).
    (An example of the Python code to access the data is described below.)
    2.- Seismic data from Bezymianny (2017-2018) are available from Ivan Koulakov (ivan.science@gmail.com) upon request.
    3.- Seismic data from Mt. Etna are available from INGV-Italy upon request (http://terremoti.ingv.it/en/help),
    also available from the Zenodo data repository (https://doi.org/10.5281/zenodo.6849621).

    Access code in Python to download the records of Kilauea, Augustine and Mount St. Helens volcanoes, from the IRIS data repository.

    '''To access the raw signals please first install ObsPy and then execute following commands in a python console: '''

    Example:

    from obspy.core import UTCDateTime
    from obspy.clients.fdsn import Client
    import obspy.io.mseed
    client = Client('IRIS')
    t1 = UTCDateTime('2006-01-10T00:00:00')
    t2 = UTCDateTime('2006-01-12T00:00:00')
    raw_data = client.get_waveforms(
    network='AV',
    station='AUH',
    location='',
    channel='HHZ',
    starttime=t1,
    endtime=t2)

    '''To further download station information execute: '''

    xml = client.get_stations(network='AV',station='AUH',
    channel='HHZ',starttime=t1,endtime=t2,level='response')

    ''' 'To scale the data using the station’s meta-data: '''

    data = raw_data.remove_response(inventory=xml)

    ''' To filter, trim and plot the data execute: '''

    data.write("Augustine.mseed", format="MSEED")

    data.filter('bandpass',freqmin=1.0,freqmax=20)
    data.trim(t1+60,t2-60)
    data.plot()

    Contents:

    6 different Matlab codes. The principal code is called FeatureExtraction.
    The codes rsac.m and ReadMSEEDFast.m are for reading different format of data. (Not developed by the group)
    Seismic Data from Mt. Etna for using as an example.

  9. Visual ECoG dataset

    • openneuro.org
    Updated Apr 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Iris Groen; Kenichi Yuasa; Amber Brands; Giovanni Piantoni; Stephanie Montenegro; Adeen Flinker; Sasha Devore; Orrin Devinsky; Werner Doyle; Patricia Dugan; Daniel Friedman; Nick Ramsey; Natalia Petridou; Jonathan Winawer (2025). Visual ECoG dataset [Dataset]. http://doi.org/10.18112/openneuro.ds004194.v3.0.0
    Explore at:
    Dataset updated
    Apr 1, 2025
    Dataset provided by
    OpenNeurohttps://openneuro.org/
    Authors
    Iris Groen; Kenichi Yuasa; Amber Brands; Giovanni Piantoni; Stephanie Montenegro; Adeen Flinker; Sasha Devore; Orrin Devinsky; Werner Doyle; Patricia Dugan; Daniel Friedman; Nick Ramsey; Natalia Petridou; Jonathan Winawer
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Details related to access to the data

    • Contact person

    Please contact Iris Groen (i.i.a.groen@uva.nl, https://orcid.org/0000-0002-5536-6128) for more information.

    Please see the following papers for more details on the data collection and preprocessing:

    Groen IIA, Piantoni G, Montenegro S, Flinker A, Devore S, Devinsky O, Doyle W, Dugan P, Friedman D, Ramsey N, Petridou N, Winawer JA (2022) Temporal dynamics of neural responses in human visual cortex. The Journal of Neuroscience 42(40):7562-7580 (https://doi.org/10.1523/JNEUROSCI.1812-21.2022)

    Yuasa K, Groen IIA, Piantoni G, Montenegro S, Flinker A, Devore S, Devinsky O, Doyle W, Dugan P, Friedman D, Ramsey N, Petridou N, Winawer JA. Precise Spatial Tuning of Visually Driven Alpha Oscillations in Human Visual Cortex. eLife12:RP90387 https://doi.org/10.7554/eLife.90387.1

    Brands AM, Devore S, Devinsky O, Doyle W, Flinker A, Friedman D, Dugan P, Winawer JA, Groen IIA (2024). Temporal dynamics of short-term neural adaptation in human visual cortex. https://doi.org/10.1101/2023.09.13.557378

    • Practical information to access the data

    Processed data and model fits reported in Groen et al., (2022) are available in derivatives/Groenetal2022TemporalDynamicsECoG as matlab .mat files. Matlab code to load, process and plot these data (including 3D renderings of the participant's surface reconstructions and electrode positions) is available in https://github.com/WinawerLab/ECoG_utils and https://github.com/irisgroen/temporalECoG. These repositories have dependencies on other Matlab toolboxes (e.g., FieldTrip). See instructions on Github for relevant links and guidelines.

    Processed data and model fits reported in Yuasa et al., (2023) are available in the Github repositories described in the paper.

    Processed data and model fits reported in Brands et al., (2024) are available in derivatives/Brandsetal2024TemporalAdaptationECoGCategories as python .py files. Python code to process and analyze these data is available in the Github repositories described in the paper.

    Overview

    • Project name

    Visual ECoG dataset

    • Years that the project ran

    Data were collected between 2017-2020. Exact recording dates have been scrubbed for anonymization purposes.

    • Brief overview of the tasks in the experiment

    Participants sub-p01 to sub-p11 viewed grayscale visual pattern stimuli that were varied in temporal or spatial properties. Participans sub-p11 to sub-p14 additionally saw color images of different image classes (faces, bodies, buildings, objects, scenes, and scrambled) that were varied in temporal properties. See 'Independent Variables' below for more details.

    In all tasks, participants were instructed to fixate a cross or point in the center of the screen and monitor it for a color change, i.e. to perform a stimulus-orthogonal task (see the task-specific _events.json files, e.g., task-prf_events.json, for further details).

    • Description of the contents of the dataset

    The data consists of cortical iEEG recordings in 14 epilepsy patients in response to visual stimulation. Patients were implanted with standard clinical surface (grid) and depth electrodes. Two patients were additionally implanted with a high-density research grid. In addition to the ieeg recordings, pre-implantation MRI T1 scans are provided for the purpose of localizing electrodes. Participants performed a varying number of tasks and runs.

    • Independent variables

    The data are divided in 6 different sets of stimulus types or events:

    1. prf: grayscale, oriented bar stimuli consisting of curved, band-pass filtered lines that were swept across the screen (up to (~16 degree of visual angle) in a fixed order for the purpose of estimating spatial population receptive fields (pRFs).
    2. spatialpattern: grayscale, centrally presented pattern stimuli (~16 degree of visual angle diameter) consisting of curved, band-pass filtered lines that were systematically varied in level of contrast and density, as well as various oriented grating stimuli.
    3. temporalpattern: grayscale, centrally presented pattern stimuli (~16 degree of visual angle diameter) consisting of curved, band-pass filtered lines that were systematically varied in temporal duration and interval.
    4. soc: combination of the spatialpattern and temporalpattern stimuli.
    5. sixcatloctemporal: color images of six stimulus classes: faces, bodies (hands/feet only), buildings, objects, scenes and scrambled, systematically varied in temporal duration and interval, whereby interval stimuli consisted of direct repeats of the identical image.
    6. sixcatlocisidiff/sixcatlocdiffisi: color images of six stimulus classes: faces, bodies (hands/feet only), buildings, objects, scenes and scrambled, systematically varied in temporal duration and interval, whereby the first interval stimulus was followed by images from either the same or a different category (but not the identical image).

    Participant-, task- and run-specific stimuli are provided in the /stimuli folder as matlab .mat files.

    • Dependent variables

    The main BIDS folder contains the raw voltage data, split up in individual task runs. The /derivatives/ECoGCAR folder contains common-average-referenced version of the data. The /derivatives/ECoGBroadband folder contains time-varying broadband responses estimated by band-pass filtering the common-average-referenced voltage data and taking the average power envelope. The /derivatives/ECoGPreprocessed folder contains epoched trials used in Brands et al., (2024). The /derivatives/freesurfer folder contains surface reconstructions of each participant's T1, along with retinotopic atlas files. The /derivatives/Groen2022TemporalDynamicsECoG contains preprocessed data and model fits that can be used to reproduce the results reported in Groen et al., (2022). The /derivatives/Brands2024TemporalAdaptationECoG contains preprocessed data and model fits that can be used to reproduce the results reported in Brands et al., (2024).

    • Quality assessment of the data

    Data quality and number of trials per subjects varies considerably across patients, for various reasons.

    First, for each recording session, attempts were made to optimize the environment for running visual experiments; e.g. room illumination was stabilized as much as possible by closing blinds when available, the visual display was calibrated (for most patients), and interference from medical staff or visitors was minimized. However, it was not possible to equate this with great precision across patients and sessions/runs.

    Second, implantations were determined based on clinical needs and electrode locations therefore vary across participants. The strength and robustness of the neural responses varies greatly with the electrode location (e.g. early vs higher-level visual cortex), as well as with uncontrolled factors such as how well the electrode made contact with the cortex and whether it was primarily situated on grey matter (surface/grid electrodes) or could be located in white matter (some depth electrodes). Electrodes that were marked as containing epileptic activity by clinicians, or that did not have good signal based on visual inspection of the raw data, are marked as 'bad' in the channels.tsv files.

    Third, patients varied greatly in their cognitive abilities and mental/medical state, which affected their ability to follow task instructions, e.g. to remain alert and fixation. Some patients were able to perform repeated runs of multiple tasks across multiple sessions, while others only managed to do a few runs.

    All patients included in this dataset have sufficiently good responses in some electrodes/tasks as judged by Groen et al., (2022) and Brands et al., (2024). However, when using this dataset to address further research questions, it is advisable to set stringent requirements on electrode and trial selection. See Groen et al., (2022) and associated code repository for an example preprocessing pipeline that selected for robust visual responses to temporally- and contrast-varying stimuli.

    Methods

    • Subjects

    All participants were intractable epilepsy patients who were undergoing ECoG for the purpose of monitoring seizures. Participants were included if their implantation covered parts of visual cortex and if they consented to participate in research.

    • Apparatus

    Data were collected in a clinical setting, i.e. at bedside in the patient's hospital room. Information about iEEG recording apparatus is provided the meta data for each patient. Information about the visual stimulation equipment and behavioral response recordings are provided in Groen et al., (2022), Yuasa et al., (2023) and Brands et al., (2024).

    • Experimental location

    Data were collected at NYU University Langone Hospital (New York, USA) or at University Medical Center Utrecht (The Netherlands).

    • Missing data

    Stimulus files are missing for a few runs of sub-02. These are marked as N/A in the associated event files.

    Notes

    Further participant-specific notes:

    • For sub-03 and sub-04 the spatial pattern and temporal pattern stimuli are combined in the soc task runs, for the remaining participants these are split across the spatialpattern and temporalpattern task runs.

    • The pRF task from sub-04 has different prf parameters (bar duration and gap).

    • The first two runs of the pRF task from sub-05 are not of good quality (participant repeatedly broke fixation). In addition, the triggers in all pRF runs from sub-05 are not correct due to a stimulus coding problem and will need to be re-interpolated if one wishes to use these data.

    • Participants sub-10 and sub-11 have high density grids in addition to clinical grids.

    • Note that all stimuli and stimulus parameters can be found in the participant-specific stimulus *.mat files.

  10. Data from: Data and analysis script for channel measurement campaign at...

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Oct 27, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oscar Bejarano; Kirk Webb; Rahman Doost-Mohammady; Oscar Bejarano; Kirk Webb; Rahman Doost-Mohammady (2020). Data and analysis script for channel measurement campaign at POWDER-RENEW using Iris SDRs [Dataset]. http://doi.org/10.5281/zenodo.4135078
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 27, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Oscar Bejarano; Kirk Webb; Rahman Doost-Mohammady; Oscar Bejarano; Kirk Webb; Rahman Doost-Mohammady
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains our raw datasets from channel measurements performed at the University of Utah campus. In addition, we have included a document that explains the setup and methodology used to collect this data, as well as a very brief discussion of results.
    File organization:
    * documentation/ - Contains a .docx with the description of the setup and evaluation.
    * data/ - HDF5 files containing both metadata and raw IQ samples for
    each location at which data was collected. Notice we collected data at 14
    different client locations. See map in the attached docx (skipped locations 12 and 16).
    We deployed 5 different receivers at 5 different rooftops. Due to resource constraints,
    one set of files contains data from 4 different locations whereas another set
    contains information from the single remaining location.

    We have developed a set of python scripts that allow us to parse and analyze the data.
    Although not included here, they can be found in our public repository: https://github.com/renew-wireless/RENEWLab
    You can find the top script here.

    For more information on the POWDER-RENEW project please visit the POWDER website.
    The RENEW part of the project focuses on the deployment of an open-source massive MIMO system.
    Please visit our website for more information.

  11. E

    HadCM3 Model data used in the article 'Disentangling the causes of the 1816...

    • find.data.gov.scot
    • dtechtive.com
    txt, zip
    Updated Aug 13, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Edinburgh. School of GeoSciences. Institute of Geography (2019). HadCM3 Model data used in the article 'Disentangling the causes of the 1816 European year without a summer' by Schurer, Andrew; Hegerl, Gabriele; Luterbacher, Juerg; Broennimann, Stefan; Cowan, Tim; Tett, Simon; Zanchettin, Davide; Timmreck, Claudia [Dataset]. http://doi.org/10.7488/ds/2601
    Explore at:
    txt(0.0166 MB), zip(10823.68 MB), zip(7977.984 MB)Available download formats
    Dataset updated
    Aug 13, 2019
    Dataset provided by
    University of Edinburgh. School of GeoSciences. Institute of Geography
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract: The European summer of 1816 has often been referred to as a 'year without a summer' due to anomalously cold conditions and unusual wetness, which led to widespread famines and agricultural failures. The cause has often been assumed to be the eruption of Mount Tambora in April 1815, however this link has not, until now, been proven. Here we apply state-of-the-art event attribution methods to quantify the contribution by the eruption and random weather variability to this extreme European summer climate anomaly. By selecting analogue summers that have similar sea-level-pressure patterns to that observed in 1816 from both observations and unperturbed climate model simulations, we show that the circulation state can reproduce the precipitation anomaly without external forcing, but can explain only about a quarter of the anomalously cold conditions. We find that in climate models, including the forcing by the Tambora eruption makes the European cold anomaly up to 100 times more likely, while the precipitation anomaly became 1.5-3 times as likely, attributing a large fraction of the observed anomalies to the volcanic forcing. Our study thus demonstrates how linking regional climate anomalies to large-scale circulation is necessary to quantitatively interpret and attribute post-eruption variability. The Model data consists 50 HadCM3 Model simulations with volcanic forcing for the period 01/12/1814 to 01/12/1816. The dataset is divided into atmosphere monthly mean and ocean monthly mean files. With each containing 24 monthly values for each of the 50 simulations. The files are in the UK Metoffice pp format. This can be read using the iris python package: https://scitools.org.uk/iris/docs/latest/ Example_script.py is an example of a simple python script which reads the model data.

  12. AL preprocessed data used in paper "Multi variables time series information...

    • zenodo.org
    bin
    Updated Feb 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Denis Ullmann; Denis Ullmann (2023). AL preprocessed data used in paper "Multi variables time series information bottleneck" [Dataset]. http://doi.org/10.5281/zenodo.7674274
    Explore at:
    binAvailable download formats
    Dataset updated
    Feb 25, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Denis Ullmann; Denis Ullmann
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Preprocessed AL data used in paper "Multi variables time series information bottleneck" with the GitHub code

    This dataset is created from a public available dataset of solar power data collected in Alabama by NREL.

    The npz file is a numpy (np) compressed data and can be loaded using np.load with allow_pickle=True
    Loaded data is then a python dict described bellow.

    Each sample 'data' is a np.ndarray with 2 dimensions: time (various length) and wavelength (length=137 representing 137 solar plants ordered like in NREL).

    Each sample is given a 'position' which is a list of length 4:
    position[1] is a string that gives the name of the event
    position[4] is a boolean vector that gives the time positionsof the corresponding sample in the original sequence of public IRIS level2 data

    Data file info :
    Type: .npz
    Size: 34.48MB
    *** Key: 'data_TR_AL'
    ndarray data of length 161
    containing np.ndarray of shapes ['various', 137]

    *** Key: 'data_VAL_AL'
    ndarray data of length 11
    containing np.ndarray of shapes ['various', 137]

    *** Key: 'data_TE_AL'
    ndarray data of length 57
    containing np.ndarray of shapes ['various', 137]

    *** Key: 'data_TR'
    ndarray data of length 161
    containing np.ndarray of shapes ['various', 137]

    *** Key: 'data_VAL'
    ndarray data of length 11
    containing np.ndarray of shapes ['various', 137]

    *** Key: 'data_TE'
    ndarray data of length 57
    containing np.ndarray of shapes ['various', 137]

    *** Key: 'position_TR_AL'
    ndarray data of length 161
    containing ndarray data of length 4
    containing mix of types {'str', 'ndarray', 'int'}

    *** Key: 'position_VAL_AL'
    ndarray data of length 11
    containing ndarray data of length 4
    containing mix of types {'str', 'ndarray', 'int'}

    *** Key: 'position_TE_AL'
    ndarray data of length 57
    containing ndarray data of length 4
    containing mix of types {'str', 'ndarray', 'int'}

    *** Key: 'position_TR'
    ndarray data of length 161
    containing ndarray data of length 4
    containing mix of types {'str', 'ndarray', 'int'}

    *** Key: 'position_VAL'
    ndarray data of length 11
    containing ndarray data of length 4
    containing mix of types {'str', 'ndarray', 'int'}

    *** Key: 'position_TE'
    ndarray data of length 57
    containing ndarray data of length 4
    containing mix of types {'str', 'ndarray', 'int'}

  13. Earthquake Early Warning Dataset

    • figshare.com
    txt
    Updated Nov 20, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kevin Fauvel; Daniel Balouek-Thomert; Diego Melgar; Pedro Silva; Anthony Simonet; Gabriel Antoniu; Alexandru Costan; Véronique Masson; Manish Parashar; Ivan Rodero; Alexandre Termier (2019). Earthquake Early Warning Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.9758555.v3
    Explore at:
    txtAvailable download formats
    Dataset updated
    Nov 20, 2019
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Kevin Fauvel; Daniel Balouek-Thomert; Diego Melgar; Pedro Silva; Anthony Simonet; Gabriel Antoniu; Alexandru Costan; Véronique Masson; Manish Parashar; Ivan Rodero; Alexandre Termier
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is composed of GPS stations (1 file) and seismometers (1 file) multivariate time series data associated with three types of events (normal activity / medium earthquakes / large earthquakes). Files Format: plain textFiles Creation Date: 02/09/2019Data Type: multivariate time seriesNumber of Dimensions: 3 (east-west, north-south and up-down)Time Series Length: 60 (one data point per second)Period: 2001-2018Geographic Location: -62 ≤ latitude ≤ 73, -179 ≤ longitude ≤ 25Data Collection - Large Earthquakes: GPS stations and seismometers data are obtained from the archive [1]. This archive includes 29 large eathquakes. In order to be able to adopt a homogeneous labeling method, dataset is limited to the data available from the American Incorporated Research Institutions for Seismology - IRIS (14 large earthquakes remaining over 29). > GPS stations (14 events): High Rate Global Navigation Satellite System (HR-GNSS) displacement data (1-5Hz). Raw observations have been processed with a precise point positioning algorithm [2] to obtain displacement time series in geodetic coordinates. Undifferenced GNSS ambiguities were fixed to integers to improve accuracy, especially over the low frequency band of tens of seconds [3]. Then, coordinates have been rotated to a local east-west, north-south and up-down system. > Seismometers (14 events): seismometers strong motion data (1-10Hz). Channel files are specifying the units, sample rates, and gains of each channel. - Normal Activity / Medium Earthquakes: > GPS stations (255 events: 255 normal activity): High Rate Global Navigation Satellite System (HR-GNSS) normal activity displacement data (1Hz). GPS data outside of large earthquake periods can be considered as normal activity (noise). Data is downloaded from [4], an archive maintained by the University of Oregon which stores a representative extract of GPS noise. It is an archive of real-time three component positions for 240 stations in the western U.S. from California to Alaska and spanning from October 2018 to the present day. The raw GPS data (observations of phase and range to visible satellites) are processed with an algorithm called FastLane [5] and converted to 1 Hz sampled positions. Normal activity MTS are randomly sampled from the archive to match the number of seismometers events and to keep a ratio above 30% between the number of large earthquakes MTS and normal activity in order not to encounter a class imbalance issue.> Seismometers (255 events: 170 normal activity, 85 medium earthquakes): seismometers strong motion data (1-10Hz). Time series data collected from the international Federation of Digital Seismograph Networks (FDSN) client available in Python package ObsPy [6]. Channel information is specifying the units, sample rates, and gains of each channel. The number of medium earthquakes is calculated by the ratio of medium over large earthquakes during the past 10 years in the region. A ratio above 30% is kept between the number of 60 seconds MTS corresponding to earthquakes (medium + large) and total (earthquakes + normal activity) number of MTS to prevent a class imbalance issue. The number of GPS stations and seismometers for each event varies (tens to thousands). Preprocessing:- Conversion (seismometers): data are available as digital signal, which is specific for each sensor. Therefore, each instrument digital signal is converted to its physical signal (acceleration) to obtain comparable seismometers data- Aggregation (GPS stations and seismometers): data aggregation by second (mean)Variables:- event_id: unique ID of an event. Dataset is composed of 269 events.- event_time: timestamp of the event occurence - event_magnitude: magnitude of the earthquake (Richter scale)- event_latitude: latitude of the event recorded (degrees)- event_longitude: longitude of the event recorded (degrees)- event_depth: distance below Earth's surface where earthquake happened (km)- mts_id: unique multivariate time series ID. Dataset is composed of 2,072 MTS from GPS stations and 13,265 MTS from seismometers.- station: sensor name (GPS station or seismometer)- station_latitude: sensor (GPS station or seismometer) latitude (degrees)- station_longitude: sensor (GPS station or seismometer) longitude (degrees)- timestamp: timestamp of the multivariate time series- dimension_E: East-West component of the sensor (GPS station or seismometer) signal (cm/s/s)- dimension_N: North-South component of the sensor (GPS station or seismometer) signal (cm/s/s)- dimension_Z: Up-Down component of the sensor (GPS station or seismometer) signal (cm/s/s)- label: label associated with the event. There are 3 labels: normal activity (GPS stations: 255 events, seismometers: 170 events) / medium earthquake (GPS stations: 0 event, seismometers: 85 events) / large earthquake (GPS stations: 14 events, seismometers: 14 events). EEW relies on the detection of the primary wave (P-wave) before the secondary wave (damaging wave) arrive. P-waves follow a propagation model (IASP91 [7]). Therefore, each MTS is labeled based on the P-wave arrival time on each sensor (seismometers, GPS stations) calculated with the propagation model.[1] Ruhl, C. J., Melgar, D., Chung, A. I., Grapenthin, R. and Allen, R. M. 2019. Quantifying the value of real‐time geodetic constraints for earthquake early warning using a global seismic and geodetic data set. Journal of Geophysical Research: Solid Earth 124:3819-3837.[2] Geng, J., Bock, Y., Melgar, D, Crowell, B. W., and Haase, J. S. 2013. A new seismogeodetic approach applied to GPS and accelerometer observations of the 2012 Brawley seismic swarm: Implications for earthquake early warning. Geochemistry, Geophysics, Geosystems 14:2124-2142.[3] Geng, J., Jiang, P., and Liu, J. 2017. Integrating GPS with GLONASS for high‐rate seismogeodesy. Geophysical Research Letters 44:3139-3146.[4] http://tunguska.uoregon.edu/rtgnss/data/cwu/mseed/[5] Melgar, D., Melbourne, T., Crowell, B., Geng, J, Szeliga, W., Scrivner, C., Santillan, M. and Goldberg, D. 2019. Real-Time High-Rate GNSS Displacements: Performance Demonstration During the 2019 Ridgecrest, CA Earthquakes (Version 1.0) [Data set]. Zenodo.[6] https://docs.obspy.org/packages/obspy.clients.fdsn.html[7] Kennet, B. L. N. 1991. Iaspei 1991 Seismological Tables. Terra Nova 3:122–122.

  14. d

    Climate model output from a study of tropical cyclones over the Shanghai...

    • search.dataone.org
    • data.niaid.nih.gov
    • +1more
    Updated Jul 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Erasmo Buonomo; Nicholas Savage; Grace Redmond; Simon Tucker (2025). Climate model output from a study of tropical cyclones over the Shanghai region under climate change based on a convection-permitting modelling [Dataset]. http://doi.org/10.5061/dryad.z8w9ghxgq
    Explore at:
    Dataset updated
    Jul 27, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Erasmo Buonomo; Nicholas Savage; Grace Redmond; Simon Tucker
    Time period covered
    Jan 1, 2023
    Description

    Changes in tropical cyclones due to greenhouse-gas forcing in the Shanghai area have been studied in a double-nesting regional model experiment using the Met Office convection-permitting model HadREM3-RA1T at 4km resolution and the regional model HadREM3-GA7.05 at 12km for the intermediate nest. Boundary conditions for the experiment have been constructed from HadGEM2-ES, a General Circulation Model (GCM) from the 5th Coupled Model Intercomparison Project (CMIP5), directly using high-frequency data for the atmosphere (6-hourly) and the ocean (daily), for the historical period (1981-2000) and under the Representative Concentration Pathway 8.5 (2080-2099). These choices identify one of the warmest climate scenarios available from CMIP5. Given the direct use of GCM data for the baseline, large scale conditions relevant for tropical cyclones have been analyzed, demonstrating a realistic representation of environmental conditions off the coast of eastern China. GCM large scale changes show a..., Data has been generated by climate models and converted to NetCDF4 format (interfaces for commonly used languages available at https://www.unidata.ucar.edu/software/netcdf/), following the CF metadata convention (https://cfconventions.org/) . The conversion has been done using the IRIS package under Python (https://scitools-iris.readthedocs.io/en/latest/index.html) and the NCO utilities (https://nco.sourceforge.net/). The buffer zone has been removed from the output of limited area models., Standard UNiX tar to uncompress and expand, NetCDF support for most commonly used software environemt (e.g., Fortran, Python, R) available at https://www.unidata.ucar.edu/software/netcdf/, This readme file was generated on 2024-02-16 by Erasmo Buonomo

    GENERAL INFORMATION

    Title of Dataset: Climate model output from a study of tropical cyclones over the Shanghai region under climate change based on a convection-permitting modelling

    Author Information Name: Erasmo Buonomo ORCID: Institution: Hadley Centre - Met Office Address: Fitzroy Road, Exeter, EX1 3PB, UK Email:

    Alternate Contact Information Name: Nicholas Savage ORCID: Institution: Hadley Centre - Met Office Address: Fitzroy Road, Exeter, EX1 3PB, UK Email:

    Date of data collection: Simulations run in different period, collected in the current format by 2022-12-20

    Geographic location of data collection: HadGEM2-ES global1.875x1.125 degrees horizontal resolution, HadREM3-GA705 domain over China at 12km horizontal resolution, HadREM3-RA1T domain over eastern China at 4km horizontal resolution.

    Information about funding sources that supported the collection of the data: Newton Fund, Climate Science for Service P...

  15. VISION: UKESM1 hourly modelled ozone for comparison to observations

    • catalogue.ceda.ac.uk
    Updated Jan 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nathan Luke Abraham; Maria Russo (2025). VISION: UKESM1 hourly modelled ozone for comparison to observations [Dataset]. https://catalogue.ceda.ac.uk/uuid/300046500aeb4af080337ff86ae8e776
    Explore at:
    Dataset updated
    Jan 18, 2025
    Dataset provided by
    Centre for Environmental Data Analysishttp://www.ceda.ac.uk/
    Authors
    Nathan Luke Abraham; Maria Russo
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Time period covered
    Jan 1, 1982 - May 31, 2022
    Area covered
    Earth
    Description

    Two UK Earth System Model (UKESM1) hindcasts have been performed in support of the Virtual Integration of Satellite and In-situ Observation Networks (VISION) project (NE/Z503393/1).

    Data is provided as raw model output in Met Office PP (32-bit) format that can be read by the Iris (https://scitools-iris.readthedocs.io/en/stable/) or cf-python (https://ncas-cms.github.io/cf-python/) libraries.

    This is global data at N96 L85 resolution (1.875 x 1.25, 85 model levels up to 85km). Simulations were performed on the Monsoon2 High Performance Computer (HPC).

    The first dataset (Jan 1982 to May 2022) contains hourly ozone concentrations on the lowest model level (20m above the surface).

    The second dataset (Jan 2010 to Dec 2020) contains hourly ozone concentrations and hourly Heaviside function on 37 fixed pressure levels. Data is only provided for days in which ozone was measured by the FAAM aircraft (for comparison purposes).

    Ozone data is provided in mass mixing ratio (kg species/kg air).

  16. The Seismogenic Thickness of Venus

    • zenodo.org
    zip
    Updated Jun 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Julia Maia; Julia Maia; Ana-Catalina Plesa; Ana-Catalina Plesa; Iris van Zelst; Iris van Zelst; Richard Ghail; Richard Ghail; Anna Gülcher; Anna Gülcher; Mark Panning; Mark Panning; Sven Peter Näsholm; Sven Peter Näsholm; Barbara De Toffoli; Barbara De Toffoli; Anna Horleston; Anna Horleston; Krystyna Smolinski; Sara Klaasen; Sara Klaasen; Robert Herrick; Raphael Garcia; Raphael Garcia; Krystyna Smolinski; Robert Herrick (2025). The Seismogenic Thickness of Venus [Dataset]. http://doi.org/10.5281/zenodo.13133118
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 25, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Julia Maia; Julia Maia; Ana-Catalina Plesa; Ana-Catalina Plesa; Iris van Zelst; Iris van Zelst; Richard Ghail; Richard Ghail; Anna Gülcher; Anna Gülcher; Mark Panning; Mark Panning; Sven Peter Näsholm; Sven Peter Näsholm; Barbara De Toffoli; Barbara De Toffoli; Anna Horleston; Anna Horleston; Krystyna Smolinski; Sara Klaasen; Sara Klaasen; Robert Herrick; Raphael Garcia; Raphael Garcia; Krystyna Smolinski; Robert Herrick
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This archive contains Python routines and data files supporting the study "The Seismogenic Thickness of Venus" by Julia Maia, Ana-Catalina Plesa, Iris van Zelst, Richard Ghail, Anna J. P. Gülcher, Mark P. Panning, Sven Peter Näsholm, Barbara De Toffoli, Anna C. Horleston, Krystyna T. Smolinski, Sara Klaasen, Robert R. Herrick, Raphaël F. Garcia submitted to Journal of Geophysical Research: Planets.

  17. Data and scripts (2) for Storkey et al, "Resolution dependence of...

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip
    Updated May 13, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Storkey; David Storkey (2024). Data and scripts (2) for Storkey et al, "Resolution dependence of interlinked Southern Ocean biases in global coupled HadGEM3 models", GMD (2024) [Dataset]. http://doi.org/10.5281/zenodo.11102967
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    May 13, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    David Storkey; David Storkey
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Southern Ocean
    Description

    ================================================================
    Data and scripts for producing plots from Storkey et al (2024):
    "Resolution dependence of interlinked Southern Ocean biases in
    global coupled HadGEM3 models"
    ================================================================

    The plots in the paper consist of 10-year mean fields from the third
    decade of the spin up and timeseries of scalar quantities for the first
    150 years of the spin up. The data to produce these plots are stored
    in the MEANS_YEARS_21-30 and TIMESERIES_DATA directories respectively.

    Note that due to the size limit on records on Zenodo, the 10-year mean
    output from the N216-ORCA12 integration has been stored as a separate
    record.

    Scripts to produce the plots are in SCRIPT, with section definitions
    in SECTIONS. Bespoke plotting scripts are included in SCRIPT. They use
    python 3 including the Matplotlib, Iris and Cartopy packages. The
    plotting of the timeseries data used the Marine_Val VALSO-VALTRANS
    package which is available here:

    https://github.com/JMMP-Group/MARINE_VAL/tree/main/VALSO-VALTRANS

    Much of the processing of the model output data was performed with the
    CDFTools package, which is available here:

    https://github.com/meom-group/CDFTOOLS

    and the NCO package:

    https://web.mit.edu/course/13/13.715/nco-2.8.1/doc/

  18. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Nibedita Sahu (2023). Classification Analysis Using Python [Dataset]. https://www.kaggle.com/datasets/nibeditasahu/classification-analysis-using-python
Organization logo

Classification Analysis Using Python

Exploring the Iris Dataset and Building a Decision Tree Classifier

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 3, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Nibedita Sahu
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

The Iris dataset is a classic and widely used dataset in machine learning for classification tasks. It consists of measurements of different iris flowers, including sepal length, sepal width, petal length, and petal width, along with their corresponding species. With a total of 150 samples, the dataset is balanced and serves as an excellent choice for understanding and implementing classification algorithms. This notebook explores the dataset, preprocesses the data, builds a decision tree classification model, and evaluates its performance, showcasing the effectiveness of decision trees in solving classification problems.

Search
Clear search
Close search
Google apps
Main menu