100+ datasets found
  1. Data from: EEG-Dataset

    • kaggle.com
    zip
    Updated Aug 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Quân Nguyễn Bảo (2025). EEG-Dataset [Dataset]. https://www.kaggle.com/datasets/quands/eeg-dataset
    Explore at:
    zip(3155571 bytes)Available download formats
    Dataset updated
    Aug 3, 2025
    Authors
    Quân Nguyễn Bảo
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    **Overview:

    The Bonn EEG Dataset is a widely recognized dataset in the field of biomedical signal processing and machine learning, specifically designed for research in epilepsy detection and EEG signal analysis. It contains electroencephalogram (EEG) recordings from both healthy individuals and patients with epilepsy, making it suitable for tasks such as seizure detection and classification of brain activity states. The dataset is structured into five distinct subsets (labeled A, B, C, D, and E), each comprising 100 single-channel EEG segments, resulting in a total of 500 segments. Each segment represents 23.6 seconds of EEG data, sampled at a frequency of 173.61 Hz, yielding 4,096 data points per segment, stored in ASCII format as text files.

    ****Structure and Label:

    • Set A: EEG recordings from healthy individuals with eyes open, capturing normal brain activity under visual stimulation.
    • Set B: EEG recordings from healthy individuals with eyes closed, reflecting brain activity in a resting state.
    • Set C: EEG recordings from epilepsy patients, collected from the epileptogenic zone during an interictal (seizure-free) period.
    • Set D: EEG recordings from epilepsy patients, collected from the hippocampal formation of the opposite brain hemisphere during an interictal period.
    • Set E: EEG recordings from epilepsy patients during an ictal (seizure) period, capturing brain activity during an epileptic seizure. Each subset contains 100 EEG segments, ensuring a balanced distribution across the five classes, which supports both binary (e.g., healthy vs. epileptic) and multi-class (e.g., A-E classification) tasks.

    **Key Characteristics

    • Size: 500 EEG segments (100 segments per subset, across five subsets).
    • Data Type: Single-channel EEG signals, stored in text files (ASCII format).
    • Sampling Rate: 173.61 Hz, providing high temporal resolution.
    • Segment Length: 23.6 seconds per segment, equivalent to 4,096 data points.
    • Labels: Clearly defined for each subset (A: healthy, eyes open; B: healthy, eyes closed; C: interictal, epileptogenic zone; D: interictal, opposite hemisphere; E: ictal), enabling precise model evaluation.
    • Preprocessing: The data is not pre-filtered, but a low-pass filter with a 40 Hz cutoff is recommended to remove high-frequency noise and artifacts, as suggested in the original documentation.

    **Applications

    The Bonn EEG Dataset is ideal for machine learning and signal processing tasks, including: - Developing algorithms for epileptic seizure detection and prediction. - Exploring feature extraction techniques, such as wavelet transforms, for EEG signal analysis. - Classifying brain states (healthy vs. epileptic, interictal vs. ictal). - Supporting research in neuroscience and medical diagnostics, particularly for epilepsy monitoring and treatment.

    **Source

    • The dataset is publicly available from the University of Bonn and can be downloaded from the following link: University of Bonn EEG Dataset
    • The dataset is provided as five ZIP files, each containing 100 text files corresponding to the EEG segments for subsets A, B, C, D, and E.

    **Citation

    When using this dataset, researchers are required to cite the original publication: Andrzejak, R. G., Lehnertz, K., Mormann, F., Rieke, C., David, P., & Elger, C. E. (2001). Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: Dependence on recording region and brain state. Physical Review E, 64(6), 061907. DOI: 10.1103/PhysRevE.64.061907.

    **Additional Notes

    1. The dataset is randomized, with no specific information provided about patients or electrode placements, ensuring simplicity and focus on signal characteristics.

    2. The data is not hosted on Kaggle or Hugging Face but is accessible directly from the University of Bonn’s repository or mirrored sources.

    3. Researchers may need to apply preprocessing steps, such as filtering or normalization, to optimize the data for machine learning tasks.

    4. The dataset’s balanced structure and clear labels make it an excellent choice for a one-week machine learning project, particularly for tasks involving traditional algorithms like SVM, Random Forest, or Logistic Regression.

    5. This dataset provides a robust foundation for learning signal processing, feature extraction, and machine learning techniques while addressing a real-world medical challenge in epilepsy detection.

  2. p

    CHB-MIT Scalp EEG Database

    • physionet.org
    Updated Jun 9, 2010
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Guttag (2010). CHB-MIT Scalp EEG Database [Dataset]. http://doi.org/10.13026/C2K01R
    Explore at:
    Dataset updated
    Jun 9, 2010
    Authors
    John Guttag
    License

    Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
    License information was derived automatically

    Description

    This database, collected at the Children’s Hospital Boston, consists of EEG recordings from pediatric subjects with intractable seizures. Subjects were monitored for up to several days following withdrawal of anti-seizure medication in order to characterize their seizures and assess their candidacy for surgical intervention. The recordings are grouped into 23 cases and were collected from 22 subjects (5 males, ages 3–22; and 17 females, ages 1.5–19).

  3. i

    EEG Signal Dataset

    • ieee-dataport.org
    Updated Jun 11, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rahul Kher (2020). EEG Signal Dataset [Dataset]. https://ieee-dataport.org/documents/eeg-signal-dataset
    Explore at:
    Dataset updated
    Jun 11, 2020
    Authors
    Rahul Kher
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    PCA

  4. Emotions based EEG dataset

    • kaggle.com
    zip
    Updated Sep 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thejaswinishrinivas (2023). Emotions based EEG dataset [Dataset]. https://www.kaggle.com/datasets/thejaswinishrinivas/emotions-based-eeg-dataset
    Explore at:
    zip(76305134 bytes)Available download formats
    Dataset updated
    Sep 30, 2023
    Authors
    Thejaswinishrinivas
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The datset is comprised of 46(22 commercial adertisement and 24 kannada Music clips) different subjcets EEG data recorded uisng 2 channel EEG device

    The dataset folder contaions two sub folder 1. comercial advertisement 1.1 Channel_1(Ch_1) and Channel_2 (Ch_2) :Prefontal Cortex 2. Kannada Musical clips 2.1 channel_1(Ch_1) and Channel_2 (Ch_2) :left Brain

    Excel file information : Each file column represneted as number of subjects and row is represnted as features per subjects There are totaly 12 excel files from two channels ( 6 for commercial advertisemnt and 6 for kannda Musical clips).

    Subjective self-rating scale

    Name
    age Gender Have you ever had any health issues? YES NO Have you watched this song/advertisement before? YES NO Please let us know if this advertisement brings up any specific memories for you. YES NO Please Rate the following query from 1 to 10. How funny was the advertisement you watched How sad was the advertisement you watched How Horror was the advertisement you watched How relaxed was the Music you viewed with How Sad was the Music you viewed with How enjoyable was the Music you viewed with Do you think what you just watched was entertaining enough? If you have any comment please write here

    Here is the website address for each stimulus that we considered:

    ad1: https://www.youtube.com/watch?v=ZzG7duipQ7U&ab_channel=perfettiindia ad2: https://www.youtube.com/watch?v=SfAxUpeVhCg&ab_channel=bo0fhead ad3: https://www.youtube.com/watch?v=HqGsT6VM8Vg&ab_channel=kiddlestix song1: https://www.youtube.com/hashtag/kgfchapter2 song 2: https://www.youtube.com/watch?v=x43w4lLS9E0&ab_channel=AnandAudio Song 3: https://youtube.com/watch?v=Ysf4QRrcLGM&si=EnSIkaIECMiOmarE

    For a more comprehensive understanding of the dataset and its background, we kindly ask researchers to refer to our associated manuscript titled:

    Entertainment Based Database for Emotion Recognition from EEG Signals, the research article accepted at 3rd International Conference on Applied Intelligence and informatics (AII2023) held in Fostering reproducibility of research results right 29 -31 OCT 2023, DUBAI, UAE. (When utilizing this dataset in your research, please consider citing the following reference)

  5. RAW EEG STRESS DATASET

    • kaggle.com
    zip
    Updated Dec 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ayush Tibrewal (2023). RAW EEG STRESS DATASET [Dataset]. https://www.kaggle.com/datasets/ayushtibrewal/raw-eeg-stress-dataset-sam40
    Explore at:
    zip(366418728 bytes)Available download formats
    Dataset updated
    Dec 11, 2023
    Authors
    Ayush Tibrewal
    Description

    SAM 40: Dataset of 40 subject EEG recordings to monitor the induced-stress while performing Stroop color-word test, arithmetic task, and mirror image recognition task

    presents a collection of electroencephalogram (EEG) data recorded from 40 subjects (female: 14, male: 26, mean age: 21.5 years). The dataset was recorded from the subjects while performing various tasks such as Stroop color-word test, solving arithmetic questions, identification of symmetric mirror images, and a state of relaxation. The experiment was primarily conducted to monitor the short-term stress elicited in an individual while performing the aforementioned cognitive tasks. The individual tasks were carried out for 25 s and were repeated to record three trials. The EEG was recorded using a 32-channel Emotiv Epoc Flex gel kit. The EEG data were then segmented into non-overlapping epochs of 25 s depending on the various tasks performed by the subjects. The EEG data were further processed to remove the baseline drifts by subtracting the average trend obtained using the Savitzky-Golay filter. Furthermore, the artifacts were also removed from the EEG data by applying wavelet thresholding. The dataset proposed in this paper can aid and support the research activities in the field of brain-computer interface and can also be used in the identification of patterns in the EEG data elicited due to stress.

  6. p

    Auditory evoked potential EEG-Biometric dataset

    • physionet.org
    Updated Dec 1, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nibras Abo Alzahab; Angelo Di Iorio; Luca Apollonio; Muaaz Alshalak; Alessandro Gravina; Luca Antognoli; Marco Baldi; Lorenzo Scalise; Bilal Alchalabi (2021). Auditory evoked potential EEG-Biometric dataset [Dataset]. http://doi.org/10.13026/ps31-fc50
    Explore at:
    Dataset updated
    Dec 1, 2021
    Authors
    Nibras Abo Alzahab; Angelo Di Iorio; Luca Apollonio; Muaaz Alshalak; Alessandro Gravina; Luca Antognoli; Marco Baldi; Lorenzo Scalise; Bilal Alchalabi
    License

    https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts

    Description

    This data set consists of over 240 two-minute EEG recordings obtained from 20 volunteers. Resting-state and auditory stimuli experiments are included in the data. The goal is to develop an EEG-based Biometric system.

    The data includes resting-state EEG signals in both cases: eyes open and eyes closed. The auditory stimuli part consists of six experiments; Three with in-ear auditory stimuli and another three with bone-conducting auditory stimuli. The three stimuli for each case are a native song, a non-native song, and neutral music.

  7. b

    Harvard Electroencephalography Database

    • bdsp.io
    • registry.opendata.aws
    Updated Feb 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sahar Zafar; Tobias Loddenkemper; Jong Woo Lee; Andrew Cole; Daniel Goldenholz; Jurriaan Peters; Alice Lam; Edilberto Amorim; Catherine Chu; Sydney Cash; Valdery Moura Junior; Aditya Gupta; Manohar Ghanta; Marta Fernandes; Haoqi Sun; Jin Jing; M Brandon Westover (2025). Harvard Electroencephalography Database [Dataset]. http://doi.org/10.60508/k85b-fc87
    Explore at:
    Dataset updated
    Feb 10, 2025
    Authors
    Sahar Zafar; Tobias Loddenkemper; Jong Woo Lee; Andrew Cole; Daniel Goldenholz; Jurriaan Peters; Alice Lam; Edilberto Amorim; Catherine Chu; Sydney Cash; Valdery Moura Junior; Aditya Gupta; Manohar Ghanta; Marta Fernandes; Haoqi Sun; Jin Jing; M Brandon Westover
    License

    https://github.com/bdsp-core/bdsp-license-and-duahttps://github.com/bdsp-core/bdsp-license-and-dua

    Description

    The Harvard EEG Database will encompass data gathered from four hospitals affiliated with Harvard University: Massachusetts General Hospital (MGH), Brigham and Women's Hospital (BWH), Beth Israel Deaconess Medical Center (BIDMC), and Boston Children's Hospital (BCH). The EEG data includes three types:

    rEEG: "routine EEGs" recorded in the outpatient setting.
    EMU: recordings obtained in the inpatient setting, within the Epilepsy Monitoring Unit (EMU).
    ICU/LTM: recordings obtained from acutely and critically ill patients within the intensive care unit (ICU).
    
  8. An EEG dataset recorded during affective music listening

    • openneuro.org
    Updated Apr 23, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ian Daly; Nicoletta Nicolaou; Duncan Williams; Faustina Hwang; Alexis Kirke; Eduardo Miranda; Slawomir J. Nasuto (2020). An EEG dataset recorded during affective music listening [Dataset]. http://doi.org/10.18112/openneuro.ds002721.v1.0.1
    Explore at:
    Dataset updated
    Apr 23, 2020
    Dataset provided by
    OpenNeurohttps://openneuro.org/
    Authors
    Ian Daly; Nicoletta Nicolaou; Duncan Williams; Faustina Hwang; Alexis Kirke; Eduardo Miranda; Slawomir J. Nasuto
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    0. Sections

    1. Project
    2. Dataset
    3. Terms of Use
    4. Contents
    5. Method and Processing

    1. PROJECT

    Title: Brain-Computer Music Interface for Monitoring and Inducing Affective States (BCMI-MIdAS)

    Dates: 2012-2017

    Funding organisation: Engineering and Physical Sciences Research Council (EPSRC)

    Grant no.: EP/J003077/1 and EP/J002135/1.

    2. DATASET

    Title: EEG data investigating neural correlates of music-induced emotion.

    Description: This dataset accompanies the publication by Daly et al. (2018) and has been analysed in Daly et al. (2014; 2015a; 2015b) (please see Section 5 for full references). The purpose of the research activity in which the data were collected was to investigate the EEG neural correlates of music-induced emotion. For this purpose 31 healthy adult participants listened to 40 music clips of 12 s duration each, targeting a range of emotional states. The music clips comprised excerpts from film scores spanning a range of styles and rated on induced emotion. The dataset contains unprocessed EEG data from all 31 participants (age range 18-66, 18 female) while listening to the music clips, together with the reported induced emotional responses . The paradigm involved 6 runs of EEG recordings. The first and last runs were resting state runs, during which participants were instructed to sit still and rest for 300 s. The other 4 runs each contained 10 music listening trials.

    Publication Year: 2018

    Creator: Nicoletta Nicolaou, Ian Daly.

    Contributors: Isil Poyraz Bilgin, James Weaver, Asad Malik.

    Principal Investigator: Slawomir Nasuto (EP/J003077/1).

    Co-Investigator: Eduardo Miranda (EP/J002135/1).

    Organisation: University of Reading

    Rights-holders: University of Reading

    Source: The musical stimuli were taken from Eerola & Vuoskoski, “A comparison of the discrete and dimensional models of emotion in music”, Psychol. Music, 39:18-49, 2010 (doi: 10.1177/0305735610362821).

    3. TERMS OF USE

    Copyright University of Reading, 2018. This dataset is licensed by the rights-holder(s) under a Creative Commons Attribution 4.0 International Licence: https://creativecommons.org/licenses/by/4.0/.

    4. CONTENTS

    BIDS File listing: The dataset comprises data from 31 participants, named using the convention: sub_s_number where: s_number is a random participant number from 1 to 31. For example: ‘sub-08’ contains data obtained from participant 8.

    The data is BIDS format and contains EEG and associated meta data. The sampling rate is 1 kHz and the EEG corresponding to a music clip is 20 s long (the duration of the clips).

    Each data folder contains the following data (please note that the number of runs varies between participants):

    EEG data in .tsv format. Event codes (JSON) and timings (tsv). EEG channel information.

    5. METHOD and PROCESSING

    This information is available in the following publications:

    [1] Daly, I., Nicolaou, N., Williams, D., Hwang, F., Kirke, A., Miranda, E., Nasuto, S.J., �Neural and physiological data from participants listening to affective music�, Scientific Data, 2018. [2] Daly, I., Malik, A., Hwang, F., Roesch, E., Weaver, J., Kirke, A., Williams, D., Miranda, E. R., Nasuto, S. J., �Neural correlates of emotional responses to music: an EEG study�, Neuroscience Letters, 573: 52-7, 2014; doi: 10.1016/j.neulet.2014.05.003. [3] Daly, I., Hallowell, J., Hwang, F., Kirke, A., Malik, A., Roesch, E., Weaver, J., Williams, D., Miranda, E., Nasuto, S.J., �Changes in music tempo entrain movement related brain activity�, Proc. IEEE EMBC 2014, pp.4595-8; doi: 10.1109/EMBC.2014.6944647 [4] Daly, I., Williams, D., Hallowell, J., Hwang, F., Kirke, A., Malik, A., Weaver, J., Miranda, E., Nasuto, S.J., �Music-induced emotions can be predicted from a combination of brain activity and acoustic features�, Brain and Cognition, 101:1-11, 2015b; doi: 10.1016/j.bandc.2015.08.003

    Please cite these references if you use this dataset in your study.

    Thank you for your interest in our work.

  9. Data from: A Resting-state EEG Dataset for Sleep Deprivation

    • openneuro.org
    Updated Apr 27, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chuqin Xiang; Xinrui Fan; Duo Bai; Ke Lv; Xu Lei (2025). A Resting-state EEG Dataset for Sleep Deprivation [Dataset]. http://doi.org/10.18112/openneuro.ds004902.v1.0.8
    Explore at:
    Dataset updated
    Apr 27, 2025
    Dataset provided by
    OpenNeurohttps://openneuro.org/
    Authors
    Chuqin Xiang; Xinrui Fan; Duo Bai; Ke Lv; Xu Lei
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    General information

    The dataset provides resting-state EEG data (eyes open,partially eyes closed) from 71 participants who underwent two experiments involving normal sleep (NS---session1) and sleep deprivation(SD---session2) .The dataset also provides information on participants' sleepiness and mood states. (Please note here Session 1 (NS) and Session 2 (SD) is not the time order, the time order is counterbalanced across participants and is listed in metadata.)

    Dataset

    Presentation

    The data collection was initiated in March 2019 and was terminated in December 2020. The detailed description of the dataset is currently under working by Chuqin Xiang,Xinrui Fan,Duo Bai,Ke Lv and Xu Lei, and will submit to Scientific Data for publication.

    EEG acquisition

    • EEG system (Brain Products GmbH, Steing- rabenstr, Germany, 61 electrodes)
    • Sampling frequency: 500Hz
    • Impedances were kept below 5k

    Contact

     * If you have any questions or comments, please contact:
     * Xu Lei: xlei@swu.edu.cn   
    

    Article

    Xiang, C., Fan, X., Bai, D. et al. A resting-state EEG dataset for sleep deprivation. Sci Data 11, 427 (2024). https://doi.org/10.1038/s41597-024-03268-2

  10. EEG and audio dataset for auditory attention decoding

    • zenodo.org
    bin, zip
    Updated Jan 31, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Søren A. Fuglsang; Søren A. Fuglsang; Daniel D.E. Wong; Daniel D.E. Wong; Jens Hjortkjær; Jens Hjortkjær (2020). EEG and audio dataset for auditory attention decoding [Dataset]. http://doi.org/10.5281/zenodo.1199011
    Explore at:
    zip, binAvailable download formats
    Dataset updated
    Jan 31, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Søren A. Fuglsang; Søren A. Fuglsang; Daniel D.E. Wong; Daniel D.E. Wong; Jens Hjortkjær; Jens Hjortkjær
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    This dataset contains EEG recordings from 18 subjects listening to one of two competing speech audio streams. Continuous speech in trials of ~50 sec. was presented to normal hearing listeners in simulated rooms with different degrees of reverberation. Subjects were asked to attend one of two spatially separated speakers (one male, one female) and ignore the other. Repeated trials with presentation of a single talker were also recorded. The data were recorded in a double-walled soundproof booth at the Technical University of Denmark (DTU) using a 64-channel Biosemi system and digitized at a sampling rate of 512 Hz. Full details can be found in:

    • Søren A. Fuglsang, Torsten Dau & Jens Hjortkjær (2017): Noise-robust cortical tracking of attended speech in real-life environments. NeuroImage, 156, 435-444

    and

    • Daniel D.E. Wong, Søren A. Fuglsang, Jens Hjortkjær, Enea Ceolini, Malcolm Slaney & Alain de Cheveigné: A Comparison of Temporal Response Function Estimation Methods for Auditory Attention Decoding. Frontiers in Neuroscience, https://doi.org/10.3389/fnins.2018.00531

    The data is organized in format of the publicly available COCOHA Matlab Toolbox. The preproc_script.m demonstrates how to import and align the EEG and audio data. The script also demonstrates some EEG preprocessing steps as used the Wong et al. paper above. The AUDIO.zip contains wav-files with the speech audio used in the experiment. The EEG.zip contains MAT-files with the EEG/EOG data for each subject. The EEG/EOG data are found in data.eeg with the following channels:

    • channels 1-64: scalp EEG electrodes
    • channel 65: right mastoid electrode
    • channel 66: left mastoid electrode
    • channel 67: vertical EOG below right eye
    • channel 68: horizontal EOG right eye
    • channel 69: vertical EOG above right eye
    • channel 70: vertical EOG below left eye
    • channel 71: horizontal EOG left eye
    • channel 72: vertical EOG above left eye

    The expinfo table contains information about experimental conditions, including what what speaker the listener was attending to in different trials. The expinfo table contains the following information:

    • attend_mf: attended speaker (1=male, 2=female)
    • attend_lr: spatial position of the attended speaker (1=left, 2=right)
    • acoustic_condition: type of acoustic room (1= anechoic, 2= mild reverberation, 3= high reverberation, see Fuglsang et al. for details)
    • n_speakers: number of speakers presented (1 or 2)
    • wavfile_male: name of presented audio wav-file for the male speaker
    • wavfile_female: name of presented audio wav-file for the female speaker (if any)
    • trigger: trigger event value for each trial also found in data.event.eeg.value

    DATA_preproc.zip contains the preprocessed EEG and audio data as output from preproc_script.m.

    The dataset was created within the COCOHA Project: Cognitive Control of a Hearing Aid

  11. The Phantom EEG Dataset

    • zenodo.org
    bin
    Updated Oct 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anonymous; Anonymous (2024). The Phantom EEG Dataset [Dataset]. http://doi.org/10.5281/zenodo.11238929
    Explore at:
    binAvailable download formats
    Dataset updated
    Oct 14, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Anonymous; Anonymous
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    New version on https://zenodo.org/records/13341214.

    When you use this dataset, please cite this paper. More information about this dataset could also be found in this paper.

    Xu, X., Wang, B., Xiao, B., Niu, Y., Wang, Y., Wu, X., & Chen, J. (2024). Beware of Overestimated Decoding Performance Arising from Temporal Autocorrelations in Electroencephalogram Signals. arXiv preprint arXiv:2405.17024.

    1 Metadata

    Brief introduction

    The present work aims to demonstrate that temporal autocorrelations (TA) significantly impacts various BCI tasks even in conditions without neural activity. We used the watermelon as the phantom head and found that we could get the pitfall of overestimated decoding performance if continuous EEG data with the same class label were split into training and test sets. More details can be found in Motivation.

    As watermelons cannot perform any experimental tasks, we can reorganize it to the format of various actual EEG dataset without the need to collect EEG data as previous work did (examples in Domain Studied).

    Measurement devices

    Manufacturers: NeuroScan SynAmps2 system (Compumedics Limited, Victoria, Australia)

    Configuration: 64-channel Ag/AgCl electrode cap with a 10/20 layout

    Species

    Watermelons. Ten watermelons served as phantom heads.

    Domain Studied

    Overestimated Decoding Performance in EEG decoding.

    Following BCI datasets in various BCI tasks have been reorganized using the Phantom EEG Dataset. The pitfall has been found in four of five tasks.

    - CVPR dataset [1] for image decoding task.

    - DEAP dataset [2] for emotion recognition task.

    - KUL dataset [3] for auditory spatial attention decoding task.

    - BCIIV2a dataset [4] for motor imagery task (the pitfalls were absent due to the use of rapid-design paradigm during EEG recording).

    - SIENA dataset [5] for epilepsy detection task.

    Tasks Completed

    Resting State but you could reorganize it to any task in BCI.

    Dataset Name

    The Phantom EEG Dataset

    Dataset license

    Creative Commons Attribution 4.0 International

    Code

    The code to read the data files (.cnt) is provided in "Other". We could not add the file in this version because Zenodo demand that "you must create a new version to add, modify or delete files". We will add the file after organizing the datasets to comply with the FAIR principles in the version v2 recently.

    Data information

    The data will be published with following format in version v2:

    - CNT: the raw data.

    - BIDS: an extension to the brain imaging data structure for electroencephalography. BIDS primarily addresses the heterogeneity of data organization by following the FAIR principles [6].

    An additional electrode was placed on the lower part of the watermelon as the physiological reference, and the forehead served as the ground site. The inter-electrode impedances were maintained under 20 kOhm. Data were recorded at a sampling rate of 1000 Hz. EEG recordings for each watermelon lasted for more than 1 hour to ensure sufficient data for the decoding task.

    Each Subject (S*.cnt) contains the following information:

    EEG.data: EEG data (samples X channels)

    EEG.srate: Sampling frequency of the saved data

    EEG.chanlocs : channel numbers (1 to 68, ‘EKG’ ‘EMG’ 'VEO' 'HEO' were not recorded)

    Citation and more information

    Citation will be updated after the review period is completed.

    We will provide more information about this dataset (e.g. the units of the captured data) once our work is accepted. This is because our work is currently under review, and we are not allowed to disclose more information according to the relevant requirements.

    All metadata will be provided as a backup on Github and will be available after the review period is completed.

    2 Motivation

    Researchers have reported high decoding accuracy (>95%) using non-invasive Electroencephalogram (EEG) signals for brain-computer interface (BCI) decoding tasks like image decoding, emotion recognition, auditory spatial attention detection, epilepsy detection, etc. Since these EEG data were usually collected with well-designed paradigms in labs, the reliability and robustness of the corresponding decoding methods were doubted by some researchers, and they proposed that such decoding accuracy was overestimated due to the inherent temporal autocorrelations (TA) of EEG signals [7]–[9].

    However, the coupling between the stimulus-driven neural responses and the EEG temporal autocorrelations makes it difficult to confirm whether this overestimation exists in truth. Some researchers also argue that the effect of TA in EEG data on decoding is negligible and that it becomes a significant problem only under specific experimental designs in which subjects do not have enough resting time [10], [11].

    Due to a lack of problem formulation previous studies [7]–[9] only proposed that block-design should not be used to avoid the pitfall. However, the impact of TA could be avoided only when the trial of EEG was not further segmented into several samples. Otherwise, the overfitting or pitfall would still occur. In contrast, when the correct data splitting strategy was used (e.g. separating training and test data in time), the pitfall could also be avoided even when block-design was used.

    In our framework, we proposed the concept of "domain" to represent the EEG patterns resulting from TA and then used phantom EEG to remove stimulus-driven neural responses for verification. The results confirmed that the TA, always existing in the EEG data, added unique domain features to a continuous segment of EEG. The specific finding is that when the segment of EEG data with the same class label is split into multiple samples, the classifier will associate the sample's class label with the domain features, interfering with the learning of class-related features. This leads to an overestimation of decoding performance for test samples from the domains seen during training, and results in poor accuracy for test samples from unseen domains (as in real-world applications).

    Importantly, our work suggests that the key to reducing the impact of EEG TA on BCI decoding is to decouple class-related features from domain features in the actual EEG dataset. Our proposed unified framework serves as a reminder to BCI researchers of the impact of TA on their specific BCI tasks and is intended to guide them in selecting the appropriate experimental design, splitting strategy and model construction.

    3 The rationality for using watermelon as the phantom head

    We must point out that the "phantom EEG" indeed does not contain any "EEG" but records only noise, a watermelon is not a brain and does not generate any electrical signals. Therefore, the recorded electrical noises, even when amplified using equipment typically used for EEG, do not constitute EEG data when considering the definition of EEG. This is why previous researchers called it "phantom EEG". Some researchers may therefore think that it is questionable to use watermelon to get the phantom EEG.

    However, the usage of the phantom head allows researchers to evaluate the performance of neural-recording equipment and proposed algorithms without the effects of neural activity variability, artifacts, and potential ethical issues. Phantom heads used in previous studies include digital models [12]–[14], real human skulls [15]–[17], artificial physical phantoms [18]–[24] and watermelons [25]–[40]. Due to their similar conductivity to human tissue, similar size and shape to the human head, and ease of acquisition, watermelons are widely used as "phantom heads".

    Most works tried to use watermelon as a phantom head and found that the results analyzed using the neural signals from human subjects could not be obtained when using the phantom head, thus proving that the achieved results were indeed caused by neural signals. For example, Mutanen et.al [35] proposed that “the fact that the phantom head stimulation did not evoke similar biphasic artifacts excludes the possibility that residual induced artifacts, with the current TMS-compatible EEG system, could explain these components”.

    Our work differs significantly from most previous works. It is firstly found in our work that the phantom EEG exhibits the effect of TA on BCI decoding even when only noise was recorded, indicating the inherent existence of TA in the EEG data. The conclusion we hope to draw is that some current works may not truly use stimulus-driven neural responses to obtain the overestimated decoding performance. Similar logic may be found in a neuroscience review article [41], they proposed that EEG recordings from phantom head (watermelon) remind us that background noise may appear as positive results without proper statistical precautions.

    Reference

    [1] C. Spampinato, S. Palazzo, I. Kavasidis, D. Giordano, N. Souly, and M. Shah, “Deep Learning Human Mind for Automated Visual Classification,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jul. 2017, pp. 4503–4511.

    [2] S. Koelstra et al., “DEAP: A Database for Emotion Analysis ;Using Physiological Signals,” IEEE Transactions on Affective Computing, vol. 3, no. 1, pp. 18–31, 2012.

    [3] N. Das, T. Francart, and A. Bertrand, “Auditory Attention Detection Dataset KULeuven.” Zenodo, Aug. 27, 2020.

    [4] M. Tangermann et al., “Review of the BCI Competition IV,” Front.

  12. h

    things-eeg

    • huggingface.co
    Updated Mar 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HaitaoWu (2025). things-eeg [Dataset]. https://huggingface.co/datasets/Haitao999/things-eeg
    Explore at:
    Dataset updated
    Mar 6, 2025
    Authors
    HaitaoWu
    Description

    THINGS-EEG

    This dataset is a processed version of THINGS-EEG, derived from the paper Bridging the Vision-Brain Gap with an Uncertainty-Aware Blur Prior (CVPR 2025). In this version, the EEG data is stored in float16 format, reducing the storage size by half. The original official dataset can be accessed from the OSF repository. Original official dataset:

    A large and rich EEG dataset for modeling human visual object recognition [THINGS-EEG]

      Citation… See the full description on the dataset page: https://huggingface.co/datasets/Haitao999/things-eeg.
    
  13. T

    Raw EEG Data Files

    • dataverse.tdl.org
    txt
    Updated Jun 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Logan Trujillo; Logan Trujillo (2024). Raw EEG Data Files [Dataset]. http://doi.org/10.18738/T8/9TTLK8
    Explore at:
    txt(77555456), txt(76490240), txt(77779712), txt(75144704), txt(90450176), txt(75873536), txt(28499456), txt(28387328), txt(28555520), txt(75312896), txt(75200768), txt(75256832), txt(73126400), txt(74976512), txt(75985664), txt(80302592), txt(28611584), txt(85516544), txt(79910144), txt(76658432), txt(77499392), txt(76041728), txt(74023424), txt(80078336), txt(28443392), txt(77611520), txt(79517696), txt(78452480), txt(75761408), txt(73574912), txt(81928448), txt(74696192), txt(79293440), txt(76770560)Available download formats
    Dataset updated
    Jun 6, 2024
    Dataset provided by
    Texas Data Repository
    Authors
    Logan Trujillo; Logan Trujillo
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This is the raw EEG data for the study. Data is in BioSemi Data Format (BDF). Files with only "II" in the file name were recorded during the reported 1-Exemplar categorization task; "RB-II" files were recorded during the reported 2-Exemplar categorization task. "Resting" files were recorded during wakeful resting state data.

  14. c

    Ultra high-density EEG recording of interictal migraine and controls:...

    • kilthub.cmu.edu
    txt
    Updated Jul 21, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alireza Chaman Zar; Sarah Haigh; Pulkit Grover; Marlene Behrmann (2020). Ultra high-density EEG recording of interictal migraine and controls: sensory and rest [Dataset]. http://doi.org/10.1184/R1/12636731
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jul 21, 2020
    Dataset provided by
    Carnegie Mellon University
    Authors
    Alireza Chaman Zar; Sarah Haigh; Pulkit Grover; Marlene Behrmann
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We used a high-density electroencephalography (HD-EEG) system, with 128 customized electrode locations, to record from 17 individuals with migraine (12 female) in the interictal period, and 18 age- and gender-matched healthy control subjects, during visual (vertical grating pattern) and auditory (modulated tone) stimulation which varied in temporal frequency (4 and 6Hz), and during rest. This dataset includes the EEG raw data related to the paper entitled Chamanzar, Haigh, Grover, and Behrmann (2020), Abnormalities in cortical pattern of coherence in migraine detected using ultra high-density EEG. The link to our paper will be made available as soon as it is published online.

  15. Features-EEG dataset

    • researchdata.edu.au
    • openneuro.org
    Updated Jun 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Grootswagers Tijl; Tijl Grootswagers (2023). Features-EEG dataset [Dataset]. http://doi.org/10.18112/OPENNEURO.DS004357.V1.0.0
    Explore at:
    Dataset updated
    Jun 29, 2023
    Dataset provided by
    OpenNeurohttps://openneuro.org/
    Western Sydney University
    Authors
    Grootswagers Tijl; Tijl Grootswagers
    License

    ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
    License information was derived automatically

    Description

    Experiment Details Electroencephalography recordings from 16 subjects to fast streams of gabor-like stimuli. Images were presented in rapid serial visual presentation streams at 6.67Hz and 20Hz rates. Participants performed an orthogonal fixation colour change detection task.

    Experiment length: 1 hour Raw and preprocessed data are available online through openneuro: https://openneuro.org/datasets/ds004357. Supplementary Material and analysis scripts are available on github: https://github.com/Tijl/features-eeg

  16. EEG dataset for speech decoding

    • openneuro.org
    Updated Apr 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    João Pedro Carvalho Moreira; Vinícius Rezende Carvalho; Eduardo Mazoni Andrade Marçal Mendes; Ariah Fallah; Terrence J. Sejnowski; Claudia Lainscsek; Lindy Comstock (2025). EEG dataset for speech decoding [Dataset]. http://doi.org/10.18112/openneuro.ds006104.v1.0.1
    Explore at:
    Dataset updated
    Apr 8, 2025
    Dataset provided by
    OpenNeurohttps://openneuro.org/
    Authors
    João Pedro Carvalho Moreira; Vinícius Rezende Carvalho; Eduardo Mazoni Andrade Marçal Mendes; Ariah Fallah; Terrence J. Sejnowski; Claudia Lainscsek; Lindy Comstock
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    EEG dataset for speech decoding

    Dataset Overview

    This dataset contains EEG recordings from a phoneme discrimination task with TMS. The data were collected during two related studies in 2019 and 2021.

    Study 1 (2019, Session 01): - 8 participants (P01-P08) - Focus on CV and VC phoneme pairs - 2 blocks: CV pairs and VC pairs - TMS targeted to LipM1 (-56, -8, 46) and TongueM1 (-60, -10, 25)

    Study 2 (2021, Session 02): - 16 participants (S01-S16) - Expanded to include single phonemes and phoneme triplets - 4 blocks: single phonemes, CV pairs, real words, and pseudowords - Additional TMS targets included Broca's area (BA 44: -51, 7, 23) and verbal memory region (BA 6: -46, 1, 41)

    Task Description

    Participants listened to speech sounds and identified stimuli with a button-press response. The stimuli included: 1. Single phonemes - Consonants (/b/, /p/, /d/, /t/, /s/, /z/) and vowels (/i/, /E/, /A/, /u/, /oU/) 2. Phoneme pairs - CV and VC combinations of the phonemes 3. Phoneme triplets - Real and pseudowords constructed of CVC sequences

    TMS Methodology

    Detailed information about TMS parameters can be found in the sourcedata/tms_metadata/tms_parameters.json file. TMS was applied using a Magstim Super Rapid Plus1 stimulator with a figure-of-eight 40 mm coil. Stimulation was delivered at 110% of resting motor threshold as paired pulses with 50ms interpulse interval.

    Detailed information about the methodology and results can be found in the associated publication: Moreira et al. "An open-access EEG dataset for speech decoding: Exploring the role of articulation and coarticulation"

    Directory Structure

    The dataset follows BIDS convention with the following structure: /sub-[subject]/ses-[session]/eeg/ Where subject is P01-P08 for Study 1 and S01-S16 for Study 2. Session is 01 for Study 1 and 02 for Study 2.

    Contact Information

    For questions about this dataset, please contact Lindy Comstock at lbcomstock@ucla.edu

  17. EEG Motor Movement/Imagery Dataset

    • kaggle.com
    zip
    Updated Apr 5, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    6541816 (2022). EEG Motor Movement/Imagery Dataset [Dataset]. https://www.kaggle.com/datasets/brianleung2020/eeg-motor-movementimagery-dataset
    Explore at:
    zip(2009033556 bytes)Available download formats
    Dataset updated
    Apr 5, 2022
    Authors
    6541816
    Description

    Dataset

    This dataset was created by 6541816

    Contents

  18. EEG of Alzheimer's and Frontotemporal dementia

    • kaggle.com
    zip
    Updated Jan 28, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    yosf tag (2024). EEG of Alzheimer's and Frontotemporal dementia [Dataset]. https://www.kaggle.com/datasets/yosftag/open-nuro-dataset
    Explore at:
    zip(4479288286 bytes)Available download formats
    Dataset updated
    Jan 28, 2024
    Authors
    yosf tag
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset contains the EEG resting state-closed eyes recordings from 88 subjects in total. Participants: 36 of them were diagnosed with Alzheimer's disease (AD group), 23 were diagnosed with Frontotemporal Dementia (FTD group) and 29 were healthy subjects (CN group). Cognitive and neuropsychological state was evaluated by the international Mini-Mental State Examination (MMSE). MMSE score ranges from 0 to 30, with lower MMSE indicating more severe cognitive decline. The duration of the disease was measured in months and the median value was 25 with IQR range (Q1-Q3) being 24 - 28.5 months. Concerning the AD groups, no dementia-related comorbidities have been reported. The average MMSE for the AD group was 17.75 (sd=4.5), for the FTD group was 22.17 (sd=8.22) and for the CN group was 30. The mean age of the AD group was 66.4 (sd=7.9), for the FTD group was 63.6 (sd=8.2), and for the CN group was 67.9 (sd=5.4).

    Recordings: Recordings were aquired from the 2nd Department of Neurology of AHEPA General Hispital of Thessaloniki by an experienced team of neurologists. For recording, a Nihon Kohden EEG 2100 clinical device was used, with 19 scalp electrodes (Fp1, Fp2, F7, F3, Fz, F4, F8, T3, C3, Cz, C4, T4, T5, P3, Pz, P4, T6, O1, and O2) according to the 10-20 international system and 2 reference electrodes (A1 and A2) placed on the mastoids for impendance check, according to the manual of the device. Each recording was performed according to the clinical protocol with participants being in a sitting position having their eyes closed. Before the initialization of each recording, the skin impedance value was ensured to be below 5k?. The sampling rate was 500 Hz with 10uV/mm resolution. The recording montages were anterior-posterior bipolar and referential montage using Cz as the common reference. The referential montage was included in this dataset. The recordings were received under the range of the following parameters of the amplifier: Sensitivity: 10uV/mm, time constant: 0.3s, and high frequency filter at 70 Hz. Each recording lasted approximately 13.5 minutes for AD group (min=5.1, max=21.3), 12 minutes for FTD group (min=7.9, max=16.9) and 13.8 for CN group (min=12.5, max=16.5). In total, 485.5 minutes of AD, 276.5 minutes of FTD and 402 minutes of CN recordings were collected and are included in the dataset.

    Preprocessing: The EEG recordings were exported in .eeg format and are transformed to BIDS accepted .set format for the inclusion in the dataset. Automatic annotations of the Nihon Kohden EEG device marking artifacts (muscle activity, blinking, swallowing) have not been included for language compatibility purposes (If this is an issue, please use the preprocessed dataset in Folder: derivatives). The unprocessed EEG recordings are included in folders named: sub-0XX. Folders named sub-0XX in the subfolder derivatives contain the preprocessed and denoised EEG recordings. The preprocessing pipeline of the EEG signals is as follows. First, a Butterworth band-pass filter 0.5-45 Hz was applied and the signals were re-referenced to A1-A2. Then, the Artifact Subspace Reconstruction routine (ASR) which is an EEG artifact correction method included in the EEGLab Matlab software was applied to the signals, removing bad data periods which exceeded the max acceptable 0.5 second window standard deviation of 17, which is considered a conservative window. Next, the Independent Component Analysis (ICA) method (RunICA algorithm) was performed, transforming the 19 EEG signals to 19 ICA components. ICA components that were classified as “eye artifacts” or “jaw artifacts” by the automatic classification routine “ICLabel” in the EEGLAB platform were automatically rejected. It should be noted that, even though the recording was performed in a resting state, eyes-closed condition, eye artifacts of eye movement were still found at some EEG recordings.

    A complete analysis of this dataset can be found in the published Data Descriptor paper "A Dataset of Scalp EEG Recordings of Alzheimer’s Disease, Frontotemporal Dementia and Healthy Subjects from Routine EEG", https://doi.org/10.3390/data8060095 *****Im not the original creator of this dataset it was published on https://openneuro.org/datasets/ds004504/versions/1.0.6 i just ported it here for ease of use *****

  19. n

    Electroencephalogram Database: Prediction of Epileptic Seizures

    • neuinfo.org
    • dknet.org
    • +2more
    Updated May 10, 2005
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2005). Electroencephalogram Database: Prediction of Epileptic Seizures [Dataset]. http://identifiers.org/RRID:SCR_008032
    Explore at:
    Dataset updated
    May 10, 2005
    Description

    THIS RESOURCE IS NO LONGER IN SERVICE. Documented on April 29,2025. Electroencephalogram (EEG) data recorded from invasive and scalp electrodes. The EEG database contains invasive EEG recordings of 21 patients suffering from medically intractable focal epilepsy. The data were recorded during an invasive pre-surgical epilepsy monitoring at the Epilepsy Center of the University Hospital of Freiburg, Germany. In eleven patients, the epileptic focus was located in neocortical brain structures, in eight patients in the hippocampus, and in two patients in both. In order to obtain a high signal-to-noise ratio, fewer artifacts, and to record directly from focal areas, intracranial grid-, strip-, and depth-electrodes were utilized. The EEG data were acquired using a Neurofile NT digital video EEG system with 128 channels, 256 Hz sampling rate, and a 16 bit analogue-to-digital converter. Notch or band pass filters have not been applied. For each of the patients, there are datasets called ictal and interictal, the former containing files with epileptic seizures and at least 50 min pre-ictal data. the latter containing approximately 24 hours of EEG-recordings without seizure activity. At least 24 h of continuous interictal recordings are available for 13 patients. For the remaining patients interictal invasive EEG data consisting of less than 24 h were joined together, to end up with at least 24 h per patient. An interdisciplinary project between: * Epilepsy Center, University Hospital Freiburg * Bernstein Center for Computational Neuroscience (BCCN), Freiburg * Freiburg Center for Data Analysis and Modeling (FDM).

  20. EEG dataset

    • figshare.com
    bin
    Updated Dec 6, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    minho lee (2019). EEG dataset [Dataset]. http://doi.org/10.6084/m9.figshare.8091242.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Dec 6, 2019
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    minho lee
    License

    https://www.gnu.org/copyleft/gpl.htmlhttps://www.gnu.org/copyleft/gpl.html

    Description

    This dataset has collected for the study of "Robust Detection of Event-Related Potentials in a User-Voluntary Short-Term Imagery Task.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Quân Nguyễn Bảo (2025). EEG-Dataset [Dataset]. https://www.kaggle.com/datasets/quands/eeg-dataset
Organization logo

Data from: EEG-Dataset

Read the descriptions!!!

Related Article
Explore at:
zip(3155571 bytes)Available download formats
Dataset updated
Aug 3, 2025
Authors
Quân Nguyễn Bảo
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

**Overview:

The Bonn EEG Dataset is a widely recognized dataset in the field of biomedical signal processing and machine learning, specifically designed for research in epilepsy detection and EEG signal analysis. It contains electroencephalogram (EEG) recordings from both healthy individuals and patients with epilepsy, making it suitable for tasks such as seizure detection and classification of brain activity states. The dataset is structured into five distinct subsets (labeled A, B, C, D, and E), each comprising 100 single-channel EEG segments, resulting in a total of 500 segments. Each segment represents 23.6 seconds of EEG data, sampled at a frequency of 173.61 Hz, yielding 4,096 data points per segment, stored in ASCII format as text files.

****Structure and Label:

  • Set A: EEG recordings from healthy individuals with eyes open, capturing normal brain activity under visual stimulation.
  • Set B: EEG recordings from healthy individuals with eyes closed, reflecting brain activity in a resting state.
  • Set C: EEG recordings from epilepsy patients, collected from the epileptogenic zone during an interictal (seizure-free) period.
  • Set D: EEG recordings from epilepsy patients, collected from the hippocampal formation of the opposite brain hemisphere during an interictal period.
  • Set E: EEG recordings from epilepsy patients during an ictal (seizure) period, capturing brain activity during an epileptic seizure. Each subset contains 100 EEG segments, ensuring a balanced distribution across the five classes, which supports both binary (e.g., healthy vs. epileptic) and multi-class (e.g., A-E classification) tasks.

**Key Characteristics

  • Size: 500 EEG segments (100 segments per subset, across five subsets).
  • Data Type: Single-channel EEG signals, stored in text files (ASCII format).
  • Sampling Rate: 173.61 Hz, providing high temporal resolution.
  • Segment Length: 23.6 seconds per segment, equivalent to 4,096 data points.
  • Labels: Clearly defined for each subset (A: healthy, eyes open; B: healthy, eyes closed; C: interictal, epileptogenic zone; D: interictal, opposite hemisphere; E: ictal), enabling precise model evaluation.
  • Preprocessing: The data is not pre-filtered, but a low-pass filter with a 40 Hz cutoff is recommended to remove high-frequency noise and artifacts, as suggested in the original documentation.

**Applications

The Bonn EEG Dataset is ideal for machine learning and signal processing tasks, including: - Developing algorithms for epileptic seizure detection and prediction. - Exploring feature extraction techniques, such as wavelet transforms, for EEG signal analysis. - Classifying brain states (healthy vs. epileptic, interictal vs. ictal). - Supporting research in neuroscience and medical diagnostics, particularly for epilepsy monitoring and treatment.

**Source

  • The dataset is publicly available from the University of Bonn and can be downloaded from the following link: University of Bonn EEG Dataset
  • The dataset is provided as five ZIP files, each containing 100 text files corresponding to the EEG segments for subsets A, B, C, D, and E.

**Citation

When using this dataset, researchers are required to cite the original publication: Andrzejak, R. G., Lehnertz, K., Mormann, F., Rieke, C., David, P., & Elger, C. E. (2001). Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: Dependence on recording region and brain state. Physical Review E, 64(6), 061907. DOI: 10.1103/PhysRevE.64.061907.

**Additional Notes

  1. The dataset is randomized, with no specific information provided about patients or electrode placements, ensuring simplicity and focus on signal characteristics.

  2. The data is not hosted on Kaggle or Hugging Face but is accessible directly from the University of Bonn’s repository or mirrored sources.

  3. Researchers may need to apply preprocessing steps, such as filtering or normalization, to optimize the data for machine learning tasks.

  4. The dataset’s balanced structure and clear labels make it an excellent choice for a one-week machine learning project, particularly for tasks involving traditional algorithms like SVM, Random Forest, or Logistic Regression.

  5. This dataset provides a robust foundation for learning signal processing, feature extraction, and machine learning techniques while addressing a real-world medical challenge in epilepsy detection.

Search
Clear search
Close search
Google apps
Main menu