100+ datasets found
  1. Data from: EEG-Dataset

    • kaggle.com
    zip
    Updated Aug 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Quân Nguyễn Bảo (2025). EEG-Dataset [Dataset]. https://www.kaggle.com/datasets/quands/eeg-dataset
    Explore at:
    zip(3155571 bytes)Available download formats
    Dataset updated
    Aug 3, 2025
    Authors
    Quân Nguyễn Bảo
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    **Overview:

    The Bonn EEG Dataset is a widely recognized dataset in the field of biomedical signal processing and machine learning, specifically designed for research in epilepsy detection and EEG signal analysis. It contains electroencephalogram (EEG) recordings from both healthy individuals and patients with epilepsy, making it suitable for tasks such as seizure detection and classification of brain activity states. The dataset is structured into five distinct subsets (labeled A, B, C, D, and E), each comprising 100 single-channel EEG segments, resulting in a total of 500 segments. Each segment represents 23.6 seconds of EEG data, sampled at a frequency of 173.61 Hz, yielding 4,096 data points per segment, stored in ASCII format as text files.

    ****Structure and Label:

    • Set A: EEG recordings from healthy individuals with eyes open, capturing normal brain activity under visual stimulation.
    • Set B: EEG recordings from healthy individuals with eyes closed, reflecting brain activity in a resting state.
    • Set C: EEG recordings from epilepsy patients, collected from the epileptogenic zone during an interictal (seizure-free) period.
    • Set D: EEG recordings from epilepsy patients, collected from the hippocampal formation of the opposite brain hemisphere during an interictal period.
    • Set E: EEG recordings from epilepsy patients during an ictal (seizure) period, capturing brain activity during an epileptic seizure. Each subset contains 100 EEG segments, ensuring a balanced distribution across the five classes, which supports both binary (e.g., healthy vs. epileptic) and multi-class (e.g., A-E classification) tasks.

    **Key Characteristics

    • Size: 500 EEG segments (100 segments per subset, across five subsets).
    • Data Type: Single-channel EEG signals, stored in text files (ASCII format).
    • Sampling Rate: 173.61 Hz, providing high temporal resolution.
    • Segment Length: 23.6 seconds per segment, equivalent to 4,096 data points.
    • Labels: Clearly defined for each subset (A: healthy, eyes open; B: healthy, eyes closed; C: interictal, epileptogenic zone; D: interictal, opposite hemisphere; E: ictal), enabling precise model evaluation.
    • Preprocessing: The data is not pre-filtered, but a low-pass filter with a 40 Hz cutoff is recommended to remove high-frequency noise and artifacts, as suggested in the original documentation.

    **Applications

    The Bonn EEG Dataset is ideal for machine learning and signal processing tasks, including: - Developing algorithms for epileptic seizure detection and prediction. - Exploring feature extraction techniques, such as wavelet transforms, for EEG signal analysis. - Classifying brain states (healthy vs. epileptic, interictal vs. ictal). - Supporting research in neuroscience and medical diagnostics, particularly for epilepsy monitoring and treatment.

    **Source

    • The dataset is publicly available from the University of Bonn and can be downloaded from the following link: University of Bonn EEG Dataset
    • The dataset is provided as five ZIP files, each containing 100 text files corresponding to the EEG segments for subsets A, B, C, D, and E.

    **Citation

    When using this dataset, researchers are required to cite the original publication: Andrzejak, R. G., Lehnertz, K., Mormann, F., Rieke, C., David, P., & Elger, C. E. (2001). Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: Dependence on recording region and brain state. Physical Review E, 64(6), 061907. DOI: 10.1103/PhysRevE.64.061907.

    **Additional Notes

    1. The dataset is randomized, with no specific information provided about patients or electrode placements, ensuring simplicity and focus on signal characteristics.

    2. The data is not hosted on Kaggle or Hugging Face but is accessible directly from the University of Bonn’s repository or mirrored sources.

    3. Researchers may need to apply preprocessing steps, such as filtering or normalization, to optimize the data for machine learning tasks.

    4. The dataset’s balanced structure and clear labels make it an excellent choice for a one-week machine learning project, particularly for tasks involving traditional algorithms like SVM, Random Forest, or Logistic Regression.

    5. This dataset provides a robust foundation for learning signal processing, feature extraction, and machine learning techniques while addressing a real-world medical challenge in epilepsy detection.

  2. p

    CHB-MIT Scalp EEG Database

    • physionet.org
    Updated Jun 9, 2010
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Guttag (2010). CHB-MIT Scalp EEG Database [Dataset]. http://doi.org/10.13026/C2K01R
    Explore at:
    Dataset updated
    Jun 9, 2010
    Authors
    John Guttag
    License

    Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
    License information was derived automatically

    Description

    This database, collected at the Children’s Hospital Boston, consists of EEG recordings from pediatric subjects with intractable seizures. Subjects were monitored for up to several days following withdrawal of anti-seizure medication in order to characterize their seizures and assess their candidacy for surgical intervention. The recordings are grouped into 23 cases and were collected from 22 subjects (5 males, ages 3–22; and 17 females, ages 1.5–19).

  3. i

    EEG Signal Dataset

    • ieee-dataport.org
    Updated Jun 11, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rahul Kher (2020). EEG Signal Dataset [Dataset]. https://ieee-dataport.org/documents/eeg-signal-dataset
    Explore at:
    Dataset updated
    Jun 11, 2020
    Authors
    Rahul Kher
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    PCA

  4. EEG Brainwave Dataset: Feeling Emotions

    • kaggle.com
    zip
    Updated Dec 19, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jordan J. Bird (2018). EEG Brainwave Dataset: Feeling Emotions [Dataset]. https://www.kaggle.com/datasets/birdy654/eeg-brainwave-dataset-feeling-emotions
    Explore at:
    zip(12498935 bytes)Available download formats
    Dataset updated
    Dec 19, 2018
    Authors
    Jordan J. Bird
    Description

    Can you use brainwave data to discern whether someone is feeling good?

    Please cite the following if you are using this data

    https://www.researchgate.net/publication/329403546_Mental_Emotional_Sentiment_Classification_with_an_EEG-based_Brain-machine_Interface

    https://www.researchgate.net/publication/335173767_A_Deep_Evolutionary_Approach_to_Bioinspired_Classifier_Optimisation_for_Brain-Machine_Interaction

    This is a dataset of EEG brainwave data that has been processed with our original strategy of statistical extraction (paper below)

    The data was collected from two people (1 male, 1 female) for 3 minutes per state - positive, neutral, negative. We used a Muse EEG headband which recorded the TP9, AF7, AF8 and TP10 EEG placements via dry electrodes. Six minutes of resting neutral data is also recorded, the stimuli used to evoke the emotions are below

    1 . Marley and Me - Negative (Twentieth Century Fox) Death Scene 2. Up - Negative (Walt Disney Pictures) Opening Death Scene 3. My Girl - Negative (Imagine Entertainment) Funeral Scene 4. La La Land - Positive (Summit Entertainment) Opening musical number 5. Slow Life - Positive (BioQuest Studios) Nature timelapse 6. Funny Dogs - Positive (MashupZone) Funny dog clips

    Our method of statistical extraction resampled the data since waves must be mathematically described in a temporal fashion.

    If you would like to use the data in research projects, please cite the following:

    J. J. Bird, L. J. Manso, E. P. Ribiero, A. Ekart, and D. R. Faria, “A study on mental state classification using eeg-based brain-machine interface,”in 9th International Conference on Intelligent Systems, IEEE, 2018.

    J. J. Bird, A. Ekart, C. D. Buckingham, and D. R. Faria, “Mental emotional sentiment classification with an eeg-based brain-machine interface,” in The International Conference on Digital Image and Signal Processing (DISP’19), Springer, 2019.

    This research was part supported by the EIT Health GRaCE-AGE grant number 18429 awarded to C.D. Buckingham.

  5. Seizure Epilepcy CHB MIT EEG dataset pediatric

    • kaggle.com
    zip
    Updated Jul 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abhishek Parikh (2023). Seizure Epilepcy CHB MIT EEG dataset pediatric [Dataset]. https://www.kaggle.com/datasets/abhishekinnvonix/seizure-epilepcy-chb-mit-eeg-dataset-pediatric
    Explore at:
    zip(25296815967 bytes)Available download formats
    Dataset updated
    Jul 1, 2023
    Authors
    Abhishek Parikh
    License

    Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
    License information was derived automatically

    Description

    Recordings, grouped into 23 cases, were collected from 22 subjects (5 males, ages 3–22; and 17 females, ages 1.5–19). (Case chb21 was obtained 1.5 years after case chb01, from the same female subject.) The file SUBJECT-INFO contains the gender and age of each subject. (Case chb24 was added to this collection in December 2010, and is not currently included in SUBJECT-INFO.)

    Each case (chb01, chb02, etc.) contains between 9 and 42 continuous .edf files from a single subject. Hardware limitations resulted in gaps between consecutively-numbered .edf files, during which the signals were not recorded; in most cases, the gaps are 10 seconds or less, but occasionally there are much longer gaps. In order to protect the privacy of the subjects, all protected health information (PHI) in the original .edf files has been replaced with surrogate information in the files provided here. Dates in the original .edf files have been replaced by surrogate dates, but the time relationships between the individual files belonging to each case have been preserved. In most cases, the .edf files contain exactly one hour of digitized EEG signals, although those belonging to case chb10 are two hours long, and those belonging to cases chb04, chb06, chb07, chb09, and chb23 are four hours long; occasionally, files in which seizures are recorded are shorter.

    All signals were sampled at 256 samples per second with 16-bit resolution. Most files contain 23 EEG signals (24 or 26 in a few cases). The International 10-20 system of EEG electrode positions and nomenclature was used for these recordings. In a few records, other signals are also recorded, such as an ECG signal in the last 36 files belonging to case chb04 and a vagal nerve stimulus (VNS) signal in the last 18 files belonging to case chb09. In some cases, up to 5 “dummy” signals (named "-") were interspersed among the EEG signals to obtain an easy-to-read display format; these dummy signals can be ignored.

    The file RECORDS contains a list of all 664 .edf files included in this collection, and the file RECORDS-WITH-SEIZURES lists the 129 of those files that contain one or more seizures. In all, these records include 198 seizures (182 in the original set of 23 cases); the beginning ([) and end (]) of each seizure is annotated in the .seizure annotation files that accompany each of the files listed in RECORDS-WITH-SEIZURES. In addition, the files named chbnn-summary.txt contain information about the montage used for each recording, and the elapsed time in seconds from the beginning of each .edf file to the beginning and end of each seizure contained in it.

  6. u

    EEG Datasets for Naturalistic Listening to "Alice in Wonderland" (Version 1)...

    • deepblue.lib.umich.edu
    Updated Nov 20, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brennan, Jonathan R. (2018). EEG Datasets for Naturalistic Listening to "Alice in Wonderland" (Version 1) [Dataset]. http://doi.org/10.7302/Z29C6VNH
    Explore at:
    Dataset updated
    Nov 20, 2018
    Dataset provided by
    Deep Blue Data
    Authors
    Brennan, Jonathan R.
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These files contain the raw data and processing parameters to go with the paper "Hierarchical structure guides rapid linguistic predictions during naturalistic listening" by Jonathan R. Brennan and John T. Hale. These files include the stimulus (wav files), raw data (matlab format for the Fieldtrip toolbox), data processing paramters (matlab), and variables used to align the stimuli with the EEG data and for the statistical analyses reported in the paper.

  7. h

    General-Disorders-EEG-Dataset-v1

    • huggingface.co
    Updated Nov 25, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neurazum (2025). General-Disorders-EEG-Dataset-v1 [Dataset]. http://doi.org/10.57967/hf/3321
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 25, 2025
    Dataset authored and provided by
    Neurazum
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    Synthetic EEG data generated by the ‘bai’ model based on real data.

      Features/Columns:
    

    No: "Number" Sex: "Gender" Age: "Age of participants" EEG Date: "The date of the EEG" Education: "Education level" IQ: "IQ level of participants" Main Disorder: "General class definition of the disorder" Specific Disorder: "Specific class definition of the disorder"

    Total Features/Columns: 1140

      Content:
    

    Obsessive Compulsive Disorder Bipolar Disorder Schizophrenia… See the full description on the dataset page: https://huggingface.co/datasets/Neurazum/General-Disorders-EEG-Dataset-v1.

  8. p

    Auditory evoked potential EEG-Biometric dataset

    • physionet.org
    Updated Dec 1, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nibras Abo Alzahab; Angelo Di Iorio; Luca Apollonio; Muaaz Alshalak; Alessandro Gravina; Luca Antognoli; Marco Baldi; Lorenzo Scalise; Bilal Alchalabi (2021). Auditory evoked potential EEG-Biometric dataset [Dataset]. http://doi.org/10.13026/ps31-fc50
    Explore at:
    Dataset updated
    Dec 1, 2021
    Authors
    Nibras Abo Alzahab; Angelo Di Iorio; Luca Apollonio; Muaaz Alshalak; Alessandro Gravina; Luca Antognoli; Marco Baldi; Lorenzo Scalise; Bilal Alchalabi
    License

    https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts

    Description

    This data set consists of over 240 two-minute EEG recordings obtained from 20 volunteers. Resting-state and auditory stimuli experiments are included in the data. The goal is to develop an EEG-based Biometric system.

    The data includes resting-state EEG signals in both cases: eyes open and eyes closed. The auditory stimuli part consists of six experiments; Three with in-ear auditory stimuli and another three with bone-conducting auditory stimuli. The three stimuli for each case are a native song, a non-native song, and neutral music.

  9. Data from: A multi-subject and multi-session EEG dataset for modelling human...

    • openneuro.org
    Updated Jun 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shuning Xue; Bu Jin; Jie Jiang; Longteng Guo; Jin Zhou; Changyong Wang; Jing Liu (2025). A multi-subject and multi-session EEG dataset for modelling human visual object recognition [Dataset]. http://doi.org/10.18112/openneuro.ds005589.v1.0.3
    Explore at:
    Dataset updated
    Jun 7, 2025
    Dataset provided by
    OpenNeurohttps://openneuro.org/
    Authors
    Shuning Xue; Bu Jin; Jie Jiang; Longteng Guo; Jin Zhou; Changyong Wang; Jing Liu
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Overview

    This multi-subject and multi-session EEG dataset for modelling human visual object recognition (MSS) contains:

    1. 122-channel EEG data collected on 32 participants during natural visual stimulation;
    2. totally 100 sessions for 1.5 hours each;
    3. each session consists of 4 RSVP runs and 4 low-speed presentation runs;
    4. each participant completed between 1 to 5 sessions on different days, around one week apart.

    More details about the dataset are described as follows.

    Participants

    32 participants were recruited from college students in Beijing, of which 4 were female, and 28 were male, with an age range of 21-33 years. 100 sessions were conducted. They were paid and gave written informed consent. The study was conducted under the approval of the ethical committee of the Institute of Automation at the Chinese Academy of Sciences, with the approval number: IA21-2410-020201.

    Experimental Procedures

    1. RSVP experiment: During the RSVP experiment, the participants were shown images at a rate of 5 Hz, and each run consisted of 2,000 trials. There were 20 image categories, with 100 images in each category, making up the 2,000 stimuli. The 100 images in each category were further divided into five image sequences, resulting in 100 image sequences per run. Each sequence was composed of 20 images from the same class, and the 100 sequences were presented in a pseudo-random order.

    After every 50 sequences, there was a break for the participants to rest. Each rapid serial sequence lasted approximately 7.5 seconds, starting with a 750ms blank screen with a white fixation cross, followed by 20 or 21 images presented at 5 Hz with a 50% duty cycle. The sequence ended with another 750ms blank screen.

    After the rapid serial sequence, there was a 2-second interval during which participants were instructed to blink and then report whether a special image appeared in the sequence using a keyboard. During each run, 20 sequences were randomly inserted with additional special images at random positions. The special images are logos for brain-computer interfaces.

    1. Low-speed experiment: During the low-speed experiment, each run consisted of 100 trials, with 1 second per image for a slower paradigm. The 100 stimuli were presented in a pseudo-random order and included 20 image categories, each containing 5 images. A break was given to the participants after every 20 images for them to rest.

    Each image was displayed for 1 second and was followed by 11 choice boxes (1 correct class box, 9 random class boxes, and 1 reject box). Participants were required to select the correct class of the displayed image using a mouse to increase their engagement. After the selection, a white fixation cross was displayed for 1 second in the centre of the screen to remind participants to pay attention to the upcoming task.

    Stimuli

    The stimuli are from two image databases, ImageNet and PASCAL. The final set consists of 10,000 images, with 500 images for each class.

    Annotations

    In the derivatives/annotations folder, there are additional information of MSS:

    1. Videos of two paradigms.
    2. Dataset_info: Main features of MSS.
    3. Experiment_schedule: Schedule of each session.
    4. Stimuli_source: Source categories of ImageNet and PASCAL.
    5. Subject_info: Age and sex of participants.
    6. Task_event: The meaning of eventID.

    Preprocessing

    The EEG signals were pre-processed using the MNE package, version 1.3.1, with Python 3.9.16. The data was sampled at a rate of 1,000 Hz with a bandpass filter applied between 0.1 and 100 Hz. A notch filter was used to remove 50 Hz power frequency. Epochs were created for each trial ranging from 0 to 500 ms relative to stimulus onset. No further preprocessing or artefact correction methods were applied in technical validation. However, researchers may want to consider widely used preprocessing steps such as baseline correction or eye movement correction. After the preprocessing, each session resulted in two matrices: RSVP EEG data matrix of shape (8,000 image conditions × 122 EEG channels × 125 EEG time points) and low-speed EEG data matrix of shape (400 image conditions × 122 EEG channels × 125 EEG time points).

  10. b

    Harvard Electroencephalography Database

    • bdsp.io
    • registry.opendata.aws
    Updated Feb 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sahar Zafar; Tobias Loddenkemper; Jong Woo Lee; Andrew Cole; Daniel Goldenholz; Jurriaan Peters; Alice Lam; Edilberto Amorim; Catherine Chu; Sydney Cash; Valdery Moura Junior; Aditya Gupta; Manohar Ghanta; Marta Fernandes; Haoqi Sun; Jin Jing; M Brandon Westover (2025). Harvard Electroencephalography Database [Dataset]. http://doi.org/10.60508/k85b-fc87
    Explore at:
    Dataset updated
    Feb 10, 2025
    Authors
    Sahar Zafar; Tobias Loddenkemper; Jong Woo Lee; Andrew Cole; Daniel Goldenholz; Jurriaan Peters; Alice Lam; Edilberto Amorim; Catherine Chu; Sydney Cash; Valdery Moura Junior; Aditya Gupta; Manohar Ghanta; Marta Fernandes; Haoqi Sun; Jin Jing; M Brandon Westover
    License

    https://github.com/bdsp-core/bdsp-license-and-duahttps://github.com/bdsp-core/bdsp-license-and-dua

    Description

    The Harvard EEG Database will encompass data gathered from four hospitals affiliated with Harvard University: Massachusetts General Hospital (MGH), Brigham and Women's Hospital (BWH), Beth Israel Deaconess Medical Center (BIDMC), and Boston Children's Hospital (BCH). The EEG data includes three types:

    rEEG: "routine EEGs" recorded in the outpatient setting.
    EMU: recordings obtained in the inpatient setting, within the Epilepsy Monitoring Unit (EMU).
    ICU/LTM: recordings obtained from acutely and critically ill patients within the intensive care unit (ICU).
    
  11. Data from: A Resting-state EEG Dataset for Sleep Deprivation

    • openneuro.org
    Updated Apr 27, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chuqin Xiang; Xinrui Fan; Duo Bai; Ke Lv; Xu Lei (2025). A Resting-state EEG Dataset for Sleep Deprivation [Dataset]. http://doi.org/10.18112/openneuro.ds004902.v1.0.8
    Explore at:
    Dataset updated
    Apr 27, 2025
    Dataset provided by
    OpenNeurohttps://openneuro.org/
    Authors
    Chuqin Xiang; Xinrui Fan; Duo Bai; Ke Lv; Xu Lei
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    General information

    The dataset provides resting-state EEG data (eyes open,partially eyes closed) from 71 participants who underwent two experiments involving normal sleep (NS---session1) and sleep deprivation(SD---session2) .The dataset also provides information on participants' sleepiness and mood states. (Please note here Session 1 (NS) and Session 2 (SD) is not the time order, the time order is counterbalanced across participants and is listed in metadata.)

    Dataset

    Presentation

    The data collection was initiated in March 2019 and was terminated in December 2020. The detailed description of the dataset is currently under working by Chuqin Xiang,Xinrui Fan,Duo Bai,Ke Lv and Xu Lei, and will submit to Scientific Data for publication.

    EEG acquisition

    • EEG system (Brain Products GmbH, Steing- rabenstr, Germany, 61 electrodes)
    • Sampling frequency: 500Hz
    • Impedances were kept below 5k

    Contact

     * If you have any questions or comments, please contact:
     * Xu Lei: xlei@swu.edu.cn   
    

    Article

    Xiang, C., Fan, X., Bai, D. et al. A resting-state EEG dataset for sleep deprivation. Sci Data 11, 427 (2024). https://doi.org/10.1038/s41597-024-03268-2

  12. The Phantom EEG Dataset

    • zenodo.org
    bin, tar
    Updated Oct 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anonymous; Anonymous (2024). The Phantom EEG Dataset [Dataset]. http://doi.org/10.5281/zenodo.13341214
    Explore at:
    bin, tarAvailable download formats
    Dataset updated
    Oct 14, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Anonymous; Anonymous
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    When you use this dataset, please cite this paper. More information about this dataset could also be found in this paper.

    Xu, X., Wang, B., Xiao, B., Niu, Y., Wang, Y., Wu, X., & Chen, J. (2024). Beware of Overestimated Decoding Performance Arising from Temporal Autocorrelations in Electroencephalogram Signals. arXiv preprint arXiv:2405.17024.

    1 Metadata

    Brief introduction

    The present work aims to demonstrate that temporal autocorrelations (TA) significantly impacts various BCI tasks even in conditions without neural activity. We used the watermelon as the phantom head and found that we could get the pitfall of overestimated decoding performance if continuous EEG data with the same class label were split into training and test sets. More details can be found in Motivation.

    As watermelons cannot perform any experimental tasks, we can reorganize it to the format of various actual EEG dataset without the need to collect EEG data as previous work did (examples in Domain Studied).

    Measurement devices

    Manufacturers: NeuroScan SynAmps2 system (Compumedics Limited, Victoria, Australia)

    Configuration: 64-channel Ag/AgCl electrode cap with a 10/20 layout

    Species

    Watermelons. Ten watermelons served as phantom heads.

    Domain Studied

    Overestimated Decoding Performance in EEG decoding.

    Following BCI datasets in various BCI tasks have been reorganized using the Phantom EEG Dataset. The pitfall has been found in four of five tasks.

    - CVPR dataset [1] for image decoding task.

    - DEAP dataset [2] for emotion recognition task.

    - KUL dataset [3] for auditory spatial attention decoding task.

    - BCIIV2a dataset [4] for motor imagery task (the pitfalls were absent due to the use of rapid-design paradigm during EEG recording).

    - SIENA dataset [5] for epilepsy detection task.

    Tasks Completed

    Resting State but you could reorganize it to any task in BCI.

    Dataset Name

    The Phantom EEG Dataset

    Dataset license

    Creative Commons Attribution 4.0 International

    Code

    Your could get the code to read the data files (.cnt or .set) in the “code” folder.

    To run the codes, you should install the mne and numpy package. You could install via pip

    pip install mne==1.3.1

    pip install numpy

    Then, you could use “BID2WMCVPR.py” to convert the BID dataset to the WM-CVPR dataset. You could also use “CNTK2WMCVPR.py” to convert the CNT dataset to the WM-CVPR dataset.

    The codes to reorganize other datasets other than CVPR [1] will be released on github after reviewing.

    Data information

    - CNT: the raw data.

    Each Subject (S*.cnt) contains the following information:

    EEG.data: EEG data (samples X channels)

    EEG.srate: Sampling frequency of the saved data

    EEG.chanlocs : channel numbers (1 to 68, ‘EKG’ ‘EMG’ 'VEO' 'HEO' were not recorded)

    - BIDS: an extension to the brain imaging data structure for electroencephalography. BIDS primarily addresses the heterogeneity of data organization by following the FAIR principles [6].

    Each Subject (sub-S*/eeg/) contains the following information:

    sub-S*_task-RestingState_channels.tsv: channel numbers (1 to 68, ‘EKG’ ‘EMG’ 'VEO' 'HEO' were not recorded)

    sub-S*_task-RestingState_eeg.json: Some information about the dataset.

    sub-S*_task-RestingState_eeg.set: EEG data (samples X channels)

    sub-S*_task-RestingState_events.tsv: the event during recording. We organized events using block-design and rapid-event-design. However, it is important to note that this does not need to be considered in any subsequent data reorganization, as watermelons cannot follow any experimental instructions.

    - code: more information on Code.

    - readme.md: the information about the dataset.

    Recordings

    An additional electrode was placed on the lower part of the watermelon as the physiological reference, and the forehead served as the ground site. The inter-electrode impedances were maintained under 20 kOhm. Data were recorded at a sampling rate of 1000 Hz. EEG recordings for each watermelon lasted for more than 1 hour to ensure sufficient data for the decoding task.

    Citation and more information

    Citation will be updated after the review period is completed.

    We will provide more information about this dataset (e.g. the units of the captured data) once our work is accepted. This is because our work is currently under review, and we are not allowed to disclose more information according to the relevant requirements.

    All metadata will be provided as a backup on Github and will be available after the review period is completed.

    2 Motivation

    Researchers have reported high decoding accuracy (>95%) using non-invasive Electroencephalogram (EEG) signals for brain-computer interface (BCI) decoding tasks like image decoding, emotion recognition, auditory spatial attention detection, epilepsy detection, etc. Since these EEG data were usually collected with well-designed paradigms in labs, the reliability and robustness of the corresponding decoding methods were doubted by some researchers, and they proposed that such decoding accuracy was overestimated due to the inherent temporal autocorrelations (TA) of EEG signals [7]–[9].

    However, the coupling between the stimulus-driven neural responses and the EEG temporal autocorrelations makes it difficult to confirm whether this overestimation exists in truth. Some researchers also argue that the effect of TA in EEG data on decoding is negligible and that it becomes a significant problem only under specific experimental designs in which subjects do not have enough resting time [10], [11].

    Due to a lack of problem formulation previous studies [7]–[9] only proposed that block-design should not be used to avoid the pitfall. However, the impact of TA could be avoided only when the trial of EEG was not further segmented into several samples. Otherwise, the overfitting or pitfall would still occur. In contrast, when the correct data splitting strategy was used (e.g. separating training and test data in time), the pitfall could also be avoided even when block-design was used.

    In our framework, we proposed the concept of "domain" to represent the EEG patterns resulting from TA and then used phantom EEG to remove stimulus-driven neural responses for verification. The results confirmed that the TA, always existing in the EEG data, added unique domain features to a continuous segment of EEG. The specific finding is that when the segment of EEG data with the same class label is split into multiple samples, the classifier will associate the sample's class label with the domain features, interfering with the learning of class-related features. This leads to an overestimation of decoding performance for test samples from the domains seen during training, and results in poor accuracy for test samples from unseen domains (as in real-world applications).

    Importantly, our work suggests that the key to reducing the impact of EEG TA on BCI decoding is to decouple class-related features from domain features in the actual EEG dataset. Our proposed unified framework serves as a reminder to BCI researchers of the impact of TA on their specific BCI tasks and is intended to guide them in selecting the appropriate experimental design, splitting strategy and model construction.

    3 The rationality for using watermelon as the phantom head

    We must point out that the "phantom EEG" indeed does not contain any "EEG" but records only noise, a watermelon is not a brain and does not generate any electrical signals. Therefore, the recorded electrical noises, even when amplified using equipment typically used for EEG, do not constitute EEG data when considering the definition of EEG. This is why previous researchers called it "phantom EEG". Some researchers may therefore think that it is questionable to use watermelon to get the phantom EEG.

    However, the usage of the phantom head allows researchers to evaluate the performance of neural-recording equipment and proposed algorithms without the effects of neural activity variability, artifacts, and potential ethical issues. Phantom heads used in previous studies include digital models [12]–[14], real human skulls [15]–[17], artificial physical phantoms [18]–[24] and watermelons [25]–[40]. Due to their similar conductivity to human tissue, similar size and shape to the human head, and ease of acquisition, watermelons are widely used as "phantom heads".

    Most works tried to use watermelon as a phantom head and found that the results analyzed using the neural signals from human subjects could not be obtained when using the phantom head, thus proving that the achieved results were indeed caused by neural signals. For example, Mutanen et.al [35] proposed that “the fact that the phantom head stimulation did not evoke similar biphasic artifacts excludes the possibility that residual induced artifacts, with the current TMS-compatible EEG system, could explain these components”.

    Our work differs significantly from most previous works. It is firstly found in our work that the phantom EEG exhibits the effect of TA on BCI decoding even when only noise was recorded, indicating the inherent existence of TA in the EEG data. The conclusion we hope to draw is that some current works may not truly use stimulus-driven neural

  13. Emotions based EEG dataset

    • kaggle.com
    zip
    Updated Sep 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thejaswinishrinivas (2023). Emotions based EEG dataset [Dataset]. https://www.kaggle.com/datasets/thejaswinishrinivas/emotions-based-eeg-dataset
    Explore at:
    zip(76305134 bytes)Available download formats
    Dataset updated
    Sep 30, 2023
    Authors
    Thejaswinishrinivas
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The datset is comprised of 46(22 commercial adertisement and 24 kannada Music clips) different subjcets EEG data recorded uisng 2 channel EEG device

    The dataset folder contaions two sub folder 1. comercial advertisement 1.1 Channel_1(Ch_1) and Channel_2 (Ch_2) :Prefontal Cortex 2. Kannada Musical clips 2.1 channel_1(Ch_1) and Channel_2 (Ch_2) :left Brain

    Excel file information : Each file column represneted as number of subjects and row is represnted as features per subjects There are totaly 12 excel files from two channels ( 6 for commercial advertisemnt and 6 for kannda Musical clips).

    Subjective self-rating scale

    Name
    age Gender Have you ever had any health issues? YES NO Have you watched this song/advertisement before? YES NO Please let us know if this advertisement brings up any specific memories for you. YES NO Please Rate the following query from 1 to 10. How funny was the advertisement you watched How sad was the advertisement you watched How Horror was the advertisement you watched How relaxed was the Music you viewed with How Sad was the Music you viewed with How enjoyable was the Music you viewed with Do you think what you just watched was entertaining enough? If you have any comment please write here

    Here is the website address for each stimulus that we considered:

    ad1: https://www.youtube.com/watch?v=ZzG7duipQ7U&ab_channel=perfettiindia ad2: https://www.youtube.com/watch?v=SfAxUpeVhCg&ab_channel=bo0fhead ad3: https://www.youtube.com/watch?v=HqGsT6VM8Vg&ab_channel=kiddlestix song1: https://www.youtube.com/hashtag/kgfchapter2 song 2: https://www.youtube.com/watch?v=x43w4lLS9E0&ab_channel=AnandAudio Song 3: https://youtube.com/watch?v=Ysf4QRrcLGM&si=EnSIkaIECMiOmarE

    For a more comprehensive understanding of the dataset and its background, we kindly ask researchers to refer to our associated manuscript titled:

    Entertainment Based Database for Emotion Recognition from EEG Signals, the research article accepted at 3rd International Conference on Applied Intelligence and informatics (AII2023) held in Fostering reproducibility of research results right 29 -31 OCT 2023, DUBAI, UAE. (When utilizing this dataset in your research, please consider citing the following reference)

  14. i

    Preprocessed CHB-MIT Scalp EEG Database

    • ieee-dataport.org
    Updated Jan 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mrs Deepa .B (2023). Preprocessed CHB-MIT Scalp EEG Database [Dataset]. https://ieee-dataport.org/open-access/preprocessed-chb-mit-scalp-eeg-database
    Explore at:
    Dataset updated
    Jan 24, 2023
    Authors
    Mrs Deepa .B
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Univ. of Bonn’ and ‘CHB-MIT Scalp EEG Database’ are publically available datasets which are the most sought after amongst researchers. Bonn dataset is very small compared to CHB-MIT. But still researchers prefer Bonn as it is in simple '.txt' format. The dataset being published here is a preprocessed form of CHB-MIT. The dataset is available in '.csv' format.

  15. EEG and audio dataset for auditory attention decoding

    • zenodo.org
    bin, zip
    Updated Jan 31, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Søren A. Fuglsang; Søren A. Fuglsang; Daniel D.E. Wong; Daniel D.E. Wong; Jens Hjortkjær; Jens Hjortkjær (2020). EEG and audio dataset for auditory attention decoding [Dataset]. http://doi.org/10.5281/zenodo.1199011
    Explore at:
    zip, binAvailable download formats
    Dataset updated
    Jan 31, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Søren A. Fuglsang; Søren A. Fuglsang; Daniel D.E. Wong; Daniel D.E. Wong; Jens Hjortkjær; Jens Hjortkjær
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    This dataset contains EEG recordings from 18 subjects listening to one of two competing speech audio streams. Continuous speech in trials of ~50 sec. was presented to normal hearing listeners in simulated rooms with different degrees of reverberation. Subjects were asked to attend one of two spatially separated speakers (one male, one female) and ignore the other. Repeated trials with presentation of a single talker were also recorded. The data were recorded in a double-walled soundproof booth at the Technical University of Denmark (DTU) using a 64-channel Biosemi system and digitized at a sampling rate of 512 Hz. Full details can be found in:

    • Søren A. Fuglsang, Torsten Dau & Jens Hjortkjær (2017): Noise-robust cortical tracking of attended speech in real-life environments. NeuroImage, 156, 435-444

    and

    • Daniel D.E. Wong, Søren A. Fuglsang, Jens Hjortkjær, Enea Ceolini, Malcolm Slaney & Alain de Cheveigné: A Comparison of Temporal Response Function Estimation Methods for Auditory Attention Decoding. Frontiers in Neuroscience, https://doi.org/10.3389/fnins.2018.00531

    The data is organized in format of the publicly available COCOHA Matlab Toolbox. The preproc_script.m demonstrates how to import and align the EEG and audio data. The script also demonstrates some EEG preprocessing steps as used the Wong et al. paper above. The AUDIO.zip contains wav-files with the speech audio used in the experiment. The EEG.zip contains MAT-files with the EEG/EOG data for each subject. The EEG/EOG data are found in data.eeg with the following channels:

    • channels 1-64: scalp EEG electrodes
    • channel 65: right mastoid electrode
    • channel 66: left mastoid electrode
    • channel 67: vertical EOG below right eye
    • channel 68: horizontal EOG right eye
    • channel 69: vertical EOG above right eye
    • channel 70: vertical EOG below left eye
    • channel 71: horizontal EOG left eye
    • channel 72: vertical EOG above left eye

    The expinfo table contains information about experimental conditions, including what what speaker the listener was attending to in different trials. The expinfo table contains the following information:

    • attend_mf: attended speaker (1=male, 2=female)
    • attend_lr: spatial position of the attended speaker (1=left, 2=right)
    • acoustic_condition: type of acoustic room (1= anechoic, 2= mild reverberation, 3= high reverberation, see Fuglsang et al. for details)
    • n_speakers: number of speakers presented (1 or 2)
    • wavfile_male: name of presented audio wav-file for the male speaker
    • wavfile_female: name of presented audio wav-file for the female speaker (if any)
    • trigger: trigger event value for each trial also found in data.event.eeg.value

    DATA_preproc.zip contains the preprocessed EEG and audio data as output from preproc_script.m.

    The dataset was created within the COCOHA Project: Cognitive Control of a Hearing Aid

  16. EEG dataset for the analysis of age-related changes in motor-related...

    • figshare.com
    png
    Updated Nov 19, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nikita Frolov; Elena Pitsik; Vadim V. Grubov; Anton R. Kiselev; Vladimir Maksimenko; Alexander E. Hramov (2020). EEG dataset for the analysis of age-related changes in motor-related cortical activity during a series of fine motor tasks performance [Dataset]. http://doi.org/10.6084/m9.figshare.12301181.v2
    Explore at:
    pngAvailable download formats
    Dataset updated
    Nov 19, 2020
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Nikita Frolov; Elena Pitsik; Vadim V. Grubov; Anton R. Kiselev; Vladimir Maksimenko; Alexander E. Hramov
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    EEG signals were acquired from 20 healthy right-handed subjects performing a series of fine motor tasks cued by the audio command. The participants were divided equally into two distinct age groups: (i) 10 elderly adults (EA group, aged 55-72, 6 females); (ii) 10 young adults (YA group, aged 19-33, 3 females).The active phase of the experimental session included sequential execution of 60 fine motor tasks - squeezing a hand into a fist after the first audio command and holding it until the second audio command (30 repetitions per hand) (see Fig.1). Duration of the audio command determined type of the motor action to be executed: 0.25s for left hand (LH) movement and 0.75s for right rand (RH) movement. The time interval between two audio signals was selected randomly in the range 4-5s for each trial. The sequence of motor tasks was randomized and the pause between tasks was also chosen randomly in the range 6-8s to exclude possible training or motor-preparation effects caused by the sequential execution of the same tasks.Acquired EEG signals were then processed via preprocessing tools implemented in MNE Python package. Specifically, raw EEG signals were filtered by a Butterworth 5th order filter in the range 1-100 Hz, and by 50Hz Notch filter. Further, Independent Component Analysis (ICA) was applied to remove ocular and cardiac artifacts. Artifact-free EEG recordings were then segmented into 60 epochs according to the experimental protocol. Each epoch was 14s long, including 3s of baseline and 11s of motor-related brain activity, and time-locked to the first audio command indicating the start of motor execution. After visual inspection epochs that still contained artifacts were rejected. Finally, 15 epochs per movement type were stored for each subject.Individual epochs for each subject are stored in the attached MNE .fif files. Prefix EA or YA in the name of the file identifies the age group, which subject belongs to. Postfix LH or RH in the name of the file indicates the type of motor tasks.EEG signals were acquired from 20 healthy right-handed subjects performing a series of fine motor tasks cued by the audio command. The participants were divided equally into two distinct age groups: (i) 10 elderly adults (EA group, aged 55-72, 6 females); (ii) 10 young adults (YA group, aged 19-33, 3 females).

  17. u

    Longitudinal ALS EEG Dataset for Motor Imagery Studies

    • rdr.ucl.ac.uk
    bin
    Updated Jan 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rishan Patel; Dai Jiang; Barney Bryson; Tom Carlson; Andreas Demosthenous; Andrew Geronimo (2025). Longitudinal ALS EEG Dataset for Motor Imagery Studies [Dataset]. http://doi.org/10.5522/04/28156016.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Jan 24, 2025
    Dataset provided by
    University College London
    Authors
    Rishan Patel; Dai Jiang; Barney Bryson; Tom Carlson; Andreas Demosthenous; Andrew Geronimo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset comprises EEG recordings from eight ALS patients aged between 45.5 and 74 years. Patients exhibited revised ALS Functional Rating Scale (ALSFRS-R) scores ranging from 0 to 46, with time since symptom onset (TSSO) varying between 12 and 113 months. Notably, no disease progression was reported during the study period, ensuring stability in clinical conditions. The participants were recruited from the Penn State Hershey Medical Center ALS Clinic and had confirmed ALS diagnoses without significant dementia. This rigorous selection criterion ensured the validity and reliability of the dataset for motor imagery analysis in an ALS population.The EEG data were collected using 19 electrodes placed according to the international 10-20 system (FP1, FP2, F7, F3, FZ, F4, F8, T7, C3, CZ, C4, T8, P7, P3, PZ, P4, P8, O1, O2), with signals referenced to linked earlobes and a ground electrode at FPz. Additionally, three electrooculogram (EOG) electrodes were employed to facilitate artifact removal, maintaining impedance levels below 10 kΩ throughout data acquisition. The data were amplified using two g.USBamp systems (g.tec GmbH) and recorded via the BCI2000 software suite, with supplementary preprocessing in MATLAB. All experimental procedures adhered strictly to Penn State University’s IRB protocol PRAMSO40647EP, ensuring ethical compliance.Each participant underwent four brain-computer interface (BCI) sessions conducted over a period of 1 to 2 months. Each session consisted of four runs, with 10 trials per class (left hand, right hand, and rest) for a total of 40 trials per session. The sessions began with a calibration run to initialize the system, followed by feedback runs during which participants controlled a cursor's movement through motor imagery, specifically imagined grasping movements. The study design, focused on motor imagery (MI), generated a total of 160 trials per participant over two months.This dataset holds significance in studying the longitudinal dynamics of motor imagery decoding in ALS patients. To ensure reproducibility of our findings and to promote advancements in the field, we have received explicit permission from Prof. Geronimo of Penn State University to distribute this dataset in the processed format for research purposes. The original publication of this collection can be found below.How to use this dataset: This dataset is structured in MATLAB as a collection of subject-specific structs, where each subject is represented as a single struct. Each struct contains three fields:L: Trials corresponding to Left Motor Imagery.R: Trials corresponding to Right Motor Imagery.Re: Trials corresponding to Rest state.Each field contains an array of trials, where each trial is represented as a matrix with, Rows as Timestamps, and Columns as channels.Primary Collection: Geronimo A, Simmons Z, Schiff SJ. Performance predictors of brain-computer interfaces in patients with amyotrophic lateral sclerosis. Journal of neural engineering 2016 13. 10.1088/1741-2560/13/2/026002.All code for any publications with this data has been made publicly available at the following link:https://github.com/rishannp/Auto-Adaptive-FBCSPhttps://github.com/rishannp/Motor-Imagery---Graph-Attention-Network

  18. Z

    Data from: EmoKey Moments Muse EEG Dataset (EKM-ED): A Comprehensive...

    • data.niaid.nih.gov
    • produccioncientifica.ugr.es
    • +3more
    Updated Nov 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Francisco M. Garcia-Moreno; Marta Badenes-Sastre (2023). EmoKey Moments Muse EEG Dataset (EKM-ED): A Comprehensive Collection of Muse S EEG Data and Key Emotional Moments [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8431450
    Explore at:
    Dataset updated
    Nov 10, 2023
    Dataset provided by
    Universidad de Granada
    Authors
    Francisco M. Garcia-Moreno; Marta Badenes-Sastre
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    EmoKey Moments Muse EEG Dataset (EKM-ED): A Comprehensive Collection of Muse S EEG Data and Key Emotional Moments

    Dataset Description:

    The EmoKey Moments EEG Dataset (EKM-ED) is an intricately curated dataset amassed from 47 participants, detailing EEG responses as they engage with emotion-eliciting video clips. Covering a spectrum of emotions, this dataset holds immense value for those diving deep into human cognitive responses, psychological research, and emotion-based analyses.

    Dataset Highlights:

    Precise Timestamps: Capturing the exact millisecond of EEG data acquisition, ensuring unparalleled granularity.

    Brainwave Metrics: Illuminating the variety of cognitive states through the prism of Delta, Theta, Alpha, Beta, and Gamma waves.

    Motion Data: Encompassing the device's movement in three dimensions for enhanced contextuality.

    Auxiliary Indicators: Key elements like the device's positioning, battery metrics, and user-specific actions are meticulously logged.

    Consent and Ethics: The dataset respects and upholds privacy and ethical standards. Every participant provided informed consent. This endeavor has received the green light from the Ethics Committee at the University of Granada, documented under the reference: 2100/CEIH/2021.

    A pivotal component of this dataset is its focus on "key moments" within the selected video clips, honing in on periods anticipated to evoke heightened emotional responses.

    Curated Video Clips within Dataset:

        Film
        Emotion
        Duration (seconds)
    
    
    
    
        The Lover
        Baseline
        43
    
    
        American History X
        Anger
        106
    
    
        Cry Freedom
        Sadness
        166
    
    
        Alive
        Happiness
        310
    
    
        Scream
        Fear
        395
    

    The cornerstone of EKM-ED is its innovative emphasis on these key moments, bringing to light the correlation between distinct cinematic events and specific EEG responses.

    Key Emotional Moments in Dataset:

        Film
        Emotion
        Key moment timestamps (seconds)
    
    
    
    
        American History X
        Anger
        36, 57, 68
    
    
        Cry Freedom
        Sadness
        112, 132, 154
    
    
        Alive
        Happiness
        227, 270, 289
    
    
        Scream
        Fear
        23, 42, 79, 226, 279, 299, 334
    

    Citation: Gilman, T. L., et al. (2017). A film set for the elicitation of emotion in research. Behavior Research Methods, 49(6). Link to the study

    With its unparalleled depth and focus, the EmoKey Moments EEG Dataset aims to advance research in fields such as neuroscience, psychology, and affective computing, providing a comprehensive platform for understanding and analyzing human emotions through EEG data.

    ——————————————————————————————————— FOLDER STRUCTURE DESCRIPTION ———————————————————————————————————

    • questionnaires: all there response questionnaires (Spanish); raw and preprocessed Including SAM | ——preprocessed: Ficha_Evaluacion_Participante_SAM_Refactored.csv: the SAM responses for every film clip

    • key_moments: the key moment timestamps for every emotion’s clip

    • muse_wearable_data: XXXX | |—raw |——1: ID = 1 of subject |————muse: EEG data of Muse device |—————————ANGER_XXX.csv : leg data of the anger elicitation |—————————FEAR_XXX.csv : leg data of the fear elicitation |—————————HAPPINESS_XXX.csv : leg data of the happiness elicitation |—————————SADNESS_XXX.csv : leg data of the sadness elicitation |————order: film elicitation order of play: For example: HAPPINESS,SADNESS,ANGER,FEAR … | |—preprocessed |——unclean-signals: without removing EEG artifacts, noise, etc. |————muse: EEG data of Muse device |—————————0.0078125: data downsampled to 128 Hz from 256Hz recorded |——clean-signals: removed EEG artifacts, noise, etc. |————muse: EEG data of Muse device |—————————0.0078125: data downsampled to 128 Hz from 256Hz recorded

    The ethical consent for this dataset was provided by La Comisión de Ética en Investigación de la Universidad de Granada, as documented in the approval titled: 'DETECCIÓN AUTOMÁTICA DE LAS EMOCIONES BÁSICAS Y SU INFLUENCIA EN LA TOMA DE DECISIONES MEDIANTE WEARABLES Y MACHINE LEARNING' registered under 2100/CEIH/2021.

  19. Features-EEG dataset

    • researchdata.edu.au
    • openneuro.org
    Updated Jun 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Grootswagers Tijl; Tijl Grootswagers (2023). Features-EEG dataset [Dataset]. http://doi.org/10.18112/OPENNEURO.DS004357.V1.0.0
    Explore at:
    Dataset updated
    Jun 29, 2023
    Dataset provided by
    OpenNeurohttps://openneuro.org/
    Western Sydney University
    Authors
    Grootswagers Tijl; Tijl Grootswagers
    License

    ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
    License information was derived automatically

    Description

    Experiment Details Electroencephalography recordings from 16 subjects to fast streams of gabor-like stimuli. Images were presented in rapid serial visual presentation streams at 6.67Hz and 20Hz rates. Participants performed an orthogonal fixation colour change detection task.

    Experiment length: 1 hour Raw and preprocessed data are available online through openneuro: https://openneuro.org/datasets/ds004357. Supplementary Material and analysis scripts are available on github: https://github.com/Tijl/features-eeg

  20. n

    Electroencephalogram Database: Prediction of Epileptic Seizures

    • neuinfo.org
    • dknet.org
    • +2more
    Updated May 10, 2005
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2005). Electroencephalogram Database: Prediction of Epileptic Seizures [Dataset]. http://identifiers.org/RRID:SCR_008032
    Explore at:
    Dataset updated
    May 10, 2005
    Description

    THIS RESOURCE IS NO LONGER IN SERVICE. Documented on April 29,2025. Electroencephalogram (EEG) data recorded from invasive and scalp electrodes. The EEG database contains invasive EEG recordings of 21 patients suffering from medically intractable focal epilepsy. The data were recorded during an invasive pre-surgical epilepsy monitoring at the Epilepsy Center of the University Hospital of Freiburg, Germany. In eleven patients, the epileptic focus was located in neocortical brain structures, in eight patients in the hippocampus, and in two patients in both. In order to obtain a high signal-to-noise ratio, fewer artifacts, and to record directly from focal areas, intracranial grid-, strip-, and depth-electrodes were utilized. The EEG data were acquired using a Neurofile NT digital video EEG system with 128 channels, 256 Hz sampling rate, and a 16 bit analogue-to-digital converter. Notch or band pass filters have not been applied. For each of the patients, there are datasets called ictal and interictal, the former containing files with epileptic seizures and at least 50 min pre-ictal data. the latter containing approximately 24 hours of EEG-recordings without seizure activity. At least 24 h of continuous interictal recordings are available for 13 patients. For the remaining patients interictal invasive EEG data consisting of less than 24 h were joined together, to end up with at least 24 h per patient. An interdisciplinary project between: * Epilepsy Center, University Hospital Freiburg * Bernstein Center for Computational Neuroscience (BCCN), Freiburg * Freiburg Center for Data Analysis and Modeling (FDM).

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Quân Nguyễn Bảo (2025). EEG-Dataset [Dataset]. https://www.kaggle.com/datasets/quands/eeg-dataset
Organization logo

Data from: EEG-Dataset

Read the descriptions!!!

Related Article
Explore at:
zip(3155571 bytes)Available download formats
Dataset updated
Aug 3, 2025
Authors
Quân Nguyễn Bảo
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

**Overview:

The Bonn EEG Dataset is a widely recognized dataset in the field of biomedical signal processing and machine learning, specifically designed for research in epilepsy detection and EEG signal analysis. It contains electroencephalogram (EEG) recordings from both healthy individuals and patients with epilepsy, making it suitable for tasks such as seizure detection and classification of brain activity states. The dataset is structured into five distinct subsets (labeled A, B, C, D, and E), each comprising 100 single-channel EEG segments, resulting in a total of 500 segments. Each segment represents 23.6 seconds of EEG data, sampled at a frequency of 173.61 Hz, yielding 4,096 data points per segment, stored in ASCII format as text files.

****Structure and Label:

  • Set A: EEG recordings from healthy individuals with eyes open, capturing normal brain activity under visual stimulation.
  • Set B: EEG recordings from healthy individuals with eyes closed, reflecting brain activity in a resting state.
  • Set C: EEG recordings from epilepsy patients, collected from the epileptogenic zone during an interictal (seizure-free) period.
  • Set D: EEG recordings from epilepsy patients, collected from the hippocampal formation of the opposite brain hemisphere during an interictal period.
  • Set E: EEG recordings from epilepsy patients during an ictal (seizure) period, capturing brain activity during an epileptic seizure. Each subset contains 100 EEG segments, ensuring a balanced distribution across the five classes, which supports both binary (e.g., healthy vs. epileptic) and multi-class (e.g., A-E classification) tasks.

**Key Characteristics

  • Size: 500 EEG segments (100 segments per subset, across five subsets).
  • Data Type: Single-channel EEG signals, stored in text files (ASCII format).
  • Sampling Rate: 173.61 Hz, providing high temporal resolution.
  • Segment Length: 23.6 seconds per segment, equivalent to 4,096 data points.
  • Labels: Clearly defined for each subset (A: healthy, eyes open; B: healthy, eyes closed; C: interictal, epileptogenic zone; D: interictal, opposite hemisphere; E: ictal), enabling precise model evaluation.
  • Preprocessing: The data is not pre-filtered, but a low-pass filter with a 40 Hz cutoff is recommended to remove high-frequency noise and artifacts, as suggested in the original documentation.

**Applications

The Bonn EEG Dataset is ideal for machine learning and signal processing tasks, including: - Developing algorithms for epileptic seizure detection and prediction. - Exploring feature extraction techniques, such as wavelet transforms, for EEG signal analysis. - Classifying brain states (healthy vs. epileptic, interictal vs. ictal). - Supporting research in neuroscience and medical diagnostics, particularly for epilepsy monitoring and treatment.

**Source

  • The dataset is publicly available from the University of Bonn and can be downloaded from the following link: University of Bonn EEG Dataset
  • The dataset is provided as five ZIP files, each containing 100 text files corresponding to the EEG segments for subsets A, B, C, D, and E.

**Citation

When using this dataset, researchers are required to cite the original publication: Andrzejak, R. G., Lehnertz, K., Mormann, F., Rieke, C., David, P., & Elger, C. E. (2001). Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: Dependence on recording region and brain state. Physical Review E, 64(6), 061907. DOI: 10.1103/PhysRevE.64.061907.

**Additional Notes

  1. The dataset is randomized, with no specific information provided about patients or electrode placements, ensuring simplicity and focus on signal characteristics.

  2. The data is not hosted on Kaggle or Hugging Face but is accessible directly from the University of Bonn’s repository or mirrored sources.

  3. Researchers may need to apply preprocessing steps, such as filtering or normalization, to optimize the data for machine learning tasks.

  4. The dataset’s balanced structure and clear labels make it an excellent choice for a one-week machine learning project, particularly for tasks involving traditional algorithms like SVM, Random Forest, or Logistic Regression.

  5. This dataset provides a robust foundation for learning signal processing, feature extraction, and machine learning techniques while addressing a real-world medical challenge in epilepsy detection.

Search
Clear search
Close search
Google apps
Main menu