Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
**Overview:
The Bonn EEG Dataset is a widely recognized dataset in the field of biomedical signal processing and machine learning, specifically designed for research in epilepsy detection and EEG signal analysis. It contains electroencephalogram (EEG) recordings from both healthy individuals and patients with epilepsy, making it suitable for tasks such as seizure detection and classification of brain activity states. The dataset is structured into five distinct subsets (labeled A, B, C, D, and E), each comprising 100 single-channel EEG segments, resulting in a total of 500 segments. Each segment represents 23.6 seconds of EEG data, sampled at a frequency of 173.61 Hz, yielding 4,096 data points per segment, stored in ASCII format as text files.
****Structure and Label:
**Key Characteristics
**Applications
The Bonn EEG Dataset is ideal for machine learning and signal processing tasks, including: - Developing algorithms for epileptic seizure detection and prediction. - Exploring feature extraction techniques, such as wavelet transforms, for EEG signal analysis. - Classifying brain states (healthy vs. epileptic, interictal vs. ictal). - Supporting research in neuroscience and medical diagnostics, particularly for epilepsy monitoring and treatment.
**Source
**Citation
When using this dataset, researchers are required to cite the original publication: Andrzejak, R. G., Lehnertz, K., Mormann, F., Rieke, C., David, P., & Elger, C. E. (2001). Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: Dependence on recording region and brain state. Physical Review E, 64(6), 061907. DOI: 10.1103/PhysRevE.64.061907.
**Additional Notes
The dataset is randomized, with no specific information provided about patients or electrode placements, ensuring simplicity and focus on signal characteristics.
The data is not hosted on Kaggle or Hugging Face but is accessible directly from the University of Bonn’s repository or mirrored sources.
Researchers may need to apply preprocessing steps, such as filtering or normalization, to optimize the data for machine learning tasks.
The dataset’s balanced structure and clear labels make it an excellent choice for a one-week machine learning project, particularly for tasks involving traditional algorithms like SVM, Random Forest, or Logistic Regression.
This dataset provides a robust foundation for learning signal processing, feature extraction, and machine learning techniques while addressing a real-world medical challenge in epilepsy detection.
Facebook
TwitterOpen Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
This database, collected at the Children’s Hospital Boston, consists of EEG recordings from pediatric subjects with intractable seizures. Subjects were monitored for up to several days following withdrawal of anti-seizure medication in order to characterize their seizures and assess their candidacy for surgical intervention. The recordings are grouped into 23 cases and were collected from 22 subjects (5 males, ages 3–22; and 17 females, ages 1.5–19).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
PCA
Facebook
TwitterThis is a dataset of EEG brainwave data that has been processed with our original strategy of statistical extraction (paper below)
The data was collected from two people (1 male, 1 female) for 3 minutes per state - positive, neutral, negative. We used a Muse EEG headband which recorded the TP9, AF7, AF8 and TP10 EEG placements via dry electrodes. Six minutes of resting neutral data is also recorded, the stimuli used to evoke the emotions are below
1 . Marley and Me - Negative (Twentieth Century Fox) Death Scene 2. Up - Negative (Walt Disney Pictures) Opening Death Scene 3. My Girl - Negative (Imagine Entertainment) Funeral Scene 4. La La Land - Positive (Summit Entertainment) Opening musical number 5. Slow Life - Positive (BioQuest Studios) Nature timelapse 6. Funny Dogs - Positive (MashupZone) Funny dog clips
Our method of statistical extraction resampled the data since waves must be mathematically described in a temporal fashion.
If you would like to use the data in research projects, please cite the following:
J. J. Bird, L. J. Manso, E. P. Ribiero, A. Ekart, and D. R. Faria, “A study on mental state classification using eeg-based brain-machine interface,”in 9th International Conference on Intelligent Systems, IEEE, 2018.
J. J. Bird, A. Ekart, C. D. Buckingham, and D. R. Faria, “Mental emotional sentiment classification with an eeg-based brain-machine interface,” in The International Conference on Digital Image and Signal Processing (DISP’19), Springer, 2019.
This research was part supported by the EIT Health GRaCE-AGE grant number 18429 awarded to C.D. Buckingham.
Facebook
TwitterOpen Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
Recordings, grouped into 23 cases, were collected from 22 subjects (5 males, ages 3–22; and 17 females, ages 1.5–19). (Case chb21 was obtained 1.5 years after case chb01, from the same female subject.) The file SUBJECT-INFO contains the gender and age of each subject. (Case chb24 was added to this collection in December 2010, and is not currently included in SUBJECT-INFO.)
Each case (chb01, chb02, etc.) contains between 9 and 42 continuous .edf files from a single subject. Hardware limitations resulted in gaps between consecutively-numbered .edf files, during which the signals were not recorded; in most cases, the gaps are 10 seconds or less, but occasionally there are much longer gaps. In order to protect the privacy of the subjects, all protected health information (PHI) in the original .edf files has been replaced with surrogate information in the files provided here. Dates in the original .edf files have been replaced by surrogate dates, but the time relationships between the individual files belonging to each case have been preserved. In most cases, the .edf files contain exactly one hour of digitized EEG signals, although those belonging to case chb10 are two hours long, and those belonging to cases chb04, chb06, chb07, chb09, and chb23 are four hours long; occasionally, files in which seizures are recorded are shorter.
All signals were sampled at 256 samples per second with 16-bit resolution. Most files contain 23 EEG signals (24 or 26 in a few cases). The International 10-20 system of EEG electrode positions and nomenclature was used for these recordings. In a few records, other signals are also recorded, such as an ECG signal in the last 36 files belonging to case chb04 and a vagal nerve stimulus (VNS) signal in the last 18 files belonging to case chb09. In some cases, up to 5 “dummy” signals (named "-") were interspersed among the EEG signals to obtain an easy-to-read display format; these dummy signals can be ignored.
The file RECORDS contains a list of all 664 .edf files included in this collection, and the file RECORDS-WITH-SEIZURES lists the 129 of those files that contain one or more seizures. In all, these records include 198 seizures (182 in the original set of 23 cases); the beginning ([) and end (]) of each seizure is annotated in the .seizure annotation files that accompany each of the files listed in RECORDS-WITH-SEIZURES. In addition, the files named chbnn-summary.txt contain information about the montage used for each recording, and the elapsed time in seconds from the beginning of each .edf file to the beginning and end of each seizure contained in it.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These files contain the raw data and processing parameters to go with the paper "Hierarchical structure guides rapid linguistic predictions during naturalistic listening" by Jonathan R. Brennan and John T. Hale. These files include the stimulus (wav files), raw data (matlab format for the Fieldtrip toolbox), data processing paramters (matlab), and variables used to align the stimuli with the EEG data and for the statistical analyses reported in the paper.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset
Synthetic EEG data generated by the ‘bai’ model based on real data.
Features/Columns:
No: "Number" Sex: "Gender" Age: "Age of participants" EEG Date: "The date of the EEG" Education: "Education level" IQ: "IQ level of participants" Main Disorder: "General class definition of the disorder" Specific Disorder: "Specific class definition of the disorder"
Total Features/Columns: 1140
Content:
Obsessive Compulsive Disorder Bipolar Disorder Schizophrenia… See the full description on the dataset page: https://huggingface.co/datasets/Neurazum/General-Disorders-EEG-Dataset-v1.
Facebook
Twitterhttps://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts
This data set consists of over 240 two-minute EEG recordings obtained from 20 volunteers. Resting-state and auditory stimuli experiments are included in the data. The goal is to develop an EEG-based Biometric system.
The data includes resting-state EEG signals in both cases: eyes open and eyes closed. The auditory stimuli part consists of six experiments; Three with in-ear auditory stimuli and another three with bone-conducting auditory stimuli. The three stimuli for each case are a native song, a non-native song, and neutral music.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This multi-subject and multi-session EEG dataset for modelling human visual object recognition (MSS) contains:
More details about the dataset are described as follows.
32 participants were recruited from college students in Beijing, of which 4 were female, and 28 were male, with an age range of 21-33 years. 100 sessions were conducted. They were paid and gave written informed consent. The study was conducted under the approval of the ethical committee of the Institute of Automation at the Chinese Academy of Sciences, with the approval number: IA21-2410-020201.
After every 50 sequences, there was a break for the participants to rest. Each rapid serial sequence lasted approximately 7.5 seconds, starting with a 750ms blank screen with a white fixation cross, followed by 20 or 21 images presented at 5 Hz with a 50% duty cycle. The sequence ended with another 750ms blank screen.
After the rapid serial sequence, there was a 2-second interval during which participants were instructed to blink and then report whether a special image appeared in the sequence using a keyboard. During each run, 20 sequences were randomly inserted with additional special images at random positions. The special images are logos for brain-computer interfaces.
Each image was displayed for 1 second and was followed by 11 choice boxes (1 correct class box, 9 random class boxes, and 1 reject box). Participants were required to select the correct class of the displayed image using a mouse to increase their engagement. After the selection, a white fixation cross was displayed for 1 second in the centre of the screen to remind participants to pay attention to the upcoming task.
The stimuli are from two image databases, ImageNet and PASCAL. The final set consists of 10,000 images, with 500 images for each class.
In the derivatives/annotations folder, there are additional information of MSS:
The EEG signals were pre-processed using the MNE package, version 1.3.1, with Python 3.9.16. The data was sampled at a rate of 1,000 Hz with a bandpass filter applied between 0.1 and 100 Hz. A notch filter was used to remove 50 Hz power frequency. Epochs were created for each trial ranging from 0 to 500 ms relative to stimulus onset. No further preprocessing or artefact correction methods were applied in technical validation. However, researchers may want to consider widely used preprocessing steps such as baseline correction or eye movement correction. After the preprocessing, each session resulted in two matrices: RSVP EEG data matrix of shape (8,000 image conditions × 122 EEG channels × 125 EEG time points) and low-speed EEG data matrix of shape (400 image conditions × 122 EEG channels × 125 EEG time points).
Facebook
Twitterhttps://github.com/bdsp-core/bdsp-license-and-duahttps://github.com/bdsp-core/bdsp-license-and-dua
The Harvard EEG Database will encompass data gathered from four hospitals affiliated with Harvard University: Massachusetts General Hospital (MGH), Brigham and Women's Hospital (BWH), Beth Israel Deaconess Medical Center (BIDMC), and Boston Children's Hospital (BCH). The EEG data includes three types:
rEEG: "routine EEGs" recorded in the outpatient setting.
EMU: recordings obtained in the inpatient setting, within the Epilepsy Monitoring Unit (EMU).
ICU/LTM: recordings obtained from acutely and critically ill patients within the intensive care unit (ICU).
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The dataset provides resting-state EEG data (eyes open,partially eyes closed) from 71 participants who underwent two experiments involving normal sleep (NS---session1) and sleep deprivation(SD---session2) .The dataset also provides information on participants' sleepiness and mood states. (Please note here Session 1 (NS) and Session 2 (SD) is not the time order, the time order is counterbalanced across participants and is listed in metadata.)
The data collection was initiated in March 2019 and was terminated in December 2020. The detailed description of the dataset is currently under working by Chuqin Xiang,Xinrui Fan,Duo Bai,Ke Lv and Xu Lei, and will submit to Scientific Data for publication.
* If you have any questions or comments, please contact:
* Xu Lei: xlei@swu.edu.cn
Xiang, C., Fan, X., Bai, D. et al. A resting-state EEG dataset for sleep deprivation. Sci Data 11, 427 (2024). https://doi.org/10.1038/s41597-024-03268-2
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
When you use this dataset, please cite this paper. More information about this dataset could also be found in this paper.
Xu, X., Wang, B., Xiao, B., Niu, Y., Wang, Y., Wu, X., & Chen, J. (2024). Beware of Overestimated Decoding Performance Arising from Temporal Autocorrelations in Electroencephalogram Signals. arXiv preprint arXiv:2405.17024.
The present work aims to demonstrate that temporal autocorrelations (TA) significantly impacts various BCI tasks even in conditions without neural activity. We used the watermelon as the phantom head and found that we could get the pitfall of overestimated decoding performance if continuous EEG data with the same class label were split into training and test sets. More details can be found in Motivation.
As watermelons cannot perform any experimental tasks, we can reorganize it to the format of various actual EEG dataset without the need to collect EEG data as previous work did (examples in Domain Studied).
Manufacturers: NeuroScan SynAmps2 system (Compumedics Limited, Victoria, Australia)
Configuration: 64-channel Ag/AgCl electrode cap with a 10/20 layout
Watermelons. Ten watermelons served as phantom heads.
Overestimated Decoding Performance in EEG decoding.
Following BCI datasets in various BCI tasks have been reorganized using the Phantom EEG Dataset. The pitfall has been found in four of five tasks.
- CVPR dataset [1] for image decoding task.
- DEAP dataset [2] for emotion recognition task.
- KUL dataset [3] for auditory spatial attention decoding task.
- BCIIV2a dataset [4] for motor imagery task (the pitfalls were absent due to the use of rapid-design paradigm during EEG recording).
- SIENA dataset [5] for epilepsy detection task.
Resting State but you could reorganize it to any task in BCI.
The Phantom EEG Dataset
Creative Commons Attribution 4.0 International
Your could get the code to read the data files (.cnt or .set) in the “code” folder.
To run the codes, you should install the mne and numpy package. You could install via pip
pip install mne==1.3.1
pip install numpy
Then, you could use “BID2WMCVPR.py” to convert the BID dataset to the WM-CVPR dataset. You could also use “CNTK2WMCVPR.py” to convert the CNT dataset to the WM-CVPR dataset.
The codes to reorganize other datasets other than CVPR [1] will be released on github after reviewing.
- CNT: the raw data.
Each Subject (S*.cnt) contains the following information:
EEG.data: EEG data (samples X channels)
EEG.srate: Sampling frequency of the saved data
EEG.chanlocs : channel numbers (1 to 68, ‘EKG’ ‘EMG’ 'VEO' 'HEO' were not recorded)
- BIDS: an extension to the brain imaging data structure for electroencephalography. BIDS primarily addresses the heterogeneity of data organization by following the FAIR principles [6].
Each Subject (sub-S*/eeg/) contains the following information:
sub-S*_task-RestingState_channels.tsv: channel numbers (1 to 68, ‘EKG’ ‘EMG’ 'VEO' 'HEO' were not recorded)
sub-S*_task-RestingState_eeg.json: Some information about the dataset.
sub-S*_task-RestingState_eeg.set: EEG data (samples X channels)
sub-S*_task-RestingState_events.tsv: the event during recording. We organized events using block-design and rapid-event-design. However, it is important to note that this does not need to be considered in any subsequent data reorganization, as watermelons cannot follow any experimental instructions.
- code: more information on Code.
- readme.md: the information about the dataset.
An additional electrode was placed on the lower part of the watermelon as the physiological reference, and the forehead served as the ground site. The inter-electrode impedances were maintained under 20 kOhm. Data were recorded at a sampling rate of 1000 Hz. EEG recordings for each watermelon lasted for more than 1 hour to ensure sufficient data for the decoding task.
Citation will be updated after the review period is completed.
We will provide more information about this dataset (e.g. the units of the captured data) once our work is accepted. This is because our work is currently under review, and we are not allowed to disclose more information according to the relevant requirements.
All metadata will be provided as a backup on Github and will be available after the review period is completed.
Researchers have reported high decoding accuracy (>95%) using non-invasive Electroencephalogram (EEG) signals for brain-computer interface (BCI) decoding tasks like image decoding, emotion recognition, auditory spatial attention detection, epilepsy detection, etc. Since these EEG data were usually collected with well-designed paradigms in labs, the reliability and robustness of the corresponding decoding methods were doubted by some researchers, and they proposed that such decoding accuracy was overestimated due to the inherent temporal autocorrelations (TA) of EEG signals [7]–[9].
However, the coupling between the stimulus-driven neural responses and the EEG temporal autocorrelations makes it difficult to confirm whether this overestimation exists in truth. Some researchers also argue that the effect of TA in EEG data on decoding is negligible and that it becomes a significant problem only under specific experimental designs in which subjects do not have enough resting time [10], [11].
Due to a lack of problem formulation previous studies [7]–[9] only proposed that block-design should not be used to avoid the pitfall. However, the impact of TA could be avoided only when the trial of EEG was not further segmented into several samples. Otherwise, the overfitting or pitfall would still occur. In contrast, when the correct data splitting strategy was used (e.g. separating training and test data in time), the pitfall could also be avoided even when block-design was used.
In our framework, we proposed the concept of "domain" to represent the EEG patterns resulting from TA and then used phantom EEG to remove stimulus-driven neural responses for verification. The results confirmed that the TA, always existing in the EEG data, added unique domain features to a continuous segment of EEG. The specific finding is that when the segment of EEG data with the same class label is split into multiple samples, the classifier will associate the sample's class label with the domain features, interfering with the learning of class-related features. This leads to an overestimation of decoding performance for test samples from the domains seen during training, and results in poor accuracy for test samples from unseen domains (as in real-world applications).
Importantly, our work suggests that the key to reducing the impact of EEG TA on BCI decoding is to decouple class-related features from domain features in the actual EEG dataset. Our proposed unified framework serves as a reminder to BCI researchers of the impact of TA on their specific BCI tasks and is intended to guide them in selecting the appropriate experimental design, splitting strategy and model construction.
We must point out that the "phantom EEG" indeed does not contain any "EEG" but records only noise, a watermelon is not a brain and does not generate any electrical signals. Therefore, the recorded electrical noises, even when amplified using equipment typically used for EEG, do not constitute EEG data when considering the definition of EEG. This is why previous researchers called it "phantom EEG". Some researchers may therefore think that it is questionable to use watermelon to get the phantom EEG.
However, the usage of the phantom head allows researchers to evaluate the performance of neural-recording equipment and proposed algorithms without the effects of neural activity variability, artifacts, and potential ethical issues. Phantom heads used in previous studies include digital models [12]–[14], real human skulls [15]–[17], artificial physical phantoms [18]–[24] and watermelons [25]–[40]. Due to their similar conductivity to human tissue, similar size and shape to the human head, and ease of acquisition, watermelons are widely used as "phantom heads".
Most works tried to use watermelon as a phantom head and found that the results analyzed using the neural signals from human subjects could not be obtained when using the phantom head, thus proving that the achieved results were indeed caused by neural signals. For example, Mutanen et.al [35] proposed that “the fact that the phantom head stimulation did not evoke similar biphasic artifacts excludes the possibility that residual induced artifacts, with the current TMS-compatible EEG system, could explain these components”.
Our work differs significantly from most previous works. It is firstly found in our work that the phantom EEG exhibits the effect of TA on BCI decoding even when only noise was recorded, indicating the inherent existence of TA in the EEG data. The conclusion we hope to draw is that some current works may not truly use stimulus-driven neural
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The datset is comprised of 46(22 commercial adertisement and 24 kannada Music clips) different subjcets EEG data recorded uisng 2 channel EEG device
The dataset folder contaions two sub folder 1. comercial advertisement 1.1 Channel_1(Ch_1) and Channel_2 (Ch_2) :Prefontal Cortex 2. Kannada Musical clips 2.1 channel_1(Ch_1) and Channel_2 (Ch_2) :left Brain
Excel file information : Each file column represneted as number of subjects and row is represnted as features per subjects There are totaly 12 excel files from two channels ( 6 for commercial advertisemnt and 6 for kannda Musical clips).
Subjective self-rating scale
Name
age
Gender
Have you ever had any health issues? YES NO
Have you watched this song/advertisement before? YES NO
Please let us know if this advertisement brings up any specific memories for you. YES NO
Please Rate the following query from 1 to 10.
How funny was the advertisement you watched
How sad was the advertisement you watched
How Horror was the advertisement you watched
How relaxed was the Music you viewed with
How Sad was the Music you viewed with
How enjoyable was the Music you viewed with
Do you think what you just watched was entertaining enough?
If you have any comment please write here
Here is the website address for each stimulus that we considered:
ad1: https://www.youtube.com/watch?v=ZzG7duipQ7U&ab_channel=perfettiindia ad2: https://www.youtube.com/watch?v=SfAxUpeVhCg&ab_channel=bo0fhead ad3: https://www.youtube.com/watch?v=HqGsT6VM8Vg&ab_channel=kiddlestix song1: https://www.youtube.com/hashtag/kgfchapter2 song 2: https://www.youtube.com/watch?v=x43w4lLS9E0&ab_channel=AnandAudio Song 3: https://youtube.com/watch?v=Ysf4QRrcLGM&si=EnSIkaIECMiOmarE
For a more comprehensive understanding of the dataset and its background, we kindly ask researchers to refer to our associated manuscript titled:
Entertainment Based Database for Emotion Recognition from EEG Signals, the research article accepted at 3rd International Conference on Applied Intelligence and informatics (AII2023) held in Fostering reproducibility of research results right 29 -31 OCT 2023, DUBAI, UAE. (When utilizing this dataset in your research, please consider citing the following reference)
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Univ. of Bonn’ and ‘CHB-MIT Scalp EEG Database’ are publically available datasets which are the most sought after amongst researchers. Bonn dataset is very small compared to CHB-MIT. But still researchers prefer Bonn as it is in simple '.txt' format. The dataset being published here is a preprocessed form of CHB-MIT. The dataset is available in '.csv' format.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
This dataset contains EEG recordings from 18 subjects listening to one of two competing speech audio streams. Continuous speech in trials of ~50 sec. was presented to normal hearing listeners in simulated rooms with different degrees of reverberation. Subjects were asked to attend one of two spatially separated speakers (one male, one female) and ignore the other. Repeated trials with presentation of a single talker were also recorded. The data were recorded in a double-walled soundproof booth at the Technical University of Denmark (DTU) using a 64-channel Biosemi system and digitized at a sampling rate of 512 Hz. Full details can be found in:
and
The data is organized in format of the publicly available COCOHA Matlab Toolbox. The preproc_script.m demonstrates how to import and align the EEG and audio data. The script also demonstrates some EEG preprocessing steps as used the Wong et al. paper above. The AUDIO.zip contains wav-files with the speech audio used in the experiment. The EEG.zip contains MAT-files with the EEG/EOG data for each subject. The EEG/EOG data are found in data.eeg with the following channels:
The expinfo table contains information about experimental conditions, including what what speaker the listener was attending to in different trials. The expinfo table contains the following information:
DATA_preproc.zip contains the preprocessed EEG and audio data as output from preproc_script.m.
The dataset was created within the COCOHA Project: Cognitive Control of a Hearing Aid
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
EEG signals were acquired from 20 healthy right-handed subjects performing a series of fine motor tasks cued by the audio command. The participants were divided equally into two distinct age groups: (i) 10 elderly adults (EA group, aged 55-72, 6 females); (ii) 10 young adults (YA group, aged 19-33, 3 females).The active phase of the experimental session included sequential execution of 60 fine motor tasks - squeezing a hand into a fist after the first audio command and holding it until the second audio command (30 repetitions per hand) (see Fig.1). Duration of the audio command determined type of the motor action to be executed: 0.25s for left hand (LH) movement and 0.75s for right rand (RH) movement. The time interval between two audio signals was selected randomly in the range 4-5s for each trial. The sequence of motor tasks was randomized and the pause between tasks was also chosen randomly in the range 6-8s to exclude possible training or motor-preparation effects caused by the sequential execution of the same tasks.Acquired EEG signals were then processed via preprocessing tools implemented in MNE Python package. Specifically, raw EEG signals were filtered by a Butterworth 5th order filter in the range 1-100 Hz, and by 50Hz Notch filter. Further, Independent Component Analysis (ICA) was applied to remove ocular and cardiac artifacts. Artifact-free EEG recordings were then segmented into 60 epochs according to the experimental protocol. Each epoch was 14s long, including 3s of baseline and 11s of motor-related brain activity, and time-locked to the first audio command indicating the start of motor execution. After visual inspection epochs that still contained artifacts were rejected. Finally, 15 epochs per movement type were stored for each subject.Individual epochs for each subject are stored in the attached MNE .fif files. Prefix EA or YA in the name of the file identifies the age group, which subject belongs to. Postfix LH or RH in the name of the file indicates the type of motor tasks.EEG signals were acquired from 20 healthy right-handed subjects performing a series of fine motor tasks cued by the audio command. The participants were divided equally into two distinct age groups: (i) 10 elderly adults (EA group, aged 55-72, 6 females); (ii) 10 young adults (YA group, aged 19-33, 3 females).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset comprises EEG recordings from eight ALS patients aged between 45.5 and 74 years. Patients exhibited revised ALS Functional Rating Scale (ALSFRS-R) scores ranging from 0 to 46, with time since symptom onset (TSSO) varying between 12 and 113 months. Notably, no disease progression was reported during the study period, ensuring stability in clinical conditions. The participants were recruited from the Penn State Hershey Medical Center ALS Clinic and had confirmed ALS diagnoses without significant dementia. This rigorous selection criterion ensured the validity and reliability of the dataset for motor imagery analysis in an ALS population.The EEG data were collected using 19 electrodes placed according to the international 10-20 system (FP1, FP2, F7, F3, FZ, F4, F8, T7, C3, CZ, C4, T8, P7, P3, PZ, P4, P8, O1, O2), with signals referenced to linked earlobes and a ground electrode at FPz. Additionally, three electrooculogram (EOG) electrodes were employed to facilitate artifact removal, maintaining impedance levels below 10 kΩ throughout data acquisition. The data were amplified using two g.USBamp systems (g.tec GmbH) and recorded via the BCI2000 software suite, with supplementary preprocessing in MATLAB. All experimental procedures adhered strictly to Penn State University’s IRB protocol PRAMSO40647EP, ensuring ethical compliance.Each participant underwent four brain-computer interface (BCI) sessions conducted over a period of 1 to 2 months. Each session consisted of four runs, with 10 trials per class (left hand, right hand, and rest) for a total of 40 trials per session. The sessions began with a calibration run to initialize the system, followed by feedback runs during which participants controlled a cursor's movement through motor imagery, specifically imagined grasping movements. The study design, focused on motor imagery (MI), generated a total of 160 trials per participant over two months.This dataset holds significance in studying the longitudinal dynamics of motor imagery decoding in ALS patients. To ensure reproducibility of our findings and to promote advancements in the field, we have received explicit permission from Prof. Geronimo of Penn State University to distribute this dataset in the processed format for research purposes. The original publication of this collection can be found below.How to use this dataset: This dataset is structured in MATLAB as a collection of subject-specific structs, where each subject is represented as a single struct. Each struct contains three fields:L: Trials corresponding to Left Motor Imagery.R: Trials corresponding to Right Motor Imagery.Re: Trials corresponding to Rest state.Each field contains an array of trials, where each trial is represented as a matrix with, Rows as Timestamps, and Columns as channels.Primary Collection: Geronimo A, Simmons Z, Schiff SJ. Performance predictors of brain-computer interfaces in patients with amyotrophic lateral sclerosis. Journal of neural engineering 2016 13. 10.1088/1741-2560/13/2/026002.All code for any publications with this data has been made publicly available at the following link:https://github.com/rishannp/Auto-Adaptive-FBCSPhttps://github.com/rishannp/Motor-Imagery---Graph-Attention-Network
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
EmoKey Moments Muse EEG Dataset (EKM-ED): A Comprehensive Collection of Muse S EEG Data and Key Emotional Moments
Dataset Description:
The EmoKey Moments EEG Dataset (EKM-ED) is an intricately curated dataset amassed from 47 participants, detailing EEG responses as they engage with emotion-eliciting video clips. Covering a spectrum of emotions, this dataset holds immense value for those diving deep into human cognitive responses, psychological research, and emotion-based analyses.
Dataset Highlights:
Precise Timestamps: Capturing the exact millisecond of EEG data acquisition, ensuring unparalleled granularity.
Brainwave Metrics: Illuminating the variety of cognitive states through the prism of Delta, Theta, Alpha, Beta, and Gamma waves.
Motion Data: Encompassing the device's movement in three dimensions for enhanced contextuality.
Auxiliary Indicators: Key elements like the device's positioning, battery metrics, and user-specific actions are meticulously logged.
Consent and Ethics: The dataset respects and upholds privacy and ethical standards. Every participant provided informed consent. This endeavor has received the green light from the Ethics Committee at the University of Granada, documented under the reference: 2100/CEIH/2021.
A pivotal component of this dataset is its focus on "key moments" within the selected video clips, honing in on periods anticipated to evoke heightened emotional responses.
Curated Video Clips within Dataset:
Film
Emotion
Duration (seconds)
The Lover
Baseline
43
American History X
Anger
106
Cry Freedom
Sadness
166
Alive
Happiness
310
Scream
Fear
395
The cornerstone of EKM-ED is its innovative emphasis on these key moments, bringing to light the correlation between distinct cinematic events and specific EEG responses.
Key Emotional Moments in Dataset:
Film
Emotion
Key moment timestamps (seconds)
American History X
Anger
36, 57, 68
Cry Freedom
Sadness
112, 132, 154
Alive
Happiness
227, 270, 289
Scream
Fear
23, 42, 79, 226, 279, 299, 334
Citation: Gilman, T. L., et al. (2017). A film set for the elicitation of emotion in research. Behavior Research Methods, 49(6). Link to the study
With its unparalleled depth and focus, the EmoKey Moments EEG Dataset aims to advance research in fields such as neuroscience, psychology, and affective computing, providing a comprehensive platform for understanding and analyzing human emotions through EEG data.
——————————————————————————————————— FOLDER STRUCTURE DESCRIPTION ———————————————————————————————————
questionnaires: all there response questionnaires (Spanish); raw and preprocessed Including SAM | ——preprocessed: Ficha_Evaluacion_Participante_SAM_Refactored.csv: the SAM responses for every film clip
key_moments: the key moment timestamps for every emotion’s clip
muse_wearable_data: XXXX | |—raw |——1: ID = 1 of subject |————muse: EEG data of Muse device |—————————ANGER_XXX.csv : leg data of the anger elicitation |—————————FEAR_XXX.csv : leg data of the fear elicitation |—————————HAPPINESS_XXX.csv : leg data of the happiness elicitation |—————————SADNESS_XXX.csv : leg data of the sadness elicitation |————order: film elicitation order of play: For example: HAPPINESS,SADNESS,ANGER,FEAR … | |—preprocessed |——unclean-signals: without removing EEG artifacts, noise, etc. |————muse: EEG data of Muse device |—————————0.0078125: data downsampled to 128 Hz from 256Hz recorded |——clean-signals: removed EEG artifacts, noise, etc. |————muse: EEG data of Muse device |—————————0.0078125: data downsampled to 128 Hz from 256Hz recorded
The ethical consent for this dataset was provided by La Comisión de Ética en Investigación de la Universidad de Granada, as documented in the approval titled: 'DETECCIÓN AUTOMÁTICA DE LAS EMOCIONES BÁSICAS Y SU INFLUENCIA EN LA TOMA DE DECISIONES MEDIANTE WEARABLES Y MACHINE LEARNING' registered under 2100/CEIH/2021.
Facebook
TwitterODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
Experiment Details Electroencephalography recordings from 16 subjects to fast streams of gabor-like stimuli. Images were presented in rapid serial visual presentation streams at 6.67Hz and 20Hz rates. Participants performed an orthogonal fixation colour change detection task.
Experiment length: 1 hour Raw and preprocessed data are available online through openneuro: https://openneuro.org/datasets/ds004357. Supplementary Material and analysis scripts are available on github: https://github.com/Tijl/features-eeg
Facebook
TwitterTHIS RESOURCE IS NO LONGER IN SERVICE. Documented on April 29,2025. Electroencephalogram (EEG) data recorded from invasive and scalp electrodes. The EEG database contains invasive EEG recordings of 21 patients suffering from medically intractable focal epilepsy. The data were recorded during an invasive pre-surgical epilepsy monitoring at the Epilepsy Center of the University Hospital of Freiburg, Germany. In eleven patients, the epileptic focus was located in neocortical brain structures, in eight patients in the hippocampus, and in two patients in both. In order to obtain a high signal-to-noise ratio, fewer artifacts, and to record directly from focal areas, intracranial grid-, strip-, and depth-electrodes were utilized. The EEG data were acquired using a Neurofile NT digital video EEG system with 128 channels, 256 Hz sampling rate, and a 16 bit analogue-to-digital converter. Notch or band pass filters have not been applied. For each of the patients, there are datasets called ictal and interictal, the former containing files with epileptic seizures and at least 50 min pre-ictal data. the latter containing approximately 24 hours of EEG-recordings without seizure activity. At least 24 h of continuous interictal recordings are available for 13 patients. For the remaining patients interictal invasive EEG data consisting of less than 24 h were joined together, to end up with at least 24 h per patient. An interdisciplinary project between: * Epilepsy Center, University Hospital Freiburg * Bernstein Center for Computational Neuroscience (BCCN), Freiburg * Freiburg Center for Data Analysis and Modeling (FDM).
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
**Overview:
The Bonn EEG Dataset is a widely recognized dataset in the field of biomedical signal processing and machine learning, specifically designed for research in epilepsy detection and EEG signal analysis. It contains electroencephalogram (EEG) recordings from both healthy individuals and patients with epilepsy, making it suitable for tasks such as seizure detection and classification of brain activity states. The dataset is structured into five distinct subsets (labeled A, B, C, D, and E), each comprising 100 single-channel EEG segments, resulting in a total of 500 segments. Each segment represents 23.6 seconds of EEG data, sampled at a frequency of 173.61 Hz, yielding 4,096 data points per segment, stored in ASCII format as text files.
****Structure and Label:
**Key Characteristics
**Applications
The Bonn EEG Dataset is ideal for machine learning and signal processing tasks, including: - Developing algorithms for epileptic seizure detection and prediction. - Exploring feature extraction techniques, such as wavelet transforms, for EEG signal analysis. - Classifying brain states (healthy vs. epileptic, interictal vs. ictal). - Supporting research in neuroscience and medical diagnostics, particularly for epilepsy monitoring and treatment.
**Source
**Citation
When using this dataset, researchers are required to cite the original publication: Andrzejak, R. G., Lehnertz, K., Mormann, F., Rieke, C., David, P., & Elger, C. E. (2001). Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: Dependence on recording region and brain state. Physical Review E, 64(6), 061907. DOI: 10.1103/PhysRevE.64.061907.
**Additional Notes
The dataset is randomized, with no specific information provided about patients or electrode placements, ensuring simplicity and focus on signal characteristics.
The data is not hosted on Kaggle or Hugging Face but is accessible directly from the University of Bonn’s repository or mirrored sources.
Researchers may need to apply preprocessing steps, such as filtering or normalization, to optimize the data for machine learning tasks.
The dataset’s balanced structure and clear labels make it an excellent choice for a one-week machine learning project, particularly for tasks involving traditional algorithms like SVM, Random Forest, or Logistic Regression.
This dataset provides a robust foundation for learning signal processing, feature extraction, and machine learning techniques while addressing a real-world medical challenge in epilepsy detection.