13 datasets found
  1. MERGE Dataset (INCOMPLETE. SEE V1.1)

    • zenodo.org
    Updated Feb 7, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pedro Lima Louro; Pedro Lima Louro; Hugo Redinho; Hugo Redinho; Ricardo Santos; Ricardo Santos; Ricardo Malheiro; Ricardo Malheiro; Renato Panda; Renato Panda; Rui Pedro Paiva; Rui Pedro Paiva (2025). MERGE Dataset (INCOMPLETE. SEE V1.1) [Dataset]. http://doi.org/10.5281/zenodo.13904708
    Explore at:
    Dataset updated
    Feb 7, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Pedro Lima Louro; Pedro Lima Louro; Hugo Redinho; Hugo Redinho; Ricardo Santos; Ricardo Santos; Ricardo Malheiro; Ricardo Malheiro; Renato Panda; Renato Panda; Rui Pedro Paiva; Rui Pedro Paiva
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    The MERGE dataset is a collection of audio, lyrics, and bimodal datasets for conducting research on Music Emotion Recognition. A complete version is provided for each modality. The audio datasets provide 30-second excerpts for each sample, while full lyrics are provided in the relevant datasets. The amount of available samples in each dataset is the following:

    • MERGE Audio Complete: 3554
    • MERGE Audio Balanced: 3232
    • MERGE Lyrics Complete: 2568
    • MERGE Lyrics Balanced: 2400
    • MERGE Bimodal Complete: 2216
    • MERGE Bimodal Balanced: 2000

    Additional Contents

    Each dataset contains the following additional files:

    • av_values: File containing the arousal and valence values for each sample sorted by their identifier;
    • tvt_dataframes: Train, validate, and test splits for each dataset. Both a 70-15-15 and a 40-30-30 split are provided.

    Metadata

    A metadata spreadsheet is provided for each dataset with the following information for each sample, if available:

    • Song (Audio and Lyrics datasets) - Song identifiers. Identifiers starting with MT were extracted from the AllMusic platform, while those starting with A or L were collected from private collections;
    • Quadrant - Label corresponding to one of the four quadrants from Russell's Circumplex Model;
    • AllMusic Id - For samples starting with A or L, the matching AllMusic identifier is also provided. This was used to complement the available information for the samples originally obtained from the platform;
    • Artist - First performing artist or band;
    • Title - Song title;
    • Relevance - AllMusic metric representing the relevance of the song in relation to the query used;
    • Duration - Song length in seconds;
    • Moods - User-generated mood tags extracted from the AllMusic platform and available in Warriner's affective dictionary;
    • MoodsAll - User-generated mood tags extracted from the AllMusic platform;
    • Genres - User-generated genre tags extracted from the AllMusic platform;
    • Themes - User-generated theme tags extracted from the AllMusic platform;
    • Styles - User-generated style tags extracted from the AllMusic platform;
    • AppearancesTrackIDs - All AllMusic identifiers related with a sample;
    • Sample - Availability of the sample in the AllMusic platform;
    • SampleURL - URL to the 30-second excerpt in AllMusic;
    • ActualYear - Year of song release

    Acknowledgements

    This work is funded by FCT - Foundation for Science and Technology, I.P., within the scope of the projects: MERGE - DOI: 10.54499/PTDC/CCI-COM/3171/2021 financed with national funds (PIDDAC) via the Portuguese State Budget; and project CISUC - UID/CEC/00326/2020 with funds from the European Social Fund, through the Regional Operational Program Centro 2020.

    Renato Panda was supported by Ci2 - FCT UIDP/05567/2020.

  2. P

    GeBiD Dataset

    • paperswithcode.com
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gabriela Sejnova; Michal Vavrecka; Karla Stepanova; Tadahiro Taniguchi (2022). GeBiD Dataset [Dataset]. https://paperswithcode.com/dataset/gebid
    Explore at:
    Dataset updated
    Aug 1, 2022
    Authors
    Gabriela Sejnova; Michal Vavrecka; Karla Stepanova; Tadahiro Taniguchi
    Description

    We provide a custom synthetic bimodal dataset, called GeBiD, designed specifically for the comparison of the joint- and cross-generative capabilities of Multimodal Variational Autoencoders. It comprises RGB images of geometric primitives and textual descriptions. The dataset offers 5 levels of difficulty (based on the number of attributes) to find the minimal functioning scenario for each model. Moreover, its rigid structure enables automatic qualitative evaluation of the generated samples.

  3. Supplementary data B: PSD & true density measurements of Indonesian...

    • figshare.com
    xlsx
    Updated May 31, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fajar Fathiawan Pambudi (2020). Supplementary data B: PSD & true density measurements of Indonesian middle-rank coal [Dataset]. http://doi.org/10.6084/m9.figshare.12399788.v2
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    May 31, 2020
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Fajar Fathiawan Pambudi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Indonesia
    Description

    This dataset contains Particle Size Distribution (PSD) and true density measurement of Indonesian middle-rank coal samples. For PSD measurement, the samples were separated into 3 categories, which named PSD Small, Large, and Bimodal combination of Small & Large particles. Particle size measurement was conducted using LS13320 laser difraction particle size analyzer. The size distribution was constructed with respect to equivalent volume of particles to the spheres. Obtained size distribution was characterized for its mean, median, and mean/median ratio.True density measurement was conducted using Pycnometer 50 mL Silberbrand. The measurement liquid consist of Teepol and Water with ratio 1:100. Measurement of PSD and true density was performed by Mineral and Coal Technology Research and Development Center in Bandung, West Java, Indonesia.Inside the xlsx file, spreadsheet containing PSD Small, PSD Large, Bimodal 70%L, and true density calculation could be found.

  4. d

    Data from: Detecting and removing sample contamination in phylogenomic data:...

    • search.dataone.org
    • datadryad.org
    Updated Apr 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christopher Owen (2025). Detecting and removing sample contamination in phylogenomic data: An example and its implications for Cicadidae phylogeny (Insecta: Hemiptera) [Dataset]. http://doi.org/10.5061/dryad.tht76hdz1
    Explore at:
    Dataset updated
    Apr 25, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Christopher Owen
    Time period covered
    Jan 1, 2021
    Description

    Contamination of a genetic sample with DNA from one or more non-target species is a continuing concern of molecular phylogenetic studies, both Sanger sequencing studies and Next-Generation Sequencing (NGS) studies. We developed an automated pipeline for identifying and excluding likely cross-contaminated loci based on detection of bimodal distributions of patristic distances across gene trees. When the contamination occurs between samples within a dataset, comparisons between a contaminated sample and its contaminant taxon will yield bimodal distributions with one peak close to zero patristic distance. Here we present an automated pipeline for identifying and excluding likely cross-contaminated loci based on detection of these bimodal distributions of patristic distances between taxa across gene trees. This new method does not rely on a priori knowledge of taxon relatedness nor does it determine the process(es) that caused the contamination. Exclusion of putatively contaminated loci fro...

  5. Supplementary data A: Compacted density, bimodal optimal composition,...

    • figshare.com
    xlsx
    Updated May 31, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fajar Fathiawan Pambudi (2020). Supplementary data A: Compacted density, bimodal optimal composition, fractional packing density, and Cooper-Eaton modelling of briquette made with Indonesian middle-rank coal [Dataset]. http://doi.org/10.6084/m9.figshare.12389471.v2
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    May 31, 2020
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Fajar Fathiawan Pambudi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains compaction data of briquette made from binary mixture of Indonesia middle-rank coal with different mass fraction of small and large particles. Density of briquette was measured using caliper while being pressed inside the dies during uniaxial briquetting. Based on each density, the compaction characteristics of each mixture was analyzed using Cooper-Eaton curve-fitting model. Curve-fitting was performed using Solver Add-Ins in Microsoft Excel 2016.Bimodal optimal composition, fractional packing density and coordination number of the bulk mixture at 0.002 MPa were calculated based on the formula found on R.M. German researches on Powder Metallurgy.Inside the xlsx file, spreadsheets of pressed density, bimodal mixture optimal composition, fractional density and average coordination number at 0.002 MPa, example of curve-fitting operations, compaction characteristics from Cooper-Eaton model, and the regression quality (R2) could be found.

  6. TraVLR: Now You See It, Now You Don't! A Bimodal Dataset for Evaluating...

    • zenodo.org
    zip
    Updated Mar 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Keng Ji Chow; Samson Tan; Kan Min Yen; Keng Ji Chow; Samson Tan; Kan Min Yen (2023). TraVLR: Now You See It, Now You Don't! A Bimodal Dataset for Evaluating Visio-Linguisic Reasoning [Dataset]. http://doi.org/10.5281/zenodo.6563623
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 5, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Keng Ji Chow; Samson Tan; Kan Min Yen; Keng Ji Chow; Samson Tan; Kan Min Yen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    TraVLR dataset for evaluation of visio-linguistic reasoning and cross-modal transfer. Four reasoning tasks are supported: spatiality, cardinality, numerical comparison and quantifiers. For some of the datasets, we provide two versions which differ only in their training sets: one with 8k training examples, and another with 32k training examples. The validation sets, in-distribution and out-of-distribution test sets are identical.

  7. A multi-modal human neuroimaging dataset for data integration: simultaneous...

    • openneuro.org
    Updated Dec 4, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Giulia Lioi; Claire Cury; Lorraine Perronnet; Marsel Mano; Elise Bannier; Anatole Lecuyer; Christian Barillot (2019). A multi-modal human neuroimaging dataset for data integration: simultaneous EEG and MRI acquisition during a motor imagery neurofeedback task: XP2 [Dataset]. http://doi.org/10.18112/openneuro.ds002338.v1.0.1
    Explore at:
    Dataset updated
    Dec 4, 2019
    Dataset provided by
    OpenNeurohttps://openneuro.org/
    Authors
    Giulia Lioi; Claire Cury; Lorraine Perronnet; Marsel Mano; Elise Bannier; Anatole Lecuyer; Christian Barillot
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    ———————————————————————————————— ORIGINAL PAPERS ———————————————————————————————— Mano, Marsel, Anatole Lécuyer, Elise Bannier, Lorraine Perronnet, Saman Noorzadeh, and Christian Barillot. 2017. “How to Build a Hybrid Neurofeedback Platform Combining EEG and FMRI.” Frontiers in Neuroscience 11 (140). https://doi.org/10.3389/fnins.2017.00140 Perronnet, Lorraine, L Anatole, Marsel Mano, Maureen Clerc, Fabien Lotte, and Christian Barillot. 2018. “Learning 2-in-1 : Towards Integrated EEG-FMRI-Neurofeedback.” BioRxiv, no. 397729. https://doi.org/10.1101/397729.

    ———————————————————————————————— OVERVIEW ———————————————————————————————— This dataset XP2 can be pull together with the dataset XP1 (DOI: 10.18112/openneuro.ds002336.v1.0.0). Data acquisition methods have been described in Perronnet et al. (2017, Frontiers in Human Neuroscience). Simultaneous 64 channel EEG and fMRI during right-hand motor imagery and neurofeedback (NF) were acquired in this study (as well as in XP1). This study involved 16 subjects randomly assigned to two groups: in a first group they performed bimodal EEG-fMRI NF with a bi-dimensional feedback metaphor, in the second group the same task was executed with a mono-dimensional feedback.

    ———————————————————————————————— EXPERIMENTAL PARADIGM ————————————————————————————————

    The experimental protocol consisted of 5 EEG-fMRI runs with a 20s block design alternating rest and task. 1 block = 20s rest + 20s task. Task description : _task-MIpre : motor imagery run without NF. 8 blocks. _task-1dNF or _task-2dNF : bimodal neurofeedback, with either a mono-dimensional neurofeedback display (mean of EEG NF and fMRI NF scores), either a bi-dimensional display (one modality per dimension). The list of subjects with 1d or 2d is given above. Each subjects had 3 runs. 8 blocks per run. _task-MIpost : motor imagery run without NF. 8 blocks. Subjects with mono-dimensional feedback display : xp201 : 1D xp202 : 1D xp203 : 1D xp206 : 1D xp211 : 1D xp218 : 1D xp219 : 1D xp220 : 1D xp222 : 1D

    Subjects with bi-dimensional feedback display : xp204 : 2D xp205 : 2D xp207 : 2D xp213 : 2D xp216 : 2D xp217 : 2D xp221 : 2D

    ———————————————————————————————— EEG DATA ———————————————————————————————— EEG data was recorded using a 64-channel MR compatible solution from Brain Products (Brain Products GmbH, Gilching, Germany).

    RAW EEG DATA

    EEG was sampled at 5kHz with FCz as the reference electrode and AFz as the ground electrode, and a resolution of 0.5 microV. Following the BIDs arborescence, raw eeg data for each task can be found for each subject in

    XP2/sub-xp2*/eeg

    in Brain Vision Recorder format (File Version 1.0). Each raw EEG recording includes three files: the data file (.eeg), the header file (.vhdr) and the marker file (*.vmrk). The header file contains information about acquisition parameters and amplifier setup. For each electrode, the impedance at the beginning of the recording is also specified. For all subjects, channel 32 is the ECG channel. The 63 other channels are EEG channels.

    The marker file contains the list of markers assigned to the EEG recordings and their properties (marker type, marker ID and position in data points). Three type of markers are relevant for the EEG processing: R128 (Response): is the fMRI volume marker to correct for the gradient artifact S 99 (Stimulus): is the protocol marker indicating the start of the Rest block S 2 (Stimulus): is the protocol marker indicating the start of the Task (Motor Execution Motor Imagery or Neurofeedback)
    Warning : in few EEG data, the first S99 marker might be missing, but can be easily “added” 20 s before the first S 2.

    PREPROCESSED EEG DATA

    Following the BIDs arborescence, processed eeg data for each task can be found for each subject in

    XP2/derivatives/sub-xp2*/eeg_pp/*eeg_pp.*

    and following the Brain Analyzer format. Each processed EEG recording includes three files: the data file (.dat), the header file (.vhdr) and the marker file (*.vmrk), containing information similar to those described for raw data. In the header file of preprocessed data channels location are also specified. In the marker file the location in data points of the identified heart pulse (R marker) are specified as well.

    EEG data were pre-processed using BrainVision Analyzer II Software, with the following steps: Automatic gradient artifact correction using the artifact template subtraction method (Sliding average calculation with 21 intervals for sliding average and all channels enabled for correction. Downsampling with factor: 25 (200 Hz) Low Pass FIR Filter:Cut-off Frequency: 50 Hz. Ballistocardiogram (pulse) artifact correction using a semiautomatic procedure (Pulse Template searched between 40 s and 240 s in the ECG channel with the following parameters:Coherence Trigger = 0.5, Minimal Amplitude = 0.5, Maximal Amplitude = 1.3). A Pulse Artifact marker R was associated to each identified pulse. Segmentation relative to the first block marker (S 99) for all the length of the training protocol (las S 2 + 20 s).

    EEG-NF SCORES

    Neurofeedback scores can be found in the .mat structures in

    XP2/derivatives/sub-xp2*/NF_eeg/d_sub*NFeeg_scores.mat

    Structures names NF_eeg are composed by the following subfields: ID : Subject ID, for example sub-xp201 lapC3_ERD : a 1x1280 vector of neurofeedback scores. 4 scores per secondes, for the whole session. eeg : a 64x80200 matrix, with the pre-processed EEG signals with the step described above, filtered between 8 and 30 Hz. lapC3_bandpower_8Hz_30Hz : 1x1280 vector. Bandpower of the filtered signal with a laplacian centred on C3, used to estimate the lapC3_ERD. lapC3_filter : 1x64 vector. Laplacian filter centred above C3 channel. ———————————————————————————————— BOLD fMRI DATA ———————————————————————————————— All DICOM files were converted to Nifti-1 and then in BIDs format (version 2.1.4) using the software dcm2niix (version v1.0.20190720 GVV7.4.0)

    fMRI acquisitions were performed using echo- planar imaging (EPI) and covered the superior half of the brain with the following parameters 3T Siemens Verio EPI sequence TR=1 s TE=23 ms Resolution 2x2x4 mm N of slices: 16 No slice gap

    As specified in the relative task event files in XP2\ *events.tsv files onset, the scanner began the EPI pulse sequence two seconds prior to the start of the protocol (first rest block), so the the first two TRs should be discarded.

    The useful TRs for the runs are therefore

    -task-MIpre and task-MIpost: 320 s (2 to 302) -task-1dNF and task-2dNF: 320 s (2 to 302)

    In task events files for the different tasks, each column represents:

    • 'onset': onset time (sec) of an event
    • 'duration': duration (sec) of the event
    • 'trial_type': trial (block) type: rest or task (Rest, Task-MI, Task-NF)
    • 'stim_file': image presented in a stimulus block. During Rest or Motor Imagery (Task-MI) instructions were presented to the subject. On the other hand, during Neurofeedback blocks (Task-NF) the image presented was a ball moving in a square for the bidimensional NF (task-2dNF) or a ball moving along a gauge for the unidimensional NF (task-1dNF) that the subject could control self-regulating his EEG and fMRI brain activity.

    Following the BIDs arborescence, the functional data and relative metadata are found for each subject in the following directory

    XP2/sub-xp2*/func

    BOLD-NF SCORES

    For each subject and NF session, a matlab structure with BOLD-NF features can be found in

    XP2/derivatives/sub-xp2*/NF_bold/

    In view of BOLD-NF scores computation, fMRI data were preprocessed using AutoMRI, a software based on spm8 and with the following steps: slice-time correction, spatial realignment and coregistration with the anatomical scan, spatial smoothing with a 8 mm Gaussian kernel and normalization to the Montreal Neurological Institute template For each session, a first level general linear model analysis modeling was then performed. The resulting activation maps (voxel-wise Family-Wise error corrected at p < 0.05) were used to define two ROIs (9x9x3 voxels) around the maximum of activation in the ipsilesional primary motor area (M1) and supplementary motor area (SMA) respectively.

    The BOLD-NF scores were calculated as the difference between percentage signal change in the two ROIs (SMA and M1) and a large deep background region (slice 3 out of 16) whose activity is not correlated with the NF task. A smoothed version of the NF scores over the precedent three volumes was also computed.

    The NF_boldi structure has the following structure

    NF_bold → .m1→ .nf → .smoothnf
    → .roimean (averaged BOLD signal in the ROI) → .bgmean (averaged BOLD signal in the background slice) → .method
    NFscores.fmri → .sma→ .nf → .smoothnf
    → .roimean (averaged BOLD signal in the ROI) → .bgmean (averaged BOLD signal in the background slice) → .method

    Where the subfield method contains information about the ROI size (.roisize), the background mask (.bgmask) and ROI mask (.roimask).

    More details about signal

  8. A multi-modal human neuroimaging dataset for data integration: simultaneous...

    • openneuro.org
    Updated Sep 24, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Giulia Lioi; Claire Cury; Lorraine Perronnet; Marsel Mano; Elise Bannier; Anatole Lecuyer; Christian Barillot (2020). A multi-modal human neuroimaging dataset for data integration: simultaneous EEG and fMRI acquisition during a motor imagery neurofeedback task: XP2 [Dataset]. http://doi.org/10.18112/openneuro.ds002338.v2.0.1
    Explore at:
    Dataset updated
    Sep 24, 2020
    Dataset provided by
    OpenNeurohttps://openneuro.org/
    Authors
    Giulia Lioi; Claire Cury; Lorraine Perronnet; Marsel Mano; Elise Bannier; Anatole Lecuyer; Christian Barillot
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    ———————————————————————————————— ORIGINAL PAPERS ———————————————————————————————— Lioi, G., Cury, C., Perronnet, L., Mano, M., Bannier, E., Lécuyer, A., & Barillot, C. (2019). Simultaneous MRI-EEG during a motor imagery neurofeedback task: an open access brain imaging dataset for multi-modal data integration Authors. Accepted for publication in Scientific Data. https://doi.org/https://doi.org/10.1101/862375 Mano, Marsel, Anatole Lécuyer, Elise Bannier, Lorraine Perronnet, Saman Noorzadeh, and Christian Barillot. 2017. “How to Build a Hybrid Neurofeedback Platform Combining EEG and FMRI.” Frontiers in Neuroscience 11 (140). https://doi.org/10.3389/fnins.2017.00140 Lorraine Perronnet, Anatole Lecuyer, Marsel Mano, Mathis Fleury, Giulia Lioi, Claire Cury, Maureen Clerc, Fabien Lotte, and Christian Barillot. 2018. “Learning 2-in-1 : Towards Integrated EEG-FMRI-Neurofeedback.” BioRxiv, no. 397729. https://doi.org/10.1101/397729.

    ———————————————————————————————— OVERVIEW ———————————————————————————————— This dataset XP2 can be pull together with the dataset XP1, available here : https://openneuro.org/datasets/ds002336. Data acquisition methods have been described in Perronnet et al. (2017, Frontiers in Human Neuroscience). Simultaneous 64 channel EEG and fMRI during right-hand motor imagery and neurofeedback (NF) were acquired in this study (as well as in XP1). This study involved 16 subjects randomly assigned to two groups: in a first group they performed bimodal EEG-fMRI NF with a bi-dimensional feedback metaphor, in the second group the same task was executed with a mono-dimensional feedback.

    ———————————————————————————————— EXPERIMENTAL PARADIGM ————————————————————————————————

    The experimental protocol consisted of 5 EEG-fMRI runs with a 20s block design alternating rest and task. 1 block = 20s rest + 20s task. Task description : _task-MIpre : motor imagery run without NF. 8 blocks. _task-1dNF or _task-2dNF : bimodal neurofeedback, with either a mono-dimensional neurofeedback display (mean of EEG NF and fMRI NF scores), either a bi-dimensional display (one modality per dimension). The list of subjects with 1d or 2d is given above. Each subjects had 3 runs. 8 blocks per run. _task-MIpost : motor imagery run without NF. 8 blocks. Subjects with mono-dimensional feedback display : xp201 : 1D xp202 : 1D xp203 : 1D xp206 : 1D xp211 : 1D xp218 : 1D xp219 : 1D xp220 : 1D xp222 : 1D

    Subjects with bi-dimensional feedback display : xp204 : 2D xp205 : 2D xp207 : 2D xp210: 2D xp213 : 2D xp216 : 2D xp217 : 2D xp221 : 2D

    ———————————————————————————————— EEG DATA ———————————————————————————————— EEG data was recorded using a 64-channel MR compatible solution from Brain Products (Brain Products GmbH, Gilching, Germany).

    RAW EEG DATA

    EEG was sampled at 5kHz with FCz as the reference electrode and AFz as the ground electrode, and a resolution of 0.5 microV. Following the BIDs arborescence, raw eeg data for each task can be found for each subject in

    XP2/sub-xp2*/eeg

    in Brain Vision Recorder format (File Version 1.0). Each raw EEG recording includes three files: the data file (.eeg), the header file (.vhdr) and the marker file (*.vmrk). The header file contains information about acquisition parameters and amplifier setup. For each electrode, the impedance at the beginning of the recording is also specified. For all subjects, channel 32 is the ECG channel. The 63 other channels are EEG channels.

    The marker file contains the list of markers assigned to the EEG recordings and their properties (marker type, marker ID and position in data points). Three type of markers are relevant for the EEG processing: R128 (Response): is the fMRI volume marker to correct for the gradient artifact S 99 (Stimulus): is the protocol marker indicating the start of the Rest block S 2 (Stimulus): is the protocol marker indicating the start of the Task (Motor Execution Motor Imagery or Neurofeedback)
    Warning : in few EEG data, the first S99 marker might be missing, but can be easily “added” 20 s before the first S 2.

    PREPROCESSED EEG DATA

    Following the BIDs arborescence, processed eeg data for each task can be found for each subject in

    XP2/derivatives/sub-xp2*/eeg_pp/*eeg_pp.*

    and following the Brain Analyzer format. Each processed EEG recording includes three files: the data file (.dat), the header file (.vhdr) and the marker file (*.vmrk), containing information similar to those described for raw data. In the header file of preprocessed data channels location are also specified. In the marker file the location in data points of the identified heart pulse (R marker) are specified as well.

    EEG data were pre-processed using BrainVision Analyzer II Software, with the following steps: Automatic gradient artifact correction using the artifact template subtraction method (Sliding average calculation with 21 intervals for sliding average and all channels enabled for correction. Downsampling with factor: 25 (200 Hz) Low Pass FIR Filter:Cut-off Frequency: 50 Hz. Ballistocardiogram (pulse) artifact correction using a semiautomatic procedure (Pulse Template searched between 40 s and 240 s in the ECG channel with the following parameters:Coherence Trigger = 0.5, Minimal Amplitude = 0.5, Maximal Amplitude = 1.3). A Pulse Artifact marker R was associated to each identified pulse. Segmentation relative to the first block marker (S 99) for all the length of the training protocol (las S 2 + 20 s).

    EEG-NF SCORES

    Neurofeedback scores can be found in the .mat structures in

    XP2/derivatives/sub-xp2*/NF_eeg/d_sub*NFeeg_scores.mat

    Structures names NF_eeg are composed by the following subfields: ID : Subject ID, for example sub-xp201 lapC3_ERD : a 1x1280 vector of neurofeedback scores. 4 scores per secondes, for the whole session. eeg : a 64x80200 matrix, with the pre-processed EEG signals with the step described above, filtered between 8 and 30 Hz. lapC3_bandpower_8Hz_30Hz : 1x1280 vector. Bandpower of the filtered signal with a laplacian centred on C3, used to estimate the lapC3_ERD. lapC3_filter : 1x64 vector. Laplacian filter centred above C3 channel. ———————————————————————————————— BOLD fMRI DATA ———————————————————————————————— All DICOM files were converted to Nifti-1 and then in BIDs format (version 2.1.4) using the software dcm2niix (version v1.0.20190720 GVV7.4.0)

    fMRI acquisitions were performed using echo- planar imaging (EPI) and covered the superior half of the brain with the following parameters 3T Siemens Verio EPI sequence TR=1 s TE=23 ms Resolution 2x2x4 mm N of slices: 16 No slice gap

    As specified in the relative task event files in XP2\ *events.tsv files onset, the scanner began the EPI pulse sequence two seconds prior to the start of the protocol (first rest block), so the the first two TRs should be discarded.

    The useful TRs for the runs are therefore

    -task-MIpre and task-MIpost: 320 s (2 to 302) -task-1dNF and task-2dNF: 320 s (2 to 302)

    In task events files for the different tasks, each column represents:

    • 'onset': onset time (sec) of an event
    • 'duration': duration (sec) of the event
    • 'trial_type': trial (block) type: rest or task (Rest, Task-MI, Task-NF)
    • 'stim_file': image presented in a stimulus block. During Rest or Motor Imagery (Task-MI) instructions were presented to the subject. On the other hand, during Neurofeedback blocks (Task-NF) the image presented was a ball moving in a square for the bidimensional NF (task-2dNF) or a ball moving along a gauge for the unidimensional NF (task-1dNF) that the subject could control self-regulating his EEG and fMRI brain activity.

    Following the BIDs arborescence, the functional data and relative metadata are found for each subject in the following directory

    XP2/sub-xp2*/func

    BOLD-NF SCORES

    For each subject and NF session, a matlab structure with BOLD-NF features can be found in

    XP2/derivatives/sub-xp2*/NF_bold/

    In view of BOLD-NF scores computation, fMRI data were preprocessed using AutoMRI, a software based on spm8 and with the following steps: slice-time correction, spatial realignment and coregistration with the anatomical scan, spatial smoothing with a 8 mm Gaussian kernel and normalization to the Montreal Neurological Institute template For each session, a first level general linear model analysis modeling was then performed. The resulting activation maps (voxel-wise Family-Wise error corrected at p < 0.05) were used to define two ROIs (9x9x3 voxels) around the maximum of activation in the ipsilesional primary motor area (M1) and supplementary motor area (SMA) respectively.

    The BOLD-NF scores were calculated as the difference between percentage signal change in the two ROIs (SMA and M1) and a large deep background region (slice 3 out of 16) whose activity is not correlated with the NF task. A smoothed version of the NF scores over the precedent three volumes was also computed.

    The NF_boldi structure has the following structure

    NF_bold → .m1→ .nf → .smoothnf
    → .roimean (averaged BOLD signal in the ROI) → .bgmean (averaged BOLD signal in the background slice) → .method
    NFscores.fmri → .sma→ .nf

  9. h

    CdSpritesplus

    • huggingface.co
    Updated May 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gabriela Sejnova (2025). CdSpritesplus [Dataset]. https://huggingface.co/datasets/gabinsane/CdSpritesplus
    Explore at:
    Dataset updated
    May 31, 2025
    Authors
    Gabriela Sejnova
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    CdSprites+ is a synthetic bimodal dataset, designed specifically for comparison of the joint- and cross-generative capabilities of multimodal VAEs. This dataset extends the dSprites dataset with natural language captions and additional features and offers 5 levels of difficulty (based on the number of attributes) to find the minimal functioning scenario for each model. Moreover, its rigid structure enables automatic qualitative evaluation of the generated samples.

  10. t

    Cheilanthane ratios in Mesoproterozoic to Early Paleozoic rocks - Vdataset -...

    • service.tib.eu
    Updated Nov 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Cheilanthane ratios in Mesoproterozoic to Early Paleozoic rocks - Vdataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/png-doi-10-1594-pangaea-932811
    Explore at:
    Dataset updated
    Nov 30, 2024
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data set contains values on the cheilanthane (tricyclic terpenoids) ratios of C22/C21 and C24/23 from Mesoproterozoic to Silurian rocks of global distribution. Lipid extractions were performed following standard protocols. The yield was contamination-controlled via exterior vs. interior comparison of individual peak concentrations. Values were obtained via integration of multiple-reaction-monitoring (MRM) measurements. In comparing our data of cheilanthane ratios to that reported for younger rocks and oils, we noticed consistently lower C24/C23 values in our samples, suggesting a bimodal character of cheilanthane distribution in time. We tentatively attribute this to the rise of a source of oxidatively decarboxylated cheilanthatriol derived from ferns. Fossil cheilanthanes likely represent a composite mixture of various biological sources, whose secular patterns may record more than the Paleozoic rise of terrestrial plants presented here.

  11. Characteristics of 450 enrolled T. cruzi seroreactive blood donors, Chaco...

    • plos.figshare.com
    • figshare.com
    xls
    Updated May 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mirta C. Remesar; Ester C. Sabino; Lewis F. Buss; Claudio D. Merlo; Mónica G. López; Sebastián L. Humeres; Héctor A. Pavón; Clara Di Germanio; Sonia Bakkour Coco; Léa C. Oliveira-da Silva; Marcelo Martins Pinto Filho; Antonio L. Ribeiro; Michael P. Busch; Ana E. del Pozo (2025). Characteristics of 450 enrolled T. cruzi seroreactive blood donors, Chaco region, 2018-2019. [Dataset]. http://doi.org/10.1371/journal.pntd.0012724.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 30, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Mirta C. Remesar; Ester C. Sabino; Lewis F. Buss; Claudio D. Merlo; Mónica G. López; Sebastián L. Humeres; Héctor A. Pavón; Clara Di Germanio; Sonia Bakkour Coco; Léa C. Oliveira-da Silva; Marcelo Martins Pinto Filho; Antonio L. Ribeiro; Michael P. Busch; Ana E. del Pozo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Characteristics of 450 enrolled T. cruzi seroreactive blood donors, Chaco region, 2018-2019.

  12. f

    Electrocardiogram abnormalities distribution in groups of donors with four...

    • figshare.com
    • plos.figshare.com
    xls
    Updated May 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mirta C. Remesar; Ester C. Sabino; Lewis F. Buss; Claudio D. Merlo; Mónica G. López; Sebastián L. Humeres; Héctor A. Pavón; Clara Di Germanio; Sonia Bakkour Coco; Léa C. Oliveira-da Silva; Marcelo Martins Pinto Filho; Antonio L. Ribeiro; Michael P. Busch; Ana E. del Pozo (2025). Electrocardiogram abnormalities distribution in groups of donors with four tests concordant serology reactivity, as high and low antibody levels. [Dataset]. http://doi.org/10.1371/journal.pntd.0012724.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 30, 2025
    Dataset provided by
    PLOS Neglected Tropical Diseases
    Authors
    Mirta C. Remesar; Ester C. Sabino; Lewis F. Buss; Claudio D. Merlo; Mónica G. López; Sebastián L. Humeres; Héctor A. Pavón; Clara Di Germanio; Sonia Bakkour Coco; Léa C. Oliveira-da Silva; Marcelo Martins Pinto Filho; Antonio L. Ribeiro; Michael P. Busch; Ana E. del Pozo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Electrocardiogram abnormalities distribution in groups of donors with four tests concordant serology reactivity, as high and low antibody levels.

  13. T. cruzi PCR samples distribution according to classification of high and...

    • plos.figshare.com
    • figshare.com
    xls
    Updated May 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mirta C. Remesar; Ester C. Sabino; Lewis F. Buss; Claudio D. Merlo; Mónica G. López; Sebastián L. Humeres; Héctor A. Pavón; Clara Di Germanio; Sonia Bakkour Coco; Léa C. Oliveira-da Silva; Marcelo Martins Pinto Filho; Antonio L. Ribeiro; Michael P. Busch; Ana E. del Pozo (2025). T. cruzi PCR samples distribution according to classification of high and low antibody levels for each serology test. [Dataset]. http://doi.org/10.1371/journal.pntd.0012724.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 30, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Mirta C. Remesar; Ester C. Sabino; Lewis F. Buss; Claudio D. Merlo; Mónica G. López; Sebastián L. Humeres; Héctor A. Pavón; Clara Di Germanio; Sonia Bakkour Coco; Léa C. Oliveira-da Silva; Marcelo Martins Pinto Filho; Antonio L. Ribeiro; Michael P. Busch; Ana E. del Pozo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    T. cruzi PCR samples distribution according to classification of high and low antibody levels for each serology test.

  14. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Pedro Lima Louro; Pedro Lima Louro; Hugo Redinho; Hugo Redinho; Ricardo Santos; Ricardo Santos; Ricardo Malheiro; Ricardo Malheiro; Renato Panda; Renato Panda; Rui Pedro Paiva; Rui Pedro Paiva (2025). MERGE Dataset (INCOMPLETE. SEE V1.1) [Dataset]. http://doi.org/10.5281/zenodo.13904708
Organization logo

MERGE Dataset (INCOMPLETE. SEE V1.1)

Explore at:
Dataset updated
Feb 7, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Pedro Lima Louro; Pedro Lima Louro; Hugo Redinho; Hugo Redinho; Ricardo Santos; Ricardo Santos; Ricardo Malheiro; Ricardo Malheiro; Renato Panda; Renato Panda; Rui Pedro Paiva; Rui Pedro Paiva
License

Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically

Description

The MERGE dataset is a collection of audio, lyrics, and bimodal datasets for conducting research on Music Emotion Recognition. A complete version is provided for each modality. The audio datasets provide 30-second excerpts for each sample, while full lyrics are provided in the relevant datasets. The amount of available samples in each dataset is the following:

  • MERGE Audio Complete: 3554
  • MERGE Audio Balanced: 3232
  • MERGE Lyrics Complete: 2568
  • MERGE Lyrics Balanced: 2400
  • MERGE Bimodal Complete: 2216
  • MERGE Bimodal Balanced: 2000

Additional Contents

Each dataset contains the following additional files:

  • av_values: File containing the arousal and valence values for each sample sorted by their identifier;
  • tvt_dataframes: Train, validate, and test splits for each dataset. Both a 70-15-15 and a 40-30-30 split are provided.

Metadata

A metadata spreadsheet is provided for each dataset with the following information for each sample, if available:

  • Song (Audio and Lyrics datasets) - Song identifiers. Identifiers starting with MT were extracted from the AllMusic platform, while those starting with A or L were collected from private collections;
  • Quadrant - Label corresponding to one of the four quadrants from Russell's Circumplex Model;
  • AllMusic Id - For samples starting with A or L, the matching AllMusic identifier is also provided. This was used to complement the available information for the samples originally obtained from the platform;
  • Artist - First performing artist or band;
  • Title - Song title;
  • Relevance - AllMusic metric representing the relevance of the song in relation to the query used;
  • Duration - Song length in seconds;
  • Moods - User-generated mood tags extracted from the AllMusic platform and available in Warriner's affective dictionary;
  • MoodsAll - User-generated mood tags extracted from the AllMusic platform;
  • Genres - User-generated genre tags extracted from the AllMusic platform;
  • Themes - User-generated theme tags extracted from the AllMusic platform;
  • Styles - User-generated style tags extracted from the AllMusic platform;
  • AppearancesTrackIDs - All AllMusic identifiers related with a sample;
  • Sample - Availability of the sample in the AllMusic platform;
  • SampleURL - URL to the 30-second excerpt in AllMusic;
  • ActualYear - Year of song release

Acknowledgements

This work is funded by FCT - Foundation for Science and Technology, I.P., within the scope of the projects: MERGE - DOI: 10.54499/PTDC/CCI-COM/3171/2021 financed with national funds (PIDDAC) via the Portuguese State Budget; and project CISUC - UID/CEC/00326/2020 with funds from the European Social Fund, through the Regional Operational Program Centro 2020.

Renato Panda was supported by Ci2 - FCT UIDP/05567/2020.

Search
Clear search
Close search
Google apps
Main menu