Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data release contains lake and reservoir water surface temperature summary statistics calculated from Landsat 8 Analysis Ready Dataset (ARD) images available within the Conterminous United States (CONUS) from 2013-2023. All zip files within this data release contain nested directories using .parquet files to store the data. The file example_script_for_using_parquet.R contains example code for using the R arrow package (Richardson and others, 2024) to open and query the nested .parquet files. Limitations with this dataset include: - All biases inherent to the Landsat Surface Temperature product are retained in this dataset which can produce unrealistically high or low estimates of water temperature. This is observed to happen, for example, in cases with partial cloud coverage over a waterbody. - Some waterbodies are split between multiple Landsat Analysis Ready Data tiles or orbit footprints. In these cases, multiple waterbody-wide statistics may be reported - one for each data tile. The deepest point values will be extracted and reported for tile covering the deepest point. A total of 947 waterbodies are split between multiple tiles (see the multiple_tiles = “yes” column of site_id_tile_hv_crosswalk.csv). - Temperature data were not extracted from satellite images with more than 90% cloud cover. - Temperature data represents skin temperature at the water surface and may differ from temperature observations from below the water surface. Potential methods for addressing limitations with this dataset: - Identifying and removing unrealistic temperature estimates: - Calculate total percentage of cloud pixels over a given waterbody as: percent_cloud_pixels = wb_dswe9_pixels/(wb_dswe9_pixels + wb_dswe1_pixels), and filter percent_cloud_pixels by a desired percentage of cloud coverage. - Remove lakes with a limited number of water pixel values available (wb_dswe1_pixels < 10) - Filter waterbodies where the deepest point is identified as water (dp_dswe = 1) - Handling waterbodies split between multiple tiles: - These waterbodies can be identified using the "site_id_tile_hv_crosswalk.csv" file (column multiple_tiles = “yes”). A user could combine sections of the same waterbody by spatially weighting the values using the number of water pixels available within each section (wb_dswe1_pixels). This should be done with caution, as some sections of the waterbody may have data available on different dates. All zip files within this data release contain nested directories using .parquet files to store the data. The example_script_for_using_parquet.R contains example code for using the R arrow package to open and query the nested .parquet files. - "year_byscene=XXXX.zip" – includes temperature summary statistics for individual waterbodies and the deepest points (the furthest point from land within a waterbody) within each waterbody by the scene_date (when the satellite passed over). Individual waterbodies are identified by the National Hydrography Dataset (NHD) permanent_identifier included within the site_id column. Some of the .parquet files with the byscene datasets may only include one dummy row of data (identified by tile_hv="000-000"). This happens when no tabular data is extracted from the raster images because of clouds obscuring the image, a tile that covers mostly ocean with a very small amount of land, or other possible. An example file path for this dataset follows: year_byscene=2023/tile_hv=002-001/part-0.parquet -"year=XXXX.zip" – includes the summary statistics for individual waterbodies and the deepest points within each waterbody by the year (dataset=annual), month (year=0, dataset=monthly), and year-month (dataset=yrmon). The year_byscene=XXXX is used as input for generating these summary tables that aggregates temperature data by year, month, and year-month. Aggregated data is not available for the following tiles: 001-004, 001-010, 002-012, 028-013, and 029-012, because these tiles primarily cover ocean with limited land, and no output data were generated. An example file path for this dataset follows: year=2023/dataset=lakes_annual/tile_hv=002-001/part-0.parquet - "example_script_for_using_parquet.R" – This script includes code to download zip files directly from ScienceBase, identify HUC04 basins within desired landsat ARD grid tile, download NHDplus High Resolution data for visualizing, using the R arrow package to compile .parquet files in nested directories, and create example static and interactive maps. - "nhd_HUC04s_ingrid.csv" – This cross-walk file identifies the HUC04 watersheds within each Landsat ARD Tile grid. -"site_id_tile_hv_crosswalk.csv" - This cross-walk file identifies the site_id (nhdhr{permanent_identifier}) within each Landsat ARD Tile grid. This file also includes a column (multiple_tiles) to identify site_id's that fall within multiple Landsat ARD Tile grids. - "lst_grid.png" – a map of the Landsat grid tiles labelled by the horizontal – vertical ID.
https://doi.org/10.5061/dryad.brv15dvh0
On each trial, participants heard a stimulus and clicked a box on the computer screen to indicate whether they heard "SET" or "SAT." Responses of "SET" are coded as 0 and responses of "SAT" are coded as 1. The continuum steps, from 1-7, for duration and spectral quality cues of the stimulus on each trial are named "DurationStep" and "SpectralStep," respectively. Group (young or older adult) and listening condition (quiet or noise) information are provided for each row of the dataset.
https://www.bco-dmo.org/dataset/660543/licensehttps://www.bco-dmo.org/dataset/660543/license
Water column data from CTD casts along the East Siberian Arctic Shelf on R/V Oden during 2011 (ESAS Water Column Methane project) access_formats=.htmlTable,.csv,.json,.mat,.nc,.tsv,.esriCsv,.geoJson acquisition_description=Acquisition methods are described in the following publication: Orcut, B. et al. 2005
Core sectioning, porewater\u00a0collection\u00a0and analysis
At each sampling site, sediment sub-samples were collected for porewater analyses and at selected depths for microbial rate assays (AOM, anaerobic oxidation of methane oxidation; methanogenesis (MOG) from bicarbonate and acetate). Sediment was expelled from core liner using a hydraulic extruder under anoxic conditions. The depth intervals for extrusion varied. At each depth interval, a sub-sample was collected into a cut-off syringe for dissolved methane concentration quantification. Another 5 mL\u00a0sub- sample\u00a0was collected into pre-weighed and pre-combusted glass vial for determination of porosity (determined by the change in weight after drying at 80 degrees celsius to a constant weight). The remaining material was used for porewater extraction. Sample fixation and\u00a0analyses\u00a0for dissolved constituents followed the methods of Joye et al. (2010).\u00a0
Microbial Activity Measurements\u00a0
To determine AOM and MOG rates, 8 to 12 sub-samples (5 cm3) were collected from a core by manual insertion of a glass tube. For AOM, 100 uL of dissolved\u00a014CH4\u00a0tracer (about 2,000,000 DPM as gas) was injected into each core. Samples were incubated for 36 to 48 hours at\u00a0in situ\u00a0temperature.\u00a0 Following incubation, samples were transferred to 20 mL glass vials containing 2 mL of 2M NaOH (which served to arrest biological activity and fix\u00a014CO2\u00a0as\u00a014C-HCO3-).\u00a0 Each vial was sealed with a\u00a0teflon-lined screw cap, vortexed to mix the sample and base, and immediately frozen. Time zero samples were fixed immediately after radiotracer injection. The specific activity of the tracer substrate (14CH4) was determined by injecting 50 uL directly into scintillation cocktail (Scintiverse BD) followed by liquid scintillation counting. The accumulation of 14C product (14CO2) was determined by acid digestion following the method of Joye et al. (2010).\u00a0 The AOM rate was calculated using equation 1:
AOM Rate = [CH4] x alphaCH4 /t x (a-14CO2/a-14CH4)\u00a0\u00a0 \u00a0\u00a0\u00a0 \u00a0\u00a0\u00a0 \u00a0(Eq. 1)
Here, the AOM Rate is expressed as nmol CH4 oxidized per cm3 sediment per day (nmol\u00a0cm-3 d-1), [CH4] is the methane concentration (uM), alphaCH4 is the isotope fractionation factor for AOM (1.06; (ALPERIN and REEBURGH, 1988)), t is the incubation time (d), a-14CO2 is the activity of the product pool, and a-14CH4 is the activity of the substrate pool. If methane concentration was not available, the turnover time of the 14CH4 tracer is presented.
Rates of bicarbonate-based-methanogenesis and acetoclastic methanogenesis were determined by incubating samples in gas-tight, closed-tube vessels without headspace, to prevent the loss of gaseous 14CH4 product during sample manipulation. These sample tubes were sealed using custom-designed plungers (black Hungate stoppers with the lip removed containing a plastic \u201ctail\u201d that was run through the stopper) were inserted at the base of the tube; the sediment was then pushed via the plunger to the top of the tube until a small amount protruded through the tube opening. A butyl rubber septa\u00a0was\u00a0then eased into the tube opening to displace sediment in contact with the atmosphere and close the tube, which was then sealed with\u00a0a open-top\u00a0screw cap.\u00a0 The rubber materials used in these assays were boiled in 1N NaOH for 1 hour, followed by several rinses in boiling milliQ, to leach potentially toxic substances. \u00a0 \u00a0
A volume of radiotracer solution (100 uL of 14C-HCO3- tracer (~1 x 107\u00a0dpm\u00a0in slightly alkaline milliQ\u00a0water) or 1,2-14C-CH3COO- tracer (~5 x 107\u00a0dpm\u00a0in slightly alkaline milliQ\u00a0water)) was injected into each sample. Samples were incubated as described above and then 2 ml of 2N NaOH was injected through the top stopper into each sample to terminate biological activity (time zero samples were fixed prior to tracer injection).\u00a0 Samples were mixed to evenly distribute NaOH through the sample.\u00a0 Production of 14CH4 was quantified by stripping methane from the tubes with an air carrier, converting the 14CH4 to 14CO2 in a combustion furnace, and subsequent trapping of the 14CO2 in NaOH as carbonate (CRAGG et al., 1990; CRILL and MARTENS, 1986).\u00a0\u00a0Activity\u00a0of 14CO2 was measured subsequently by liquid scintillation counting.\u00a0
The rates of Bi-MOG and Ac-MOG rates were calculated using equations 2 and 3, respectively:
Bi-MOG Rate = [HCO3-] x alphaHCO3/t x\u00a0 (a-14CH4/a-H14CO3-) \u00a0 \u00a0 (Eq. 2)
Ac-MOG Rate = [CH3COO-] x alphaCH3COO-/t\u00a0 x\u00a0 (a-14CH4/a-14CH314COO-) \u00a0 \u00a0 (Eq. 3)
Both rates are expressed as nmol HCO3- or CH3COO-, respectively, reduced cm-3 d-1, alphaHCO3\u00a0and alphaCH3COO- are the isotope fractionation factors for MOG (assumed to be 1.06). [HCO3-] and [CH3COO-] are the\u00a0pore\u00a0water bicarbonate (mM) and acetate (uM) concentrations, respectively, t is incubation time (d), a-14CH4 is the activity of the product pool, and a-H14CO3 and a-14CH314COO are the activities of the substrate pools. If samples for substrate concentration determination were not available, the substrate turnover constant instead of the rate is presented.
For water column methane oxidation rate assays, triplicate 20 mL of live water (in addition to one 20 mL sample which was killed with ethanol (750 uL of pure EtOH) before tracer addition) were transferred from the CTD into serum vials. Samples were amended with 2 x 10^6 DPM of 3H-labeled-methane tracer and incubated for 24 to 72 hours (linearity of activity was tested and confirmed). After incubation, samples were fixed with ethanol, as above, and a sub-sample to determine total sample activity (3H-methane + 3H-water) was collected. Next, the sample was purged with nitrogen to remove the 3H-methane tracer and a sub-sample was amended with scintillation fluid and counted on a shipboard scintillation counter to determine the activity of tracer in the product of 3H-methane oxidation, 3H-water. The methane oxidation rate was calculated as:
MOX Rate = [methane concentration in nM] x alphaCH4/t\u00a0 x\u00a0 (a-3H- H2O/a-3H-CH4-) \u00a0 \u00a0 (Eq. 3) awards_0_award_nid=651766 awards_0_award_number=PLR-1023444 awards_0_data_url=http://www.nsf.gov/awardsearch/showAward?AWD_ID=1023444 awards_0_funder_name=NSF Division of Polar Programs awards_0_funding_acronym=NSF PLR awards_0_funding_source_nid=490497 awards_0_program_manager=Henrietta N Edmonds awards_0_program_manager_nid=51517 cdm_data_type=Other comment=Water Column Data S. Joye and V. Samarkin, PIs Version 4 October 2016 Conventions=COARDS, CF-1.6, ACDD-1.3 data_source=extract_data_as_tsv version 2.3 19 Dec 2019 defaultDataQuery=&time<now doi=10.1575/1912/bco-dmo.660543.1 Easternmost_Easting=178.9479 geospatial_lat_max=77.3829 geospatial_lat_min=65.0835 geospatial_lat_units=degrees_north geospatial_lon_max=178.9479 geospatial_lon_min=125.0406 geospatial_lon_units=degrees_east geospatial_vertical_max=651.0 geospatial_vertical_min=10.0 geospatial_vertical_positive=down geospatial_vertical_units=m infoUrl=https://www.bco-dmo.org/dataset/660543 institution=BCO-DMO instruments_0_acronym=CTD instruments_0_dataset_instrument_description=Used to collect water column samples instruments_0_dataset_instrument_nid=660553 instruments_0_description=The Conductivity, Temperature, Depth (CTD) unit is an integrated instrument package designed to measure the conductivity, temperature, and pressure (depth) of the water column. The instrument is lowered via cable through the water column and permits scientists observe the physical properties in real time via a conducting cable connecting the CTD to a deck unit and computer on the ship. The CTD is often configured with additional optional sensors including fluorometers, transmissometers and/or radiometers. It is often combined with a Rosette of water sampling bottles (e.g. Niskin, GO-FLO) for collecting discrete water samples during the cast. This instrument designation is used when specific make and model are not known. instruments_0_instrument_external_identifier=https://vocab.nerc.ac.uk/collection/L05/current/130/ instruments_0_instrument_name=CTD profiler instruments_0_instrument_nid=417 instruments_0_supplied_name=CTD keywords_vocabulary=GCMD Science Keywords metadata_source=https://www.bco-dmo.org/api/dataset/660543 Northernmost_Northing=77.3829 param_mapping={'660543': {'lat': 'master - latitude', 'depth_max': 'flag - depth', 'lon': 'master - longitude'}} parameter_source=https://www.bco-dmo.org/mapserver/dataset/660543/parameters people_0_affiliation=University of Georgia people_0_affiliation_acronym=UGA people_0_person_name=Samantha B. Joye people_0_person_nid=51421 people_0_role=Principal Investigator people_0_role_type=originator people_1_affiliation=University of Georgia people_1_affiliation_acronym=UGA people_1_person_name=Vladimir Samarkin people_1_person_nid=641543 people_1_role=Co-Principal Investigator people_1_role_type=originator people_2_affiliation=University of Georgia people_2_affiliation_acronym=UGA people_2_person_name=Samantha B. Joye people_2_person_nid=51421 people_2_role=Contact people_2_role_type=related people_3_affiliation=Woods Hole Oceanographic Institution people_3_affiliation_acronym=WHOI BCO-DMO people_3_person_name=Hannah Ake people_3_person_nid=650173 people_3_role=BCO-DMO Data Manager people_3_role_type=related project=ESAS Water Column Methane projects_0_acronym=ESAS Water Column Methane projects_0_description=We propose to study methane (CH4)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ABSTRACT
The issue of diagnosing psychotic diseases, including schizophrenia and bipolar disorder, in particular, the objectification of symptom severity assessment, is still a problem requiring the attention of researchers. Two measures that can be helpful in patient diagnosis are heart rate variability calculated based on electrocardiographic signal and accelerometer mobility data. The following dataset contains data from 30 psychiatric ward patients having schizophrenia or bipolar disorder and 30 healthy persons. The duration of the measurements for individuals was usually between 1.5 and 2 hours. R-R intervals necessary for heart rate variability calculation were collected simultaneously with accelerometer data using a wearable Polar H10 device. The Positive and Negative Syndrome Scale (PANSS) test was performed for each patient participating in the experiment, and its results were attached to the dataset. Furthermore, the code for loading and preprocessing data, as well as for statistical analysis, was included on the corresponding GitHub repository.
BACKGROUND
Heart rate variability (HRV), calculated based on electrocardiographic (ECG) recordings of R-R intervals stemming from the heart's electrical activity, may be used as a biomarker of mental illnesses, including schizophrenia and bipolar disorder (BD) [Benjamin et al]. The variations of R-R interval values correspond to the heart's autonomic regulation changes [Berntson et al, Stogios et al]. Moreover, the HRV measure reflects the activity of the sympathetic and parasympathetic parts of the autonomous nervous system (ANS) [Task Force of the European Society of Cardiology the North American Society of Pacing Electrophysiology, Matusik et al]. Patients with psychotic mental disorders show a tendency for a change in the centrally regulated ANS balance in the direction of less dynamic changes in the ANS activity in response to different environmental conditions [Stogios et al]. Larger sympathetic activity relative to the parasympathetic one leads to lower HRV, while, on the other hand, higher parasympathetic activity translates to higher HRV. This loss of dynamic response may be an indicator of mental health. Additional benefits may come from measuring the daily activity of patients using accelerometry. This may be used to register periods of physical activity and inactivity or withdrawal for further correlation with HRV values recorded at the same time.
EXPERIMENTS
In our experiment, the participants were 30 psychiatric ward patients with schizophrenia or BD and 30 healthy people. All measurements were performed using a Polar H10 wearable device. The sensor collects ECG recordings and accelerometer data and, additionally, prepares a detection of R wave peaks. Participants of the experiment had to wear the sensor for a given time. Basically, it was between 1.5 and 2 hours, but the shortest recording was 70 minutes. During this time, evaluated persons could perform any activity a few minutes after starting the measurement. Participants were encouraged to undertake physical activity and, more specifically, to take a walk. Due to patients being in the medical ward, they received instruction to take a walk in the corridors at the beginning of the experiment. They were to repeat the walk 30 minutes and 1 hour after the first walk. The subsequent walks were to be slightly longer (about 3, 5 and 7 minutes, respectively). We did not remind or supervise the command during the experiment, both in the treatment and the control group. Seven persons from the control group did not receive this order and their measurements correspond to freely selected activities with rest periods but at least three of them performed physical activities during this time. Nevertheless, at the start of the experiment, all participants were requested to rest in a sitting position for 5 minutes. Moreover, for each patient, the disease severity was assessed using the PANSS test and its scores are attached to the dataset.
The data from sensors were collected using Polar Sensor Logger application [Happonen]. Such extracted measurements were then preprocessed and analyzed using the code prepared by the authors of the experiment. It is publicly available on the GitHub repository [Książek et al].
Firstly, we performed a manual artifact detection to remove abnormal heartbeats due to non-sinus beats and technical issues of the device (e.g. temporary disconnections and inappropriate electrode readings). We also performed anomaly detection using Daubechies wavelet transform. Nevertheless, the dataset includes raw data, while a full code necessary to reproduce our anomaly detection approach is available in the repository. Optionally, it is also possible to perform cubic spline data interpolation. After that step, rolling windows of a particular size and time intervals between them are created. Then, a statistical analysis is prepared, e.g. mean HRV calculation using the RMSSD (Root Mean Square of Successive Differences) approach, measuring a relationship between mean HRV and PANSS scores, mobility coefficient calculation based on accelerometer data and verification of dependencies between HRV and mobility scores.
DATA DESCRIPTION
The structure of the dataset is as follows. One folder, called HRV_anonymized_data contains values of R-R intervals together with timestamps for each experiment participant. The data was properly anonymized, i.e. the day of the measurement was removed to prevent person identification. Files concerned with patients have the name treatment_X.csv, where X is the number of the person, while files related to the healthy controls are named control_Y.csv, where Y is the identification number of the person. Furthermore, for visualization purposes, an image of the raw RR intervals for each participant is presented. Its name is raw_RR_{control,treatment}_N.png, where N is the number of the person from the control/treatment group. The collected data are raw, i.e. before the anomaly removal. The code enabling reproducing the anomaly detection stage and removing suspicious heartbeats is publicly available in the repository [Książek et al]. The structure of consecutive files collecting R-R intervals is following:
Phone timestamp
RR-interval [ms]
12:43:26.538000
651
12:43:27.189000
632
12:43:27.821000
618
12:43:28.439000
621
12:43:29.060000
661
...
...
The first column contains the timestamp for which the distance between two consecutive R peaks was registered. The corresponding R-R interval is presented in the second column of the file and is expressed in milliseconds.
The second folder, called accelerometer_anonymized_data contains values of accelerometer data collected at the same time as R-R intervals. The naming convention is similar to that of the R-R interval data: treatment_X.csv and control_X.csv represent the data coming from the persons from the treatment and control group, respectively, while X is the identification number of the selected participant. The numbers are exactly the same as for R-R intervals. The structure of the files with accelerometer recordings is as follows:
Phone timestamp
X [mg]
Y [mg]
Z [mg]
13:00:17.196000
-961
-23
182
13:00:17.205000
-965
-21
181
13:00:17.215000
-966
-22
187
13:00:17.225000
-967
-26
193
13:00:17.235000
-965
-27
191
...
...
...
...
The first column contains a timestamp, while the next three columns correspond to the currently registered acceleration in three axes: X, Y and Z, in milli-g unit.
We also attached a file with the PANSS test scores (PANSS.csv) for all patients participating in the measurement. The structure of this file is as follows:
no_of_person
PANSS_P
PANSS_N
PANSS_G
PANSS_total
1
8
13
22
43
2
11
7
18
36
3
14
30
44
88
4
18
13
27
58
...
...
...
...
..
The first column contains the identification number of the patient, while the three following columns refer to the PANSS scores related to positive, negative and general symptoms, respectively.
USAGE NOTES
All the files necessary to run the HRV and/or accelerometer data analysis are available on the GitHub repository [Książek et al]. HRV data loading, preprocessing (i.e. anomaly detection and removal), as well as the calculation of mean HRV values in terms of the RMSSD, is performed in the main.py file. Also, Pearson's correlation coefficients between HRV values and PANSS scores and the statistical tests (Levene's and Mann-Whitney U tests) comparing the treatment and control groups are computed. By default, a sensitivity analysis is made, i.e. running the full pipeline for different settings of the window size for which the HRV is calculated and various time intervals between consecutive windows. Preparing the heatmaps of correlation coefficients and corresponding p-values can be done by running the utils_advanced_plots.py file after performing the sensitivity analysis. Furthermore, a detailed analysis for the one selected set of hyperparameters may be prepared (by setting sensitivity_analysis = False), i.e. for 15-minute window sizes, 1-minute time intervals between consecutive windows and without data interpolation method. Also, patients taking quetiapine may be excluded from further calculations by setting exclude_quetiapine = True because this medicine can have a strong impact on HRV [Hattori et al].
The accelerometer data processing may be performed using the utils_accelerometer.py file. In this case, accelerometer recordings are downsampled to ensure the same timestamps as for R-R intervals and, for each participant, the mobility coefficient is calculated. Then, a correlation
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Original dataset The original year-2019 dataset was downloaded from the World Bank Databank by the following approach on July 23, 2022.
Database: "World Development Indicators" Country: 266 (all available) Series: "CO2 emissions (kt)", "GDP (current US$)", "GNI, Atlas method (current US$)", and "Population, total" Time: 1960, 1970, 1980, 1990, 2000, 2010, 2017, 2018, 2019, 2020, 2021 Layout: Custom -> Time: Column, Country: Row, Series: Column Download options: Excel
Preprocessing
With libreoffice,
remove non-country entries (lines after Zimbabwe), shorten column names for easy processing: Country Name -> Country, Country Code -> Code, "XXXX ... GNI ..." -> GNI_1990, etc (notice '_', not '-', for R), remove unnesssary rows after line Zimbabwe.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Contact Information
If you would like further information about PeakAffectDS, to purchase a commercial license, or if you experience any issues downloading files, please contact us at peakaffectds@gmail.com.
Description
PeakAffectDS contains 663 files (total size: 1.84 GB), consisting of 612 physiology files, and 51 perceptual rating files. The dataset contains 51 untrained research participants (39 female, 12 male), who had their body physiology recorded while watching movie clips validated to induce strong emotional reactions. Emotional conditions included: calm, happy, sad, angry, fearful, and disgust; along with baseline a neutral condition. Four physiology channels were recorded with a Biopac MP36 system: two facial muscles with fEMG (zygomaticus major, corrugator supercilii) using Ag/AgCl electrodes, heart activity with ECG using a 1-Lead, Lead II configuration, and respiration with a wearable strain-gauge belt. While viewing movie clips, participants indicated in real-time when they experienced a "peak" emotional event, including: chills, tears, or the startle reflex. After each clip, participants further rated their felt emotional state using a forced-choice categorical response measure, along with their felt Arousal and Valence. All data are provided in plaintext (.csv) format.
PeakAffectDS was created in the Affective Data Science Lab.
Physiology files
Each participant has 12 .CSV physiology files, consisting of 6 Emotional conditions, and 6 Neutral baseline conditions. All physiology channels were recorded at 2000 Hz. A 50Hz notch filter was then applied to fEMG and ECG channels to remove mains hum. Each .CSV file contains 6 columns, in order from left to right:
Perceptual files
There are 51 perceptual ratings files, one for each participant. Each .CSV file contains 4 columns, in order from left to right:
File naming convention
Each of the 612 physiology files has a unique filename. The filename consists of a 3-part numerical identifier (e.g., 09-02-03.csv). The first identifier refers to the participant's ID (09), while the remaining two identifiers refer to the stimulus presented for that recording (02-03.mp4); these identifiers define the stimulus characteristics:
Filename example: 09-02-03.csv
Filename example: 09-01-05.csv
Methods
A 1-way mixed-design was used, with a within-subjects factor Emotion (6 levels: Calm, Happy, Sad, Angry, Fearful, Disgust) and a between-subjects factor Stimulus Set (3 levels). Trials were blocked by Affect Condition (Baseline, Emotional), with each participant presented 6 blocked trials: Baseline (neutral), then Emotional (Calm, ..., Disgust). This design reduced potential contamination from preceeding emotional trials, by ensuring that participant's physiology began close to a resting baseline for emotional conditions.
Emotion was presented in pseudorandom order using a carryover balanced generalised Youden design, generated by the crossdes package in R. Eighteen emotional movie clips were used as stimuli, with three instances for each emotion category (6x3). Clips were then grouped into one of three Stimulus Sets, with participants assigned to a given Set using Block randomisation. For example, participants assigned to Stimulus Set 1 (PID: 1, 4, 7, ...) all saw the same movie clips, but these clips differed to those in Sets 2 and 3. Six Neutral baseline movie clips were used as stimuli, with all participants viewing the same neutral clips, with their order also generated with a Youden design.
Stimulus duration varied, with clips lasting several minutes. Lengthy clips without repetition were used to help ensure that participants became engaged, and experienced genuine, strong emotional responses. Participants were instructed to immediately indicate using the keyboard when experiencing a "peak" emotional event, including: chills, tears, or startle. Participants were permitted to indicate multiple events in a single trial, and identified the type of the evens at the trial feedback stage, along with ratings of emotional category, arousal, and valence. The concept of peak physiological events was explained at the beginning of the experiment, but the three states were not described as being associated with any particular emotion or valence.
License information
PeakAffectDS is released under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, CC BY-NC-SA 4.0.
Citing PeakAffectDS
Greene, N., Livingstone, S. R., & Szymanski, L. (2022). PeakAffectDB [Data set]. Zenodo. https://doi.org/10.5281/zenodo.6403363
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically