Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This archive contains E3SM Land Model simulation results associated with the Journal of Advances in Modeling Earth Systems (JAMES) article titled "More Realistic Intermediate Depth Dry Firn Densification in the Energy Exascale Earth System Model (E3SM)," by Adam M. Schneider, Charles S. Zender, and Stephen F. Price. Also included in the archive are python scripts used to analyze associated data and an offline, statistical firn model.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This submission contains several shapefiles used for a deterministic PFA, as well as a heat composite risk segment with union overlay, and training sites used for weights of evidence. More detailed metadata can be found in the specific file.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This submission includes maps of the spatial distribution of basaltic, and felsic rocks in the Oregon Cascades. It also includes a final Play Fairway Analysis (PFA) model, with the heat and permeability composite risk segments (CRS) supplied separately. Metadata for each raster dataset can be found within the zip files, in the TIF images
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains 344 different digitized and tagged Tafel slope datasets from the CO2 reduction literature. We re-analyze this data with a Bayesian data analysis procedure that estimates a Tafel slope and yields distributional uncertainty information about its value. We are releasing this dataset along with our study to facilitate re-analyzing and refitting our data using different models and approaches.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
No description was included in this Dataset collected from the OSF
This submission includes a Na/K geothermometer probability greater than 200 deg C map, as well as two play fairway analysis (PFA) models. The probability map acts as a composite risk segment for the PFA models. The PFA models differ in their application of magnetotelluric conductors as composite risk segments. These PFA models map out the geothermal potential in the region of SE Great Basin, Utah.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Clinical characteristics of the datasets included in the analysis.
The data project includes large-scale longitudinal analysis (2015-2020) of online hate speech on Twitter (N=847,978). A tweet database was generated: collected tweets using Twitter’s Application Programming Interface (API) (v2 full-archive search endpoint, using Academic research product track), which provides access to the historical archive of messages since Twitter was created in 2006. To download the tweets, we first defined the search filter by keyword and geographic zones using the Python programming language and the NLTK, Tensorflow, Keras and Numpy libraries. We established generic words directly related with the topic, taking into account linguistic agreement in Spanish (i.e., gender and number inflections) but without considering adjectives, for instance: migrant, migrants, immigrant, immigrants, refugee (both in masculine and feminine forms in Spanish), refugees (both in masculine and feminine forms in Spanish), asylum seeker, asylum seekers (the keywords are available as supplementary materials here.
For the process of hate speech detection in tweets, we used as a basis a tool created and validated by Vrysis et al. (2021). For this research, the tool has been retrained with:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
STUDY PURPOSE: The purpose of the study was to analyze recovery after spinal cord injury through various different behavioral outcome tests. The study compared effect sizes from literature sourced (literature-extracted data (LED)) to the literatures’ corresponding publicly available raw data (individual animal data (IAD)). Random effect models and regression analyses were applied to evaluate predictors of neuro-conversion in LED versus IAD. Subgroup analyses were performed on animal sex, animal type, animal species, injury severity, injury segment and sample sizes. Publications with common injury models (contusive injuries) and standardized endpoints (open field assessments) were included in the meta-analyses. Studies that recorded open field assessments at 0-3 and 28-56 days past operation were included. This dataset includes the individual animal data (IAD) (part 2) that was collected for the study. The code to replicate our study can be found on github (https://github.com/ucsf-ferguson-lab/climber_meta_analysis2024.git). This dataset corresponds with another dataset in ODC-SCI (10.34945/F5DG6D) which contains data that was extracted directly from the 7 published articles, literature extracted data (LED). DATA COLLECTED: The individual animal data in this dataset was collected from raw data publicly available in ODC-SCI. This dataset includes merged data from 7 different publicly available datasets. Each study from the published articles includes contusion injuries with various severities and different locations, which are indicated in this dataset. Different mice and rat species are included in the dataset with both sexes. Outcome scores at different days-post operation from BBB, BMS, Grooming and Forelimb Open Field tests are also included. The behavioral outcome scores over days post operation were used to calculate effect sizes. DATA USAGE NOTES:
https://www.icpsr.umich.edu/web/ICPSR/studies/4020/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/4020/terms
The goal of the Arrestee Drug Abuse Monitoring (ADAM) Program is to determine the extent and correlates of illicit drug use in the population of booked arrestees in local areas. Data were collected in 2003 up to four separate times (quarterly) during the year in 39 metropolitan areas in the United States. The ADAM program adopted a new instrument in 2000 in adult booking facilities for male (Part 1) and female (Part 2) arrestees. The ADAM program in 2003 also continued the use of probability-based sampling for male arrestees in adult facilities, which was initiated in 2000. Therefore, the male adult sample includes weights, generated through post-sampling stratification of the data. For the adult male and female files, variables fell into one of eight categories: (1) demographic data on each arrestee, (2) ADAM facesheet (records-based) data, (3) data on disposition of the case, including accession to a verbal consent script, (4) calendar of admissions to substance abuse and mental health treatment programs, (5) data on alcohol and drug use, abuse, and dependence, (6) drug acquisition data covering the five most commonly used illicit drugs, (7) urine test results, and (8) for males, weights.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Result of 10-Fold cross-validation on augmented dataset.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
iEEG and EEG data from 5 centers is organized in our study with a total of 100 subjects. We publish 4 centers' dataset here due to data sharing issues.
Acquisitions include ECoG and SEEG. Each run specifies a different snapshot of EEG data from that specific subject's session. For seizure sessions, this means that each run is a EEG snapshot around a different seizure event.
For additional clinical metadata about each subject, refer to the clinical Excel table in the publication.
NIH, JHH, UMMC, and UMF agreed to share. Cleveland Clinic did not, so requires an additional DUA.
All data, except for Cleveland Clinic was approved by their centers to be de-identified and shared. All data in this dataset have no PHI, or other identifiers associated with patient. In order to access Cleveland Clinic data, please forward all requests to Amber Sours, SOURSA@ccf.org:
Amber Sours, MPH Research Supervisor | Epilepsy Center Cleveland Clinic | 9500 Euclid Ave. S3-399 | Cleveland, OH 44195 (216) 444-8638
You will need to sign a data use agreement (DUA).
For each subject, there was a raw EDF file, which was converted into the BrainVision format with mne_bids
.
Each subject with SEEG implantation, also has an Excel table, called electrode_layout.xlsx
, which outlines where the clinicians marked each electrode anatomically. Note that there is no rigorous atlas applied, so the main points of interest are: WM
, GM
, VENTRICLE
, CSF
, and OUT
, which represent white-matter, gray-matter, ventricle, cerebrospinal fluid and outside the brain. WM, Ventricle, CSF and OUT were removed channels from further analysis. These were labeled in the corresponding BIDS channels.tsv
sidecar file as status=bad
.
The dataset uploaded to openneuro.org
does not contain the sourcedata
since there was an extra
anonymization step that occurred when fully converting to BIDS.
Derivatives include: * fragility analysis * frequency analysis * graph metrics analysis * figures
These can be computed by following the following paper: Neural Fragility as an EEG Marker for the Seizure Onset Zone
Within each EDF file, there contain event markers that are annotated by clinicians, which may inform you of specific clinical events that are occuring in time, or of when they saw seizures onset and offset (clinical and electrographic).
During a seizure event, specifically event markers may follow this time course:
* eeg onset, or clinical onset - the onset of a seizure that is either marked electrographically, or by clinical behavior. Note that the clinical onset may not always be present, since some seizures manifest without clinical behavioral changes.
* Marker/Mark On - these are usually annotations within some cases, where a health practitioner injects a chemical marker for use in ICTAL SPECT imaging after a seizure occurs. This is commonly done to see which portions of the brain are active metabolically.
* Marker/Mark Off - This is when the ICTAL SPECT stops imaging.
* eeg offset, or clinical offset - this is the offset of the seizure, as determined either electrographically, or by clinical symptoms.
Other events included may be beneficial for you to understand the time-course of each seizure. Note that ICTAL SPECT occurs in all Cleveland Clinic data. Note that seizure markers are not consistent in their description naming, so one might encode some specific regular-expression rules to consistently capture seizure onset/offset markers across all dataset. In the case of UMMC data, all onset and offset markers were provided by the clinicians on an Excel sheet instead of via the EDF file. So we went in and added the annotations manually to each EDF file.
For various datasets, there are seizures present within the dataset. Generally there is only one seizure per EDF file. When seizures are present, they are marked electrographically (and clinically if present) via standard approaches in the epilepsy clinical workflow.
Clinical onset are just manifestation of the seizures with clinical syndromes. Sometimes the maker may not be present.
What is actually important in the evaluation of datasets is the clinical annotations of their localization hypotheses of the seizure onset zone.
These generally include:
* early onset: the earliest onset electrodes participating in the seizure that clinicians saw
* early/late spread (optional): the electrodes that showed epileptic spread activity after seizure onset. Not all seizures has spread contacts annotated.
For patients with the post-surgical MRI available, then the segmentation process outlined above tells us which electrodes were within the surgical removed brain region.
Otherwise, clinicians give us their best estimate, of which electrodes were resected/ablated based on their surgical notes.
For surgical patients whose postoperative medical records did not explicitly indicate specific resected or ablated contacts, manual visual inspection was performed to determine the approximate contacts that were located in later resected/ablated tissue. Postoperative T1 MRI scans were compared against post-SEEG implantation CT scans or CURRY coregistrations of preoperative MRI/post SEEG CT scans. Contacts of interest in and around the area of the reported resection were selected individually and the corresponding slice was navigated to on the CT scan or CURRY coregistration. After identifying landmarks of that slice (e.g. skull shape, skull features, shape of prominent brain structures like the ventricles, central sulcus, superior temporal gyrus, etc.), the location of a given contact in relation to these landmarks, and the location of the slice along the axial plane, the corresponding slice in the postoperative MRI scan was navigated to. The resected tissue within the slice was then visually inspected and compared against the distinct landmarks identified in the CT scans, if brain tissue was not present in the corresponding location of the contact, then the contact was marked as resected/ablated. This process was repeated for each contact of interest.
Adam Li, Chester Huynh, Zachary Fitzgerald, Iahn Cajigas, Damian Brusko, Jonathan Jagid, Angel Claudio, Andres Kanner, Jennifer Hopp, Stephanie Chen, Jennifer Haagensen, Emily Johnson, William Anderson, Nathan Crone, Sara Inati, Kareem Zaghloul, Juan Bulacio, Jorge Gonzalez-Martinez, Sridevi V. Sarma. Neural Fragility as an EEG Marker of the Seizure Onset Zone. bioRxiv 862797; doi: https://doi.org/10.1101/862797
Appelhoff, S., Sanderson, M., Brooks, T., Vliet, M., Quentin, R., Holdgraf, C., Chaumon, M., Mikulan, E., Tavabi, K., Höchenberger, R., Welke, D., Brunner, C., Rockhill, A., Larson, E., Gramfort, A. and Jas, M. (2019). MNE-BIDS: Organizing electrophysiological data into the BIDS format and facilitating their analysis. Journal of Open Source Software 4: (1896). https://doi.org/10.21105/joss.01896
Holdgraf, C., Appelhoff, S., Bickel, S., Bouchard, K., D'Ambrosio, S., David, O., … Hermes, D. (2019). iEEG-BIDS, extending the Brain Imaging Data Structure specification to human intracranial electrophysiology. Scientific Data, 6, 102. https://doi.org/10.1038/s41597-019-0105-7
Pernet, C. R., Appelhoff, S., Gorgolewski, K. J., Flandin, G., Phillips, C., Delorme, A., Oostenveld, R. (2019). EEG-BIDS, an extension to the brain imaging data structure for electroencephalography. Scientific Data, 6, 103. https://doi.org/10.1038/s41597-019-0104-8
summary data from simulation modelThese data are the average percent of real loci and the average number of spurious loci detected given the various parameter combinations tested by the model.Flanagan_and_Jones_molecol_data.docx.xlsxmultiple populations comparisonThis data contains Fst values for loci in replicate populations with the same QTLs, which were: chrom 0: 702, 978; chrom 1: 516, 341; chrom 2: 878, 76; chrom 3: 46, 153.12May_same-qtls_Fsts.txtpopulationsHeader file for C++ simulation programlife_cycleC++ program file for simulation modelchi_squareHeader file containing functions to calculate chi-square p-values for C++ simulation model.rand_numsC++ header file containing functions to calculate and use random numbers for simulation model program. File is available in GitHub at https://github.com/spflanagan/gwsca_simulation_model/blob/master/simulation_model/simulation_model/rand_nums.hSimulation_DataAll summary data from the simulation model (the average number of real and spur...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Input scripts, datasets and outputs used for the GSA methods. Please read the README and workflow files within the ZIP file.
This data package is associated with the publication “Meta-metabolome ecology reveals that geochemistry and microbial functional potential are linked to organic matter development across seven rivers” submitted to Science of the Total Environment. This data package includes the data necessary to replicate the analyses presented within the manuscript to investigate dissolved organic matter (DOM) development across broad spatial distances and within divergent biomes. Specifically, we included the Fourier transform ion cyclotron mass spectrometry (FTICR-MS) data, geochemistry data, annotated metagenomic data, and results from ecological null modeling analyses in this data package. Additionally, we included the scripts necessary to generate the figures from the manuscript.Complete metagenomic data associated with this data package can be found at the National Center for Biotechnology (NCBI) under Bioproject PRJNA946291.This dataset consists of (1) four folders; (2) a file-level metadata (flmd) file; (3) a data dictionary (dd) file; (4) a factor sheet describing samples; and (5) a readme. The FTICR Data folder contains (1) the processed Fourier transform ion cyclotron mass spectrometry (FTICR-MS) data; (2) a transformation-weighted characteristics dendrogram generated from the FTICR-MS data; and (3) the script used to generate all FTICR-MS related figures. The Geochemical Data folder contains (1) the single geochemistry data filemore » and (2) the R script responsible for generating associated figures. The Metagenomic Data folder contains (1) annotation information across different levels; (2) carbohydrate active enzyme (CAZyme) information from the dbCAN database (Yin et al., 2012); (3) phylogenetic tree data (FASTAs, alignments, and tree file); and (4) the scripts necessary to analyze all of these data and generate figures. The Null Modeling Data folder contains (1) data generated during null modeling for each river and all rivers combined and (2) the R scripts necessary to process the data. All files are .csv, .pdf, .tsv, .tre, .faa, .afa, .tree, or .R.« less
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides a list of surnames that are reliably Irish and that can be used for identifying textual references to Irish individuals in the London area and surrounding countryside within striking distance of the capital. This classification of the Irish necessarily includes the Irish-born and their descendants. The dataset has been validated for use on records up to the middle of the nineteenth century, and should only be used in cases in which a few mis-classifications of individuals would not undermine the results of the work, such as large-scale analyses. These data were created through an analysis of the 1841 Census of England and Wales, and validated against the Middlesex Criminal Registers (National Archives HO 26) and the Vagrant Lives Dataset (Crymble, Adam et al. (2014). Vagrant Lives: 14,789 Vagrants Processed by Middlesex County, 1777-1786. Zenodo. 10.5281/zenodo.13103). The sample was derived from the records of the Hundred of Ossulstone, which included much of rural and urban Middlesex, excluding the City of London and Westminster. The analysis was based upon a study of 278,949 adult males. Full details of the methodology for how this dataset was created can be found in the following article, and anyone intending to use this dataset for scholarly research is strongly encouraged to read it so that they understand the strengths and limits of this resource:
Adam Crymble, 'A Comparative Approach to Identifying the Irish in Long Eighteenth Century London', _Historical Methods: A Journal of Quantitative and Interdisciplinary History_, vol. 48, no. 3 (2015): 141-152.
The data here provided includes all 283 names listed in Appendix I of the above paper, but also an additional 209 spelling variations of those root surnames, for a total of 492 names.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Pseudo-code of the generalised dynamic programming forward and reversed passes. and indicate 'th row and 'th column of matrix.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the data for the LREC 2020 paper "Dataset for Temporal Analysis of English-French Cognates". If you use this resource, please cite the paper:
@inproceedings{frossard2020dataset,
title = "Dataset for Temporal Analysis of {E}nglish-{F}rench Cognates",
author = {Frossard, Esteban and Coustaty, Micka\"el and Doucet, Antoine and Jatowt, Adam and Hengchen, Simon},
booktitle = "Proceedings of the Eighth International Conference on Language Resources and Evaluation ({LREC}'20)",
year = "2020",
location = "Marseille"
}
This work has been supported by the European Union Horizon 2020 research and innovation programme under grants 825153 (Embeddia) and 770299 (NewsEye).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A DEM of the Tularosa Basin was divided into twelve zones, each of which a ZR ratio was calculated for. This submission has a TIFF image of the zoning designations, along with a table with respective ZR ratio calculations in the metadata.
The primary results are in the table below, and high ZR ratio values indicate relatively high strain rates. Zone ZR ratio 1 1.2852479 2 1.17442846 3 0.89700274 4 0.74546427 5 0.99841793 6 0.86434253 7 0.83016287 8 1.91696538 9 1.13691977 10 1.68062953 11 1.23044486 12 1.13160887
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
R scripts for data preparation and analysis
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This archive contains E3SM Land Model simulation results associated with the Journal of Advances in Modeling Earth Systems (JAMES) article titled "More Realistic Intermediate Depth Dry Firn Densification in the Energy Exascale Earth System Model (E3SM)," by Adam M. Schneider, Charles S. Zender, and Stephen F. Price. Also included in the archive are python scripts used to analyze associated data and an offline, statistical firn model.