58 datasets found
  1. n

    Data from: What explains rare and conspicuous colours in a snail? A test of...

    • data-staging.niaid.nih.gov
    • dataone.org
    • +3more
    zip
    Updated Jul 29, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kerstin Johannesson; Roger K. Butlin (2016). What explains rare and conspicuous colours in a snail? A test of time-series data against models of drift, migration or selection [Dataset]. http://doi.org/10.5061/dryad.427p0
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 29, 2016
    Dataset provided by
    University of Gothenburg
    Authors
    Kerstin Johannesson; Roger K. Butlin
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Area covered
    North Sea
    Description

    It is intriguing that conspicuous colour morphs of a prey species may be maintained at low frequencies alongside cryptic morphs. Negative frequency-dependent selection by predators using search images (‘apostatic selection’) is often suggested without rejecting alternative explanations. Using a maximum likelihood approach we fitted predictions from models of genetic drift, migration, constant selection, heterozygote advantage or negative frequency-dependent selection to time-series data of colour frequencies in isolated populations of a marine snail (Littorina saxatilis), re-established with perturbed colour morph frequencies and followed for >20 generations. Snails of conspicuous colours (white, red, banded) are naturally rare in the study area (usually <10%) but frequencies were manipulated to levels of ~50% (one colour per population) in 8 populations at the start of the experiment in 1992. In 2013, frequencies had declined to ~15–45%. Drift alone could not explain these changes. Migration could not be rejected in any population, but required rates much higher than those recorded. Directional selection was rejected in three populations in favour of balancing selection. Heterozygote advantage and negative frequency-dependent selection could not be distinguished statistically, although overall the results favoured the latter. Populations varied idiosyncratically as mild or variable colour selection (3–11%) interacted with demographic stochasticity, and the overall conclusion was that multiple mechanisms may contribute to maintaining the polymorphisms.

  2. d

    Data from: Privacy Preserving Outlier Detection through Random Nonlinear...

    • catalog.data.gov
    • data.amerigeoss.org
    Updated Apr 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Privacy Preserving Outlier Detection through Random Nonlinear Data Distortion [Dataset]. https://catalog.data.gov/dataset/privacy-preserving-outlier-detection-through-random-nonlinear-data-distortion
    Explore at:
    Dataset updated
    Apr 10, 2025
    Dataset provided by
    Dashlink
    Description

    Consider a scenario in which the data owner has some private/sensitive data and wants a data miner to access it for studying important patterns without revealing the sensitive information. Privacy preserving data mining aims to solve this problem by randomly transforming the data prior to its release to data miners. Previous work only considered the case of linear data perturbations — additive, multiplicative or a combination of both for studying the usefulness of the perturbed output. In this paper, we discuss nonlinear data distortion using potentially nonlinear random data transformation and show how it can be useful for privacy preserving anomaly detection from sensitive datasets. We develop bounds on the expected accuracy of the nonlinear distortion and also quantify privacy by using standard definitions. The highlight of this approach is to allow a user to control the amount of privacy by varying the degree of nonlinearity. We show how our general transformation can be used for anomaly detection in practice for two specific problem instances: a linear model and a popular nonlinear model using the sigmoid function. We also analyze the proposed nonlinear transformation in full generality and then show that for specific cases it is distance preserving. A main contribution of this paper is the discussion between the invertibility of a transformation and privacy preservation and the application of these techniques to outlier detection. Experiments conducted on real-life datasets demonstrate the effectiveness of the approach.

  3. Z

    Standing Balance Experiment with Long Duration Random Pulses Perturbation

    • data.niaid.nih.gov
    • oppositeofnorth.com
    • +2more
    Updated Jul 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wang, Huawei; van den Bogert, Antonie (2024). Standing Balance Experiment with Long Duration Random Pulses Perturbation [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3631957
    Explore at:
    Dataset updated
    Jul 22, 2024
    Dataset provided by
    Cleveland State University
    Authors
    Wang, Huawei; van den Bogert, Antonie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Standing balance experiment and the measured data-set are fundamental for identifying postural feedback controllers. As the generalized feedback controllers can only be identified from long duration balance data (under random external perturbations), a standing balance experiment is conducted and the long duration motion data was recorded. The data-set includes the perturbation reaction data from eight subjects. Each subject performed four experiment trials, including two quiet standing and two perturbed trials. Each trial lasted five minutes. A total of 80 minutes quiet standing and 80 minutes perturbed standing data are included in this data-set. Recorded information including three dimensional trajectories of thirty-two markers (27 on subjects' trunk and legs and 5 on the treadmill frame), six dimensional ground reaction forces, and nine Electromyography signals (EMGs, on subjects' right leg). In addition, joint angles and torques were calculated using a human body model and inverse dynamics. Basic statistical analysis of the data is also included.

    Measured raw data for each subject in each experimental trial includes three files:

    Mocapxxxx.txt: contains motion capture marker data, ground reaction force, and 76 analog channels. Data was recorded at 100 Hz sampling rate.

    Mocapxxxx_Motion Analysis_analog.txt: contains 76 high sampling rate (1000Hz) analog channels' data. Analog data is consisted of the analog singal from the froce sensor on the treadmill, EMG signals in the Delsys EMG sensors, and 3 axises acceeleration signals of the Delsys EMG sensors.

    Recordxxxx.txt: contains the sway motion data of treadmill and the three-axis acceleration data of two Xsens MTi-10 series sensors.

    Measured raw data also includes two files of the unloaded trial, which is used for the inertia compensation.

    Mocap0000.txt: contains motion capture marker data (5 markers on the treadmill frame) and ground reaction forces.

    Record0000.txt: contains the treadmill sway motion data and the acceleration data (three-axis) of two Xsens MTi-10 series sensors.

    Processed data of each subject in each experimental trial contains four files:

    Mocapxxxx.txt: contains the gap filled motion capture marker data and the inertia compensated ground reaction force data.

    Motionxxxx.txt: contains the calculated the trajectories of three joints' (hip, knee, and ankle) angles, angular velocities, moments, and joint contact forces.

    Data_infoxxxx.txt: contains the quality of recorded raw marker data (percentage and biggest duration of missing marker data), and the percentage of removed inertia artifacts in ground reaction forces

    MotionAnalysis.fig: shows the mean and standard deviation of three joints' trajectories in four experimental trials.

    There are two more plots in the processed data folder which shows the joint motion/moment and the raw/compensated ground reaction forces of one example experimental trial (subject 07 trial 03).

    The processed data was generated using the code in the 'Processing_Code' folder. The code was wrote using Matlab and the main function is "Data_Processing_Main.m"

    More details of the standing balance experiment can be found in the document 'Standing_Balance_Experiment_with_Long_Duration_Random_Pulses_Perturbation.pdf'

  4. d

    Data from: A simple method to describe the COVID-19 trajectory and dynamics...

    • datadryad.org
    • zenodo.org
    zip
    Updated Sep 7, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adam Ćmiel; Bogdan Ćmiel (2021). A simple method to describe the COVID-19 trajectory and dynamics in any country based on Johnson cumulative density function fitting [Dataset]. http://doi.org/10.5061/dryad.f4qrfj6w9
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 7, 2021
    Dataset provided by
    Dryad
    Authors
    Adam Ćmiel; Bogdan Ćmiel
    Time period covered
    Dec 30, 2021
    Description

    covid-data.csv contains the data used in Johnson CDF fitting to the cumulative epidemic curves in each of 80 countires and data on population, population density and GDP per capita for each country. simulation-20k contains generated samples (cumulative epidemic curves) and fitted Johnson CDFs to them, which were used in sensitivity analysis of the Johnson CDF to data perturbation. Total number of infections (sample size) = 20 000. simulation-50k contains generated samples (cumulative epidemic curves) and fitted Johnson CDFs to them, which were used in sensitivity analysis of the Johnson CDF to data perturbation. Total number of infections (sample size) = 50 000. simulation-100k contains generated samples (cumulative epidemic curves) and fitted Johnson CDFs to them, which were used in sensitivity analysis of the Johnson CDF to data perturbation. Total number of infections (sample size) = 100 000.

  5. Gridded Weather Generator Perturbations of Historical Detrended and...

    • data.ca.gov
    • data.cnra.ca.gov
    • +1more
    csv, jpeg, netcdf +2
    Updated May 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Water Resources (2025). Gridded Weather Generator Perturbations of Historical Detrended and Stochastically Generated Temperature and Precipitation for the State of CA and HUC8s [Dataset]. https://data.ca.gov/dataset/gridded-weather-generator-perturbations-of-historical-detrended-and-stochastically-generated-te
    Explore at:
    txt, xlsx, jpeg, csv, netcdfAvailable download formats
    Dataset updated
    May 14, 2025
    Dataset authored and provided by
    California Department of Water Resourceshttp://www.water.ca.gov/
    Description

    The Weather Generator Gridded Data consists of two products:

    [1] statistically perturbed gridded 100-year historic daily weather data including precipitation [in mm], and detrended maximum and minimum temperature in degrees Celsius, and

    [2] stochastically generated and statistically perturbed gridded 1000-year daily weather data including precipitation [in mm], maximum temperature [in degrees Celsius], and minimum temperature in degrees Celsius.

    The base climate of this dataset is a combination of historically observed gridded data including Livneh Unsplit 1915-2018 (Pierce et. al. 2021), Livneh 1915-2015 (Livneh et. al. 2013) and PRISM 2016-2018 (PRISM Climate Group, 2014). Daily precipitation is from Livneh Unsplit 1915-2018, daily temperature is from Livneh 2013 spanning 1915-2015 and was extended to 2018 with daily 4km PRISM that was rescaled to the Livneh grid resolution (1/16 deg). The Livneh temperature was bias corrected by month to the corresponding monthly PRISM climate over the same period. Baseline temperature was then detrended by month over the entire time series based on the average monthly temperature from 1991-2020. Statistical perturbations and stochastic generation of the time series were performed by the Weather Generator (Najibi et al. 2024a and Najibi et al. 2024b).

    The repository consists of 30 climate perturbation scenarios that range from -25 to +25 % change in mean precipitation, and from 0 to +5 degrees Celsius change in mean temperature. Changes in thermodynamics represent scaling of precipitation during extreme events by a scaling factor per degree Celsius increase in mean temperature and consists primarily of 7%/degree-Celsius with 14%/degree-Celsius as sensitivity perturbations. Further insight for thermodynamic scaling can be found in full report linked below or in Najibi et al. 2024a and Najibi et al. 2024b.

    The data presented here was created by the Weather Generator which was developed by Dr. Scott Steinschneider and Dr. Nasser Najibi (Cornell University). If a separate weather generator product is desired apart from this gridded climate dataset, the weather generator code can be adopted to suit the specific needs of the user. The weather generator code and supporting information can be found here: https://github.com/nassernajibi/WGEN-v2.0/tree/main. The full report for the model and performance can be found here: https://water.ca.gov/-/media/DWR-Website/Web-Pages/Programs/All-Programs/Climate-Change-Program/Resources-for-Water-Managers/Files/WGENCalifornia_Final_Report_final_20230808.pdf

  6. Privacy Preservation through Random Nonlinear Distortion - Dataset - NASA...

    • data.nasa.gov
    Updated Mar 31, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). Privacy Preservation through Random Nonlinear Distortion - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/privacy-preservation-through-random-nonlinear-distortion
    Explore at:
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    Consider a scenario in which the data owner has some private or sensitive data and wants a data miner to access them for studying important patterns without revealing the sensitive information. Privacy-preserving data mining aims to solve this problem by randomly transforming the data prior to their release to the data miners. Previous works only considered the case of linear data perturbations - additive, multiplicative, or a combination of both - for studying the usefulness of the perturbed output. In this paper, we discuss nonlinear data distortion using potentially nonlinear random data transformation and show how it can be useful for privacy-preserving anomaly detection from sensitive data sets. We develop bounds on the expected accuracy of the nonlinear distortion and also quantify privacy by using standard definitions. The highlight of this approach is to allow a user to control the amount of privacy by varying the degree of nonlinearity. We show how our general transformation can be used for anomaly detection in practice for two specific problem instances: a linear model and a popular nonlinear model using the sigmoid function. We also analyze the proposed nonlinear transformation in full generality and then show that, for specific cases, it is distance preserving. A main contribution of this paper is the discussion between the invertibility of a transformation and privacy preservation and the application of these techniques to outlier detection. The experiments conducted on real-life data sets demonstrate the effectiveness of the approach.

  7. f

    Data from: Exploration of the Two-Electron Excitation Space with Data-Driven...

    • acs.figshare.com
    txt
    Updated Feb 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    P. D. Varuna S. Pathirage; Justin T. Phillips; Konstantinos D. Vogiatzis (2024). Exploration of the Two-Electron Excitation Space with Data-Driven Coupled Cluster [Dataset]. http://doi.org/10.1021/acs.jpca.3c06600.s002
    Explore at:
    txtAvailable download formats
    Dataset updated
    Feb 29, 2024
    Dataset provided by
    ACS Publications
    Authors
    P. D. Varuna S. Pathirage; Justin T. Phillips; Konstantinos D. Vogiatzis
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Computational cost limits the applicability of post-Hartree–Fock methods such as coupled-cluster on larger molecular systems. The data-driven coupled-cluster (DDCC) method applies machine learning to predict the coupled-cluster two-electron amplitudes (t2) using data from second-order perturbation theory (MP2). One major limitation of the DDCC models is the size of training sets that increases exponentially with the system size. Effective sampling of the amplitude space can resolve this issue. Five different amplitude selection techniques that reduce the amount of data used for training were evaluated, an approach that also prevents model overfitting and increases the portability of data-driven coupled-cluster singles and doubles to more complex molecules or larger basis sets. In combination with a localized orbital formalism to predict the CCSD t2 amplitudes, we have achieved a 10-fold error reduction for energy calculations.

  8. c

    Data from: Temperature correction for cylindrical cavity perturbation...

    • research-data.cardiff.ac.uk
    zip
    Updated Sep 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jerome Cuenca; Daniel Slocombe; Adrian Porch (2024). Temperature correction for cylindrical cavity perturbation measurements [Dataset]. http://doi.org/10.17035/d.2017.0030964432
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 18, 2024
    Dataset provided by
    Cardiff University
    Authors
    Jerome Cuenca; Daniel Slocombe; Adrian Porch
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset demonstrates a method for temperature correcting microwave cavity perturbation measurements by monitoring two modes; one which is perturbed by the sample and one which is not (referred to as a nodal mode). The nodal modes used (TM310 and TE311 for an axial sample in a cylindrical cavity) are subject only to sample-independent influences. To demonstrate this technique the bulk permittivity of a PTFE rod has been measured under varying temperature conditions. The results show that without correction, the measured temperature dependent dielectric constant has large variations associated with the stepped and linear temperature ramping procedures. The corrected response mitigates systematic errors in the real part. However, the correction of the imaginary part requires careful consideration of the mode coupling strength. This work demonstrates the importance of temperature correction in dynamic cavity perturbation experiments.Research results based upon these data are published at http://doi.org/10.1109/TMTT.2017.2652462 and http://dx.doi.org/10.1109/TMTT.2017.2751550

  9. d

    Privacy Preservation through Random Nonlinear Distortion

    • catalog.data.gov
    • s.cnmilf.com
    Updated Apr 9, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Privacy Preservation through Random Nonlinear Distortion [Dataset]. https://catalog.data.gov/dataset/privacy-preservation-through-random-nonlinear-distortion
    Explore at:
    Dataset updated
    Apr 9, 2025
    Dataset provided by
    Dashlink
    Description

    Consider a scenario in which the data owner has some private or sensitive data and wants a data miner to access them for studying important patterns without revealing the sensitive information. Privacy-preserving data mining aims to solve this problem by randomly transforming the data prior to their release to the data miners. Previous works only considered the case of linear data perturbations - additive, multiplicative, or a combination of both - for studying the usefulness of the perturbed output. In this paper, we discuss nonlinear data distortion using potentially nonlinear random data transformation and show how it can be useful for privacy-preserving anomaly detection from sensitive data sets. We develop bounds on the expected accuracy of the nonlinear distortion and also quantify privacy by using standard definitions. The highlight of this approach is to allow a user to control the amount of privacy by varying the degree of nonlinearity. We show how our general transformation can be used for anomaly detection in practice for two specific problem instances: a linear model and a popular nonlinear model using the sigmoid function. We also analyze the proposed nonlinear transformation in full generality and then show that, for specific cases, it is distance preserving. A main contribution of this paper is the discussion between the invertibility of a transformation and privacy preservation and the application of these techniques to outlier detection. The experiments conducted on real-life data sets demonstrate the effectiveness of the approach.

  10. o

    Research data supporting "Complex analysis of divergent perturbation theory...

    • ora.ox.ac.uk
    Updated Jan 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sun, Y; Burton, H G A (2022). Research data supporting "Complex analysis of divergent perturbation theory at finite temperature" [Dataset]. http://doi.org/10.5287/bodleian:zr4eaB46w
    Explore at:
    (3150), (601), (6090), (4594920), (7018), (1818095), (2521), (3446191), (3446193), (120121), (34126655), (1818106)Available download formats
    Dataset updated
    Jan 1, 2022
    Dataset provided by
    University of Oxford
    Authors
    Sun, Y; Burton, H G A
    License

    https://ora.ox.ac.uk/terms_of_usehttps://ora.ox.ac.uk/terms_of_use

    Description

    All data are generated using the Mathematica notebook "SupportingNotebook.nb" within the repository, which also provides relevant metadata. This notebook allows all data to be numerically reproduced.

    Numerical data are exported in human-readable form as plain text files (.txt) that are labelled according to the relevant figure in the paper.

  11. c

    Microwave cavity perturbation of nitrogen doped nano-crystalline diamond...

    • research-data.cardiff.ac.uk
    zip
    Updated Sep 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jerome Cuenca; Oliver Williams; Adrian Porch (2024). Microwave cavity perturbation of nitrogen doped nano-crystalline diamond films - data [Dataset]. http://doi.org/10.17035/d.2018.0065733586
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 18, 2024
    Dataset provided by
    Cardiff University
    Authors
    Jerome Cuenca; Oliver Williams; Adrian Porch
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains the calculated conductivity of nitrogen incorporated nano-crystalline diamond films using the microwave cavity perturbation method. It also includes data for Raman Spectroscopy, X-ray photoelectron spectroscopy and electron energy loss spectroscopy to demonstrate variations in sp2 carbon and which is related to the conductivity of the films.Research results based upon these data are published at https://doi.org/10.1016/j.carbon.2018.12.025

  12. d

    Data from: Evidence for a time-invariant phase variable in human ankle...

    • datadryad.org
    • data.niaid.nih.gov
    • +1more
    zip
    Updated Jan 28, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Robert D. Gregg; Elliott J. Rouse; Levi J. Hargrove; Jonathon W. Sensinger (2015). Evidence for a time-invariant phase variable in human ankle control [Dataset]. http://doi.org/10.5061/dryad.rm505
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 28, 2015
    Dataset provided by
    Dryad
    Authors
    Robert D. Gregg; Elliott J. Rouse; Levi J. Hargrove; Jonathon W. Sensinger
    Time period covered
    Jan 27, 2014
    Description

    Data from: Evidence for a Time-Invariant Phase Variable in Human Ankle ControlReadme file for data used in study: "Evidence for a Time-Invariant Phase Variable in Human Ankle Control" R D Gregg, E J Rouse, L J Hargrove and J W Sensinger PLOS ONE

    For questions regarding the data, please contact the authors: Robert Gregg (rgregg@utdallas.edu) or Elliott Rouse (erouse@media.mit.edu)data.zip

  13. f

    Data from: Asymptotic Behavior of Adversarial Training Estimator under...

    • datasetcatalog.nlm.nih.gov
    • tandf.figshare.com
    Updated Jun 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xie, Yiling; Huo, Xiaoming (2025). Asymptotic Behavior of Adversarial Training Estimator under ℓ∞-Perturbation [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0002031630
    Explore at:
    Dataset updated
    Jun 6, 2025
    Authors
    Xie, Yiling; Huo, Xiaoming
    Description

    Adversarial training has been proposed to protect machine learning models against adversarial attacks. This article focuses on adversarial training under l∞-perturbation, which has recently attracted much research attention. The asymptotic behavior of the adversarial training estimator is investigated in the generalized linear model. The results imply that the asymptotic distribution of the adversarial training estimator under l∞-perturbation could put a positive probability mass at 0 when the true parameter is 0, providing a theoretical guarantee of the associated sparsity-recovery ability. Alternatively, a two-step procedure is proposed—adaptive adversarial training, which could further improve the performance of adversarial training under l∞-perturbation. Specifically, the proposed procedure could achieve asymptotic variable-selection consistency and unbiasedness. Numerical experiments are conducted to show the sparsity-recovery ability of adversarial training under l∞-perturbation and to compare the empirical performance between classic adversarial training and adaptive adversarial training. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.

  14. c

    Numerically Perturbed Structural Connectomes from 100 individuals in the NKI...

    • portal.conp.ca
    • data.niaid.nih.gov
    Updated Feb 9, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kiar, Gregory (2021). Numerically Perturbed Structural Connectomes from 100 individuals in the NKI Rockland Dataset [Dataset]. https://portal.conp.ca/dataset?id=projects/Numerically_Perturbed_Structural_Connectomes_from_100_individuals_in_the_NKI_Rockland_Dataset
    Explore at:
    Dataset updated
    Feb 9, 2021
    Dataset authored and provided by
    Kiar, Gregory
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains the derived connectomes, discriminability scores, and classification performance for structural connectomes estimated from a subset of the Nathan Kline Institute Rockland Sample dataset, and is associated with an upcoming manuscript entitled: Numerical Instabilities in Analytical Pipelines Compromise the Reliability of Network Neuroscience. The associated code for this project is publicly available at: https://github.com/gkpapers/2020ImpactOfInstability. For any questions, please contact Gregory Kiar (gkiar07@gmail.com) or Tristan Glatard (tristan.glatard@concordia.ca).

    Below is a table of contents describing the contents of this dataset, which is followed by an excerpt from the manuscript pertaining to the contained data.

    • impactofinstability_connect_dset25x2x2x20_inputs.h5 : Connectomes derived from 25 subjects, 2 sessions, 2 subsamples, and 20 MCA simulations with input perturbations.
    • impactofinstability_connect_dset25x2x2x20_pipeline.h5 : Connectomes derived from 25 subjects, 2 sessions, 2 subsamples, and 20 MCA simulations with pipeline perturbations.
    • impactofinstability_discrim_dset25x2x2x20_both.csv : Discriminability scores for each grouping of the 25x2x2x20 dataset.
    • impactofinstability_connect+feature_dset100x1x1x20_both.h5 : Connectomes and features derived from 100 subjects, 1 sessions, 1 subsamples, and 20 MCA simulations with both perturbation types.
    • impactofinstability_classif_dset100x1x1x20_both.h5 : Classification performance results for the BMI classification task on the 100x1x1x20 dataset.

    Dataset
    The Nathan Kline Institute Rockland Sample (NKI-RS) dataset [1] contains high-fidelity imaging and phenotypic data from over 1,000 individuals spread across the lifespan. A subset of this dataset was chosen for each experiment to both match sample sizes presented in the original analyses and to minimize the computational burden of performing MCA. The selected subset comprises 100 individuals ranging in age from 6 – 79 with a mean of 36.8 (original: 6 – 81, mean 37.8), 60% female (original: 60%), with 52% having a BMI over 25 (original: 54%).

    Each selected individual had at least a single session of both structural T1-weighted (MPRAGE) and diffusion-weighted (DWI) MR imaging data. DWI data was acquired with 137 diffusion directions; more information regarding the acquisition of this dataset can be found in the NKI-RS data release [1].

    In addition to the 100 sessions mentioned above, 25 individuals had a second session to be used in a test-retest analysis. Two additional copies of the data for these individuals were generated, including only the odd or even diffusion directions (64 + 9 B0 volumes = 73 in either case). This allows an extra level of stability evaluation to be performed between the levels of MCA and session-level variation.

    In total, the dataset is composed of 100 diffusion-downsampled sessions of data originating from 50 acquisitions and 25 individuals for in depth stability analysis, and an additional 100 sessions of full-resolution data from 100 individuals for subsequent analyses.

    Processing
    The dataset was preprocessed using a standard FSL [2] workflow consisting of eddy-current correction and alignment. The MNI152 atlas was aligned to each session of data, and the resulting transformation was applied to the DKT parcellation [3]. Downsampling the diffusion data took place after preprocessing was performed on full-resolution sessions, ensuring that an additional confound was not introduced in this process when comparing between downsampled sessions. The preprocessing described here was performed once without MCA, and thus is not being evaluated.

    Structural connectomes were generated from preprocessed data using two canonical pipelines from Dipy [4]: deterministic and probabilistic. In the deterministic pipeline, a constant solid angle model was used to estimate tensors at each voxel and streamlines were then generated using the EuDX algorithm [5]. In the probabilistic pipeline, a constrained spherical deconvolution model was fit at each voxel and streamlines were generated by iteratively sampling the resulting fiber orientation distributions. In both cases tracking occurred with 8 seeds per 3D voxel and edges were added to the graph based on the location of terminal nodes with weight determined by fiber count.

    Perturbations
    All connectomes were generated with one reference execution where no perturbation was introduced in the processing. For all other executions, all floating point operations were instrumented with Monte Carlo Arithmetic (MCA) [6] through Verificarlo [7]. MCA simulates the distribution of errors implicit to all instrumented floating point operations (flop).

    MCA can be introduced in two places for each flop: before or after evaluation. Performing MCA on the inputs of an operation limits its precision, while performing MCA on the output of an operation highlights round-off errors that may be introduced. The former is referred to as Precision Bounding (PB) and the latter is called Random Rounding (RR).

    Using MCA, the execution of a pipeline may be performed many times to produce a distribution of results. Studying the distribution of these results can then lead to insights on the stability of the instrumented tools or functions. To this end, a complete software stack was instrumented with MCA and is made available on GitHub through https://github.com/gkiar/fuzzy.

    Both the RR and PB variants of MCA were used independently for all experiments. As was presented in [8], both the degree of instrumentation (i.e. number of affected libraries) and the perturbation mode have an effect on the distribution of observed results. For this work, the RR-MCA was applied across the bulk of the relevant libraries and is referred to as Pipeline Perturbation. In this case the bulk of numerical operations were affected by MCA.

    Conversely, the case in which PB-MCA was applied across the operations in a small subset of libraries is here referred to as Input Perturbation. In this case, the inputs to operations within the instrumented libraries (namely, Python and Cython) were perturbed, resulting in less frequent, data-centric perturbations. Alongside the stated theoretical differences, Input Perturbation is considerably less computationally expensive than Pipeline Perturbation.

    All perturbations were targeted the least-significant-bit for all data (t=24and t=53in float32 and float64, respectively [7]). Simulations were performed between 10 and 20 times for each pipeline execution, depending on the experiment. A detailed motivation for the number of simulations can be found in [9].

    Evaluation
    The magnitude and importance of instabilities in pipelines can be considered at a number of analytical levels, namely: the induced variability of derivatives directly, the resulting downstream impact on summary statistics or features, or the ultimate change in analyses or findings. We explore the nature and severity of instabilities through each of these lenses. Unless otherwise stated, all p-values were computed using Wilcoxon signed-rank tests.

    Direct Evaluation of the Graphs
    The differences between simulated graphs was measured directly through both a direct variance quantification and a comparison to other sources of variance such as individual- and session-level differences.

    Quantification of Variability – Graphs, in the form of adjacency matrices, were compared to one another using three metrics: normalized percent deviation, Pearson correlation, and edgewise significant digits. The normalized percent deviation measure, defined in [8], scales the norm of the difference between a simulated graph and the reference execution (that without intentional perturbation) with respect to the norm of the reference graph. The purpose of this comparison is to provide insight on the scale of differences in observed graphs relative to the original signal intensity. A Pearson correlation coefficient was computed in complement to normalized percent deviation to identify the consistency of structure and not just intensity between observed graphs. Finally, the estimated number of significant digits for each edge in the graph was computed. The upper bound on significant digits is 15.7 for 64-bit floating point data.

    The percent deviation, correlation, and number of significant digits were each calculated within a single session of data, thereby removing any subject- and session-effects and providing a direct measure of the tool-introduced variability across perturbations. A distribution was formed by aggregating these individual results.

    Class-based Variability Evaluation – To gain a concrete understanding of the significance of observed variations we explore the separability of our results with respect to understood sources of variability, such as subject-, session-, and pipeline-level effects. This can be probed through Discriminability [10], a technique similar to ICC which relies on the mean of a ranked distribution of distances between observations belonging to a defined set of classes.

    Discriminability can then be interpreted as the probability that an observation belonging to a given class will be more similar to other observations within that class than observations of a different class. It is a measure of reproducibility, and is discussed in detail in [10].

    This definition allows for the exploration of deviations across arbitrarily defined classes which in practice can be any of those listed above. We combine this statistic with permutation testing to test hypotheses on whether

  15. h

    Data from: Massive Transcriptional Perturbation in Subgroups of Diffuse...

    • health-atlas.de
    Updated Jul 13, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maciej Rosolowski (2020). Massive Transcriptional Perturbation in Subgroups of Diffuse Large B-cell Lymphomas [Dataset]. https://www.health-atlas.de/data_files/20?version=1
    Explore at:
    Dataset updated
    Jul 13, 2020
    Authors
    Maciej Rosolowski
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Based on the assumption that molecular mechanisms involved in cancerogenesis are characterized by groups of coordinately expressed genes, we developed and validated a novel method for analyzing transcriptional data called Correlated Gene Set Analysis (CGSA). Using 50 extracted gene sets we identified three different profiles of tumors in a cohort of 364 Diffuse large B-cell (DLBCL) and related mature aggressive B-cell lymphomas other than Burkitt lymphoma. The first profile had high level of expression of genes related to proliferation whereas the second profile exhibited a stromal and immune response phenotype. These two profiles were characterized by a large scale gene activation affecting genes which were recently shown to be epigenetically regulated, and which were enriched in oxidative phosphorylation, energy metabolism and nucleoside biosynthesis. The third and novel profile showed only low global gene activation similar to that found in normal B cells but not cell lines. Our study indicates novel levels of complexity of DLBCL with low or high large scale gene activation related to metabolism and biosynthesis and, within the group of highly activated DLBCLs, differential behavior leading to either a proliferative or a stromal and immune response phenotype.

  16. ARCTAS P-3B Aircraft Merge Data - Dataset - NASA Open Data Portal

    • data.nasa.gov
    Updated Apr 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). ARCTAS P-3B Aircraft Merge Data - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/arctas-p-3b-aircraft-merge-data-159e2
    Explore at:
    Dataset updated
    Apr 1, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    ARCTAS_Merge_P3B-Aircraft_Data contains pre-generated merge data files for the P-3B aircraft during the Arctic Research of the Composition of the Troposphere from Aircraft & Satellites (ARCTAS) mission. Data collection for this product is complete.The Arctic is a critical region in understanding climate change. The responses of the Arctic to environmental perturbations such as warming, pollution, and emissions from forest fires in boreal Eurasia and North America include key processes such as the melting of ice sheets and permafrost, a decrease in snow albedo, and the deposition of halogen radical chemistry from sea salt aerosols to ice. ARCTAS was a field campaign that explored environmental processes related to the high degree of climate sensitivity in the Arctic. ARCTAS was part of NASA’s contribution to the International Global Atmospheric Chemistry (IGAC) Polar Study using Aircraft, Remote Sensing, Surface Measurements, and Models of Climate, Chemistry, Aerosols, and Transport (POLARCAT) Experiment for the International Polar Year 2007-2008.ARCTAS had four primary objectives. The first was to understand long-range transport of pollution to the Arctic. Pollution brought to the Arctic from northern mid-latitude continents has environmental consequences, such as modifying regional and global climate and affecting the ozone budget. Prior to ARCTAS, these pathways remained largely uncertain. The second objective was to understand the atmospheric composition and climate implications of boreal forest fires; the smoke emissions from which act as an atmospheric perturbation to the Arctic by impacting the radiation budget and cloud processes and contributing to the production of tropospheric ozone. The third objective was to understand aerosol radiative forcing from climate perturbations, as the Arctic is an important place for understanding radiative forcing due to the rapid pace of climate change in the region and its unique radiative environment. The fourth objective of ARCTAS was to understand chemical processes with a focus on ozone, aerosols, mercury, and halogens. Additionally, ARCTAS sought to develop capabilities for incorporating data from aircraft and satellites related to pollution and related environmental perturbations in the Arctic into earth science models, expanding the potential for those models to predict future environmental change.ARCTAS consisted of two, three-week aircraft deployments conducted in April and July 2008. The spring deployment sought to explore arctic haze, stratosphere-troposphere exchange, and sunrise photochemistry. April was chosen for the deployment phase due to historically being the peak in the seasonal accumulation of pollution from northern mid-latitude continents in the Arctic. The summer deployment sought to understand boreal forest fires at their most active seasonal phase in addition to stratosphere-troposphere exchange and summertime photochemistry.During ARCTAS, three NASA aircrafts, the DC-8, P-3B, and BE-200, conducted measurements and were equipped with suites of in-situ and remote sensing instrumentation. Airborne data was used in conjunction with satellite observations from AURA, AQUA, CloudSat, PARASOL, CALIPSO, and MISR.The ASDC houses ARCTAS aircraft data, along with data related to MISR, a satellite instrument aboard the Terra satellite which provides measurements that provide information about the Earth’s environment and climate.

  17. S

    Data from: Bragg microcavities created by the collision of half-cycle...

    • scidb.cn
    Updated May 15, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ROSTISLAV ARKHIPOV; OLGA DIACHKOVA; MIKHAIL ARKHIPOV; IVAN KISLYAKOV; JUN WANG; NIKOLAY ROSANOV (2025). Bragg microcavities created by the collision of half-cycle Gaussian and rectangular attosecond light pulses in a time-dependent resonant medium [Dataset]. http://doi.org/10.57760/sciencedb.25003
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 15, 2025
    Dataset provided by
    Science Data Bank
    Authors
    ROSTISLAV ARKHIPOV; OLGA DIACHKOVA; MIKHAIL ARKHIPOV; IVAN KISLYAKOV; JUN WANG; NIKOLAY ROSANOV
    Description

    This paper examines the behavior of dynamic microcavities (DM) in multi-level resonant media when attosecond unipolar pulses of different shapes (Gaussian and rectangular) collide, acting as 2π-like selfinduced transparency pulses. In the case of realistic quasi-unipolar pulses, the influence of the opposite polarity trailing edge is investigated, and conditions under which the trailing edge’s effect can be ignored are found analytically and then confirmed by a direct numerical calculation. We demonstrate that the collision of sub-cycle pulses in a medium results in a rapid change of the refractive index both in time and space. Thus, the resonant medium under the action of a sequence of pulses considered here is an example of a time-dependent medium. The findings demonstrate the feasibility of creating time-dependent media in an atomic medium with high polarization relaxation times T2, and subsequently controlling them with half-cycle pulses at an extremely short timescale of half of an electromagnetic field oscillation.

  18. ARCTAS P-3B Aircraft In-situ Trace Gas Data - Dataset - NASA Open Data...

    • data.nasa.gov
    Updated Apr 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). ARCTAS P-3B Aircraft In-situ Trace Gas Data - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/arctas-p-3b-aircraft-in-situ-trace-gas-data-6d79a
    Explore at:
    Dataset updated
    Apr 1, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    ARCTAS_TraceGas_AircraftInSitu_P3B_Data is the in-situ trace gas data for the P-3B aircraft collected during the Arctic Research of the Composition of the Troposphere from Aircraft & Satellites sub-orbital campaign. This product features data from the Carbon monOxide by Attenuated Laser Transmission (COBALT) instrument. Data collection for this product is complete.The Arctic is a critical region in understanding climate change. The responses of the Arctic to environmental perturbations such as warming, pollution, and emissions from forest fires in boreal Eurasia and North America include key processes such as the melting of ice sheets and permafrost, a decrease in snow albedo, and the deposition of halogen radical chemistry from sea salt aerosols to ice. Arctic Research of the Composition of the Troposphere from Aircraft and Satellites (ARCTAS) was a field campaign that explored environmental processes related to the high degree of climate sensitivity in the Arctic. ARCTAS was part of NASA’s contribution to the International Global Atmospheric Chemistry (IGAC) Polar Study using Aircraft, Remote Sensing, Surface Measurements, and Models of Climate, Chemistry, Aerosols, and Transport (POLARCAT) Experiment for the International Polar Year 2007-2008.ARCTAS had four primary objectives. The first was to understand long-range transport of pollution to the Arctic. Pollution brought to the Arctic from northern mid-latitude continents has environmental consequences, such as modifying regional and global climate and affecting the ozone budget. Prior to ARCTAS, these pathways remained largely uncertain. The second objective was to understand the atmospheric composition and climate implications of boreal forest fires; the smoke emissions from which act as an atmospheric perturbation to the Arctic by impacting the radiation budget and cloud processes and contributing to the production of tropospheric ozone. The third objective was to understand aerosol radiative forcing from climate perturbations, as the Arctic is an important place for understanding radiative forcing due to the rapid pace of climate change in the region and its unique radiative environment. The fourth objective of ARCTAS was to understand chemical processes with a focus on ozone, aerosols, mercury, and halogens. Additionally, ARCTAS sought to develop capabilities for incorporating data from aircraft and satellites related to pollution and related environmental perturbations in the Arctic into earth science models, expanding the potential for those models to predict future environmental change.ARCTAS consisted of two, three-week aircraft deployments conducted in April and July 2008. The spring deployment sought to explore arctic haze, stratosphere-troposphere exchange, and sunrise photochemistry. April was chosen for the deployment phase due to historically being the peak in the seasonal accumulation of pollution from northern mid-latitude continents in the Arctic. The summer deployment sought to understand boreal forest fires at their most active seasonal phase in addition to stratosphere-troposphere exchange and summertime photochemistry.During ARCTAS, three NASA aircrafts, the DC-8, P-3B, and BE-200, conducted measurements and were equipped with suites of in-situ and remote sensing instrumentation. Airborne data was used in conjunction with satellite observations from AURA, AQUA, CloudSat, PARASOL, CALIPSO, and MISR.The ASDC houses ARCTAS aircraft data, along with data related to MISR, a satellite instrument aboard the Terra satellite which provides measurements that provide information about the Earth’s environment and climate.

  19. d

    Data from: A global optimization paradigm based on change of measures

    • datadryad.org
    • data.niaid.nih.gov
    • +1more
    zip
    Updated May 14, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Saikat Sarkar; Debasish Roy; Ram Mohan Vasu (2015). A global optimization paradigm based on change of measures [Dataset]. http://doi.org/10.5061/dryad.6rj0n
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 14, 2015
    Dataset provided by
    Dryad
    Authors
    Saikat Sarkar; Debasish Roy; Ram Mohan Vasu
    Time period covered
    May 12, 2015
    Description

    exampleexperimentThis is the main file for solving global optimization problems. In this specific case pseudo-code 2 has been used for demonstrating the performance the algorithm. All the input values (i.e. dimension of problem, minimum and maximum function evaluations) are to be given here. To run this code, except the functions uploaded herein plese use the two function benchmarks.m and fgeneric.m from https://www.lri.fr/~hansen/cmaes_inmatlab.html which is an open sorce platform.MY_OPTIMIZERThe specific optimization algorithm is written here. For solving the global optimization problem with a different algorithm, only this function needs to be modified.new_KS_EnKF_gain_matrixThis function creates the gain matrix, defined in the main manuscript, which is required in pseudo code 2 for giving the evolving particle proper directionality.phase_KThe phase data (attached) for the illustrative example in Section 5.2phase_KThe phase data (attached) for the illustrative example in Section 5.2

  20. d

    Data for: Detecting artificially impaired balance in human locomotion:...

    • search.dataone.org
    • datadryad.org
    Updated May 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jiaen Wu; Michael Raitor; Guan Rong Tan; Kristan Staudenmayer; Scott Delp; C. Karen Liu; Steven Collins (2025). Data for: Detecting artificially impaired balance in human locomotion: metrics, perturbation effects and detection thresholds [Dataset]. http://doi.org/10.5061/dryad.cnp5hqch3
    Explore at:
    Dataset updated
    May 23, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Jiaen Wu; Michael Raitor; Guan Rong Tan; Kristan Staudenmayer; Scott Delp; C. Karen Liu; Steven Collins
    Description

    Measuring balance is important for detecting impairments and developing interventions to prevent falls, but there is no consensus on which method is most effective. Many balance metrics derived from steady-state walking data have been proposed, such as step width variability, step time variability, foot placement predictability, maximum Lyapunov exponent, and margin of stability. Recently, perturbation-based metrics such as center of mass displacement have also been explored. Perturbations typically involve unexpected disturbances applied to the subject. In this study, we collected walking data from 10 healthy subjects while walking normally and impairing their balance with ankle braces, eye-blocking masks, and pneumatic jets on their legs. In some walking trials, we also applied mechanical perturbations to their pelvis. We provide a comprehensive biomechanics dataset as supplementary material. We compared the ability of various metrics to detect impaired balance using steady-state walk..., , # Steady-state and perturbed walking dataset

    Dataset DOI: 10.5061/dryad.cnp5hqch3

    Description of the data and file structure

    1) <"Scripts" folder> section: Describes the folder titled "Scripts", which includes information on how we processed our data.

    2) <"Subject X" folders...> section: Describes all the data in the folders that can be found in the "Subject X" folders.

    3)

    "Scripts" folder

    Scripts and aggregate data used for time-synchronizing sensors (e.g., between Vicon, EMG, etc.) and sensor data processing.

    Read the README in the "Scripts" folder for a more detailed guide on how we processed our data using these scripts.

    visualizeProcessedData.m demonstrates how to load important signals in the dataset using MATLAB.

    Note: All code using the data in this dataset to generate results in ..., We confirm that we obtained explicit written consent from all participants to publish their de-identified data in the public domain as part of the study’s approved IRB protocol. To ensure privacy, all identifying information was removed, and each participant was assigned a random numerical code. The dataset includes only biomechanical and sensor data with no names, dates of birth, or other personally identifiable information.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Kerstin Johannesson; Roger K. Butlin (2016). What explains rare and conspicuous colours in a snail? A test of time-series data against models of drift, migration or selection [Dataset]. http://doi.org/10.5061/dryad.427p0

Data from: What explains rare and conspicuous colours in a snail? A test of time-series data against models of drift, migration or selection

Related Article
Explore at:
zipAvailable download formats
Dataset updated
Jul 29, 2016
Dataset provided by
University of Gothenburg
Authors
Kerstin Johannesson; Roger K. Butlin
License

https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

Area covered
North Sea
Description

It is intriguing that conspicuous colour morphs of a prey species may be maintained at low frequencies alongside cryptic morphs. Negative frequency-dependent selection by predators using search images (‘apostatic selection’) is often suggested without rejecting alternative explanations. Using a maximum likelihood approach we fitted predictions from models of genetic drift, migration, constant selection, heterozygote advantage or negative frequency-dependent selection to time-series data of colour frequencies in isolated populations of a marine snail (Littorina saxatilis), re-established with perturbed colour morph frequencies and followed for >20 generations. Snails of conspicuous colours (white, red, banded) are naturally rare in the study area (usually <10%) but frequencies were manipulated to levels of ~50% (one colour per population) in 8 populations at the start of the experiment in 1992. In 2013, frequencies had declined to ~15–45%. Drift alone could not explain these changes. Migration could not be rejected in any population, but required rates much higher than those recorded. Directional selection was rejected in three populations in favour of balancing selection. Heterozygote advantage and negative frequency-dependent selection could not be distinguished statistically, although overall the results favoured the latter. Populations varied idiosyncratically as mild or variable colour selection (3–11%) interacted with demographic stochasticity, and the overall conclusion was that multiple mechanisms may contribute to maintaining the polymorphisms.

Search
Clear search
Close search
Google apps
Main menu