Facebook
TwitterThese are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: File format: R workspace file; “Simulated_Dataset.RData”. Metadata (including data dictionary) • y: Vector of binary responses (1: adverse outcome, 0: control) • x: Matrix of covariates; one row for each simulated individual • z: Matrix of standardized pollution exposures • n: Number of simulated individuals • m: Number of exposure time periods (e.g., weeks of pregnancy) • p: Number of columns in the covariate design matrix • alpha_true: Vector of “true” critical window locations/magnitudes (i.e., the ground truth that we want to estimate) Code Abstract We provide R statistical software code (“CWVS_LMC.txt”) to fit the linear model of coregionalization (LMC) version of the Critical Window Variable Selection (CWVS) method developed in the manuscript. We also provide R code (“Results_Summary.txt”) to summarize/plot the estimated critical windows and posterior marginal inclusion probabilities. Description “CWVS_LMC.txt”: This code is delivered to the user in the form of a .txt file that contains R statistical software code. Once the “Simulated_Dataset.RData” workspace has been loaded into R, the text in the file can be used to identify/estimate critical windows of susceptibility and posterior marginal inclusion probabilities. “Results_Summary.txt”: This code is also delivered to the user in the form of a .txt file that contains R statistical software code. Once the “CWVS_LMC.txt” code is applied to the simulated dataset and the program has completed, this code can be used to summarize and plot the identified/estimated critical windows and posterior marginal inclusion probabilities (similar to the plots shown in the manuscript). Optional Information (complete as necessary) Required R packages: • For running “CWVS_LMC.txt”: • msm: Sampling from the truncated normal distribution • mnormt: Sampling from the multivariate normal distribution • BayesLogit: Sampling from the Polya-Gamma distribution • For running “Results_Summary.txt”: • plotrix: Plotting the posterior means and credible intervals Instructions for Use Reproducibility (Mandatory) What can be reproduced: The data and code can be used to identify/estimate critical windows from one of the actual simulated datasets generated under setting E4 from the presented simulation study. How to use the information: • Load the “Simulated_Dataset.RData” workspace • Run the code contained in “CWVS_LMC.txt” • Once the “CWVS_LMC.txt” code is complete, run “Results_Summary.txt”. Format: Below is the replication procedure for the attached data set for the portion of the analyses using a simulated data set: Data The data used in the application section of the manuscript consist of geocoded birth records from the North Carolina State Center for Health Statistics, 2005-2008. In the simulation study section of the manuscript, we simulate synthetic data that closely match some of the key features of the birth certificate data while maintaining confidentiality of any actual pregnant women. Availability Due to the highly sensitive and identifying information contained in the birth certificate data (including latitude/longitude and address of residence at delivery), we are unable to make the data from the application section publically available. However, we will make one of the simulated datasets available for any reader interested in applying the method to realistic simulated birth records data. This will also allow the user to become familiar with the required inputs of the model, how the data should be structured, and what type of output is obtained. While we cannot provide the application data here, access to the North Carolina birth records can be requested through the North Carolina State Center for Health Statistics, and requires an appropriate data use agreement. Description Permissions: These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is associated with the following publication: Warren, J., W. Kong, T. Luben, and H. Chang. Critical Window Variable Selection: Estimating the Impact of Air Pollution on Very Preterm Birth. Biostatistics. Oxford University Press, OXFORD, UK, 1-30, (2019).
Facebook
Twitterhttps://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html
The validation of a simulation model is a crucial task in model development. It involves the comparison of simulation data to observation data and the identification of suitable model parameters. SLIVISU is a Visual Analytics framework that enables geoscientists to perform these tasks for observation data that is sparse and uncertain. Primarily, SLIVISU was designed to evaluate sea level indicators, which are geological or archaeological samples supporting the reconstruction of former sea level over the last ten thousands of years and are compiled in a postgreSQL database system. At the same time, the software aims at supporting the validation of numerical sea-level reconstructions against this data by means of visual analytics.
Facebook
Twitterhttps://spdx.org/licenses/CC0-1.0https://spdx.org/licenses/CC0-1.0
Excel spreadsheet with data and simulations used to prepare figures for publication, see Metadata sheet for conditions. Data Fresh (not dry) rosette leaf biomass, measured in samples of 5 plants each on multiple days, as mean and SD; Simulation outputs from FMv2 for Col Wild Type plants, lsf1, and two simulations for prr7prr9 where the mutation affects only starch degradation or both starch degradation and malate/fumarate store mobilisation.
Starch levels in carbon units (not C6) measured on on days 27-28, mean and SD, simulations as above Malate and fumarate levels in carbon units (not C4) measured on days 27-28, mean and SD, simulations as above Many simulation outputs from FMv2 runs in the conditions above, from the Matlab output file
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains the main data of the paper "Optimal Rejection-Free Path Sampling," and the source code for generating/appending the independent RFPS-AIMMD and AIMMD runs.
Due to size constraints, the data has been split into separate repositories. The following repositories contain the trajectory files generated by the runs:
all the WQ runs: 10.5281/zenodo.14830317chignolin, fps0: 10.5281/zenodo.14826023chignolin, fps1: 10.5281/zenodo.14830200chignolin, fps2: 10.5281/zenodo.14830224chignolin, tps0: 10.5281/zenodo.14830251chignolin, tps1: 10.5281/zenodo.14830270chignolin, tps2: 10.5281/zenodo.14830280
The trajectory files are not required for running the main analysis, as all necessary information for machine learning and path reweighting is contained in the "PatEnsemble" object files stored in this repository. However, these trajectories are essential for projecting the path ensemble estimate onto an arbitrary set of collective variables.
To reconstruct the full dataset, please merge all the data folders you find in the supplemental repositories.
Data structure and content
analysis (code for analyzing the data and generating the figures of the| paper)|- figures.ipynb (Jupyter notebook for the analysis)|- figures (the figures created by the Jupyter notebook) |- ...
data (all the AIMMD and reference runs, plus general info about the| simulated systems)|- chignolin |- *.py: (code for generating/appending AIMMD runs on a Workstation or | HPC cluster via Slurm; see the "src" folder below) |- run.gro (full system positions in the native conformation) |- mol.pdb (only the peptide positions in the native conformation) |- topol.top (the system's topology for the GROMACS MD engine) |- charmmm22star.ff (force field parameter files) |- run.mdp (GROMACS MD parameters when appending a simulation) |- randomvelocities.mdp (GROMACS MD parameters when initializing a | simulation with random velocities) |- signature.npy, r0.npy (parameters for defining the fraction of native | contacts involved in the folded/unfolded states | definition; used by params.py function | "states_function") |- dmax.npy, dmin.npy (parameters for defining the feature representation | of the AIMMD NN model; used by params.py | function "descriptors_function") |- equilibrium (reference long equilibrium trajectory files; only the | peptide positions are saved!) |- run0.xtc, ..., run3.xtc |- validation |- validation.xtc (the validation SPs all together in an XTC file) |- validation.npy (for each SP, collects the cumulative shooting results after 10 two-way shooting simulations) |- fps0 (the first AIMMD-RFPS independent run) |- equilibriumA (the free simulations around A, already processed | in PathEnsemble files) |- traj000001.h5 |- traj000001.tpr (for running the simulation; in that case, please | retrieve all the trajectory files in the right | supplemental repository first) |- traj000001.cpt (for appending the simulation; in that case, please | retrieve all the trajectory files in the right | supplemental repository first) |- traj000002.h5 (in case of re-initialization) |- ... |- equilibriumB (the free simulations around B, ...) |- ... |- shots0 |- chain.h5 (the path sampling chain) |- pool.h5 (the selection pool, containing the frames from which | shooting points are currently selected from) |- params.py (file containing the states and descriptors definitions, | the NN fit function, and the AIMMD runs hyperparameters; | if can be modified to allow for RFPS-AIMMD or the original | algorithm AIMMD runs) |- initial.trr (the initial transition for path sampling) |- manager.log (reports info about the run) |- network.h5 (NN weights of the model at different path | sampling steps) |- fps1, fps2 (the other RFPS-AIMMD runs) |- tps0 (the first AIMMD-TPS, or "standard" AIMMD, run) |- ... |- shots0 |- ... |- chain_weights.npy (weights of the trials in TPS; only the trials | with non zero weight had been accepted) |- tps1, tps2 (the other AIMMD runs, with TPS for the shooting simulations)|- wq (Wolfe-Quapp 2D system) |- *.py: (code for generating/appending AIMMD runs on a Workstation or | HPC cluster via Slurm) |- run.gro (dummy gro file produced for compatibility reasons) |- integrator.py (custom MD engine) |- equilibrium (reference long simulation) |- transition000001.xtc (extracted from reference long simulation) |- transition000002.xtc |- ... |- transitions.h5 (PathEnsemble file with all the transitions) |- reference |- grid_X.npy, grid_Y.npy (X, Y grid for 2D plots) |- grid_V.npy (PES projected on the grid) |- grid_committor_relaxation.npy (true committor on the grid solved | with the relaxation method on the | backward Kolmogorov equation; the | code for doing this is in utils.py) |- grid_boltzmann_distribution.npy (Boltzmann distribution on the grid) |- pe.h5 (equilibrium distribution processed as a PathEnsemble file) |- tpe.h5 (TPE distribution processed as a PathEnsemble file) |- ... |- uniform_tps (reference TPS run with uniform SP selection) |- chain.h5 (PathEnsemble file containin all the accepted paths | with their correct weight) |- fps0, ..., fps9 (the independent AIMMD-RFPS runs) |- ... |- tps0, ..., tps9 (the independent AIMMD-TPS, or "standard" AIMMD runs)
src (code for generating/appending AIMMD runs on a Workstation or HPC| cluster via Slurm)|- generate.py (on a Workstation: initializes the processes; on an HPC| cluster: creates the sh file for submitting a job)|- slurm_options.py (to customize and use in case of running on HPC)|- manager.py (controls SP selection; reweights the paths)|- shooter.py (performs path sampling simulations)|- equilibrium.py (performs free simulations)|- pathensemble.py (code of the PathEnsemble class)|- utils.py (auxiliary functions for data production and analysis)
Running/appending AIMMD runs
Create a "run directory" folder (same depth as "fps0")
Copy "initial.trr" and "params.py" from another AIMMD run folder. It is possible to change "params.py" to customize the run.
(On a Workstation) call:
python generate.py
where nsteps is the final number of path sampling steps for the run, n the number of independent path sampling chains, nA the number of independent free simulators around A, and nB that of free simulators around B.
python generate.py -s slurm_options.pysbatch ._job.sh
Merge the supplemental repository with the trajectory files into this one.
Just call again (on a Workstation)
python generate.py
or (on a HPC cluster)
sbatch ._job.sh
after updating the "nsteps" parameters.
Reproducing the analysis
Run the analysis/figures.ipynb notebook. Some groups of cells have to be run multiple times after changing the parameters in the preamble.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset to produce the results of the publication: "Confronting Large-Eddy Simulations with Stereo Camera Data by means of reconstructed hemispheric Cloud Size Distributions". This dataset supports the findings presented in the publication and includes comprehensive resources for replicating its analysis and visualization.
The dataset encompasses:
Dutch Atmospheric Large-Eddy Simulation (DALES) Data
Configuration files
Selected simulation output data
Image Data
Rendered stereo camera images from the DALES output
Actual stereo camera images
Cloud masks generated from these images
Camera-Based Reconstructions
Reconstructed cloud fields from the rendered camera images
Reconstructed cloud fields from the actual camera images
Derived Cloud Metrics
Cloud base areas, cloud base heights, and cloud cover from the camera-based reconstructions
Observational Data
Radiosondes, Ceilometer, and Cloudnet measurements
Cloud cover from radiation measurements
Mixed layer height from the Doppler lidar
Reproduction Scripts
Scripts to reproduce the analysis and figures
This research was supported by the U.S. Department of Energy's Atmospheric System Research, an Office of Science Biological and Environmental Research program, under grant DE-SC0022126 and by the German Research Foundation (DFG) under project number 430226822 (https://gepris.dfg.de/gepris/projekt/430226822). The Gauss Centre for Supercomputing e.V. (https://www.gauss-centre.eu/) is acknowledged for providing computing time on the Gauss Centre for Supercomputing (GCS) supercomputer JUWELS at the Jülich Supercomputing Centre (JSC) under the projects RCONGM and VIRTUALLAB. JOYCE data were provided by the Institute for Geophysics and Meteorology of the University of Cologne. JOYCE is a collaborative research platform between University of Cologne and Forschungszentrum Jülich within the European research infrastructure ACTRIS. We acknowledge ACTRIS and the Finnish Meteorological Institute for providing Cloudnet data which is available for download from https://cloudnet.fmi.fi. We acknowledge ECMWF for providing IFS model data.
Facebook
Twitterhttps://www.nist.gov/open/licensehttps://www.nist.gov/open/license
The Internet of Things (IoT) is comprised of networks of physical, computational, and human components that coordinate to fulfill time-sensitive functions in a shared operating environment. Development and testing of IoT systems often utilizes modeling and simulation, whether to analyze potential performance gains of new technologies or develop robust digital twins to support future operations and maintenance. However, the complexity and scale of IoT means that individual simulators are often inadequate to simulate the real-world dynamics of such systems, and simulators must be combined with other software or hardware. The National Institute of Standards and Technology (NIST) has developed a software module that extends the ns-3 network simulator with a new capability to communicate with external software and hardware at runtime. This software facilitates the development of co-simulations where ns-3 models can synchronize and exchange data with external processes to develop higher-fidelity simulations. The software is open-source and available on the NIST GitHub.
Facebook
TwitterThis dataset comprises monthly mean data from a global, transient simulation with the Whole Atmosphere Community Climate Model eXtension (WACCM-X) from 2015 to 2070. WACCM-X is a global atmosphere model covering altitudes from the surface up to ~500 km, i.e., including the troposphere, stratosphere, mesosphere and thermosphere. WACCM-X version 2.0 (Liu et al., 2018) was used, part of the Community Earth System Model (CESM) release 2.1.0 (http://www.cesm.ucar.edu/models/cesm2) made available by the National Center for Atmospheric Research. The model was run in free-running mode with a horizontal resolution of 1.9 degrees latitude and 2.5 degrees longitude (giving 96 latitude points and 144 longitude points) and 126 vertical levels. Further description of the model and simulation setup is provided by Cnossen (2022) and references therein. A large number of variables is included on standard monthly mean output files on the model grid, while selected variables are also offered interpolated to a constant height grid or vertically integrated in height (details below). Zonal mean and global mean output files are included as well. The data are provided in NetCDF format and file names have the following structure: f.e210.FXHIST.f19_f19.h1a.cam.h0.[YYYY]-[MM][DFT].nc where [YYYY] gives the year with 4 digits, [MM] gives the month (2 digits) and [DFT] specifies the data file type. The following data file types are included: 1) Monthly mean output on the full grid for the full set of variables; [DFT] = 2) Zonal mean monthly mean output for the full set of variables; [DFT] = _zm 3) Global mean monthly mean output for the full set of variables; [DFT] = _gm 4) Height-interpolated/-integrated output on the full grid for selected variables; [DFT] = _ht A cos(latitude) weighting was used when calculating the global means. Data were interpolated to a set of constant heights (61 levels in total) using the Z3GM variable (for variables output on midpoints, with 'lev' as the vertical coordinate) or the Z3GMI variable (for variables output on interfaces, with ilev as the vertical coordinate) stored on the original output files (type 1 above). Interpolation was done separately for each longitude, latitude and time. Mass density (DEN [g/cm3]) was calculated from the M_dens, N2_vmr, O2, and O variables on the original data files before interpolation to constant height levels. The Joule heating power QJ [W/m3] was calculated using Q_J = (sigma_P*B^2)*((u_i - U_n)^2 + (v_i-v_n)^2 + (w_i-w_n)^2) with sigma_P = Pedersen conductivity[S], B = geomagnetic field strength [T], ui, vi, and wi = zonal, meridional, and vertical ion velocities [m/s] and un, vn, and wn = neutral wind velocities [m/s]. QJ was integrated vertically in height (using a 2.5 km height grid spacing rather than the 61 levels on output file type 4) to give the JHH variable on the type 4 data files. The QJOULE variable also given is the Joule heating rate [K/s] at each of the 61 height levels. All data are provided as monthly mean files with one time record per file, giving 672 files for each data file type for the period 2015-2070 (56 years). References: Cnossen, I. (2022), A realistic projection of climate change in the upper atmosphere into the 21st century, in preparation. Liu, H.-L., C.G. Bardeen, B.T. Foster, et al. (2018), Development and validation of the Whole Atmosphere Community Climate Model with thermosphere and ionosphere extension (WACCM-X 2.0), Journal of Advances in Modeling Earth Systems, 10(2), 381-402, doi:10.1002/2017ms001232.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Additional simulation data for "Computational and experimental assessment of key interdomain residues controlling the fold-switch of RfaH"
Content:
'AA-SBM': Contains a total of 4 folders, which require the use of SMOG2 and the GROMACS v4.5.4 version with added Gaussian contact potentials available at the SMOG server (https://smog-server.org).
'ColabFold': Contains a total of 2 folders with results from protein structure predictions using ColabFold v1.5.5 (https://colabfold.com).
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
This dataset comprises monthly mean data from a global, transient simulation with the Whole Atmosphere Community Climate Model eXtension (WACCM-X) from 1950 to 2015. WACCM-X is a global atmosphere model covering altitudes from the surface up to ~500 km, i.e. including the troposphere, stratosphere, mesosphere and thermosphere.
WACCM-X version 2.0 (Liu et al., 2018) was used, part of the Community Earth System Model (CESM) release 2.1.0 made available by the US National Center for Atmospheric Research. The model was run in free-running mode with a horizontal resolution of 1.9° latitude 2.5° longitude (giving 96 latitude points and 144 longitude points) and 126 vertical levels. Further description of the model and simulation setup is provided by Cnossen (2020) and references therein. A large number of variables are included on standard monthly mean output files on the model grid, while selected variables are also offered interpolated to a constant height grid or vertically integrated in height (details below). Zonal mean and global mean output files are included as well.
The following data file types are included: 1)Monthly mean output on the full grid for the full set of variables; [DFT] = '' 2)Zonal mean monthly mean output for the full set of variables; [DFT] = _zm 3)Global mean monthly mean output for the full set of variables; [DFT] = _gm 4)Height-interpolated/-integrated output on the full grid for selected variables; [DFT] = _ht
A cos(latitude) weighting was used when calculating the global means.
Data were interpolated to a set of constant heights (61 levels in total) using the Z3GM variable (for variables output on midpoints, with "lev" as the vertical coordinate) or the Z3GMI variable (for variables output on interfaces, with "ilev" as the vertical coordinate) stored on the original output files (type 1 above). Interpolation was done separately for each longitude, latitude and time.
Mass density (DEN [g/cm3]) was calculated from the M_dens, N2_vmr, O2, and O variables on the original data files before interpolation to constant height levels.
The Joule heating power QJ [W/m3] was calculated using Q_J=_P B^2 [(u_i-u_n )^2+(v_i-v_n )^2+(w_i-w_n )^2] with P = Pedersen conductivity [S], B = geomagnetic field strength [T], ui, vi, and wi = zonal, meridional, and vertical ion velocities [m/s] and un, vn, and wn = neutral wind velocities [m/s]. QJ was integrated vertically in height (using a 2.5 km height grid spacing rather than the 61 levels on output file type 4) to give the JHH variable on the type 4 data files. The QJOULE variable also given is the Joule heating rate [K/s] at each of the 61 height levels.
All data are provided as monthly mean files with one time record per file, giving 792 files for each data file type for the period 1950-2015 (66 years).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The scene modeling and simulation data is included in Data.zip.
Facebook
TwitterAttribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
This dataset is associated with the paper Knoop et al. (2019) titled "A generic gust definition and detection method based on wavelet-analysis" published in "Advances in Science and Research (ASR)" within the Special Issue: 18th EMS Annual Meeting: European Conference for Applied Meteorology and Climatology 2018. It contains the data and analysis software required to recreate all figures in the publication.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Many practical problems in fluid dynamics demand an empirical approach, where statistics estimated from data inform understanding and modelling. In this context data-driven probabilistic modelling offers an elegant alternative to ad hoc estimation procedures. Probabilistic models are useful as emulators, but also offer an attractive means of estimating particular statistics of interest. In this paradigm one can rely on probabilistic scoring rules for model comparison and validation. Stochastic neural networks provide a particularly rich class of probabilistic models, which, when paired with modern optimisation algorithms and GPUs, can be remarkably efficient. We demonstrate this approach by learning the single particle transition density of ocean surface drifters from observations using a mixture density network. This provides a comprehensive description of drifter dynamics, from which we derive maps of various single-particle statistics. Our model also offers a means of simulating drifter trajectories as a discrete-time Markov process. A drifter release simulation using our model shows the emergence of concentrated clusters in the subtropical gyres, in agreement with previous studies on the formation of garbage patches. The dataset is intended to accompany the code repository archived at doi.org/10.5281/zenodo.7737161 . They are both related to the upcoming paper Brolly, M.T. (in submission), 'Inferring ocean transport statistics with probabilistic neural networks'.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Predictive screening of metal–organic framework (MOF) materials for their gas uptake properties has been previously limited by using data from a range of simulated sources, meaning the final predictions are dependent on the performance of these original models. In this work, experimental gas uptake data has been used to create a Gradient Boosted Tree model for the prediction of H2, CH4, and CO2 uptake over a range of temperatures and pressures in MOF materials. The descriptors used in this database were obtained from the literature, with no computational modeling needed. This model was repeated 10 times, showing an average R2 of 0.86 and a mean absolute error (MAE) of ±2.88 wt % across the runs. This model will provide gas uptake predictions for a range of gases, temperatures, and pressures as a one-stop solution, with the data provided being based on previous experimental observations in the literature, rather than simulations, which may differ from their real-world results. The objective of this work is to create a machine learning model for the inference of gas uptake in MOFs. The basis of model development is experimental as opposed to simulated data to realize its applications by practitioners. The real-world nature of this research materializes in a focus on the application of algorithms as opposed to the detailed assessment of the algorithms.
Facebook
Twitterhttps://api.github.com/licenses/cc0-1.0https://api.github.com/licenses/cc0-1.0
The Supplementary materials present simulation data including biases and mean squared errors of estimators, the type I error rate and power curves of rank score test, and the estimated mean lengths and the empirical coverage probabilities of confidence intervals in various cases. The R codes are included.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Entity name, meaning and corresponding production unit.
Facebook
TwitterIn bold–the best performance amongst all the methods.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Mean (sd) of the empirical link probabilities for the simulation data.
Facebook
TwitterThe Fluvial Egg Drift Simulator (FluEgg) estimates bighead, silver, and grass carp egg and larval drift in rivers using species-specific egg developmental data combined with user-supplied hydraulic inputs (Garcia and others, 2013, Domanski, 2020). This data release contains results from 240 FluEgg 4.1.0 simulations of bighead carp eggs in the Illinois River under steady flow conditions. The data release also contains the hydraulic inputs used in the FluEgg simulations and a KML file of the centerline that represents the model domain. FluEgg simulations were run for all combinations of four spawning locations, six water temperatures, and ten steady flow conditions. Each simulation included 5,000 bighead carp eggs, which develop and eventually hatch into larvae. The simulations end when the larvae reach the gas bladder inflation stage. The four spawning locations were just downstream of the lock and dam structures at Marseilles, Starved Rock, Peoria, and LaGrange. For each of these spawning locations, the eggs were assumed to have been spawned at the water surface and at the midpoint of the channel. The six water temperatures were 18, 20, 22, 24, 26, and 28 degrees Celsius. The ten steady flow conditions ranged from half the annual mean flow to the 500-year peak flow and are discussed in more detail below. Note that in the streamwise coordinate system used by FluEgg, the streamwise coordinate of the Mississippi River confluence is 396,639 meters. Any drift distances greater than this value should be excluded from any further analysis of this data. The hydraulic inputs for the FluEgg simulations were generated using a one-dimensional steady Hydrologic Engineering Center-River Analysis System (HEC-RAS) 5.0.7 model for the Illinois River between Marseilles Lock and Dam and the Mississippi River confluence near Grafton, Illinois (HEC-RAS, 2019). The HEC-RAS model was developed by combining four individual HEC-RAS models obtained from the U.S. Army Corps of Engineers Rock Island District (U.S. Army Corps of Engineers Rock Island District, 2003). The model was run for the following ten flow profiles: half the annual mean flow, annual mean flow, annual mean flood, 2-year peak flow, 5-year peak flow, 10-year peak flow, 25-year peak flow, 50-year peak flow, 100-year peak flow, and 500-year peak flow. The flow rates for each of the profiles were obtained for the following U.S. Geological survey (USGS) streamgaging stations from USGS StreamStats: 5543500 Illinois River at Marseilles, Illinois, 5558300 Illinois River at Henry, Illinois, 5560000 Illinois River at Peoria, Illinois, 5568500 Illinois River at Kingston Mines, Illinois, 5570500 Illinois River near Havana, Illinois, 5585500 Illinois River at Meredosia, Illinois, 5586100 Illinois River at Valley City, Illinois (Soong and others, 2004; Granato and others, 2017). Garcia, T., Jackson, P.R., Murphy, E.A., Valocchi, A.J., Garcia, M.H., 2013. Development of a Fluvial Egg Drift Simulator to evaluate the transport and dispersion of Asian carp eggs in rivers. Ecol. Model. 263, 211–222, https://doi.org/10.1016/j.ecolmodel.2013.05.005. Granato G.E., Ries, K.G., III, and Steeves, P.A., 2017, Compilation of streamflow statistics calculated from daily mean streamflow data collected during water years 1901–2015 for selected U.S. Geological Survey streamgages: U.S. Geological Survey Open-File Report 2017–1108, 17 p., https://doi.org/10.3133/ofr20171108. Domanski, M.M., Berutti, M.C., 2020, FluEgg, U.S. Geological Survey software release, https://doi.org/10.5066/P93UCQR2. Hydrologic Engineering Center-River Analysis System (HEC-RAS), 2019, accessed August 20, 2020, at http://www.hec.usace.army.mil/software/hec-ras/. Soong, D.T., Ishii, A.L., Sharpe, J.B., and Avery, C.F., 2004, Estimating flood-peak discharge magnitudes and frequencies for rural streams in Illinois: U.S. Geological Survey Scientific Investigations Report 2004–5103, 147 p., https://doi.org/10.3133/sir20045103. U.S. Army Corps of Engineers Rock Island District, 2004, Upper Mississippi River System Flow Frequency Study, Hydrology and Hydraulics, Appendix C, Illinois River, accessed August 20, 2020, at https://www.mvr.usace.army.mil/Portals/48/docs/FRM/UpperMissFlowFreq/App.%20C%20Rock%20Island%20Dist.%20Illinois%20River%20Hydrology_Hydraulics.pdf.
Facebook
Twitterhttps://entrepot.recherche.data.gouv.fr/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.57745/FMZ9HPhttps://entrepot.recherche.data.gouv.fr/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.57745/FMZ9HP
Although it is a widespread phenomenon in nature, turbulence in fluids (gases, liquids) is still very poorly understood. One area of research involves analyzing data from academic flow simulations. To make progress, the scientific community needs a large amount of reliable data in various configurations. Turbulent flows near solid flat or curved walls are very interesting examples. The database is composed of the 3D raw data (velocity, pressure, time derivative of velocity) and statistics (mean, Reynolds stresses, length scales) of a direct numerical simulations of moderate adverse pressure gradient (decelerating) turbulent boundary layer on a flat plate at Reynolds number up to Reθ = 8000 .
Facebook
TwitterThis model archive contains the data and software application necessary to simulate two-dimensional hydraulic parameters along a 1.6 kilometer study reach of the Sacramento River near Glenn, California. The iRIC modeling system and the NAYS2DH solver were used to simulate three river flows (90, 191, and 255 cubic meters per second) and provide spatially distributed depths, velocities, and water-surface elevations along the study reach. The archive is split into child-items to help distinguish the individual components of the archive and make downloading of large files more manageable. The first child item in the archive is the hydraulic model software application. The second child item includes the topographic data used to construct the model grid as well as field measurements of water-surface elevation and depth-averaged velocity used to calibrate the hydraulic roughness parameter. The third child item provides output from the NAYS2DH model using various Manning's n roughness values. A comparison of the root mean square errors between the model simulation and field measurements is included for each roughness parameter. The fourth child item includes model output for the three river flows that were simulated to support the manuscript.
Facebook
TwitterThese are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: File format: R workspace file; “Simulated_Dataset.RData”. Metadata (including data dictionary) • y: Vector of binary responses (1: adverse outcome, 0: control) • x: Matrix of covariates; one row for each simulated individual • z: Matrix of standardized pollution exposures • n: Number of simulated individuals • m: Number of exposure time periods (e.g., weeks of pregnancy) • p: Number of columns in the covariate design matrix • alpha_true: Vector of “true” critical window locations/magnitudes (i.e., the ground truth that we want to estimate) Code Abstract We provide R statistical software code (“CWVS_LMC.txt”) to fit the linear model of coregionalization (LMC) version of the Critical Window Variable Selection (CWVS) method developed in the manuscript. We also provide R code (“Results_Summary.txt”) to summarize/plot the estimated critical windows and posterior marginal inclusion probabilities. Description “CWVS_LMC.txt”: This code is delivered to the user in the form of a .txt file that contains R statistical software code. Once the “Simulated_Dataset.RData” workspace has been loaded into R, the text in the file can be used to identify/estimate critical windows of susceptibility and posterior marginal inclusion probabilities. “Results_Summary.txt”: This code is also delivered to the user in the form of a .txt file that contains R statistical software code. Once the “CWVS_LMC.txt” code is applied to the simulated dataset and the program has completed, this code can be used to summarize and plot the identified/estimated critical windows and posterior marginal inclusion probabilities (similar to the plots shown in the manuscript). Optional Information (complete as necessary) Required R packages: • For running “CWVS_LMC.txt”: • msm: Sampling from the truncated normal distribution • mnormt: Sampling from the multivariate normal distribution • BayesLogit: Sampling from the Polya-Gamma distribution • For running “Results_Summary.txt”: • plotrix: Plotting the posterior means and credible intervals Instructions for Use Reproducibility (Mandatory) What can be reproduced: The data and code can be used to identify/estimate critical windows from one of the actual simulated datasets generated under setting E4 from the presented simulation study. How to use the information: • Load the “Simulated_Dataset.RData” workspace • Run the code contained in “CWVS_LMC.txt” • Once the “CWVS_LMC.txt” code is complete, run “Results_Summary.txt”. Format: Below is the replication procedure for the attached data set for the portion of the analyses using a simulated data set: Data The data used in the application section of the manuscript consist of geocoded birth records from the North Carolina State Center for Health Statistics, 2005-2008. In the simulation study section of the manuscript, we simulate synthetic data that closely match some of the key features of the birth certificate data while maintaining confidentiality of any actual pregnant women. Availability Due to the highly sensitive and identifying information contained in the birth certificate data (including latitude/longitude and address of residence at delivery), we are unable to make the data from the application section publically available. However, we will make one of the simulated datasets available for any reader interested in applying the method to realistic simulated birth records data. This will also allow the user to become familiar with the required inputs of the model, how the data should be structured, and what type of output is obtained. While we cannot provide the application data here, access to the North Carolina birth records can be requested through the North Carolina State Center for Health Statistics, and requires an appropriate data use agreement. Description Permissions: These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is associated with the following publication: Warren, J., W. Kong, T. Luben, and H. Chang. Critical Window Variable Selection: Estimating the Impact of Air Pollution on Very Preterm Birth. Biostatistics. Oxford University Press, OXFORD, UK, 1-30, (2019).