Facebook
TwitterBackground Modern drug discovery is concerned with identification and validation of novel protein targets from among the 30,000 genes or more postulated to be present in the human genome. While protein-protein interactions may be central to many disease indications, it has been difficult to identify new chemical entities capable of regulating these interactions as either agonists or antagonists. Results In this paper, we show that peptide complements (or surrogates) derived from highly diverse random phage display libraries can be used for the identification of the expected natural biological partners for protein and non-protein targets. Our examples include surrogates isolated against both an extracellular secreted protein (TNFβ) and intracellular disease related mRNAs. In each case, surrogates binding to these targets were obtained and found to contain partner information embedded in their amino acid sequences. Furthermore, this information was able to identify the correct biological partners from large human genome databases by rapid and integrated computer based searches. Conclusions Modified versions of these surrogates should provide agents capable of modifying the activity of these targets and enable one to study their involvement in specific biological processes as a means of target validation for downstream drug discovery.
Facebook
TwitterThis dataset was created by Tilii
Facebook
Twitterhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/QCKJYLhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/QCKJYL
This dataset contains replication files for "The Surrogate Index: Combining Short-Term Proxies to Estimate Long-Term Treatment Effects More Rapidly and Precisely" by Susan Athey, Raj Chetty, Guido Imbens, and Hyunseung Kang. For more information, see https://opportunityinsights.org/paper/the-surrogate-index/. A summary of the related publication follows. The impacts of many policies, such as efforts to increase upward income mobility or improve health outcomes, are only observed with long delays. For example, it can take decades to see the effects of early childhood interventions on lifetime earnings. This problem has greatly limited researchers’ and policymakers’ ability to test and improve policies and arises frequently in our own work at Opportunity Insights on the determinants of economic opportunity. In this study, we develop a new method of estimating the long-term impacts of policies more rapidly and precisely using short-term proxies. We predict long-term outcomes (e.g., lifetime earnings) using short-term outcomes (e.g., earnings in early adulthood or test scores). We then show that the causal effects of policies on this predictive index (which we term a “surrogate index”, following terminology in the statistics literature) can help us learn about their long-term impacts more quickly under certain assumptions that are described in the full paper. We apply our method to analyze the long-term impacts of a job training experiment in California. Using short-term employment rates as surrogates, we show that one could have estimated the program’s impact on mean employment rates over a 9 year horizon within 1.5 years, with a 35% reduction in standard errors. The success of the surrogate index in this job training application suggests that our method could be applied to predict the long-term impacts of other programs as well. Going forward, we hope to build a public library of early indicators (surrogate indices) for social science by harnessing historical experiments along with the large-scale datasets we have built. If you would like to contribute to this effort by reporting a surrogate index that predicts long-term impacts estimated in an experiment, as in the GAIN program, please contact us.
Facebook
TwitterBackgroundThere is a longstanding concern about the accuracy of surrogate consent in representing the health care and research preferences of those who lose their ability to decide for themselves. We sought informed, deliberative views of the older general public (≥50 years old) regarding their willingness to participate in dementia research and to grant leeway to future surrogates to choose an option contrary to their stated wishes. Methodology/Principal Findings503 persons aged 50+ recruited by random digit dialing were randomly assigned to one of three groups: deliberation, education, or control. The deliberation group attended an all-day education/peer deliberation session; the education group received written information only. Participants were surveyed at baseline, after the deliberation session (or equivalent time), and one month after the session, regarding their willingness to participate in dementia research and to give leeway to surrogates, regarding studies of varying risk-benefit profiles (a lumbar puncture study, a drug randomized controlled trial, a vaccine randomized controlled trial, and an early phase gene transfer trial). At baseline, 48% (gene transfer scenario) to 92% (drug RCT) were willing to participate in future dementia research. A majority of respondents (57–71% depending on scenario) were willing to give leeway to future surrogate decision-makers. Democratic deliberation increased willingness to participate in all scenarios, to grant leeway in 3 of 4 scenarios (lumbar puncture, vaccine, and gene transfer), and to enroll loved ones in research in all scenarios. On average, respondents were more willing to volunteer themselves for research than to enroll their loved ones. Conclusions/SignificanceMost people were willing to grant leeway to their surrogates, and this willingness was either sustained or increased after democratic deliberation, suggesting that the attitude toward leeway is a reliable opinion. Eliciting a person’s current preferences about future research participation should also involve eliciting his or her leeway preferences.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Smith-Patten_Patten_dataPresence/absence data for each taxon used in the study.Smith-Patten_Patten_calculator_matricesCalculator used for pairwise Jaccard’s dissimilarity indices; resulting matrices for all taxaSmith-Patten_Patten_BARRIER_outputRaw BARRIER 2.2 output for all taxaSmith-Patten_Patten_C_ProgramC program to create bootstrap matrices
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A great challenge for waste-to-energy power plants is their uncertain and variable feedstock, which can lead to the power plants not being run as efficiently as possible, leading to reduced energy output and control of emissions. A way to describe the feedstock is to use surrogates. This is a method where the hundreds or thousands of different species of a feedstock are modelled using a few surrogate species, enabling the feedstock’s modelling. The surrogates also provide an estimation of the HHV and the fraction of biomass, oil-based waste and inorganics. This thesis formulated surrogates for waste classes typically incinerated, using a linear least-square solution between available surrogate species and experimental values. Most of the species used were from two existing models in the literature, but three new species were created to improve the representation of some waste classes containing fossil-originated wastes, rubber and PET. These were made by creating reactions based on experimental data from the literature and then testing these reactions under pyrolysis conditions in a stochastic reactor model. The surrogates for the waste classes were formulated by first dividing the waste into components and then finding the surrogate formulation for each component. There were found surrogates for 41 components, which were used to create the surrogate formulation for 30 waste classes. It was found that most of the surrogates modelled the elemental composition accurately compared to experimental values. A statistical overview of the experimental and model data for the waste classes was also created. This overview is relevant for stakeholders in waste management and for other research, such as life-cycle analysis.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data is associated with the paper "Surrogate-guided Optimization in Quantum Networks".
In this work we introduce an efficient optimization workflow using machine-learning models that outperforms traditional techniques, addressing the challenges of complex, computationally demanding simulations in quantum networking. Please find guidelines and more context in REAMDE.md file.
Facebook
TwitterOpen Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
In Canada, it is illegal to purchase sperm or eggs from a donor (or person acting on behalf of a donor) or pay a female person to be a surrogate. However, donors and surrogates may be reimbursed for out-of-pocket expenditures incurred because of their donation or surrogacy that are provided for in the regulations.
Facebook
Twitterdata_packetThis is a zipped folder that contains the following spreadsheets used for analyses in this study: 1) site by species matrices for lichens (1 CSV file) and for vascular plants (1 CSV file). 2) Location data for study sites (1 XLSX file). 3) Georeferenced specimen data used to generate species lists for each site (1 XLSX file).Data_Packet
Facebook
TwitterSurrogate Variable[42] loadings by technology.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
IntroductionObesity, especially abdominal obesity, is more common in patients with heart failure (HF), but body mass index (BMI) cannot accurately describe fat distribution. Several surrogate adiposity markers are available to reflect fat distribution and quantity. The objective of this study was to explore which adiposity marker is most highly correlated with HF prevalence, all-cause mortality and patients’ long-term survival.MethodsThe National Health and Nutrition Examination Survey (NHANES) database provided all the data for this study. Logistic regression analyses were adopted to compare the association of each surrogate adiposity marker with the prevalence of HF. Cox proportional hazards models and restricted cubic spline (RCS) analysis were employed to assess the association between surrogate adiposity markers and all-cause mortality in HF patients. The ability of surrogate adiposity markers to predict long-term survival in HF patients was assessed using time-dependent receiver operating characteristic (ROC) curves.Results46,257 participants (1,366 HF patients) were encompassed in this retrospective study. An area under the receiver operating characteristic curve (AUC) for the prevalence of HF assessed by weight-adjusted-waist index (WWI) was 0.70 (95% CI: 0.69-0.72). During a median follow-up of 70 months, 700 of 1366 HF patients’ death were recorded. The hazard ratio (HR) for HF patients’ all-cause mortality was 1.33 (95% CI: 1.06-1.66) in the a body shape index (ABSI) quartile 4 group and 1.43 (95% CI: 1.13-1.82) in the WWI quartile 4 group, compared with the lowest quartile group. The AUC for predicting 5-year survival of HF patients using the ABSI was 0.647 (95% CI: 0.61-0.68).ConclusionsWWI is strongly correlated with the prevalence of HF. In HF patients, those with higher WWI and ABSI tend to higher all-cause mortality. ABSI can predict patients’ long-term survival. We recommend the use of WWI and ABSI for assessing obesity in HF patients.
Facebook
TwitterTable shows whole-brain-mean intensity differences between AP and LR corrected datasets (units are arbitrary signal units). Errors are the standard deviation of the means over the ten subjects. Metrics show statistically significant differences between all methods at the p<0.001 level.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The column ‘’ lists the results as described in the main text. In addition, robustness tests where conducted with results listed in separate columns. ‘ = 200 k USD’ uses the same set of countries and a threshold of 200.000 USD below which trade flows are ignored. In ‘all products with positive exports’ all products are included which have positive world exports in each year of the analysis. The column ‘1989–2000’ decreases the number of years included in analysis. Results excluding the FSU and CEE are listed in ‘Excl. FSU’. The maximal lag is then increased to . The last column reports results using the UN ComTrade dataset, as described in Text S1.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data was collected as part of the Map the Moment initiative, a volunteer project to document the artwork and changes to the streetscape following the killing of George Floyd and the demonstrations that followed. This data was collected by Lisa Conte and processed by Joe Graham-Felsen. They used a Canon 5D Mark 3 to scan this data and capture the various murals that appeared throughout the city.
Facebook
TwitterSurrogate outcomes are frequently used in cardiovascular disease research. A concern is that changes in surrogate markers may not reflect changes in disease outcomes. Two recent clinical trials (Heart and Estrogen/Progestin Replacement Study [HERS], and the Antihypertensive and Lipid-Lowering Treatment to Prevent Heart Attack Trial [ALLHAT]) underscore this problem since their results contradicted what was expected based on the surrogate outcomes. The current regulatory policy to allow new therapies to be introduced onto the market based solely on surrogate outcomes may need to be reviewed.
Facebook
TwitterThis data release is comprised of data tables of input variables for seawaveQ and surrogate models used to predict concentrations of select pesticides at six U.S. Geological Survey National Water Quality Network (NWQN) river sites (Fanno Creek at Durham, Oregon; White River at Hazleton, Indiana; Kansas River at DeSoto, Kansas; Little Arkansas River near Sedgwick, Kansas; Missouri River at Hermann, Missouri; Red River of the North at Grand Forks, North Dakota). Each data table includes discrete concentrations of one select pesticide (Atrazine, Azoxystrobin, Bentazon, Bromacil, Imidacloprid, Simazine, or Triclopyr) at one of the NWQN sites; daily mean streamflow; 30-day and 1-day flow anomalies; daily median values of pH and turbidity; daily mean values of dissolved oxygen, specific conductance, and water temperature; and 30-day and 1-day anomalies for pH, turbidity, dissolved oxygen, specific conductance, and water temperature. Two pesticides were modeled at each site with three types of regression models. Also included is a zip file with outputs from seawaveQ model summary. The processes for retrieving and preparing data for regression models followed those outlined in the SEAWAVE-Q R package documentation (Ryberg and Vecchia, 2013; Ryberg and York, 2020). The R package waterData (Ryberg and Vecchia, 2012) was used to import daily mean values for discharge and either daily mean or daily median values for continuous water-quality constituents directly into R depending on what data were available at each site. Pesticide concentration, streamflow, and surrogate data (continuously measured field parameters) were imported from and are available online from the USGS National Water Information System database (USGS, 2020). The waterData package was used to screen for missing daily mean discharge values (no missing values were found for the sites) and to calculate short-term (1 day) and mid-term (30 day) anomalies for flow and short-term anomalies (1 day) for each water-quality variable. A mid-term streamflow anomaly, for instance, is the deviation of concurrent daily streamflow from average conditions for the previous 30 days (Vecchia and others, 2008). Anomalies were calculated as additional potential model variables. Pesticide concentrations for select constituents from each site were pulled into R using the dataRetrieval package (De Cicco and others, 2018). Three of the six sites (Kansas River at DeSoto, Kansas; Missouri River at Hermann, Missouri; and White River at Hazleton, Indiana) pulled pesticide data for WY 2013–17 whereas the other three sites (Fanno Creek at Durham, Oregon; Little Arkansas River near Sedgwick, Kansas; and Red River of the North at Grand Forks, North Dakota) pulled pesticide data for WY 2013–18. Discrete pesticide data were matched with daily mean discharge and daily mean or median water-quality constituents and the associated calculated short-term (1-day) and mid-term (30-day) anomalies from the date of sampling. Pesticide concentrations were estimated using the SEAWAVE-Q (with surrogates) model using 19 combinations of surrogate variables (table 2 in the associated SIR, "Comparison of Surrogate Models to Estimate Pesticide Concentrations at Six U.S. Geological Survey National Water Quality Network Sites During Water Years 2013–18.") at each of 12 site-pesticide combinations (table 3 in the associated SIR). Three measures of model performance—the generalized coefficient of determination (R2), Akaike’s Information Criteria (AIC), and scale—were included in the output and used to select best-fit models (Table 4 of the associated SIR). The three to four best-fit SEAWAVE-Q (with surrogates) models with sample sizes at least five times the number of variables were selected for each site-pesticide combination based on generalized R2 values—the higher, the better. If generalized R2 values were the same, the model with the lower AIC value was used. The standard surrogate regression and base SEAWAVE-Q models were then applied using the same samples that were used for each of the best-fit SEAWAVE-Q (with surrogates) models so that direct comparisons could be made for each site-pesticide-surrogate instance. The input data used to estimate daily pesticide concentrations for each of the best fit models have been included in this data release. An example of one output file for each model type is included in a .zip file named "output_examples.zip". Each of the output files shows the three measures of model performance. (1) The output file for the standard regression model named "HAZ8_Atrazine_Standard_Regression_Output.txt" includes: Pseudo R-square (Allison) of 0.631, Model AIC of 174.0232, and a Scale of 0.961. (2) The output file for the base SEAWAVE-Q model named "HAZ8_Atrazine_Base_Seawave-Q_Output.txt" includes: Generalized r-squared of 0.82, AIC (Akaike's An Information Criterion) of 36.38, and a Scale of 0.288. (3) The output file for the SEAWAVE-Q w/Surrogates model named "HAZ8_Atrazine_Seawave-Q_w_Surrogates_Output.txt" includes: Generalized r-squared of 0.85, AIC (Akaike's An Information Criterion) of 33.76, and a Scale of 0.268. These values match those for Site ID = HAZ, Pesticide = Atrazine, and Surrogate variable group 8 for each model type in Table 4 of the associated SIR.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains all publicly available surrogate data for gravitational waveforms produced within the point-particle black hole perturbation theory framework and calibrated to numerical relativity simulations performed with the Spectral Einstein Code (SpEC).
Several surrogate models are currently available in this catalog:
BHPTNRSur2dq1e3, for aligned spin black hole binary systems with mass-ratios varying from 3 to 1000 and spins from −0.8≤χ1≤0.8 on the larger black hole. This surrogate model is trained on waveform data generated by point-particle black hole perturbation theory (ppBHPT) with calibration to numerical relativity (NR) data. The waveforms include all spin-weighted spherical harmonic modes up to ℓ=4 except the (4,1) and m=0 modes. Model details can be found in Rink et al. 2024. This data file is used to evaluate the surrogate model with either stand-alone Python code hosted by the Black Hole Perturbation Toolkit (Jupyter notebook tutorial) or the GWSurrogate Python package, which can be found on PyPI or conda-forge.
BHPTNRSur1dq1e4, an updated version of the EMRISur1dq1e4 model described below. The updated version includes better calibration to NR, a smoother transition to plunge model, and more harmonic modes. Model details can be found in Islam et al. 2022. This data file is used to evaluate the surrogate model with either stand-alone Python code hosted by the Black Hole Perturbation Toolkit (Jupyter notebook tutorial) or the GWSurrogate Python package, which can be found on PyPI or conda-forge.
EMRISur1dq1e4, for non-spinning black hole binary systems with mass-ratios varying from 3 to 10000. This surrogate model is trained on waveform data generated by point-particle black hole perturbation theory (ppBHPT), with the total mass rescaling parameter tuned to NR simulations. Available modes are [(2,2), (2,1), (3,3), (3,2), (3,1), (4,4), (4,3), (4,2), (5,5), (5,4), (5,3)]. The m<0 modes are deduced from the m>0 modes. Model details can be found in Rifat et al. 2019. This data file is used to evaluate the surrogate model with either stand-alone Python code hosted by the Black Hole Perturbation Toolkit (Jupyter notebook tutorial) or the GWSurrogate Python package (Jupyter notebook tutorial), which can be found on PyPI.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The goal of this work is to generate large statistically representative datasets to train machine learning models for disruption prediction provided by data from few existing discharges. Such a comprehensive training database is important to achieve satisfying and reliable prediction results in artificial neural network classifiers. Here, we aim for a robust augmentation of the training database for multivariate time series data using Student-t process regression. We apply Student-t process regression in a state space formulation via Bayesian filtering to tackle challenges imposed by outliers and noise in the training data set and to reduce the computational complexity. Thus, the method can also be used if the time resolution is high. We use an uncorrelated model for each dimension and impose correlations afterwards via coloring transformations. We demonstrate the efficacy of our approach on plasma diagnostics data of three different disruption classes from the DIII-D tokamak. To evaluate if the distribution of the generated data is similar to the training data, we additionally perform statistical analyses using methods from time series analysis, descriptive statistics, and classic machine learning clustering algorithms.
Facebook
TwitterThe data contained in the files are continuous bed load flux. The data are summed across the cross section and averaged over an hour. Data were collected with a calibrated surrogate sediment measurement system including 72 impact plates, mounted adjacent to each other, spanning the cross section of the Elwha River at river kilometer 5 (river mile 3.1). The following Science and Technology project numbers apply to the Elwha bed load data; 0115, 1709, 6209, 4542, 9562, and 6499.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Abstract:
Parameter identification for marine ecosystem models is important for the assessment and validation of marine ecosystem models against observational data. The surrogate-based optimization (SBO) is a computationally efficient method to optimize complex models. SBO replaces the computationally expensive (high-fidelity) model by a surrogate constructed from a less accurate but computationally cheaper (low-fidelity) model in combination with an appropriate correction approach, which improves the accuracy of the low-fidelity model. To construct a computationally cheap low-fidelity model, we tested three different approaches to compute an approximation of the annually periodic solution (i.e., a steady annual cycle) of a marine ecosystem model: firstly, a reduced number of spin-up iterations (several decades instead of millennia), secondly, an artificial neural network (ANN) approximating the steady annual cycle and, finally, a combination of the both approaches. Except for the low-fidelity model using only the ANN, the SBO yielded a solution close to the target and reduced the computational effort significantly. If an ANN approximating appropriately a marine ecosystem model is available, the SBO using this ANN as low-fidelity model presents a promising and computational efficient method for the validation.
Content:
SQLite database including the data of the different optimization runs
Structure and weights of the used artificial neural network
Tracer concentrations obtain from the high-fidelity model for the different optimization runs
Facebook
TwitterBackground Modern drug discovery is concerned with identification and validation of novel protein targets from among the 30,000 genes or more postulated to be present in the human genome. While protein-protein interactions may be central to many disease indications, it has been difficult to identify new chemical entities capable of regulating these interactions as either agonists or antagonists. Results In this paper, we show that peptide complements (or surrogates) derived from highly diverse random phage display libraries can be used for the identification of the expected natural biological partners for protein and non-protein targets. Our examples include surrogates isolated against both an extracellular secreted protein (TNFβ) and intracellular disease related mRNAs. In each case, surrogates binding to these targets were obtained and found to contain partner information embedded in their amino acid sequences. Furthermore, this information was able to identify the correct biological partners from large human genome databases by rapid and integrated computer based searches. Conclusions Modified versions of these surrogates should provide agents capable of modifying the activity of these targets and enable one to study their involvement in specific biological processes as a means of target validation for downstream drug discovery.