Facebook
TwitterMedian values, interquartile range (IQR) and Number of outliers.
Facebook
TwitterWe include a description of the data sets in the meta-data as well as sample code and results from a simulated data set. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: The R code is available on line here: https://github.com/warrenjl/SpGPCW. Format: Abstract The data used in the application section of the manuscript consist of geocoded birth records from the North Carolina State Center for Health Statistics, 2005-2008. In the simulation study section of the manuscript, we simulate synthetic data that closely match some of the key features of the birth certificate data while maintaining confidentiality of any actual pregnant women. Availability Due to the highly sensitive and identifying information contained in the birth certificate data (including latitude/longitude and address of residence at delivery), we are unable to make the data from the application section publicly available. However, we will make one of the simulated datasets available for any reader interested in applying the method to realistic simulated birth records data. This will also allow the user to become familiar with the required inputs of the model, how the data should be structured, and what type of output is obtained. While we cannot provide the application data here, access to the North Carolina birth records can be requested through the North Carolina State Center for Health Statistics and requires an appropriate data use agreement. Description Permissions: These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. File format: R workspace file. Metadata (including data dictionary) • y: Vector of binary responses (1: preterm birth, 0: control) • x: Matrix of covariates; one row for each simulated individual • z: Matrix of standardized pollution exposures • n: Number of simulated individuals • m: Number of exposure time periods (e.g., weeks of pregnancy) • p: Number of columns in the covariate design matrix • alpha_true: Vector of “true” critical window locations/magnitudes (i.e., the ground truth that we want to estimate). This dataset is associated with the following publication: Warren, J., W. Kong, T. Luben, and H. Chang. Critical Window Variable Selection: Estimating the Impact of Air Pollution on Very Preterm Birth. Biostatistics. Oxford University Press, OXFORD, UK, 1-30, (2019).
Facebook
TwitterSummary statistics (mean, standard deviation, median, interquartile range, number of subjects) for “ln_adducts” in cases, controls, and total population.
Facebook
TwitterThese are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: File format: R workspace file; “Simulated_Dataset.RData”. Metadata (including data dictionary) • y: Vector of binary responses (1: adverse outcome, 0: control) • x: Matrix of covariates; one row for each simulated individual • z: Matrix of standardized pollution exposures • n: Number of simulated individuals • m: Number of exposure time periods (e.g., weeks of pregnancy) • p: Number of columns in the covariate design matrix • alpha_true: Vector of “true” critical window locations/magnitudes (i.e., the ground truth that we want to estimate) Code Abstract We provide R statistical software code (“CWVS_LMC.txt”) to fit the linear model of coregionalization (LMC) version of the Critical Window Variable Selection (CWVS) method developed in the manuscript. We also provide R code (“Results_Summary.txt”) to summarize/plot the estimated critical windows and posterior marginal inclusion probabilities. Description “CWVS_LMC.txt”: This code is delivered to the user in the form of a .txt file that contains R statistical software code. Once the “Simulated_Dataset.RData” workspace has been loaded into R, the text in the file can be used to identify/estimate critical windows of susceptibility and posterior marginal inclusion probabilities. “Results_Summary.txt”: This code is also delivered to the user in the form of a .txt file that contains R statistical software code. Once the “CWVS_LMC.txt” code is applied to the simulated dataset and the program has completed, this code can be used to summarize and plot the identified/estimated critical windows and posterior marginal inclusion probabilities (similar to the plots shown in the manuscript). Optional Information (complete as necessary) Required R packages: • For running “CWVS_LMC.txt”: • msm: Sampling from the truncated normal distribution • mnormt: Sampling from the multivariate normal distribution • BayesLogit: Sampling from the Polya-Gamma distribution • For running “Results_Summary.txt”: • plotrix: Plotting the posterior means and credible intervals Instructions for Use Reproducibility (Mandatory) What can be reproduced: The data and code can be used to identify/estimate critical windows from one of the actual simulated datasets generated under setting E4 from the presented simulation study. How to use the information: • Load the “Simulated_Dataset.RData” workspace • Run the code contained in “CWVS_LMC.txt” • Once the “CWVS_LMC.txt” code is complete, run “Results_Summary.txt”. Format: Below is the replication procedure for the attached data set for the portion of the analyses using a simulated data set: Data The data used in the application section of the manuscript consist of geocoded birth records from the North Carolina State Center for Health Statistics, 2005-2008. In the simulation study section of the manuscript, we simulate synthetic data that closely match some of the key features of the birth certificate data while maintaining confidentiality of any actual pregnant women. Availability Due to the highly sensitive and identifying information contained in the birth certificate data (including latitude/longitude and address of residence at delivery), we are unable to make the data from the application section publically available. However, we will make one of the simulated datasets available for any reader interested in applying the method to realistic simulated birth records data. This will also allow the user to become familiar with the required inputs of the model, how the data should be structured, and what type of output is obtained. While we cannot provide the application data here, access to the North Carolina birth records can be requested through the North Carolina State Center for Health Statistics, and requires an appropriate data use agreement. Description Permissions: These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is associated with the following publication: Warren, J., W. Kong, T. Luben, and H. Chang. Critical Window Variable Selection: Estimating the Impact of Air Pollution on Very Preterm Birth. Biostatistics. Oxford University Press, OXFORD, UK, 1-30, (2019).
Facebook
TwitterNumber of samples (n), median age, and sex distribution of non-infected controls and SARS-CoV-2 infected patients (IQR, interquartile range).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The number and percentage as well as mean, median and interquartile range of the total payment received by an editor of each specialty journal.
Facebook
TwitterA Pearson's chi square test and Mann-Whitney U test, both with post hoc Bonferroni correction were used for statistical analysis. Adjusted significance value p<0.016 (*). EO-PE: early-onset preeclampsia; LO-PE: late-onset preeclampsia; IQR: interquartile range; BMI: body mass index.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The objective was to identify horn fly-susceptible and horn fly-resistant animals in a Sindhi herd by two different methods. The number of horn flies on 25 adult cows from a Sindhi herd was counted every 14 days. As it was an open herd, the trial period was divided into three stages based on cow composition, with the same cows maintained within each period: 2011-2012 (36 biweekly observations); 2012-2013 (26 biweekly observations); and 2013-2014 (22 biweekly observations). Only ten cows were present in the herd throughout the entire period from 2011-2014 (84 biweekly observations). The variables evaluated were the number of horn flies on the cows, the sampling date and a binary variable for rainy or dry season. Descriptive statistics were calculated, including the median, the interquartile range, and the minimum and maximum number of horn flies, for each observation day. For the present analysis, fly-susceptible cows were identified as those for which the infestation of flies appeared in the upper quartile for more than 50% of the weeks and in the lower quartile for less than 20% of the weeks. In contrast, fly-resistant cows were defined as those for which the fly counts appeared in the lower quartile for more than 50% of the weeks and in the upper quartile for less than 20% of the weeks. To identify resistant and susceptible cows for the best linear unbiased predictions analysis, three repeated measures linear mixed models (one for each period) were constructed with cow as a random effect intercept. The response variable was the log ten transformed counts of horn flies per cow, and the explanatory variable were the observation date and season. As the trail took place in a semiarid region with two seasons well stablished the season was evaluated monthly as a binary outcome, considering a rainy season if it rained more or equal than 50mm or dry season if the rain was less than 50mm. The Standardized residuals and the BLUPs of the random effects were obtained and assessed for normality, heteroscedasticity and outlying observations. Each cow’s BLUPs were plotted against the average quantile rank values that were determined as the difference between the number of weeks in the high-risk quartile group and the number of weeks in the low risk quartile group, averaged by the total number of weeks in each of the observation periods. A linear model fit for the values of BLUPS against the average rank values and the correlation between the two methods was tested using Spearman’s correlation coefficient. The animal effect values (BLUPs) were evaluated by percentiles, with 0 representing the lowest counts (or more resistant cows) and 10 representing the highest counts (or more susceptible cows). These BLUPs represented only the effect of cow and not the effect of day, season or other unmeasured counfounders.
Facebook
TwitterNumber of samples (n), severity group, and median day post onset (DPO) of symptom of patients who contracted SARS-CoV-2, belonging to three cohorts (IQR, interquartile range).
Facebook
TwitterBackground: Coronavirus disease 2019 (COVID-19) was first identified in Wuhan, China, in December 2019 and quickly spread throughout China and the rest of the world. Many mathematical models have been developed to understand and predict the infectiousness of COVID-19. We aim to summarize these models to inform efforts to manage the current outbreak.Methods: We searched PubMed, Web of science, EMBASE, bioRxiv, medRxiv, arXiv, Preprints, and National Knowledge Infrastructure (Chinese database) for relevant studies published between 1 December 2019 and 21 February 2020. References were screened for additional publications. Crucial indicators were extracted and analysed. We also built a mathematical model for the evolution of the epidemic in Wuhan that synthesised extracted indicators.Results: Fifty-two articles involving 75 mathematical or statistical models were included in our systematic review. The overall median basic reproduction number (R0) was 3.77 [interquartile range (IQR) 2.78–5.13], which dropped to a controlled reproduction number (Rc) of 1.88 (IQR 1.41–2.24) after city lockdown. The median incubation and infectious periods were 5.90 (IQR 4.78–6.25) and 9.94 (IQR 3.93–13.50) days, respectively. The median case-fatality rate (CFR) was 2.9% (IQR 2.3–5.4%). Our mathematical model showed that, in Wuhan, the peak time of infection is likely to be March 2020 with a median size of 98,333 infected cases (range 55,225–188,284). The earliest elimination of ongoing transmission is likely to be achieved around 7 May 2020.Conclusions: Our analysis found a sustained Rc and prolonged incubation/ infectious periods, suggesting COVID-19 is highly infectious. Although interventions in China have been effective in controlling secondary transmission, sustained global efforts are needed to contain an emerging pandemic. Alternative interventions can be explored using modelling studies to better inform policymaking as the outbreak continues.
Facebook
Twitter*IQR indicates interquartile range.
Facebook
TwitterBackground and purposeSex differences in cerebral microbleeds (CMBs) are not well-known. We aimed to assess the impact of sex on the progression of CMBs.MethodsThe CHALLENGE (Comparison Study of Cilostazol and Aspirin on Changes in Volume of Cerebral Small Vessel Disease White Matter Changes) database was analyzed. Out of 256 subjects, 189 participants with a follow-up brain scan were included in the analysis. The linear mixed-effect model was used to compare the 2-year changes in the number of CMBs between men and women.ResultsA total of 65 men and 124 women were analyzed. There were no significant differences in the prevalence (70.8 vs. 71.8%; P = 1.000) and the median [interquartile range (IQR)] number of total CMBs [1 (0–7) vs. 2 (0–7); P = 0.810] at baseline between men and women. The median (IQR) increase over 2 years in the number of CMBs was statistically higher in women than in men [1 (0–2) vs. 0 (0–1), P = 0.026]. The multivariate linear mixed-effects model showed that women had a significantly greater increase in the number of total, deep, and lobar CMBs compared to men after adjusting for age and the baseline number of CMBs [estimated log-transformed mean of difference between women and men: 0.040 (P = 0.028) for total CMBs, 0.037 (P = 0.047) for deep CMBs, and 0.047 (P = 0.009) for lobar CMBs].ConclusionThe progression of CMB over 2 years was significantly greater in women than in men.
Facebook
TwitterIntroductionThe aim of this study was to determine patterns of physical activity in pet dogs using real-world data at a population scale aided by the use of accelerometers and electronic health records (EHRs).MethodsA directed acyclic graph (DAG) was created to capture background knowledge and causal assumptions related to dog activity, and this was used to identify relevant data sources, which included activity data from commercially available accelerometers, and health and patient metadata from the EHRs. Linear mixed models (LMM) were fitted to the number of active minutes following log-transformation with the fixed effects tested based on the variables of interest and the adjustment sets indicated by the DAG.ResultsActivity was recorded on 8,726,606 days for 28,562 dogs with 136,876 associated EHRs, with the median number of activity records per dog being 162 [interquartile range (IQR) 60–390]. The average recorded activity per day of 51 min was much lower than previous estimates of physical activity, and there was wide variation in activity levels from less than 10 to over 600 min per day. Physical activity decreased with age, an effect that was dependent on breed size, whereby there was a greater decline in activity for age as breed size increased. Activity increased with breed size and owner age independently. Activity also varied independently with sex, location, climate, season and day of the week: males were more active than females, and dogs were more active in rural areas, in hot dry or marine climates, in spring, and on weekends.ConclusionAccelerometer-derived activity data gathered from pet dogs living in North America was used to determine associations with both dog and environmental characteristics. Knowledge of these associations could be used to inform daily exercise and caloric requirements for dogs, and how they should be adapted according to individual circumstances.
Facebook
TwitterSummaries of trap data (number of traps deployed and flies/trap/day) plus estimated dry season habitat suitability (median and interquartile range extracted from the BRT model) by distance to nearest Tiny Target.
Facebook
TwitterIQR: interquartile range,N1: number of subjects, TST: Tuberculin Skin Test, QFT: QuantiFERON-TB Gold In-Tube, MTC-score: Mycobacterium tuberculosis contact score, n: number of subjects with intestinal helminths;N2: number of subjects studied for helminths.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
P-values are for test of difference in quantile normalised characteristic between diagnostic groups, adjusted for age and gender and relatedness.n = Number; Interquartile range is lower quartile (LQ) and upper quartile (UQ). BMI = body mass index; SBP = systolic blood pressure; DBP = diastolic blood pressure; FBG = fasting blood glucose; TG = triglyceride; HDL-c = high density lipoprotein cholesterol Scr = serum creatinine; eGFR = estimated glomerular filtration rate; UACR = urinary albumin-to-creatinine ratio.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundTo date, there is a lack of sufficient evidence on the type of clusters in which severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is most likely to spread. Notably, the differences between cluster-level and population-level outbreaks in epidemiological characteristics and transmissibility remain unclear. Identifying the characteristics of these two levels, including epidemiology and transmission dynamics, allows us to develop better surveillance and control strategies following the current removal of suppression measures in China.MethodsWe described the epidemiological characteristics of SARS-CoV-2 and calculated its transmissibility by taking a Chinese city as an example. We used descriptive analysis to characterize epidemiological features for coronavirus disease 2019 (COVID-19) incidence database from 1 Jan 2020 to 2 March 2020 in Chaoyang District, Beijing City, China. The susceptible-exposed-infected-asymptomatic-recovered (SEIAR) model was fitted with the dataset, and the effective reproduction number (Reff) was calculated as the transmissibility of a single population. Also, the basic reproduction number (R0) was calculated by definition for three clusters, such as household, factory and community, as the transmissibility of subgroups.ResultsThe epidemic curve in Chaoyang District was divided into three stages. We included nine clusters (subgroups), which comprised of seven household-level and one factory-level and one community-level cluster, with sizes ranging from 2 to 17 cases. For the nine clusters, the median incubation period was 17.0 days [Interquartile range (IQR): 8.4–24.0 days (d)], and the average interval between date of onset (report date) and diagnosis date was 1.9 d (IQR: 1.7 to 6.4 d). At the population level, the transmissibility of the virus was high in the early stage of the epidemic (Reff = 4.81). The transmissibility was higher in factory-level clusters (R0 = 16) than in community-level clusters (R0 = 3), and household-level clusters (R0 = 1).ConclusionsIn Chaoyang District, the epidemiological features of SARS-CoV-2 showed multi-stage pattern. Many clusters were reported to occur indoors, mostly from households and factories, and few from the community. The risk of transmission varies by setting, with indoor settings being more severe than outdoor settings. Reported household clusters were the predominant type, but the population size of the different types of clusters limited transmission. The transmissibility of SARS-CoV-2 was different between a single population and its subgroups, with cluster-level transmissibility higher than population-level transmissibility.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Standardized test descriptive statistics: number of participants, mean, standard deviation, median and interquartile range.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
IQR: interquartile range; n/N: number of studies with the condition/total number of studies;*Switzerland and sub-Saharan area.**Calculated from 17 studies enrolling HIV-uninfected individuals.†Data available for 34 studies;‡Only HIV-infected individuals.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundThe Millennium Developmental Goals ensured a significant reduction in childhood mortality. However, this reduction simultaneously raised concerns about the long-term outcomes of survivors of early childhood insults. This systematic review focuses on the long-term neurocognitive and mental health outcomes of neonatal insults (NNI) survivors who are six years or older.MethodsTwo independent reviewers conducted a comprehensive search for empirical literature by combining index and free terms from the inception of the databases until 10th October 2019. We also searched for additional relevant literature from grey literature and using reference tracking. Studies were included if they: were empirical studies conducted in humans; the study participants were followed at six years of age or longer; have an explicit diagnosis of NNI, and explicitly define the outcome and impairment. Medians and interquartile range (IQR) of the proportions of survivors of the different NNI with any impairment were calculated. A random-effect model was used to explore the estimates accounted for by each impairment domain.ResultsFifty-two studies with 94,978 participants who survived NNI were included in this systematic review. The overall prevalence of impairment in the survivors of NNI was 10.0% (95% CI 9.8–10.2). The highest prevalence of impairment was accounted for by congenital rubella (38.8%: 95% CI 18.8–60.9), congenital cytomegalovirus (23.6%: 95% CI 9.5–41.5), and hypoxic-ischemic encephalopathy (23.3%: 95% CI 14.7–33.1) while neonatal jaundice has the lowest proportion (8.6%: 95% CI 2.7–17.3). The most affected domain was the neurodevelopmental domain (16.6%: 95% CI 13.6–19.8). The frequency of impairment was highest for neurodevelopmental impairment [22.0% (IQR = 9.2–24.8)] and least for school problems [0.0% (IQR = 0.0–0.00)] in any of the conditions.ConclusionThe long-term impact of NNI is also experienced in survivors of NNI who are 6 years or older, with impairments mostly experienced in the neurodevelopmental domain. However, there are limited studies on long-term outcomes of NNI in sub-Saharan Africa despite the high burden of NNI in the region.Trial registrationRegistration number: CRD42018082119.
Facebook
TwitterMedian values, interquartile range (IQR) and Number of outliers.