Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The correlation coefficient is a commonly used criterion to measure the strength of a linear relationship between the two quantitative variables. For a bivariate normal distribution, numerous procedures have been proposed for testing a precise null hypothesis of the correlation coefficient, whereas the construction of flexible procedures for testing a set of (multiple) precise and/or interval hypotheses has received less attention. This paper fills the gap by proposing an objective Bayesian testing procedure using the divergence-based priors. The proposed Bayes factors can be used for testing any combination of precise and interval hypotheses and also allow a researcher to quantify evidence in the data in favor of the null or any other hypothesis under consideration. An extensive simulation study is conducted to compare the performances between the proposed Bayesian methods and some existing ones in the literature. Finally, a real-data example is provided for illustrative purposes.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Notes: Three population structures are considered. The contributions of the causal site for both the traits range from 0.0025 to 0.01. Powers are estimated on 1,000 replicates. See notes in Table 1 for sample sizes.Abbreviations: T12, the proposed test for bivariate analysis; T1, the proposed test for only the first trait; T2, the proposed test for only the second trait.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Background: Clean water is an essential part of human healthy life and wellbeing. More recently, rapid population growth, high illiteracy rate, lack of sustainable development, and climate change; faces a global challenge in developing countries. The discontinuity of drinking water supply forces households either to use unsafe water storage materials or to use water from unsafe sources. The present study aimed to identify the determinants of water source types, use, quality of water, and sanitation perception of physical parameters among urban households in North-West Ethiopia.
Methods: A community-based cross-sectional study was conducted among households from February to March 2019. An interview-based a pretested and structured questionnaire was used to collect the data. Data collection samples were selected randomly and proportional to each of the kebeles' households. MS Excel and R Version 3.6.2 were used to enter and analyze the data; respectively. Descriptive statistics using frequencies and percentages were used to explain the sample data concerning the predictor variable. Both bivariate and multivariate logistic regressions were used to assess the association between independent and response variables.
Results: Four hundred eighteen (418) households have participated. Based on the study undertaken,78.95% of households used improved and 21.05% of households used unimproved drinking water sources. Households drinking water sources were significantly associated with the age of the participant (x2 = 20.392, df=3), educational status(x2 = 19.358, df=4), source of income (x2 = 21.777, df=3), monthly income (x2 = 13.322, df=3), availability of additional facilities (x2 = 98.144, df=7), cleanness status (x2 =42.979, df=4), scarcity of water (x2 = 5.1388, df=1) and family size (x2 = 9.934, df=2). The logistic regression analysis also indicated that those factors are significantly determining the water source types used by the households. Factors such as availability of toilet facility, household member type, and sex of the head of the household were not significantly associated with drinking water sources.
Conclusion: The uses of drinking water from improved sources were determined by different demographic, socio-economic, sanitation, and hygiene-related factors. Therefore, ; the local, regional, and national governments and other supporting organizations shall improve the accessibility and adequacy of drinking water from improved sources in the area.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
045 sample points that were used to construct the CCD metamodel; and (3) the Monte Carlo simulation sample points that used for validation.
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
We have conducted bivariate and multivariate statistical analysis of data measuring the integrated luminosity, shape, and potential depth of the Einstein sample of early-type galaxies (presented by Fabbiano et al. 1992, ApJS, 80, 531). We find significant correlations between the X-ray properties and the axial ratios (a/b) of our sample, such that the roundest systems tend to have the highest L_X and L_X/L_B. The most radio-loud objects are also the roundest. We confirm the assertion of Bender et al. (1989, A&A, 217, 35) that galaxies with high L_X are boxy (have negative a_4). Both a/b and a_4 are correlated with L_B, but not with IRAS 12 um and 100 um luminosities. There are strong correlations between L_X, Mg_2 and sigma_v in the sense that those systems with the deepest potential wells have the highest L_X and Mg_2. Thus the depth of the potential well appears to govern both the ability to retain an ISM at the present epoch and to retain the enriched ejecta of early star formation bursts. Both L_X/L_B and L_6 (the 6 cm radio luminosity) show threshold effects with sigma_v, exhibiting sharp increases at log(sigma_v) ~ 2.2. Finally, there is clearly an interrelationship between the various stellar and structural parameters: The scatter in the bivariate relationships between the shape parameters (a/b and a_4) and the depth parameter (sigma_v) is a function of abundance in the sense that, for a given a_4 or a/b, the systems with the highest sigma_v also have the highest Mg_2. Furthermore, for a constant sigma_v, disky galaxies tend to have higher Mg_2 than boxy ones. Alternatively, for a given abundance, boxy ellipticals tend to be more massive than disky ellipticals. One possibility is that early-type galaxies of a given mass, originating from mergers (boxy ellipticals), have lower abundances than "primordial" (disky) early-type galaxies. Another is that disky inner isophotes are due not to primordial dissipational collapse, but to either the self-gravitating inner disks of captured spirals or the dissipational collapse of new disk structures from the premerger ISM. The high measured nuclear Mg_2 values would thus be due to enrichment from secondary bursts of star formation triggered by the merging event., We have conducted bivariate and multivariate statistical analysis of data measuring the integrated luminosity, shape, and potential depth of the Einstein sample of early-type galaxies (presented by Fabbiano et al. 1992, ApJS, 80, 531). We find significant correlations between the X-ray properties and the axial ratios (a/b) of our sample, such that the roundest systems tend to have the highest L_X and L_X/L_B. The most radio-loud objects are also the roundest. We confirm the assertion of Bender et al. (1989, A&A, 217, 35) that galaxies with high L_X are boxy (have negative a_4). Both a/b and a_4 are correlated with L_B, but not with IRAS 12 um and 100 um luminosities. There are strong correlations between L_X, Mg_2 and sigma_v in the sense that those systems with the deepest potential wells have the highest L_X and Mg_2. Thus the depth of the potential well appears to govern both the ability to retain an ISM at the present epoch and to retain the enriched ejecta of early star formation bursts. Both L_X/L_B and L_6 (the 6 cm radio luminosity) show threshold effects with sigma_v, exhibiting sharp increases at log(sigma_v) ~ 2.2. Finally, there is clearly an interrelationship between the various stellar and structural parameters: The scatter in the bivariate relationships between the shape parameters (a/b and a_4) and the depth parameter (sigma_v) is a function of abundance in the sense that, for a given a_4 or a/b, the systems with the highest sigma_v also have the highest Mg_2. Furthermore, for a constant sigma_v, disky galaxies tend to have higher Mg_2 than boxy ones. Alternatively, for a given abundance, boxy ellipticals tend to be more massive than disky ellipticals. One possibility is that early-type galaxies of a given mass, originating from mergers (boxy ellipticals), have lower abundances than "primordial" (disky) early-type galaxies. Another is that disky inner isophotes are due not to primordial dissipational collapse, but to either the self-gravitating inner disks of captured spirals or the dissipational collapse of new disk structures from the premerger ISM. The high measured nuclear Mg_2 values would thus be due to enrichment from secondary bursts of star formation triggered by the merging event.
This paper investigates estimating the association parameter of Morgenstern type bivariate distribution using a modified maximum likelihood method where the regular maximum likelihood methods failed to achieve estimation. The simple random sampling, concomitant of ordered statistics and bivariate ranked set sampling methods are used and compared. Efficiency and bias of the produced estimators are compared for two specific examples, Morgenstern type bivariate uniform and exponential distributions.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This article compares distribution functions among pairs of locations in their domains, in contrast to the typical approach of univariate comparison across individual locations. This bivariate approach is studied in the presence of sampling bias, which has been gaining attention in COVID-19 studies that over-represent more symptomatic people. In cases with either known or unknown sampling bias, we introduce Anderson–Darling-type tests based on both the univariate and bivariate formulation. A simulation study shows the superior performance of the bivariate approach over the univariate one. We illustrate the proposed methods using real data on the distribution of the number of symptoms suggestive of COVID-19.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Dataset for: Leipold, B. & Loepthien, T. (2021). Attentive and emotional listening to music: The role of positive and negative affect. Jahrbuch Musikpsychologie, 30. https://doi.org/10.5964/jbdgm.78 In a cross-sectional study associations of global affect with two ways of listening to music – attentive–analytical listening (AL) and emotional listening (EL) were examined. More specifically, the degrees to which AL and EL are differentially correlated with positive and negative affect were examined. In Study 1, a sample of 1,291 individuals responded to questionnaires on listening to music, positive affect (PA), and negative affect (NA). We used the PANAS that measures PA and NA as high arousal dimensions. AL was positively correlated with PA, EL with NA. Moderation analyses showed stronger associations between PA and AL when NA was low. Study 2 (499 participants) differentiated between three facets of affect and focused, in addition to PA and NA, on the role of relaxation. Similar to the findings of Study 1, AL was correlated with PA, EL with NA and PA. Moderation analyses indicated that the degree to which PA is associated with an individual´s tendency to listen to music attentively depends on their degree of relaxation. In addition, the correlation between pleasant activation and EL was stronger for individuals who were more relaxed; for individuals who were less relaxed the correlation between unpleasant activation and EL was stronger. In sum, the results demonstrate not only simple bivariate correlations, but also that the expected associations vary, depending on the different affective states. We argue that the results reflect a dual function of listening to music, which includes emotional regulation and information processing.: Dataset Study 1
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
An example of combining ANOVA terms for bivariate principle component data to create the ANODIS F-statistic where N is the total number of samples drawn and K, the number of assemblages compared.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The paper considers some of the issues emerging from the discrete wavelet analysis of popular bivariate spectral quantities such as the coherence and phase spectra and the frequency-dependent time delay. The approach utilised here is based on the maximal overlap discrete Hilbert wavelet transform (MODHWT). Firstly, via a broad set of simulation experiments, we examine the small and large sample properties of two wavelet estimators of the scale-dependent time delay. The estimators are the wavelet cross-correlator and the wavelet phase angle-based estimator. Our results provide some practical guidelines for the empirical examination of short- and medium-term lead-lag relations for octave frequency bands. Further, we point out a deficiency in the implementation of the MODHWT and suggest using a modified implementation scheme, which was proposed earlier in the context of the dual-tree complex wavelet transform. In addition, we show how MODHWT-based wavelet quantities can serve to approximate the Fourier bivariate spectra and discuss issues connected with building confidence intervals for them. The discrete wavelet analysis of coherence and phase angle is illustrated with a scale-dependent examination of business cycle synchronisation between 11 euro zone countries. The study is supplemented by a wavelet analysis of the variance and covariance of the euro zone business cycles. The empirical examination underlines the good localisation properties and high computational efficie ncy of the wavelet transformations applied and provides new arguments in favour of the endogeneity hypothesis of the optimum currency area criteria as well as the wavelet evidence on dating the Great Moderation in the euro zone.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A bivariate random sample of motor insurance claims that include material damage and bodily injury. The sample was provided by an insurer in Spain and corresponds to claims occurred during year 2014. The sample size is n = 1751 and it represents 10% of the total of the analyzed claims
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Average ultimate tensile strength (UTS) and tensile strain for the 45° and 90° specimens.
Pooling individual samples prior to DNA extraction can mitigate the cost of DNA extraction and genotyping; however, these methods need to accurately generate equal representation of individuals within pools. This data set was generated to determine accuracy of pool construction based on white blood cell counts compared to two common DNA quantification methods. Fifty individual bovine blood samples were collected, and then pooled with all individuals represented in each pool. Pools were constructed with the target of equal representation of each individual animal based on number of white blood cells, spectrophotometric readings, spectrofluorometric readings and whole blood volume with 9 pools per method and a total of 36 pools. Pools and individual samples that comprised the pools were genotyped using a commercially available genotyping array. ASReml was used to estimate variance components for individual animal contribution to pools. The correlation between animal contributions between two pools was estimated using bivariate analysis with starting values set to the result of a univariate analysis. The dataset includes: 1) pooling allele frequencies (PAF) for all pools and individual animals computed from normalized intensities for red (X) and green (Y); PAF = X/(X+Y). 2) Genotypes or number of copies of B(green) allele (0,1,2). 3) Definitions for each sample. Resources in this dataset:Resource Title: Pooling Allele Frequencies (paf) for all pools and individual animals. File Name: pafAnimal.csv.gzResource Description: Pooling Allele Frequencies (paf) for all pools and individual animals computed from normalized intensities for red (X) and green (Y); paf = X / (X + Y)Resource Title: Genotypes for individuals within pools. File Name: g.csv.gzResource Description: Genotypes (number of copies of the B (green) allele (0,1,2)) for individual bovine animals within pools.Resource Title: Sample Definitions . File Name: XY Data Key.xlsxResource Description: Definitions for each sample (both pools and individual animals).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
dataset and Octave/MatLab codes/scripts for data analysis Background: Methods for p-value correction are criticized for either increasing Type II error or improperly reducing Type I error. This problem is worse when dealing with thousands or even hundreds of paired comparisons between waves or images which are performed point-to-point. This text considers patterns in probability vectors resulting from multiple point-to-point comparisons between two event-related potentials (ERP) waves (mass univariate analysis) to correct p-values, where clusters of signiticant p-values may indicate true H0 rejection. New method: We used ERP data from normal subjects and other ones with attention deficit hyperactivity disorder (ADHD) under a cued forced two-choice test to study attention. The decimal logarithm of the p-vector (p') was convolved with a Gaussian window whose length was set as the shortest lag above which autocorrelation of each ERP wave may be assumed to have vanished. To verify the reliability of the present correction method, we realized Monte-Carlo simulations (MC) to (1) evaluate confidence intervals of rejected and non-rejected areas of our data, (2) to evaluate differences between corrected and uncorrected p-vectors or simulated ones in terms of distribution of significant p-values, and (3) to empirically verify rate of type-I error (comparing 10,000 pairs of mixed samples whit control and ADHD subjects). Results: the present method reduced the range of p'-values that did not show covariance with neighbors (type I and also type-II errors). The differences between simulation or raw p-vector and corrected p-vectors were, respectively, minimal and maximal for window length set by autocorrelation in p-vector convolution. Comparison with existing methods: Our method was less conservative while FDR methods rejected basically all significant p-values for Pz and O2 channels. The MC simulations, gold-standard method for error correction, presented 2.78±4.83% of difference (all 20 channels) from p-vector after correction, while difference between raw and corrected p-vector was 5,96±5.00% (p = 0.0003). Conclusion: As a cluster-based correction, the present new method seems to be biological and statistically suitable to correct p-values in mass univariate analysis of ERP waves, which adopts adaptive parameters to set correction.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We analyse the responses of users who searched for child sexual abuse material (CSAM) on Tor web search engines.
We analyse responses from all participants who answered our 'Help us to help you' survey from 5 May 2021 to 28 February 2023 (N = 11,470) and compare the tendencies and habits of people who searched for CSAM.
The 'Help us to help you' survey consists of 32 questions, takes about 15 to 20 minutes to complete, and participants receive no compensation.
We ask CSAM users about their thoughts, feelings, and actions related to their use of CSAM so that we can build a cognitive behavioural therapy-based anonymous rehabilitation programme for CSAM users. For this study, we analysed responses to 12 survey questions. All 12 questions are single-answer questions, i.e., the respondent is asked to pick one option from a predetermined list of answer options.
We may potentially be targeting a specific population due to the fact that the demographics of Tor users are probably not representative of all internet users. The participants in the sample are Tor users who (i) conducted a search for CSAM and (ii) opted to complete the survey; thus, they constitute a convenience sample.
We analysed the data with both univariate and bivariate methods. The analyses in the main part of the text mainly describe the population seeking CSAM material on the Tor network, and these results provide a point of comparison to our other results. The bivariate analyses, on the other hand, deepen the picture of the factors associated with help-seeking for CSAM use. In these analyses, the outcome variable is based on help-seeking, whereas we selected the independent variables to measure both the intensity of CSAM use and the effects of CSAM use on the users themselves.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Bivariate statistics on the complete-case sample (n = 8 113): Parental interest categories according to confounding and intermediate variables in (A) men and in (B) women.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Transparency in data visualization is an essential ingredient for scientific communication. The traditional approach of visualizing continuous quantitative data solely in the form of summary statistics (i.e., measures of central tendency and dispersion) has repeatedly been criticized for not revealing the underlying raw data distribution. Remarkably, however, systematic and easy-to-use solutions for raw data visualization using the most commonly reported statistical software package for data analysis, IBM SPSS Statistics, are missing. Here, a comprehensive collection of more than 100 SPSS syntax files and an SPSS dataset template is presented and made freely available that allow the creation of transparent graphs for one-sample designs, for one- and two-factorial between-subject designs, for selected one- and two-factorial within-subject designs as well as for selected two-factorial mixed designs and, with some creativity, even beyond (e.g., three-factorial mixed-designs). Depending on graph type (e.g., pure dot plot, box plot, and line plot), raw data can be displayed along with standard measures of central tendency (arithmetic mean and median) and dispersion (95% CI and SD). The free-to-use syntax can also be modified to match with individual needs. A variety of example applications of syntax are illustrated in a tutorial-like fashion along with fictitious datasets accompanying this contribution. The syntax collection is hoped to provide researchers, students, teachers, and others working with SPSS a valuable tool to move towards more transparency in data visualization.
This data package was produced by researchers working on the Shortgrass Steppe Long Term Ecological Research (SGS-LTER) Project, administered at Colorado State University. Long-term datasets and background information (proposals, reports, photographs, etc.) on the SGS-LTER project are contained in a comprehensive project collection within the Digital Collections of Colorado (http://digitool.library.colostate.edu/R/?func=collections&collection_id=3429). The data table and associated metadata document, which is generated in Ecological Metadata Language, may be available through other repositories serving the ecological research community and represent components of the larger SGS-LTER project collection. CPER Paleopedology Study – Particle and Grain Size - Grain size data from 39 pedons were compared with modal fluvial (7) and eolian (3) samples in order to characterize the origin of CPER parent materials and distinguish the origin of CPER geomorphic features. The seven fluvial sites were located along Owl and Eastman Creeks. The three eolian sites were located on the nearest undisputed dune fields located approximately 5 km north of Roggen, CO (Muhs, 1985). For statistical analysis, the sand and coarse silt fractions were shaken in a nest of half phi(0) interval sieves ranging from -1.0 0 (10 mesh) to 4.5 0 (325 mesh) for 3 minutes. Phi intervals (-log2) were utilized to normalize the particle size data for use in conventional statistics (Krumbein, 1934). The silt and clay fractions were separated by sedimentation using the pipette method. Statistical methods adopted from Folk and Ward (1957) were applied to the -1.0 0 to 7.0 0 fractions using the Sedimentary Petrology Computer Program SEDPET (Warner, 1970) to determine mean grain size (Mz), sorting (Iz), skewness (Skz), and kurtosis (Kz). These parameters were then subjected to univariate and bivariate analysis. The clay fraction was not included in the statistical computations to avoid excessively fine skewing the sample. Additional information and referenced materials can be found: http://hdl.handle.net/10217/85625. Resources in this dataset:Resource Title: Website Pointer to html file. File Name: Web Page, url: https://portal.edirepository.org/nis/mapbrowse?scope=knb-lter-sgs&identifier=168 Webpage with information and links to data files for download
https://spdx.org/licenses/etalab-2.0.htmlhttps://spdx.org/licenses/etalab-2.0.html
This repository contains example scripts for estimating genetic parameters using the BLUPF90 software suite. The scripts handle up to four traits simultaneously (from the 15 available in the dataset data.txt found at https://doi.org/10.57745/4MI9JN ). script1.sh runs renumf90 using the parameter file renum_ex1.par. This file processes the traits LW, AFW, CW, and BMW. The model includes the effects of animal, sex, and slaughter date. Optional instructions allow blupf90+ to compute variance ratios and their standard errors. script2.sh follows a similar structure but analyzes the traits LW, BW14r, and BW26. In this case, the fixed effects used in the model are different. script3.sh runs a bivariate analysis, using a categorical data (LCAT) to describe the liver. Hence, it calls gibbsf90+ instead of blupf90+. Pay attention to the missing value code, which must be 0. gibbs_samples.R is a R program to read the output from gibbsf90+. One must provide the number of estimated components (here NCOMP = 6) and the program computes the variance ratios and their posterior distributions.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
*Weighted percentages by country are not showed since data were not adjusted for country size.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The correlation coefficient is a commonly used criterion to measure the strength of a linear relationship between the two quantitative variables. For a bivariate normal distribution, numerous procedures have been proposed for testing a precise null hypothesis of the correlation coefficient, whereas the construction of flexible procedures for testing a set of (multiple) precise and/or interval hypotheses has received less attention. This paper fills the gap by proposing an objective Bayesian testing procedure using the divergence-based priors. The proposed Bayes factors can be used for testing any combination of precise and interval hypotheses and also allow a researcher to quantify evidence in the data in favor of the null or any other hypothesis under consideration. An extensive simulation study is conducted to compare the performances between the proposed Bayesian methods and some existing ones in the literature. Finally, a real-data example is provided for illustrative purposes.