100+ datasets found
  1. f

    Data for Example I.

    • plos.figshare.com
    txt
    Updated Jul 3, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jularat Chumnaul; Mohammad Sepehrifar (2024). Data for Example I. [Dataset]. http://doi.org/10.1371/journal.pone.0297930.s002
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jul 3, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Jularat Chumnaul; Mohammad Sepehrifar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data analysis can be accurate and reliable only if the underlying assumptions of the used statistical method are validated. Any violations of these assumptions can change the outcomes and conclusions of the analysis. In this study, we developed Smart Data Analysis V2 (SDA-V2), an interactive and user-friendly web application, to assist users with limited statistical knowledge in data analysis, and it can be freely accessed at https://jularatchumnaul.shinyapps.io/SDA-V2/. SDA-V2 automatically explores and visualizes data, examines the underlying assumptions associated with the parametric test, and selects an appropriate statistical method for the given data. Furthermore, SDA-V2 can assess the quality of research instruments and determine the minimum sample size required for a meaningful study. However, while SDA-V2 is a valuable tool for simplifying statistical analysis, it does not replace the need for a fundamental understanding of statistical principles. Researchers are encouraged to combine their expertise with the software’s capabilities to achieve the most accurate and credible results.

  2. f

    Data from: R code of statistical analysis from Cold kiss still hot: Limited...

    • datasetcatalog.nlm.nih.gov
    • rs.figshare.com
    Updated Jul 26, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kodama, Tomonori; Sakamoto, Shinsuke; Mori, Akira (2024). R code of statistical analysis from Cold kiss still hot: Limited temperature effects on envenomation performance in predatory strikes of a Japanese pitviper (Gloydius blomhoffii) [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001370474
    Explore at:
    Dataset updated
    Jul 26, 2024
    Authors
    Kodama, Tomonori; Sakamoto, Shinsuke; Mori, Akira
    Description

    R code of statistical analysis in this study

  3. f

    Comparison of features in SDA-V2 and well-known statistical analysis...

    • plos.figshare.com
    xls
    Updated Jul 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jularat Chumnaul; Mohammad Sepehrifar (2024). Comparison of features in SDA-V2 and well-known statistical analysis software packages (Minitab and SPSS). [Dataset]. http://doi.org/10.1371/journal.pone.0297930.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jul 3, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Jularat Chumnaul; Mohammad Sepehrifar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Comparison of features in SDA-V2 and well-known statistical analysis software packages (Minitab and SPSS).

  4. H

    Replication data for: Statistical Analysis of Strategic Interaction with...

    • dataverse.harvard.edu
    Updated Jan 21, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harvard Dataverse (2015). Replication data for: Statistical Analysis of Strategic Interaction with Unobserved Player Actions: Introducing a Strategic Probit with Partial Observability [Dataset]. http://doi.org/10.7910/DVN/28662
    Explore at:
    text/plain; charset=us-ascii(33736), text/plain; charset=us-ascii(1054), text/x-stata-syntax; charset=us-ascii(9958), tsv(2299204), text/plain; charset=us-ascii(46439), text/x-stata-syntax; charset=us-ascii(23659)Available download formats
    Dataset updated
    Jan 21, 2015
    Dataset provided by
    Harvard Dataverse
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The strategic nature of political interactions has long captured the attention of political scientists. A traditional statistical approach to modeling strategic interactions involve multi-stage estimation (e.g., Signorino 2003), which improves parameter estimates associated with one stage by using the information from other stages. The application of such multi-stage approaches, however, imposes rather strict demands on data availability: data on the dependent variable must be available for each strategic actor at each stage of the interaction. Limited or no data makes such approaches difficult or impossible to implement. Political science data, however, especially in the fields of international relations and comparative politics, are not always structured in a manner that is conducive to these approaches. For example, we observe and have plentiful data on the onset of civil wars, but not the preceding stages, in which opposition groups decide to rebel or governments decide to repress them. In this paper, I derive an estimator that probabilistically estimates unobserved actor choices related to earlier stages of strategic interactions. I demonstrate the advantages of the estimator over traditional and split-population binary estimators both using Monte Carlo simulations and a substantive example of the strategic rebel–government interaction associated with civil wars.

  5. d

    Data from: Precipitation manipulation experiments may be confounded by water...

    • catalog.data.gov
    • datasets.ai
    • +2more
    Updated Apr 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Data from: Precipitation manipulation experiments may be confounded by water source [Dataset]. https://catalog.data.gov/dataset/data-from-precipitation-manipulation-experiments-may-be-confounded-by-water-source-7d7bc
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Agricultural Research Service
    Description

    This is digital research data corresponding to the manuscript, Reinhart, K.O., Vermeire, L.T. Precipitation Manipulation Experiments May Be Confounded by Water Source. J Soil Sci Plant Nutr (2023). https://doi.org/10.1007/s42729-023-01298-0 Files for a 3x2x2 factorial field experiment and water quality data used to create Table 1. Data for the experiment were used for the statistical analysis and generation of summary statistics for Figure 2. Purpose: This study aims to investigate the consequences of performing precipitation manipulation experiments with mineralized water in place of rainwater (i.e. demineralized water). Limited attention has been paid to the effects of water mineralization on plant and soil properties, even when the experiments are in a rainfed context. Methods: We conducted a 6-yr experiment with a gradient in spring rainfall (70, 100, and 130% of ambient). We tested effects of rainfall treatments on plant biomass and six soil properties and interpreted the confounding effects of dissolved solids in irrigation water. Results: Rainfall treatments affected all response variables. Sulfate was the most common dissolved solid in irrigation water and was 41 times more abundant in irrigated (i.e. 130% of ambient) than other plots. Soils of irrigated plots also had elevated iron (16.5 µg × 10 cm-2 × 60-d vs 8.9) and pH (7.0 vs 6.8). The rainfall gradient also had a nonlinear (hump-shaped) effect on plant available phosphorus (P). Plant and microbial biomasses are often limited by and positively associated with available P, suggesting the predicted positive linear relationship between plant biomass and P was confounded by additions of mineralized water. In other words, the unexpected nonlinear relationship was likely driven by components of mineralized irrigation water (i.e. calcium, iron) and/or shifts in soil pH that immobilized P. Conclusions: Our results suggest robust precipitation manipulation experiments should either capture rainwater when possible (or use demineralized water) or consider the confounding effects of mineralized water on plant and soil properties. Resources in this dataset: Resource Title: Readme file- Data dictionary File Name: README.txt Resource Description: File contains data dictionary to accompany data files for a research study. Resource Title: 3x2x2 factorial dataset.csv File Name: 3x2x2 factorial dataset.csv Resource Description: Dataset is for a 3x2x2 factorial field experiment (factors: rainfall variability, mowing seasons, mowing intensity) conducted in northern mixed-grass prairie vegetation in eastern Montana, USA. Data include activity of 5 plant available nutrients, soil pH, and plant biomass metrics. Data from 2018. Resource Title: water quality dataset.csv File Name: water quality dataset.csv Resource Description: Water properties (pH and common dissolved solids) of samples from Yellowstone River collected near Miles City, Montana. Data extracted from Rinella MJ, Muscha JM, Reinhart KO, Petersen MK (2021) Water quality for livestock in northern Great Plains rangelands. Rangeland Ecol. Manage. 75: 29-34.

  6. f

    Comparative overview of the statistical packages available in moreThanANOVA...

    • plos.figshare.com
    xls
    Updated Jul 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jularat Chumnaul; Mohammad Sepehrifar (2024). Comparative overview of the statistical packages available in moreThanANOVA and SDA-V2. [Dataset]. http://doi.org/10.1371/journal.pone.0297930.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jul 3, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Jularat Chumnaul; Mohammad Sepehrifar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Comparative overview of the statistical packages available in moreThanANOVA and SDA-V2.

  7. f

    CSV file containing data used for statistical analysis comparing baseline...

    • datasetcatalog.nlm.nih.gov
    Updated Sep 15, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gilchrist, H. Grant; Geldart, Erica A.; Semeniuk, Christina A. D.; Love, Oliver P.; Harris, Christopher M.; Barnas, Andrew F. (2023). CSV file containing data used for statistical analysis comparing baseline heart rate (beats/10s) to heart rate in the presence of a polar bear in the manuscript “A colonial-nesting seabird shows limited heart rate responses to natural variation in threats of polar bears”. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000958631
    Explore at:
    Dataset updated
    Sep 15, 2023
    Authors
    Gilchrist, H. Grant; Geldart, Erica A.; Semeniuk, Christina A. D.; Love, Oliver P.; Harris, Christopher M.; Barnas, Andrew F.
    Description

    All data used to compare common eider baseline heart rate (beats/10s) to heart rate in the presence of a polar bear on East Bay Island, Nunavut, Canada.

  8. Efficient statistical significance approximation for local association...

    • search.datacite.org
    Updated 2012
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Li Charlie Xia (2012). Efficient statistical significance approximation for local association analysis of high-throughput time series data [Dataset]. http://doi.org/10.25549/usctheses-c3-87579
    Explore at:
    Dataset updated
    2012
    Dataset provided by
    DataCitehttps://www.datacite.org/
    University of Southern California Digital Library (USC.DL)
    Authors
    Li Charlie Xia
    Description

    Local association analysis, such as local similarity analysis and local shape analysis, of biological time series data helps elucidate the varying dynamics of biological systems. However, their applications to large scale high-throughput data are limited by slow permutation procedures for statistical significance evaluation. We developed a theoretical approach to approximate the statistical significance of local similarity and local shape analysis based on the approximate tail distribution of the maximum partial sum of independent identically distributed (i.i.d) and Markovian random variables. Simulations show that the derived formula approximates the tail distribution reasonably well (starting at time points > 10 with no delay and > 20 with delay) and provides p-values comparable to those from permutations. The new approach enables efficient calculation of statistical significance for pairwise local association analysis, making possible all-to-all association studies otherwise prohibitive. As a demonstration, local association analysis of human microbiome time series shows that core OTUs are highly synergetic and some of the associations are body-site specific across samples. The new approach is implemented in our eLSA package, which now provides pipelines for faster local similarity and shape analysis of time series data. The tool is freely available from eLSA's website: http://meta.usc.edu/softs/lsa.

  9. n

    Collision between biological process and statistical analysis revealed by...

    • data.niaid.nih.gov
    • datasetcatalog.nlm.nih.gov
    • +2more
    zip
    Updated Sep 8, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Westneat; Yimen Araya-Ajoy; Hassen Allegue; Barbara Class; Niels Dingemanse; Ned Dochtermann; Laszlo Garamszegi; Julien Martin; Shinichi Nakagawa; Denis Reale; Holger Schielzeth (2020). Collision between biological process and statistical analysis revealed by mean-centering [Dataset]. http://doi.org/10.5061/dryad.sj3tx9632
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 8, 2020
    Dataset provided by
    University of Kentucky
    North Dakota State University
    Ludwig-Maximilians-Universität München
    University of Ottawa
    Bielefeld University
    Norwegian University of Science and Technology
    UNSW Sydney
    University of the Sunshine Coast
    Hungarian Academy of Sciences
    Université du Québec à Montréal
    Authors
    David Westneat; Yimen Araya-Ajoy; Hassen Allegue; Barbara Class; Niels Dingemanse; Ned Dochtermann; Laszlo Garamszegi; Julien Martin; Shinichi Nakagawa; Denis Reale; Holger Schielzeth
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Animal ecologists often collect hierarchically-structured data and analyze these with linear mixed-effects models. Specific complications arise when the effect sizes of covariates vary on multiple levels (e.g., within vs among subjects). Mean-centering of covariates within subjects offers a useful approach in such situations, but is not without problems. A statistical model represents a hypothesis about the underlying biological process. Mean-centering within clusters assumes that the lower level responses (e.g. within subjects) depend on the deviation from the subject mean (relative) rather than on absolute values of the covariate. This may or may not be biologically realistic. We show that mismatch between the nature of the generating (i.e., biological) process and the form of the statistical analysis produce major conceptual and operational challenges for empiricists. We explored the consequences of mismatches by simulating data with three response-generating processes differing in the source of correlation between a covariate and the response. These data were then analyzed by three different analysis equations. We asked how robustly different analysis equations estimate key parameters of interest and under which circumstances biases arise. Mismatches between generating and analytical equations created several intractable problems for estimating key parameters. The most widely misestimated parameter was the among-subject variance in response. We found that no single analysis equation was robust in estimating all parameters generated by all equations. Importantly, even when response-generating and analysis equations matched mathematically, bias in some parameters arose when sampling across the range of the covariate was limited. Our results have general implications for how we collect and analyze data. They also remind us more generally that conclusions from statistical analysis of data are conditional on a hypothesis, sometimes implicit, for the process(es) that generated the attributes we measure. We discuss strategies for real data analysis in face of uncertainty about the underlying biological process. Methods All data were generated through simulations, so included with this submission are a Read Me file containing general descriptions of data files, a code file that contains R code for the simulations and analysis data files (which will generate new datasets with the same parameters) and the analyzed results in the data files archived here. These data files form the basis for all results presented in the published paper. The code file (in R markdown) has more detailed descriptions of each file of analyzed results.

  10. Data from: Motion analysis of non-model organisms using a hierarchical...

    • zenodo.org
    Updated May 30, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mielke Falk; Schunke Vivian; Jan Wölfer; John A. Nyakatura; Mielke Falk; Schunke Vivian; Jan Wölfer; John A. Nyakatura (2022). Data from: Motion analysis of non-model organisms using a hierarchical model: influence of setup enclosure dimensions on gait parameters of Swinhoe's striped squirrels as a test case [Dataset]. http://doi.org/10.5061/dryad.10rn5
    Explore at:
    Dataset updated
    May 30, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Mielke Falk; Schunke Vivian; Jan Wölfer; John A. Nyakatura; Mielke Falk; Schunke Vivian; Jan Wölfer; John A. Nyakatura
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    In in-vivo motion analyses, data from a limited number of subjects and trials is used as proxy for locomotion properties of entire populations, yet the inherent hierarchy of the individual and population level is usually not accounted for. Despite the increasing availability of hierarchical model frameworks for statistical analyses, they have not been applied extensively to comparative motion analysis. As a case study for the use of hierarchical models, we analyzed locomotor parameters of four Swinhoe's striped squirrels. The small-bodied arboreal mammals exhibit brief bouts of rapid asymmetric gaits. Spatio-temporal parameters on runways with experimentally varied dimensions of the setup enclosure were compared to test for its potentially confounding effects. We applied principal component analysis to evaluate changes to the overall locomotor pattern. A common, non-hierarchical, pooled statistical analysis of the data revealed significant differences in some of the parameters depending on enclosure dimensions. In contrast, we used a hierarchical Bayesian generalized linear model (GLM) that considers subject specific differences and population effects to compare the effect of enclosure dimensions on the measured parameters and the principal components. None of the population effects were confirmed by the hierarchical GLM. The confounding effect of a single subject that deviates in its locomotor behavior is potentially bigger than the influence of the experimental variation in enclosure dimensions. Our findings justify the common practice of researchers to intuitively select an enclosure with dimensions assumed as "non-constraining". Hierarchical models can easily be designed to cope with limited sample size and bias introduced by deviating behavior of individuals. When limited data is available—a typical restriction of in-vivo motion analyses of non-model organisms—density distributions of the Bayesian GLM used here remain reliable and the hierarchical structure of the model optimally exploits all available information. We provide code to be adjusted to other research questions.

  11. Z

    The Surface Water Chemistry (SWatCh) database

    • data.niaid.nih.gov
    • zenodo.org
    Updated Apr 26, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rotteveel, Lobke (2022). The Surface Water Chemistry (SWatCh) database [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4559695
    Explore at:
    Dataset updated
    Apr 26, 2022
    Dataset provided by
    Rotteveel, Lobke
    Heubach, Franz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is the dataset presented in the following manuscript: The Surface Water Chemistry (SWatCh) database: A standardized global database of water chemistry to facilitate large-sample hydrological research, which is currently under review at Earth System Science Data.

    Openly accessible global scale surface water chemistry datasets are urgently needed to detect widespread trends and problems, to help identify their possible solutions, and determine critical spatial data gaps where more monitoring is required. Existing datasets are limited in availability, sample size/sampling frequency, and geographic scope. These limitations inhibit the answering of emerging transboundary water chemistry questions, for example, the detection and understanding of delayed recovery from freshwater acidification. Here, we begin to address these limitations by compiling the global surface water chemistry (SWatCh) database. We collect, clean, standardize, and aggregate open access data provided by six national and international agencies to compile a database containing information on sites, methods, and samples, and a GIS shapefile of site locations. We remove poor quality data (for example, values flagged as “suspect” or “rejected”), standardize variable naming conventions and units, and perform other data cleaning steps required for statistical analysis. The database contains water chemistry data for streams, rivers, canals, ponds, lakes, and reservoirs across seven continents, 24 variables, 33,722 sites, and over 5 million samples collected between 1960 and 2022. Similar to prior research, we identify critical spatial data gaps on the African and Asian continents, highlighting the need for more data collection and sharing initiatives in these areas, especially considering freshwater ecosystems in these environs are predicted to be among the most heavily impacted by climate change. We identify the main challenges associated with compiling global databases – limited data availability, dissimilar sample collection and analysis methodology, and reporting ambiguity – and provide recommended solutions. By addressing these challenges and consolidating data from various sources into one standardized, openly available, high quality, and trans-boundary database, SWatCh allows users to conduct powerful and robust statistical analyses of global surface water chemistry.

  12. Data for: Integrating open education practices with data analysis of open...

    • data.niaid.nih.gov
    • search.dataone.org
    zip
    Updated Jul 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marja Bakermans (2024). Data for: Integrating open education practices with data analysis of open science in an undergraduate course [Dataset]. http://doi.org/10.5061/dryad.37pvmcvst
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 26, 2024
    Dataset provided by
    Worcester Polytechnic Institute
    Authors
    Marja Bakermans
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    The open science movement produces vast quantities of openly published data connected to journal articles, creating an enormous resource for educators to engage students in current topics and analyses. However, educators face challenges using these materials to meet course objectives. I present a case study using open science (published articles and their corresponding datasets) and open educational practices in a capstone course. While engaging in current topics of conservation, students trace connections in the research process, learn statistical analyses, and recreate analyses using the programming language R. I assessed the presence of best practices in open articles and datasets, examined student selection in the open grading policy, surveyed students on their perceived learning gains, and conducted a thematic analysis on student reflections. First, articles and datasets met just over half of the assessed fairness practices, but this increased with the publication date. There was a marginal difference in how assessment categories were weighted by students, with reflections highlighting appreciation for student agency. In course content, students reported the greatest learning gains in describing variables, while collaborative activities (e.g., interacting with peers and instructor) were the most effective support. The most effective tasks to facilitate these learning gains included coding exercises and team-led assignments. Autocoding of student reflections identified 16 themes, and positive sentiments were written nearly 4x more often than negative sentiments. Students positively reflected on their growth in statistical analyses, and negative sentiments focused on how limited prior experience with statistics and coding made them feel nervous. As a group, we encountered several challenges and opportunities in using open science materials. I present key recommendations, based on student experiences, for scientists to consider when publishing open data to provide additional educational benefits to the open science community. Methods Article and dataset fairness To assess the utility of open articles and their datasets as an educational tool in an undergraduate academic setting, I measured the congruence of each pair to a set of best practices and guiding principles. I assessed ten guiding principles and best practices (Table 1), where each category was scored ‘1’ or ‘0’ based on whether it met that criteria, with a total possible score of ten. Open grading policies Students were allowed to specify the percentage weight for each assessment category in the course, including 1) six coding exercises (Exercises), 2) one lead exercise (Lead Exercise), 3) fourteen annotation assignments of readings (Annotations), 4) one final project (Final Project), 5) five discussion board posts and a statement of learning reflection (Discussion), and 6) attendance and participation (Participation). I examined if assessment categories (independent variable) were weighted (dependent variable) differently by students using an analysis of variance (ANOVA) and examined pairwise differences with Tukey HSD. Assessment of perceived learning gains I used a student assessment of learning gains (SALG) survey to measure students’ perceptions of learning gains related to course objectives (Seymour et al. 2000). This Likert-scale survey provided five response categories ranging from ‘no gains’ to ‘great gains’ in learning and the option of open responses in each category. A summary report that converted Likert responses to numbers and calculated descriptive statistics was produced from the SALG instrument website. Student reflections In student reflections, I examined the frequency of the 100 most frequent words, with stop words excluded and a minimum length of four (letters), both “with synonyms” and “with generalizations”. Due to this paper's explorative nature, I used autocoding to identify students' broad themes and sentiments in their reflections. Autocoding examines the sentiment of each word and scores it as positive, neutral, mixed, or negative. In this process, I compared how students felt about each theme, focusing on positive (i.e., satisfaction) and negative (i.e., dissatisfaction) sentiments. The relationship of how sentiment was coded to themes was visualized in a treemap, where the size of a block is relative to the number of references for that code. All reflection processing and analyses were performed in NVivo 14 (Windows). All data were collected with institutional IRB approval (IRB-24–0314). All statistical analyses were performed in R (ver. 4.3.1; R Core Team 2023).

  13. f

    Descriptive statistics.

    • plos.figshare.com
    xls
    Updated Oct 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mrinal Saha; Aparna Deb; Imtiaz Sultan; Sujat Paul; Jishan Ahmed; Goutam Saha (2023). Descriptive statistics. [Dataset]. http://doi.org/10.1371/journal.pgph.0002475.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Oct 31, 2023
    Dataset provided by
    PLOS Global Public Health
    Authors
    Mrinal Saha; Aparna Deb; Imtiaz Sultan; Sujat Paul; Jishan Ahmed; Goutam Saha
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Vitamin D insufficiency appears to be prevalent in SLE patients. Multiple factors potentially contribute to lower vitamin D levels, including limited sun exposure, the use of sunscreen, darker skin complexion, aging, obesity, specific medical conditions, and certain medications. The study aims to assess the risk factors associated with low vitamin D levels in SLE patients in the southern part of Bangladesh, a region noted for a high prevalence of SLE. The research additionally investigates the possible correlation between vitamin D and the SLEDAI score, seeking to understand the potential benefits of vitamin D in enhancing disease outcomes for SLE patients. The study incorporates a dataset consisting of 50 patients from the southern part of Bangladesh and evaluates their clinical and demographic data. An initial exploratory data analysis is conducted to gain insights into the data, which includes calculating means and standard deviations, performing correlation analysis, and generating heat maps. Relevant inferential statistical tests, such as the Student’s t-test, are also employed. In the machine learning part of the analysis, this study utilizes supervised learning algorithms, specifically Linear Regression (LR) and Random Forest (RF). To optimize the hyperparameters of the RF model and mitigate the risk of overfitting given the small dataset, a 3-Fold cross-validation strategy is implemented. The study also calculates bootstrapped confidence intervals to provide robust uncertainty estimates and further validate the approach. A comprehensive feature importance analysis is carried out using RF feature importance, permutation-based feature importance, and SHAP values. The LR model yields an RMSE of 4.83 (CI: 2.70, 6.76) and MAE of 3.86 (CI: 2.06, 5.86), whereas the RF model achieves better results, with an RMSE of 2.98 (CI: 2.16, 3.76) and MAE of 2.68 (CI: 1.83,3.52). Both models identify Hb, CRP, ESR, and age as significant contributors to vitamin D level predictions. Despite the lack of a significant association between SLEDAI and vitamin D in the statistical analysis, the machine learning models suggest a potential nonlinear dependency of vitamin D on SLEDAI. These findings highlight the importance of these factors in managing vitamin D levels in SLE patients. The study concludes that there is a high prevalence of vitamin D insufficiency in SLE patients. Although a direct linear correlation between the SLEDAI score and vitamin D levels is not observed, machine learning models suggest the possibility of a nonlinear relationship. Furthermore, factors such as Hb, CRP, ESR, and age are identified as more significant in predicting vitamin D levels. Thus, the study suggests that monitoring these factors may be advantageous in managing vitamin D levels in SLE patients. Given the immunological nature of SLE, the potential role of vitamin D in SLE disease activity could be substantial. Therefore, it underscores the need for further large-scale studies to corroborate this hypothesis.

  14. N

    Little Falls Town, New York Population Breakdown by Gender Dataset: Male and...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Little Falls Town, New York Population Breakdown by Gender Dataset: Male and Female Population Distribution // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/b24003fb-f25d-11ef-8c1b-3860777c1fe6/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Little Falls
    Variables measured
    Male Population, Female Population, Male Population as Percent of Total Population, Female Population as Percent of Total Population
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Little Falls town by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of Little Falls town across both sexes and to determine which sex constitutes the majority.

    Key observations

    There is a majority of male population, with 57.25% of total population being male. Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.

    Variables / Data Columns

    • Gender: This column displays the Gender (Male / Female)
    • Population: The population of the gender in the Little Falls town is shown in this column.
    • % of Total Population: This column displays the percentage distribution of each gender as a proportion of Little Falls town total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Little Falls town Population by Race & Ethnicity. You can refer the same here

  15. d

    Little brown bat occurrence model rangewide predictions for 2010 until 2019

    • catalog.data.gov
    • res1catalogd-o-tdatad-o-tgov.vcapture.xyz
    Updated Jul 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Little brown bat occurrence model rangewide predictions for 2010 until 2019 [Dataset]. https://catalog.data.gov/dataset/little-brown-bat-occurrence-model-rangewide-predictions-for-2010-until-2019
    Explore at:
    Dataset updated
    Jul 6, 2024
    Dataset provided by
    U.S. Geological Survey
    Description

    False positive occupancy analysis predictions with model uncertainty based on summertime data provided to support the three bat species status assessment (SSA) for Myotis lucifigus (MYLU), Myotis septentrionalis (MYSE), and Perimyotis subflavus (PESU). The objectives outlined by the Fish and Wildlife Service’s SSA team were to estimate summertime distributions across the entire species range. Statistical analysis included five types of response data requested from the North American Bat Monitoring Program database (NABat): automatically identified stationary acoustic calls, manually vetted stationary acoustic calls, automatically identified mobile acoustic calls, manually vetted mobile acoustic calls, and capture records. Statistical analysis was for the summertime distribution modeling, data collected between June 1 and Sept 1 during 2010 until 2019 were only included.

  16. N

    Little Falls, NY Population Breakdown by Gender and Age Dataset: Male and...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Little Falls, NY Population Breakdown by Gender and Age Dataset: Male and Female Population Distribution Across 18 Age Groups // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/e1ed4d50-f25d-11ef-8c1b-3860777c1fe6/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Little Falls, New York
    Variables measured
    Male and Female Population Under 5 Years, Male and Female Population over 85 years, Male and Female Population Between 5 and 9 years, Male and Female Population Between 10 and 14 years, Male and Female Population Between 15 and 19 years, Male and Female Population Between 20 and 24 years, Male and Female Population Between 25 and 29 years, Male and Female Population Between 30 and 34 years, Male and Female Population Between 35 and 39 years, Male and Female Population Between 40 and 44 years, and 8 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the three variables, namely (a) Population (Male), (b) Population (Female), and (c) Gender Ratio (Males per 100 Females), we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau across 18 age groups, ranging from under 5 years to 85 years and above. These age groups are described above in the variables section. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Little Falls by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for Little Falls. The dataset can be utilized to understand the population distribution of Little Falls by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in Little Falls. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for Little Falls.

    Key observations

    Largest age group (population): Male # 20-24 years (220) | Female # 20-24 years (249). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Age groups:

    • Under 5 years
    • 5 to 9 years
    • 10 to 14 years
    • 15 to 19 years
    • 20 to 24 years
    • 25 to 29 years
    • 30 to 34 years
    • 35 to 39 years
    • 40 to 44 years
    • 45 to 49 years
    • 50 to 54 years
    • 55 to 59 years
    • 60 to 64 years
    • 65 to 69 years
    • 70 to 74 years
    • 75 to 79 years
    • 80 to 84 years
    • 85 years and over

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.

    Variables / Data Columns

    • Age Group: This column displays the age group for the Little Falls population analysis. Total expected values are 18 and are define above in the age groups section.
    • Population (Male): The male population in the Little Falls is shown in the following column.
    • Population (Female): The female population in the Little Falls is shown in the following column.
    • Gender Ratio: Also known as the sex ratio, this column displays the number of males per 100 females in Little Falls for each age group.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Little Falls Population by Gender. You can refer the same here

  17. N

    Comprehensive Income by Age Group Dataset: Longitudinal Analysis of Little...

    • neilsberg.com
    Updated Aug 7, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2024). Comprehensive Income by Age Group Dataset: Longitudinal Analysis of Little Falls Town, New York Household Incomes Across 4 Age Groups and 16 Income Brackets. Annual Editions Collection // 2024 Edition [Dataset]. https://www.neilsberg.com/research/datasets/2edd0644-aeee-11ee-aaca-3860777c1fe6/
    Explore at:
    Dataset updated
    Aug 7, 2024
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Little Falls
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the Little Falls town household income by age. The dataset can be utilized to understand the age-based income distribution of Little Falls town income.

    Content

    The dataset will have the following datasets when applicable

    Please note: The 2020 1-Year ACS estimates data was not reported by the Census Bureau due to the impact on survey collection and analysis caused by COVID-19. Consequently, median household income data for 2020 is unavailable for large cities (population 65,000 and above).

    • Little Falls Town, New York annual median income by age groups dataset (in 2022 inflation-adjusted dollars)
    • Age-wise distribution of Little Falls Town, New York household incomes: Comparative analysis across 16 income brackets

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Interested in deeper insights and visual analysis?

    Explore our comprehensive data analysis and visual representations for a deeper understanding of Little Falls town income distribution by age. You can refer the same here

  18. d

    Map 10: ArcGIS layer showing contours of the 25 percentile of water levels...

    • catalog.data.gov
    • data.usgs.gov
    • +3more
    Updated Jul 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Map 10: ArcGIS layer showing contours of the 25 percentile of water levels from all months during the 2000-2009 water years (feet) [Dataset]. https://catalog.data.gov/dataset/map-10-arcgis-layer-showing-contours-of-the-25-percentile-of-water-levels-from-all-months-
    Explore at:
    Dataset updated
    Jul 6, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Description

    Statistical analyses and maps representing mean, high, and low water-level conditions in the surface water and groundwater of Miami-Dade County were made by the U.S. Geological Survey, in cooperation with the Miami-Dade County Department of Regulatory and Economic Resources, to help inform decisions necessary for urban planning and development. Sixteen maps were created that show contours of (1) the mean of daily water levels at each site during October and May for the 2000-2009 water years; (2) the 25th, 50th, and 75th percentiles of the daily water levels at each site during October and May and for all months during 2000-2009; and (3) the differences between mean October and May water levels, as well as the differences in the percentiles of water levels for all months, between 1990-1999 and 2000-2009. The 80th, 90th, and 96th percentiles of the annual maximums of daily groundwater levels during 1974-2009 (a 35-year period) were computed to provide an indication of unusually high groundwater-level conditions. These maps and statistics provide a generalized understanding of the variations of water levels in the aquifer, rather than a survey of concurrent water levels. Water-level measurements from 473 sites in Miami-Dade County and surrounding counties were analyzed to generate statistical analyses. The monitored water levels included surface-water levels in canals and wetland areas and groundwater levels in the Biscayne aquifer. Maps were created by importing site coordinates, summary water-level statistics, and completeness of record statistics into a geographic information system, and by interpolating between water levels at monitoring sites in the canals and water levels along the coastline. Raster surfaces were created from these data by using the triangular irregular network interpolation method. The raster surfaces were contoured by using geographic information system software. These contours were imprecise in some areas because the software could not fully evaluate the hydrology given available information; therefore, contours were manually modified where necessary. The ability to evaluate differences in water levels between 1990-1999 and 2000-2009 is limited in some areas because most of the monitoring sites did not have 80 percent complete records for one or both of these periods. The quality of the analyses was limited by (1) deficiencies in spatial coverage; (2) the combination of pre- and post-construction water levels in areas where canals, levees, retention basins, detention basins, or water-control structures were installed or removed; (3) an inability to address the potential effects of the vertical hydraulic head gradient on water levels in wells of different depths; and (4) an inability to correct for the differences between daily water-level statistics. Contours are dashed in areas where the locations of contours have been approximated because of the uncertainty caused by these limitations. Although the ability of the maps to depict differences in water levels between 1990-1999 and 2000-2009 was limited by missing data, results indicate that near the coast water levels were generally higher in May during 2000-2009 than during 1990-1999; and that inland water levels were generally lower during 2000-2009 than during 1990-1999. Generally, the 25th, 50th, and 75th percentiles of water levels from all months were also higher near the coast and lower inland during 2000–2009 than during 1990-1999. Mean October water levels during 2000-2009 were generally higher than during 1990-1999 in much of western Miami-Dade County, but were lower in a large part of eastern Miami-Dade County.

  19. d

    Retail Investor Sentiment Analytics-South Korea (RISA-Korea) | Social Media...

    • datarade.ai
    .json, .csv
    Updated Jan 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datago Technology Limited (2024). Retail Investor Sentiment Analytics-South Korea (RISA-Korea) | Social Media |4200+ KRX securities | Alternative Data | Daily Update [Dataset]. https://datarade.ai/data-products/retail-investor-sentiment-analytics-south-korea-risa-korea-datago-technology-limited
    Explore at:
    .json, .csvAvailable download formats
    Dataset updated
    Jan 2, 2024
    Dataset authored and provided by
    Datago Technology Limited
    Area covered
    South Korea
    Description

    RISA-Korea provides insight into Korean retail investor sentiment and interest in 4200 KRX securities (with 3100+ KRX stocks) by analyzing 60 million posts and 80+ million replies in Naver, the most popular web portal in South Korea since 2017.

    By analyzing the discussions on Naver's stock forum, RISA-Korea provides valuable information about the sentiments, opinions, and trends expressed by retail investors regarding various securities. The inclusion of a wide range of securities in the analysis ensures that RISA-Korea captures a holistic understanding of retail investor sentiment across the market, and the dataset serves as a valuable resource for studying retail investor behavior, identifying market trends, and assessing the impact of retail investors on specific securities.

    In particular, in addition to the statistical analysis of each security, this dataset provides record-level post analysis, such as information on sentiment, related stocks and hotness. for each post, which allows users to group posts according to their needs, such as identifying popular posts or excluding machine posts, to gain in-depth insights.

    • Coverage: 4200+ KRX securities (3100+stocks, 800+ETFs and 300+ ETNs) • History: From 2017-06-07 • Update Frequency: Daily

  20. N

    Comprehensive Income by Age Group Dataset: Longitudinal Analysis of Little...

    • neilsberg.com
    Updated Aug 7, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2024). Comprehensive Income by Age Group Dataset: Longitudinal Analysis of Little Valley Town, New York Household Incomes Across 4 Age Groups and 16 Income Brackets. Annual Editions Collection // 2024 Edition [Dataset]. https://www.neilsberg.com/research/datasets/2edd10e2-aeee-11ee-aaca-3860777c1fe6/
    Explore at:
    Dataset updated
    Aug 7, 2024
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Little Valley, New York
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the Little Valley town household income by age. The dataset can be utilized to understand the age-based income distribution of Little Valley town income.

    Content

    The dataset will have the following datasets when applicable

    Please note: The 2020 1-Year ACS estimates data was not reported by the Census Bureau due to the impact on survey collection and analysis caused by COVID-19. Consequently, median household income data for 2020 is unavailable for large cities (population 65,000 and above).

    • Little Valley Town, New York annual median income by age groups dataset (in 2022 inflation-adjusted dollars)
    • Age-wise distribution of Little Valley Town, New York household incomes: Comparative analysis across 16 income brackets

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Interested in deeper insights and visual analysis?

    Explore our comprehensive data analysis and visual representations for a deeper understanding of Little Valley town income distribution by age. You can refer the same here

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Jularat Chumnaul; Mohammad Sepehrifar (2024). Data for Example I. [Dataset]. http://doi.org/10.1371/journal.pone.0297930.s002

Data for Example I.

Related Article
Explore at:
txtAvailable download formats
Dataset updated
Jul 3, 2024
Dataset provided by
PLOS ONE
Authors
Jularat Chumnaul; Mohammad Sepehrifar
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Data analysis can be accurate and reliable only if the underlying assumptions of the used statistical method are validated. Any violations of these assumptions can change the outcomes and conclusions of the analysis. In this study, we developed Smart Data Analysis V2 (SDA-V2), an interactive and user-friendly web application, to assist users with limited statistical knowledge in data analysis, and it can be freely accessed at https://jularatchumnaul.shinyapps.io/SDA-V2/. SDA-V2 automatically explores and visualizes data, examines the underlying assumptions associated with the parametric test, and selects an appropriate statistical method for the given data. Furthermore, SDA-V2 can assess the quality of research instruments and determine the minimum sample size required for a meaningful study. However, while SDA-V2 is a valuable tool for simplifying statistical analysis, it does not replace the need for a fundamental understanding of statistical principles. Researchers are encouraged to combine their expertise with the software’s capabilities to achieve the most accurate and credible results.

Search
Clear search
Close search
Google apps
Main menu