100+ datasets found
  1. Normal and Skewed Example Data

    • figshare.com
    txt
    Updated Dec 21, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jesus Rogel-Salazar (2021). Normal and Skewed Example Data [Dataset]. http://doi.org/10.6084/m9.figshare.17306285.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Dec 21, 2021
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Jesus Rogel-Salazar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Example data for normally distributed and skewed datasets.

  2. Normal Distribution Data

    • kaggle.com
    zip
    Updated Sep 5, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TinaSoni (2020). Normal Distribution Data [Dataset]. https://www.kaggle.com/tinasoni/normal-distribution-data
    Explore at:
    zip(1080 bytes)Available download formats
    Dataset updated
    Sep 5, 2020
    Authors
    TinaSoni
    Description

    Dataset

    This dataset was created by TinaSoni

    Released under Data files © Original Authors

    Contents

  3. Distribution of waiting times and displacements: A comparison of over 30...

    • plos.figshare.com
    • figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Laura Alessandretti; Piotr Sapiezynski; Sune Lehmann; Andrea Baronchelli (2023). Distribution of waiting times and displacements: A comparison of over 30 datasets on human mobility. [Dataset]. http://doi.org/10.1371/journal.pone.0171686.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Laura Alessandretti; Piotr Sapiezynski; Sune Lehmann; Andrea Baronchelli
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The table reports for each dataset: the reference to the journal article/book where the study was published, the type of data (LBSN stands for Location Based Social Networks, CDR for Call Detail Record), the number of individuals (or vehicles in the case of car/taxi data) involved in the data collection, the duration of the data collection (M → months, Y → years, D → days, W → weeks), the minimum and maximum length of spatial displacements, the shape of the probability distribution of displacements with the corresponding parameters, the temporal sampling, the shape of the distribution of waiting times with the corresponding parameters. Power-law (T), indicates a truncated power-law. The table can also be found at http://lauraalessandretti.weebly.com/plosmobilityreview.html.

  4. N

    Normal, IL Age Group Population Dataset: A Complete Breakdown of Normal Age...

    • neilsberg.com
    csv, json
    Updated Jul 24, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2024). Normal, IL Age Group Population Dataset: A Complete Breakdown of Normal Age Demographics from 0 to 85 Years and Over, Distributed Across 18 Age Groups // 2024 Edition [Dataset]. https://www.neilsberg.com/research/datasets/aaa9fb54-4983-11ef-ae5d-3860777c1fe6/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Jul 24, 2024
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Normal, Illinois
    Variables measured
    Population Under 5 Years, Population over 85 years, Population Between 5 and 9 years, Population Between 10 and 14 years, Population Between 15 and 19 years, Population Between 20 and 24 years, Population Between 25 and 29 years, Population Between 30 and 34 years, Population Between 35 and 39 years, Population Between 40 and 44 years, and 9 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the age groups. For age groups we divided it into roughly a 5 year bucket for ages between 0 and 85. For over 85, we aggregated data into a single group for all ages. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the Normal population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for Normal. The dataset can be utilized to understand the population distribution of Normal by age. For example, using this dataset, we can identify the largest age group in Normal.

    Key observations

    The largest age group in Normal, IL was for the group of age 20 to 24 years years with a population of 12,017 (22.71%), according to the ACS 2018-2022 5-Year Estimates. At the same time, the smallest age group in Normal, IL was the 75 to 79 years years with a population of 755 (1.43%). Source: U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates

    Age groups:

    • Under 5 years
    • 5 to 9 years
    • 10 to 14 years
    • 15 to 19 years
    • 20 to 24 years
    • 25 to 29 years
    • 30 to 34 years
    • 35 to 39 years
    • 40 to 44 years
    • 45 to 49 years
    • 50 to 54 years
    • 55 to 59 years
    • 60 to 64 years
    • 65 to 69 years
    • 70 to 74 years
    • 75 to 79 years
    • 80 to 84 years
    • 85 years and over

    Variables / Data Columns

    • Age Group: This column displays the age group in consideration
    • Population: The population for the specific age group in the Normal is shown in this column.
    • % of Total Population: This column displays the population of each age group as a proportion of Normal total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Normal Population by Age. You can refer the same here

  5. Simulation Data Set

    • catalog.data.gov
    • s.cnmilf.com
    Updated Nov 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). Simulation Data Set [Dataset]. https://catalog.data.gov/dataset/simulation-data-set
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: File format: R workspace file; “Simulated_Dataset.RData”. Metadata (including data dictionary) • y: Vector of binary responses (1: adverse outcome, 0: control) • x: Matrix of covariates; one row for each simulated individual • z: Matrix of standardized pollution exposures • n: Number of simulated individuals • m: Number of exposure time periods (e.g., weeks of pregnancy) • p: Number of columns in the covariate design matrix • alpha_true: Vector of “true” critical window locations/magnitudes (i.e., the ground truth that we want to estimate) Code Abstract We provide R statistical software code (“CWVS_LMC.txt”) to fit the linear model of coregionalization (LMC) version of the Critical Window Variable Selection (CWVS) method developed in the manuscript. We also provide R code (“Results_Summary.txt”) to summarize/plot the estimated critical windows and posterior marginal inclusion probabilities. Description “CWVS_LMC.txt”: This code is delivered to the user in the form of a .txt file that contains R statistical software code. Once the “Simulated_Dataset.RData” workspace has been loaded into R, the text in the file can be used to identify/estimate critical windows of susceptibility and posterior marginal inclusion probabilities. “Results_Summary.txt”: This code is also delivered to the user in the form of a .txt file that contains R statistical software code. Once the “CWVS_LMC.txt” code is applied to the simulated dataset and the program has completed, this code can be used to summarize and plot the identified/estimated critical windows and posterior marginal inclusion probabilities (similar to the plots shown in the manuscript). Optional Information (complete as necessary) Required R packages: • For running “CWVS_LMC.txt”: • msm: Sampling from the truncated normal distribution • mnormt: Sampling from the multivariate normal distribution • BayesLogit: Sampling from the Polya-Gamma distribution • For running “Results_Summary.txt”: • plotrix: Plotting the posterior means and credible intervals Instructions for Use Reproducibility (Mandatory) What can be reproduced: The data and code can be used to identify/estimate critical windows from one of the actual simulated datasets generated under setting E4 from the presented simulation study. How to use the information: • Load the “Simulated_Dataset.RData” workspace • Run the code contained in “CWVS_LMC.txt” • Once the “CWVS_LMC.txt” code is complete, run “Results_Summary.txt”. Format: Below is the replication procedure for the attached data set for the portion of the analyses using a simulated data set: Data The data used in the application section of the manuscript consist of geocoded birth records from the North Carolina State Center for Health Statistics, 2005-2008. In the simulation study section of the manuscript, we simulate synthetic data that closely match some of the key features of the birth certificate data while maintaining confidentiality of any actual pregnant women. Availability Due to the highly sensitive and identifying information contained in the birth certificate data (including latitude/longitude and address of residence at delivery), we are unable to make the data from the application section publically available. However, we will make one of the simulated datasets available for any reader interested in applying the method to realistic simulated birth records data. This will also allow the user to become familiar with the required inputs of the model, how the data should be structured, and what type of output is obtained. While we cannot provide the application data here, access to the North Carolina birth records can be requested through the North Carolina State Center for Health Statistics, and requires an appropriate data use agreement. Description Permissions: These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is associated with the following publication: Warren, J., W. Kong, T. Luben, and H. Chang. Critical Window Variable Selection: Estimating the Impact of Air Pollution on Very Preterm Birth. Biostatistics. Oxford University Press, OXFORD, UK, 1-30, (2019).

  6. q

    MATLAB code and output files for integral, mean and covariance of the...

    • researchdatafinder.qut.edu.au
    Updated Jul 25, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dr Matthew Adams (2022). MATLAB code and output files for integral, mean and covariance of the simplex-truncated multivariate normal distribution [Dataset]. https://researchdatafinder.qut.edu.au/display/n20044
    Explore at:
    Dataset updated
    Jul 25, 2022
    Dataset provided by
    Queensland University of Technology (QUT)
    Authors
    Dr Matthew Adams
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Compositional data, which is data consisting of fractions or probabilities, is common in many fields including ecology, economics, physical science and political science. If these data would otherwise be normally distributed, their spread can be conveniently represented by a multivariate normal distribution truncated to the non-negative space under a unit simplex. Here this distribution is called the simplex-truncated multivariate normal distribution. For calculations on truncated distributions, it is often useful to obtain rapid estimates of their integral, mean and covariance; these quantities characterising the truncated distribution will generally possess different values to the corresponding non-truncated distribution.

    In the paper Adams, Matthew (2022) Integral, mean and covariance of the simplex-truncated multivariate normal distribution. PLoS One, 17(7), Article number: e0272014. https://eprints.qut.edu.au/233964/, three different approaches that can estimate the integral, mean and covariance of any simplex-truncated multivariate normal distribution are described and compared. These three approaches are (1) naive rejection sampling, (2) a method described by Gessner et al. that unifies subset simulation and the Holmes-Diaconis-Ross algorithm with an analytical version of elliptical slice sampling, and (3) a semi-analytical method that expresses the integral, mean and covariance in terms of integrals of hyperrectangularly-truncated multivariate normal distributions, the latter of which are readily computed in modern mathematical and statistical packages. Strong agreement is demonstrated between all three approaches, but the most computationally efficient approach depends strongly both on implementation details and the dimension of the simplex-truncated multivariate normal distribution.

    This dataset consists of all code and results for the associated article.

  7. r

    Data from: Truncated normal distribution

    • resodate.org
    • service.tib.eu
    Updated Dec 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Masatoshi Uehara; Takeru Matsuda; Fumiyasu Komaki (2024). Truncated normal distribution [Dataset]. https://resodate.org/resources/aHR0cHM6Ly9zZXJ2aWNlLnRpYi5ldS9sZG1zZXJ2aWNlL2RhdGFzZXQvdHJ1bmNhdGVkLW5vcm1hbC1kaXN0cmlidXRpb24=
    Explore at:
    Dataset updated
    Dec 16, 2024
    Dataset provided by
    Leibniz Data Manager
    Authors
    Masatoshi Uehara; Takeru Matsuda; Fumiyasu Komaki
    Description

    The dataset used in the paper is a truncated normal distribution with an unknown precision matrix.

  8. Table_1_Application of robust regression in translational neuroscience...

    • frontiersin.figshare.com
    • datasetcatalog.nlm.nih.gov
    docx
    Updated Jan 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michael Malek-Ahmadi; Stephen D. Ginsberg; Melissa J. Alldred; Scott E. Counts; Milos D. Ikonomovic; Eric E. Abrahamson; Sylvia E. Perez; Elliott J. Mufson (2024). Table_1_Application of robust regression in translational neuroscience studies with non-Gaussian outcome data.DOCX [Dataset]. http://doi.org/10.3389/fnagi.2023.1299451.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    Jan 24, 2024
    Dataset provided by
    Frontiers Mediahttp://www.frontiersin.org/
    Authors
    Michael Malek-Ahmadi; Stephen D. Ginsberg; Melissa J. Alldred; Scott E. Counts; Milos D. Ikonomovic; Eric E. Abrahamson; Sylvia E. Perez; Elliott J. Mufson
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Linear regression is one of the most used statistical techniques in neuroscience, including the study of the neuropathology of Alzheimer’s disease (AD) dementia. However, the practical utility of this approach is often limited because dependent variables are often highly skewed and fail to meet the assumption of normality. Applying linear regression analyses to highly skewed datasets can generate imprecise results, which lead to erroneous estimates derived from statistical models. Furthermore, the presence of outliers can introduce unwanted bias, which affect estimates derived from linear regression models. Although a variety of data transformations can be utilized to mitigate these problems, these approaches are also associated with various caveats. By contrast, a robust regression approach does not impose distributional assumptions on data allowing for results to be interpreted in a similar manner to that derived using a linear regression analysis. Here, we demonstrate the utility of applying robust regression to the analysis of data derived from studies of human brain neurodegeneration where the error distribution of a dependent variable does not meet the assumption of normality. We show that the application of a robust regression approach to two independent published human clinical neuropathologic data sets provides reliable estimates of associations. We also demonstrate that results from a linear regression analysis can be biased if the dependent variable is significantly skewed, further indicating robust regression as a suitable alternate approach.

  9. DEMANDE Dataset

    • zenodo.org
    • researchdiscovery.drexel.edu
    zip
    Updated Apr 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joseph A. Gallego-Mejia; Joseph A. Gallego-Mejia; Fabio A Gonzalez; Fabio A Gonzalez (2023). DEMANDE Dataset [Dataset]. http://doi.org/10.5281/zenodo.7822851
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 13, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Joseph A. Gallego-Mejia; Joseph A. Gallego-Mejia; Fabio A Gonzalez; Fabio A Gonzalez
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains the features and probabilites of ten different functions. Each dataset is saved using numpy arrays. \item The data set \textit{Arc} corresponds to a two-dimensional random sample drawn from a random vector $$X=(X_1,X_2)$$ with probability density function given by $$f(x_1,x_2)=\mathcal{N}(x_2|0,4)\mathcal{N}(x_1|0.25x_2^2,1)$$ where $$\mathcal{N}(u|\mu,\sigma^2)$$ denotes the density function of a normal distribution with mean $$\mu$$ and variance $$\sigma^2$$. \cite{Papamakarios2017} used this data set to evaluate his neural density estimation methods. \item The data set \textit{Potential 1} corresponds to a two-dimensional random sample drawn from a random vector $$X=(X_1,X_2)$$ with probability density function given by $$f(x_1,x_2)=\frac{1}{2}\left(\frac{||x||-2}{0.4}\right)^2 - \ln{\left(\exp\left\{-\frac{1}{2}\left[\frac{x_1-2}{0.6}\right]^2\right\}+\exp\left\{-\frac{1}{2}\left[\frac{x_1+2}{0.6}\right]^2\right\}\right)}$$ with a normalizing constant of approximately 6.52 calculated by Monte Carlo integration. \item The data set \textit{Potential 2} corresponds to a two-dimensional random sample drawn from a random vector $$X=(X_1,X_2)$$ with probability density function given by $$f(x_1,x_2)=\frac{1}{2}\left[ \frac{x_2-w_1(x)}{0.4}\right]^2$$ where $$w_1(x)=\sin{(\frac{2\pi x_1}{4})}$$ with a normalizing constant of approximately 8 calculated by Monte Carlo integration. \item The data set \textit{Potential 3} corresponds to a two-dimensional random sample drawn from a random vector $$x=(X_1,X_2)$$ with probability density function given by $$f(x_1,x_2)= - \ln{\left(\exp\left\{-\frac{1}{2}\left[\frac{x_2-w_1(x)}{0.35}\right]^2\right\}+\exp\left\{-\frac{1}{2}\left[\frac{x_2-w_1(x)+w_2(x)}{0.35}^2\right]\right\}\right)}$$ where $$w_1(x)=\sin{(\frac{2\pi x_1}{4})}$$ and $$w_2(x)=3 \exp \left\{-\frac{1}{2}\left[ \frac{x_1-1}{0.6}\right]^2\right\}$$ with a normalizing constant of approximately 13.9 calculated by Monte Carlo integration. \item The data set \textit{Potential 4} corresponds to a two-dimensional random sample drawn from a random vector $$x=(X_1,X_2)$$ with probability density function given by $$f(x_1,x_2)= - \ln{\left(\exp\left\{-\frac{1}{2}\left[\frac{x_2-w_1(x)}{0.4}\right]^2\right\}+\exp\left\{-\frac{1}{2}\left[\frac{x_2-w_1(x)+w_3(x)}{0.35}^2\right]\right\}\right)}$$ where $$w_1(x)=\sin{(\frac{2\pi x_1}{4})}$$, $$w_3(x)=3 \sigma \left(\left[ \frac{x_1-1}{0.3}\right]^2\right)$$, and $$\sigma(x)= \frac{1}{1+\exp(x)}$$ with a normalizing constant of approximately 13.9 calculated by Monte Carlo integration. \item The data set \textit{2D mixture} corresponds to a two-dimensional random sample drawn from the random vector $$x=(X_1, X_2)$$ with a probability density function given by $$f(x) = \frac{1}{2}\mathcal{N}(x|\mu_1,\Sigma_1) + \frac{1}{2}\mathcal{N}(x|\mu_2,\Sigma_2)$$ with means and covariance matrices $$\mu_1 = [1, -1]^T$$, $$\mu_2 = [-2, 2]^T$$, $$\Sigma_1=\left[\begin{array}{cc} 1 & 0 \\ 0 & 2 \end{array}\right]$$, and $$\Sigma_1=\left[\begin{array}{cc} 2 & 0 \\ 0 & 1 \end{array}\right]$$ \item The data set \textit{10D-mixture} corresponds to a 10-dimensional random sample drawn from the random vector $$x=(X_1,\cdots,X_{10})$$ with a mixture of four diagonal normal probability density functions $$\mathcal{N}(X_i|\mu_i, \sigma_i)$$, where each $$\mu_i$$ is drawn uniformly in the interval $$[-0.5,0.5]$$, and the $$\sigma_i$$ is drawn uniformly in the interval $$[-0.01, 0.5]$$. Each diagonal normal probability density has the same probability of being drawn $$1/4$$.

  10. f

    Binning of measured data with estimations from Poisson distribution and...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated May 31, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yu, Kwan Ngok; Watabe, Hiroshi; Kwan, Sum; Beni, Mehrdad Shahmohammadi; Islam, M. Rafiqul (2022). Binning of measured data with estimations from Poisson distribution and normal distribution approximation. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000300332
    Explore at:
    Dataset updated
    May 31, 2022
    Authors
    Yu, Kwan Ngok; Watabe, Hiroshi; Kwan, Sum; Beni, Mehrdad Shahmohammadi; Islam, M. Rafiqul
    Description

    Binning of measured data with estimations from Poisson distribution and normal distribution approximation.

  11. f

    CP estimate and FPR on data in normal distribution of size n1 = n2 = 25 with...

    • figshare.com
    • plos.figshare.com
    xls
    Updated May 31, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yao Wang; Guang Sun; Zhaohua Ji; Chong Xing; Yanchun Liang (2023). CP estimate and FPR on data in normal distribution of size n1 = n2 = 25 with different μ and k. [Dataset]. http://doi.org/10.1371/journal.pone.0029860.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Yao Wang; Guang Sun; Zhaohua Ji; Chong Xing; Yanchun Liang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    CP estimate and FPR on data in normal distribution of size n1 = n2 = 25 with different μ and k.

  12. N

    Income Bracket Analysis by Age Group Dataset: Age-Wise Distribution of...

    • neilsberg.com
    csv, json
    Updated Feb 25, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Income Bracket Analysis by Age Group Dataset: Age-Wise Distribution of Normal, IL Household Incomes Across 16 Income Brackets // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/f3616ba2-f353-11ef-8577-3860777c1fe6/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Feb 25, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Normal, Illinois
    Variables measured
    Number of households with income $200,000 or more, Number of households with income less than $10,000, Number of households with income between $15,000 - $19,999, Number of households with income between $20,000 - $24,999, Number of households with income between $25,000 - $29,999, Number of households with income between $30,000 - $34,999, Number of households with income between $35,000 - $39,999, Number of households with income between $40,000 - $44,999, Number of households with income between $45,000 - $49,999, Number of households with income between $50,000 - $59,999, and 6 more
    Measurement technique
    The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. It delineates income distributions across 16 income brackets (mentioned above) following an initial analysis and categorization. Using this dataset, you can find out the total number of households within a specific income bracket along with how many households with that income bracket for each of the 4 age cohorts (Under 25 years, 25-44 years, 45-64 years and 65 years and over). For additional information about these estimations, please contact us via email at research@neilsberg.com
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset presents the the household distribution across 16 income brackets among four distinct age groups in Normal: Under 25 years, 25-44 years, 45-64 years, and over 65 years. The dataset highlights the variation in household income, offering valuable insights into economic trends and disparities within different age categories, aiding in data analysis and decision-making..

    Key observations

    • Upon closer examination of the distribution of households among age brackets, it reveals that there are 4,441(22.46%) households where the householder is under 25 years old, 6,026(30.47%) households with a householder aged between 25 and 44 years, 5,308(26.84%) households with a householder aged between 45 and 64 years, and 4,000(20.23%) households where the householder is over 65 years old.
    • The age group of 45 to 64 years exhibits the highest median household income, while the largest number of households falls within the 25 to 44 years bracket. This distribution hints at economic disparities within the town of Normal, showcasing varying income levels among different age demographics.
    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Income brackets:

    • Less than $10,000
    • $10,000 to $14,999
    • $15,000 to $19,999
    • $20,000 to $24,999
    • $25,000 to $29,999
    • $30,000 to $34,999
    • $35,000 to $39,999
    • $40,000 to $44,999
    • $45,000 to $49,999
    • $50,000 to $59,999
    • $60,000 to $74,999
    • $75,000 to $99,999
    • $100,000 to $124,999
    • $125,000 to $149,999
    • $150,000 to $199,999
    • $200,000 or more

    Variables / Data Columns

    • Household Income: This column showcases 16 income brackets ranging from Under $10,000 to $200,000+ ( As mentioned above).
    • Under 25 years: The count of households led by a head of household under 25 years old with income within a specified income bracket.
    • 25 to 44 years: The count of households led by a head of household 25 to 44 years old with income within a specified income bracket.
    • 45 to 64 years: The count of households led by a head of household 45 to 64 years old with income within a specified income bracket.
    • 65 years and over: The count of households led by a head of household 65 years and over old with income within a specified income bracket.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Normal median household income by age. You can refer the same here

  13. d

    Data from: Prior choice and data requirements of Bayesian multivariate mixed...

    • search.dataone.org
    • data.niaid.nih.gov
    • +1more
    Updated Aug 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cody Deane (2024). Prior choice and data requirements of Bayesian multivariate mixed effects models fit to tag-recovery data: The need for power analyses [Dataset]. http://doi.org/10.5061/dryad.hmgqnk9h6
    Explore at:
    Dataset updated
    Aug 13, 2024
    Dataset provided by
    Dryad Digital Repository
    Authors
    Cody Deane
    Time period covered
    Feb 17, 2023
    Description
    1. Recent empirical studies have quantified correlation between survival and recovery by estimating these parameters as correlated random effects with hierarchical Bayesian multivariate models fit to tag-recovery data. In these applications, increasingly negative correlation between survival and recovery has been interpreted as evidence for increasingly additive harvest mortality. The power of these hierarchal models to detect non-zero correlations has rarely been evaluated and these few studies have not focused on tag-recovery data, which is a common data type.
    2. We assessed the power of multivariate hierarchical models to detect negative correlation between annual survival and recovery. Using three priors for multivariate normal distributions, we fit hierarchical effects models to a mallard (Anas platyrhychos) tag-recovery dataset and to simulated data with sample sizes corresponding to different levels of monitoring intensity. We also demonstrate more robust summary statistics for t..., , Instructions and metadata are provided at the top of each R script, including required R packages listed at the top of individual R scripts. Software specifications are included in the README file included in the data files. This Dryad repository is for two manuscripts:

    Deane, C. E., L. G. Carlson, C. J. Cunningham, P. Doak, K. Kielland, and G. A. Breed. 2023. Prior choice and data requirements of Bayesian multivariate hierarchical models fit to tag-recovery data: The need for power analyses. Ecology and Evolution 13:e9847. https://doi.org/10.1002/ece3.9847 Note: R scripts specific to the power analysis in this manuscript begin with "P"

    Deane, C. E., L. G. Carlson, C. J. Cunningham, P. Doak, K. Kielland, and G. A. Breed. In prep. Accurately estimating correlations between demographic parameters: a response to Riecke et al. (in press). Ecology and Evolution Note: New R scripts specific to our response begin with the letter "R"Â

    , # Data for the article "Prior choice and data requirements of Bayesian multivariate hierarchical models fit to tag-recovery data: the need for power analyses"

    Reference information

    • File name: README_Dataset-PriorsDataRequirementsBayesianHierarchicalModels.md
    • Authors: C.E. Deane (cdeane2@alaska.edu)
    • Other contributors: L.G. Carlson, C.J. Cunningham, P. Doak, K. Kielland, G.A. Breed.
    • Date of Issue: 2023-02-15
    • Suggested Citations:
      • Dataset citation: > Deane, C.E., L.G. Carlon, C.J. Cunningham, P. Doak, K. Kielland, and G.A. Breed. 2023. Data for "Prior choice and data requirements of Bayesian multivariate hierarchical models fit to tag-recovery data: the need for power analyses", Dryad, Dataset, https://doi.org/10.5061/dryad.hmgqnk9h6
      • Software citation: > Deane, C.E., L.G. Carlon, C.J. Cunningham, P. Doak, K. Kielland, and G.A. Breed. 2023. Software for "...
  14. q

    Chapter 7: The normal distribution

    • qubeshub.org
    Updated Dec 23, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Raisa Hernández-Pacheco; Alexis Diaz (2020). Chapter 7: The normal distribution [Dataset]. http://doi.org/10.25334/V6P0-A283
    Explore at:
    Dataset updated
    Dec 23, 2020
    Dataset provided by
    QUBES
    Authors
    Raisa Hernández-Pacheco; Alexis Diaz
    Description

    Biostatistics Using R: A Laboratory Manual was created with the goals of providing biological content to lab sessions by using authentic research data and introducing R programming language. Chapter 7 introduces the normal distribution.

  15. c

    Parameter estimates of mixed generalized Gaussian distribution for modelling...

    • research-data.cardiff.ac.uk
    • datasetcatalog.nlm.nih.gov
    zip
    Updated Sep 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zoe Salinger; Alla Sikorskii; Michael J. Boivin; Nenad Šuvak; Maria Veretennikova; Nikolai N. Leonenko (2024). Parameter estimates of mixed generalized Gaussian distribution for modelling the increments of electroencephalogram data [Dataset]. http://doi.org/10.17035/d.2023.0277307170
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 18, 2024
    Dataset provided by
    Cardiff University
    Authors
    Zoe Salinger; Alla Sikorskii; Michael J. Boivin; Nenad Šuvak; Maria Veretennikova; Nikolai N. Leonenko
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Electroencephalogram (EEG) is used to monitor child's brain during coma by recording data on electrical neural activity of the brain. Signals are captured by multiple electrodes called channels located over the scalp. Statistical analyses of EEG data includes classification and prediction using arrays of EEG features, but few models for the underlying stochastic processes have been proposed. For this purpose, a new strictly stationary strong mixing diffusion model with marginal multimodal (three-peak) distribution (MixGGDiff) and exponentially decaying autocorrelation function for modeling of increments of EEG data was proposed. The increments were treated as discrete-time observations and a diffusion process where the stationary distribution is viewed as a mixture of three non-central generalized Gaussian distributions (MixGGD) was constructed.Probability density function of a mixed generalized Gaussian distribution (MixGGD) consists of three components and is described using a total of 12 parameters:\muk, location parameter of each of the components,sk, shape parameter of each of the components, \sigma2k, parameter related to the scale of each of the components andwk, weight of each of the components, where k, k={1,2,3} refers to theindex of the component of a MixGGD. The parameters of this distribution were estimated using the expectation-maximization algorithm, where the added shape parameter is estimated using the higher order statistics approach based on an analytical relationship between the shape parameter and kurtosis.To illustrate an application of the MixGGDiff to real data, analysis of EEG data collected in Uganda between 2008 and 2015 from 78 children within age-range of 18 months to 12 years who were in coma due to cerebral malaria was performed. EEG were recorded using the International 10–20 system with the sampling rate of 500 Hz and the average record duration of 30 min. EEG signal for every child was the result of a recording from 19 channels. MixGGD was fitted to each channel of every child's recording separately, hence for each channel a total of 12 parameter estimates were obtained. The data is presented in a matrix form (dimension 79*228) in a .csv format and consists of 79 rows where the first row is a header row which contains the names of the variables and the subsequent 78 rows represent parameter estimates of one instance (i.e. one child, without identifiers that could be related back to a specific child). There are a total of 228 columns (19 channels times 12 parameter estimates) where each column represents one parameter estimate of one component of MixGGD in the order of the channels, thus columns 1 to 12 refer to parameter estimates on the first channel, columns 13 to 24 refer to parameter estimates on the second channel and so on. Each variable name starts with "chi" where "ch" is an abbreviation of "channel" and i refers to the order of the channel from EEG recording. The rest of the characters in variable names refer to the parameter estimate names of the components of a MixGGD, thus for example "ch3sigmasq1" refers to the parameter estimate of \sigma2 of the first component of MixGGD obtained from EEG increments on the third channel. Parameter estimates contained in the .csv file are all real numbers within a range of -671.11 and 259326.96.Research results based upon these data are published at https://doi.org/10.1007/s00477-023-02524-y

  16. f

    Shapiro tests for normality of phenotypic trait data. Fresh weight (FW) was...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Apr 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chutimanukul, Panita; Sueachuen, Suchalee; Chiangklang, Tanawut; Korinsak, Siripar; Darwell, Clive Terence; Mosaleeyanon, Kriengkrai; Wanichananan, Praderm; Janta, Supattana (2025). Shapiro tests for normality of phenotypic trait data. Fresh weight (FW) was log transformed attaining normality while dry weight (DW) was transformed using cube roots but did not attain a normal distribution. p < 0.05 indicates a violation of normally distributed data. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0002066130
    Explore at:
    Dataset updated
    Apr 29, 2025
    Authors
    Chutimanukul, Panita; Sueachuen, Suchalee; Chiangklang, Tanawut; Korinsak, Siripar; Darwell, Clive Terence; Mosaleeyanon, Kriengkrai; Wanichananan, Praderm; Janta, Supattana
    Description

    Shapiro tests for normality of phenotypic trait data. Fresh weight (FW) was log transformed attaining normality while dry weight (DW) was transformed using cube roots but did not attain a normal distribution. p < 0.05 indicates a violation of normally distributed data.

  17. h

    normal_distribution_dataset

    • huggingface.co
    Updated Dec 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hiroki Kobayashi (2025). normal_distribution_dataset [Dataset]. https://huggingface.co/datasets/koba-jon/normal_distribution_dataset
    Explore at:
    Dataset updated
    Dec 1, 2025
    Authors
    Hiroki Kobayashi
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Normal Distribution Dataset

    1-dimensional shape dataset generated by random numbers of normal distribution

      1. Usage
    
    
    
    
    
      (1) Original Dataset
    
    
    
    
    
      Get
    

    $ git clone https://huggingface.co/datasets/koba-jon/normal_distribution_dataset $ cd normal_distribution_dataset/NormalDistribution $ ls -l

      Hierarchy
    

    train : training data (100,000 pieces) of 300 dimensions

    train |--0 |--00000.dat |--00001.dat | ... |--09999.dat |--1 | ... |--9… See the full description on the dataset page: https://huggingface.co/datasets/koba-jon/normal_distribution_dataset.

  18. (Gamma-ray Spectroscopy) Distribution Dataset v1

    • kaggle.com
    zip
    Updated Jul 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Özgün Büyüktanır (2023). (Gamma-ray Spectroscopy) Distribution Dataset v1 [Dataset]. https://www.kaggle.com/datasets/zgnbyktanr/gamma-ray-spectroscopy-gaussdis-with-noise-1
    Explore at:
    zip(193213 bytes)Available download formats
    Dataset updated
    Jul 13, 2023
    Authors
    Özgün Büyüktanır
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This data set is similar to gamma-ray spectroscopy data and is designed for machine-learning data analysis. This dataset is generated by computer.

    Scientific Information about Dataset

    In gamma-ray spectroscopy, data is generated by capturing the number of emissions within a specific channel range of the radiation emitted by the sample. In scientific data, the sample produces photopeaks exhibiting a Gaussian distribution when statistically examined. A Gaussian distribution (Normal distribution) is a probability distribution dependent on three parameters.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F7989877%2F3854c6aa9a72ee0d2558ee878194a7be%2FGauss_dis%20-%20Kopya.png?generation=1689283862004724&alt=media" alt="">

    • x0 : Standard deviation
    • σ (sigma)∶ Width of the Gaussian Distribution
    • N : Number of the occurrences of the event

    for more information: https://en.wikipedia.org/wiki/Normal_distribution

    In Gamma Ray Spectroscopy

    • x0 = Photopeak location
    • σ (sigma) ∝ Detector resolution
    • N ∝ Activity of the sample

    Co-60 Gamma-ray Spectroscopy Example https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F7989877%2F8fad8994bf11dca48657dc3d3e21f628%2Fco60-repc.png?generation=1689324671029928&alt=media" alt="">

    Dataset Content

    cha_ : Number of radiations captured by the channel from 0 to 2000 with 10 intervals

    • cha_5 : Number of radiations captured by the channel between 0-10
    • cha_15: Number of radiations captured by the channel between 10-20
    • cha_25: Number of radiations captured by the channel between 20-30 . . .
    • cha_n: Number of radiations captured by the channel between (n-5)-(n+5)
    • x0 : Standard deviation of the Gaussian Distribution
    • sigma : Width of the Gaussian Distribution
    • N: Number of the emission in the Gaussian Distribution range
  19. N

    Normal, IL Population Breakdown by Gender and Age Dataset: Male and Female...

    • neilsberg.com
    csv, json
    Updated Feb 19, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2024). Normal, IL Population Breakdown by Gender and Age Dataset: Male and Female Population Distribution Across 18 Age Groups // 2024 Edition [Dataset]. https://www.neilsberg.com/research/datasets/8e35ef39-c989-11ee-9145-3860777c1fe6/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Feb 19, 2024
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Normal, Illinois
    Variables measured
    Male and Female Population Under 5 Years, Male and Female Population over 85 years, Male and Female Population Between 5 and 9 years, Male and Female Population Between 10 and 14 years, Male and Female Population Between 15 and 19 years, Male and Female Population Between 20 and 24 years, Male and Female Population Between 25 and 29 years, Male and Female Population Between 30 and 34 years, Male and Female Population Between 35 and 39 years, Male and Female Population Between 40 and 44 years, and 8 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates. To measure the three variables, namely (a) Population (Male), (b) Population (Female), and (c) Gender Ratio (Males per 100 Females), we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau across 18 age groups, ranging from under 5 years to 85 years and above. These age groups are described above in the variables section. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Normal by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for Normal. The dataset can be utilized to understand the population distribution of Normal by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in Normal. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for Normal.

    Key observations

    Largest age group (population): Male # 20-24 years (5,421) | Female # 20-24 years (6,596). Source: U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates.

    Age groups:

    • Under 5 years
    • 5 to 9 years
    • 10 to 14 years
    • 15 to 19 years
    • 20 to 24 years
    • 25 to 29 years
    • 30 to 34 years
    • 35 to 39 years
    • 40 to 44 years
    • 45 to 49 years
    • 50 to 54 years
    • 55 to 59 years
    • 60 to 64 years
    • 65 to 69 years
    • 70 to 74 years
    • 75 to 79 years
    • 80 to 84 years
    • 85 years and over

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.

    Variables / Data Columns

    • Age Group: This column displays the age group for the Normal population analysis. Total expected values are 18 and are define above in the age groups section.
    • Population (Male): The male population in the Normal is shown in the following column.
    • Population (Female): The female population in the Normal is shown in the following column.
    • Gender Ratio: Also known as the sex ratio, this column displays the number of males per 100 females in Normal for each age group.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Normal Population by Gender. You can refer the same here

  20. The effect of changing from a normal to a skew-normal distribution on the...

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated Jun 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joseph B. Sempa; Theresa M. Rossouw; Emmanuel Lesaffre; Martin Nieuwoudt (2023). The effect of changing from a normal to a skew-normal distribution on the random-effects on the regression coefficients, with 95% credible intervals, in the asymptote model. [Dataset]. http://doi.org/10.1371/journal.pone.0224723.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Joseph B. Sempa; Theresa M. Rossouw; Emmanuel Lesaffre; Martin Nieuwoudt
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The effect of changing from a normal to a skew-normal distribution on the random-effects on the regression coefficients, with 95% credible intervals, in the asymptote model.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Jesus Rogel-Salazar (2021). Normal and Skewed Example Data [Dataset]. http://doi.org/10.6084/m9.figshare.17306285.v1
Organization logoOrganization logo

Normal and Skewed Example Data

Explore at:
txtAvailable download formats
Dataset updated
Dec 21, 2021
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Jesus Rogel-Salazar
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Example data for normally distributed and skewed datasets.

Search
Clear search
Close search
Google apps
Main menu