100+ datasets found
  1. u

    Results and analysis using the Lean Six-Sigma define, measure, analyze,...

    • researchdata.up.ac.za
    docx
    Updated Mar 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Modiehi Mophethe (2024). Results and analysis using the Lean Six-Sigma define, measure, analyze, improve, and control (DMAIC) Framework [Dataset]. http://doi.org/10.25403/UPresearchdata.25370374.v1
    Explore at:
    docxAvailable download formats
    Dataset updated
    Mar 12, 2024
    Dataset provided by
    University of Pretoria
    Authors
    Modiehi Mophethe
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This section presents a discussion of the research data. The data was received as secondary data however, it was originally collected using the time study techniques. Data validation is a crucial step in the data analysis process to ensure that the data is accurate, complete, and reliable. Descriptive statistics was used to validate the data. The mean, mode, standard deviation, variance and range determined provides a summary of the data distribution and assists in identifying outliers or unusual patterns. The data presented in the dataset show the measures of central tendency which includes the mean, median and the mode. The mean signifies the average value of each of the factors presented in the tables. This is the balance point of the dataset, the typical value and behaviour of the dataset. The median is the middle value of the dataset for each of the factors presented. This is the point where the dataset is divided into two parts, half of the values lie below this value and the other half lie above this value. This is important for skewed distributions. The mode shows the most common value in the dataset. It was used to describe the most typical observation. These values are important as they describe the central value around which the data is distributed. The mean, mode and median give an indication of a skewed distribution as they are not similar nor are they close to one another. In the dataset, the results and discussion of the results is also presented. This section focuses on the customisation of the DMAIC (Define, Measure, Analyse, Improve, Control) framework to address the specific concerns outlined in the problem statement. To gain a comprehensive understanding of the current process, value stream mapping was employed, which is further enhanced by measuring the factors that contribute to inefficiencies. These factors are then analysed and ranked based on their impact, utilising factor analysis. To mitigate the impact of the most influential factor on project inefficiencies, a solution is proposed using the EOQ (Economic Order Quantity) model. The implementation of the 'CiteOps' software facilitates improved scheduling, monitoring, and task delegation in the construction project through digitalisation. Furthermore, project progress and efficiency are monitored remotely and in real time. In summary, the DMAIC framework was tailored to suit the requirements of the specific project, incorporating techniques from inventory management, project management, and statistics to effectively minimise inefficiencies within the construction project.

  2. N

    Median Household Income Variation by Family Size in South Range, MI:...

    • neilsberg.com
    csv, json
    Updated Jan 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2024). Median Household Income Variation by Family Size in South Range, MI: Comparative analysis across 7 household sizes [Dataset]. https://www.neilsberg.com/research/datasets/1b74898b-73fd-11ee-949f-3860777c1fe6/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Jan 11, 2024
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Michigan, South Range
    Variables measured
    Household size, Median Household Income
    Measurement technique
    The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. It delineates income distributions across 7 household sizes (mentioned above) following an initial analysis and categorization. Using this dataset, you can find out how household income varies with the size of the family unit. For additional information about these estimations, please contact us via email at research@neilsberg.com
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset presents median household incomes for various household sizes in South Range, MI, as reported by the U.S. Census Bureau. The dataset highlights the variation in median household income with the size of the family unit, offering valuable insights into economic trends and disparities within different household sizes, aiding in data analysis and decision-making.

    Key observations

    • Of the 7 household sizes (1 person to 7-or-more person households) reported by the census bureau, South Range did not include 4, 5, 6, or 7-person households. Across the different household sizes in South Range the mean income is $51,844, and the standard deviation is $18,238. The coefficient of variation (CV) is 35.18%. This high CV indicates high relative variability, suggesting that the incomes vary significantly across different sizes of households.
    • In the most recent year, 2021, The smallest household size for which the bureau reported a median household income was 1-person households, with an income of $31,226. It then further increased to $65,869 for 3-person households, the largest household size for which the bureau reported a median household income.

    https://i.neilsberg.com/ch/south-range-mi-median-household-income-by-household-size.jpeg" alt="South Range, MI median household income, by household size (in 2022 inflation-adjusted dollars)">

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

    Household Sizes:

    • 1-person households
    • 2-person households
    • 3-person households
    • 4-person households
    • 5-person households
    • 6-person households
    • 7-or-more-person households

    Variables / Data Columns

    • Household Size: This column showcases 7 household sizes ranging from 1-person households to 7-or-more-person households (As mentioned above).
    • Median Household Income: Median household income, in 2022 inflation-adjusted dollars for the specific household size.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for South Range median household income. You can refer the same here

  3. R

    Mean-variance data collections for multiperiod portfolio optimization...

    • repod.icm.edu.pl
    docx, txt, zip
    Updated Mar 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Juszczuk, Przemysław; Kaliszewski, Ignacy; Miroforidis, Janusz; Podkopaev, Dmitry (2025). Mean-variance data collections for multiperiod portfolio optimization problems [Dataset]. http://doi.org/10.18150/6CV7RS
    Explore at:
    zip(30669516), zip(17297512), zip(47887514), zip(122815738), zip(93996163), zip(68964388), zip(154465266), txt(2847), zip(7777092), docx(23361)Available download formats
    Dataset updated
    Mar 28, 2025
    Dataset provided by
    RepOD
    Authors
    Juszczuk, Przemysław; Kaliszewski, Ignacy; Miroforidis, Janusz; Podkopaev, Dmitry
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Dataset funded by
    National Science Centre (Poland)
    Description

    Mean-variance data collections for portfolio optimization problems based on time series of daily stock prices for New York stock exchange. These data collections can be used for investment portfolio optimization research.NYSE includes over 2000 stocks. These data includes randomly selected sets of size equal to 200, 300, 400, 500, 600, 700, 800 and 900. Each set include the succesive yearly data from 2014, 2015, 2016, 2017, 2018, and 2019 grouped in a single folder. While each folder includes the data saved in a text file following the format used by J. E. Beasley in OR Library (http://people.brunel.ac.uk/~mastjjb/jeb/orlib/portinfo.html).Example structure of files in the 200 set is presented below:- 2014 + JKMP2_200_2014_1 + JKMP2_200_2014_2 + JKMP2_200_2014_3 ... + JKMP2_200_2014_12- 2015 + JKMP2_200_2015_1 + JKMP2_200_2015_2 + JKMP2_200_2015_3 ... + JKMP2_200_2015_12- 2016 + JKMP2_200_2016_1 + JKMP2_200_2016_2 + JKMP2_200_2016_3 ... + JKMP2_200_2016_12- 2017 + JKMP2_200_2017_1 + JKMP2_200_2017_2 + JKMP2_200_2017_3 ... + JKMP2_200_2017_12- 2018 + JKMP2_200_2018_1 + JKMP2_200_2018_2 + JKMP2_200_2018_3 ... + JKMP2_200_2018_12- 2019 + JKMP2_200_2019_1 + JKMP2_200_2019_2 + JKMP2_200_2019_3 ... + JKMP2_200_2019_12A detailed description of the data collections can be found in the README file of the dataset. For each set of stocks, we estimated the correlation matrix and the vector of mean returns, based on the corresponding time series.

  4. e

    Replication Data for: Predicting measurement error variance in social...

    • b2find.eudat.eu
    Updated Sep 14, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2017). Replication Data for: Predicting measurement error variance in social surveys - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/73455c11-0c52-5be5-a1ae-deab8266e7c6
    Explore at:
    Dataset updated
    Sep 14, 2017
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Social science commonly studies relationships among variables by employing survey questions. Answers to these questions will contain some degree of measurement error, distorting the relationships of interest. Such distortions can be removed by standard statistical methods, when these are provided knowledge of a question’s measurement error variance. However, acquiring this information routinely necessitates additional experimentation, which is infeasible in practice. We use three decades’ worth of survey experiments combined with machine learning methods to show that survey measurement error variance can be predicted from the way a question was asked. By predicting experimentally obtained estimates of survey measurement error variance from question characteristics, we enable researchers to obtain estimates of the extent of measurement error in a survey question without requiring additional data collection. Our results suggest only some commonly accepted best practices in survey design have a noticeable impact on study quality, and that predicting measurement error variance is a useful approach to removing this impact in future social surveys. This repository accompanies the full paper, and allows users to reproduce all results.

  5. w

    Dataset of books series that contain Linkages between excess currency and...

    • workwithdata.com
    Updated Nov 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2024). Dataset of books series that contain Linkages between excess currency and stock market returns : Granger causality in mean and variance [Dataset]. https://www.workwithdata.com/datasets/book-series?f=1&fcol0=j0-book&fop0=%3D&fval0=Linkages+between+excess+currency+and+stock+market+returns+:+Granger+causality+in+mean+and+variance&j=1&j0=books
    Explore at:
    Dataset updated
    Nov 25, 2024
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about book series. It has 1 row and is filtered where the books is Linkages between excess currency and stock market returns : Granger causality in mean and variance. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.

  6. Z

    _Attention what is it like [Dataset]

    • data.niaid.nih.gov
    • explore.openaire.eu
    • +1more
    Updated Mar 7, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dinis Pereira, Vitor Manuel (2021). _Attention what is it like [Dataset] [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_780412
    Explore at:
    Dataset updated
    Mar 7, 2021
    Dataset authored and provided by
    Dinis Pereira, Vitor Manuel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    R Core Team. (2016). R: A language and environment for statistical computing. R Foundation for Statistical Computing.

    Supplement to Occipital and left temporal instantaneous amplitude and frequency oscillations correlated with access and phenomenal consciousness (https://philpapers.org/rec/PEROAL-2).

    Occipital and left temporal instantaneous amplitude and frequency oscillations correlated with access and phenomenal consciousness move from the features of the ERP characterized in Occipital and Left Temporal EEG Correlates of Phenomenal Consciousness (Pereira, 2015, https://doi.org/10.1016/b978-0-12-802508-6.00018-1, https://philpapers.org/rec/PEROAL) towards the instantaneous amplitude and frequency of event-related changes correlated with a contrast in access and in phenomenology.

    Occipital and left temporal instantaneous amplitude and frequency oscillations correlated with access and phenomenal consciousness proceed as following.

    In the first section, empirical mode decomposition (EMD) with post processing (Xie, G., Guo, Y., Tong, S., and Ma, L., 2014. Calculate excess mortality during heatwaves using Hilbert-Huang transform algorithm. BMC medical research methodology, 14, 35) Ensemble Empirical Mode Decomposition (postEEMD) and Hilbert-Huang Transform (HHT).

    In the second section, calculated the variance inflation factor (VIF).

    In the third section, partial least squares regression (PLSR): the minimal root mean squared error of prediction (RMSEP).

    In the last section, partial least squares regression (PLSR): significance multivariate correlation (sMC) statistic.

  7. s

    Variance Analysis Dataset - Yields N-level Environment

    • repository.soilwise-he.eu
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Variance Analysis Dataset - Yields N-level Environment [Dataset]. https://repository.soilwise-he.eu/cat/collections/metadata:main/items/4fcaa48d-aef8-4f80-aa54-5dd992ad4333
    Explore at:
    Description

    This table (Variance Analysis Dataset - Yields N-level Environment) is part of a larger file dataset that contains processed data and information used in the meta-analysis “Yield development of German winter wheat between 1958 and 2015” of the Project “Data-Meta Analysis to assess the productivity development of cultivated plants” funded by the DFG. This table contains the final data used for the variance analysis in this project and derived from the entire dataset, which comprises the following data: - Winter wheat (Triticum aestivum) yields and nitrogen application amounts from nitrogen fertilization experiments of variable duration (1-6 years) carried out at 43 locations across Germany and between 1958 and 2015 found in 34 different sources in the literature. - The derived maximum yields (Ymax) and optimal nitrogen amounts (Nopt) from the nitrogen experiments, function coefficients, and statistics. - Geographical information (latitude, longitude, altitude) and other site specific information of the experimental sites (soil type, soil yield potential, mean annual temperature, mean annual precipitation, mean annual climatic water balance, soil climate region, cultivation region). - Processed phenological and climatic data for each experimental site.

  8. Dataset for modeling spatial and temporal variation in natural background...

    • catalog.data.gov
    Updated Nov 12, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). Dataset for modeling spatial and temporal variation in natural background specific conductivity [Dataset]. https://catalog.data.gov/dataset/dataset-for-modeling-spatial-and-temporal-variation-in-natural-background-specific-conduct
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    This file contains the data set used to develop a random forest model predict background specific conductivity for stream segments in the contiguous United States. This Excel readable file contains 56 columns of parameters evaluated during development. The data dictionary provides the definition of the abbreviations and the measurement units. Each row is a unique sample described as R** which indicates the NHD Hydrologic Unit (underscore), up to a 7-digit COMID, (underscore) sequential sample month. To develop models that make stream-specific predictions across the contiguous United States, we used StreamCat data set and process (Hill et al. 2016; https://github.com/USEPA/StreamCat). The StreamCat data set is based on a network of stream segments from NHD+ (McKay et al. 2012). These stream segments drain an average area of 3.1 km2 and thus define the spatial grain size of this data set. The data set consists of minimally disturbed sites representing the natural variation in environmental conditions that occur in the contiguous 48 United States. More than 2.4 million SC observations were obtained from STORET (USEPA 2016b), state natural resource agencies, the U.S. Geological Survey (USGS) National Water Information System (NWIS) system (USGS 2016), and data used in Olson and Hawkins (2012) (Table S1). Data include observations made between 1 January 2001 and 31 December 2015 thus coincident with Moderate Resolution Imaging Spectroradiometer (MODIS) satellite data (https://modis.gsfc.nasa.gov/data/). Each observation was related to the nearest stream segment in the NHD+. Data were limited to one observation per stream segment per month. SC observations with ambiguous locations and repeat measurements along a stream segment in the same month were discarded. Using estimates of anthropogenic stress derived from the StreamCat database (Hill et al. 2016), segments were selected with minimal amounts of human activity (Stoddard et al. 2006) using criteria developed for each Level II Ecoregion (Omernik and Griffith 2014). Segments were considered as potentially minimally stressed where watersheds had 0 - 0.5% impervious surface, 0 – 5% urban, 0 – 10% agriculture, and population densities from 0.8 – 30 people/km2 (Table S3). Watersheds with observations with large residuals in initial models were identified and inspected for evidence of other human activities not represented in StreamCat (e.g., mining, logging, grazing, or oil/gas extraction). Observations were removed from disturbed watersheds, with a tidal influence or unusual geologic conditions such as hot springs. About 5% of SC observations in each National Rivers and Stream Assessment (NRSA) region were then randomly selected as independent validation data. The remaining observations became the large training data set for model calibration. This dataset is associated with the following publication: Olson, J., and S. Cormier. Modeling spatial and temporal variation in natural background specific conductivity. ENVIRONMENTAL SCIENCE & TECHNOLOGY. American Chemical Society, Washington, DC, USA, 53(8): 4316-4325, (2019).

  9. Data from: Variation in trends of consumption based carbon accounts

    • data.europa.eu
    unknown
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The citation is currently not available for this dataset.
    Explore at:
    unknown(103992)Available download formats
    Dataset updated
    Jan 24, 2020
    Dataset authored and provided by
    Zenodohttp://zenodo.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In this work we present results of all the major global models and normalise the model results by looking at changes over time relative to a common base year value. We give an analysis of the variability across the models, both before and after normalisation in order to give insights into variance at national and regional level. A dataset of harmonised results (based on means) and measures of dispersion is presented, providing a baseline dataset for CBCA validation and analysis. The dataset is intended as a goto dataset for country and regional results of consumption and production based accounts. The normalised mean for each country/region is the principle result that can be used to assess the magnitude and trend in the emission accounts. However, an additional key element of the dataset are the measures of robustness and spread of the results across the source models. These metrics give insight into the amount of trust should be placed in the individual country/region results. Code at https://doi.org/10.5281/zenodo.3181930

  10. Dual Simulated SEND and EDX dataset

    • zenodo.org
    zip
    Updated Apr 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andy Bridger; Andy Bridger (2024). Dual Simulated SEND and EDX dataset [Dataset]. http://doi.org/10.5281/zenodo.11059004
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 24, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Andy Bridger; Andy Bridger
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A simulated SEND + EDX dataset along with the code used to produce it.

    • code
      • SEND_ground_Truth_Segment_Model-AB.ipynb (ipynb which outlines the code for end to end data production, however some of the actual SEND data production is done through the gen_data.py and add_noise.py files due to system memory requirements making a cluster job more convenient)
      • gen_data.py (python file for creating an intermediate simulated SEND dataset)
      • add_noise.py (python file that takes the intermediate SEND dataset and samples it to produce pseudo-experimental data)
    • phase_maps
      • Contains pairs of jpg/npy files that show/quantify the proportional of each phase at each pixel location as constructed in the atomic model
    • data
      • SEND.hspy (the simulated SEND dataset as per the atomic model)
      • EDS.hspy (the simulated EDS dataset)
      • EDS-varied-dose.zip (EDS simulations at different electron doses)
      • atomic_model.xyz (ASE atomic model for the simulated data)
      • labelled_voxels.npy (the phase labels for the 3d array of volumetric-pixels used to produce the atomic model)

    Added in newer version: the VAE processing of the SEND data has been included

    • data
      • RadialData
        • data_radial.hspy (A radial transformation of the simulated SEND dataset used for VAE testing)
        • data_radial_training_data.hspy (The data_radial.hspy dataset but with pattern populations reweighted to better represent high variance regions)
        • navigation_axis_variance.npy (The mean variance within the 2D diffraction signal at each pixel probe position)
        • signal_axis_variance.npy (The variance of each signal pixel in the 2D diffraction pattern, averaged for each pixel probe position)
        • RadialModel
          • best_model.hdf5 (The trained weights for the VAE)
        • PCA_comps_mse
          • N (The number of PCA components used to estimate the centroids of the clustering)
            • latspacedata.npy (The coordinates of the simulated SEND data in the 2d latent space)
            • mapdata.npy (The assigned cluster label to each of the patterns in the simulated SEND data)
            • Regions
              • i.jpg (the region and radial pattern of each of the clusters
        • ML_clusters_mse
          • encode_data.npy (The coordinates of the simulated SEND data in the 2d latent space)
          • enc_mask.npy (The encoded data transformed into a density based fixed image)
          • ml_cluster_map.npy (The ML predictions of the centroid locations)
          • ML_clusters
            • N (This folder is then the same as the PCA equivalent)
  11. e

    Matlab script Stress2Grid - Dataset - B2FIND

    • b2find.eudat.eu
    Updated Jun 29, 2007
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2007). Matlab script Stress2Grid - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/214253a7-91ea-5351-8f46-4c162cb8ac2e
    Explore at:
    Dataset updated
    Jun 29, 2007
    Description

    The distribution of data records for the maximum horizontal stress orientation SHmax in the Earth’s crust is sparse and very unequally. In order to analyse the stress pattern and its wavelength or to predict the mean SHmax orientation on a regular grid, statistical interpolation as conducted e.g. by Coblentz and Richardson (1995), Müller et al. (2003), Heidbach and Höhne (2008), Heidbach et al. (2010) or Reiter et al. (2014) is necessary. Based on their work we wrote the Matlab® script Stress2Grid that provides several features to analyse the mean SHmax pattern. The script facilitates and speeds up this analysis and extends the functionality compared to aforementioned publications. The script is complemented by a number of example and input files as described in the WSM Technical Report (Ziegler and Heidbach, 2017, http://doi.org/10.2312/wsm.2017.002). The script provides two different concepts to calculate the mean SHmax orientation on a regular grid. The first is using a fixed search radius around the grid point and computes the mean SHmax orientation if sufficient data records are within the search radius. The larger the search radius the larger is the filtered wavelength of the stress pattern. The second approach is using variable search radii and determines the search radius for which the variance of the mean SHmax orientation is below a given threshold. This approach delivers mean SHmax orientations with a user-defined degree of reliability. It resolves local stress perturbations and is not available in areas with conflicting information that result in a large variance. Furthermore, the script can also estimate the deviation between plate motion direction and the mean SHmax orientation.

  12. f

    Dataset for: Optimal Transport, Mean Partition, and Uncertainty Assessment...

    • wiley.figshare.com
    • figshare.com
    txt
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jia Li; Beomseok Seo; Lin Lin (2023). Dataset for: Optimal Transport, Mean Partition, and Uncertainty Assessment in Cluster Analysis [Dataset]. http://doi.org/10.6084/m9.figshare.8038925
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Wiley
    Authors
    Jia Li; Beomseok Seo; Lin Lin
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    In scientific data analysis, clusters identified computationally often substantiate existing hypotheses or motivate new ones. Yet the combinatorial nature of the clustering result, which is a partition rather than a set of parameters or a function, blurs notions of mean and variance. This intrinsic difficulty hinders the development of methods to improve clustering by aggregation or to assess the uncertainty of clusters generated. We overcome that barrier by aligning clusters via optimal transport. Equipped with this technique, we propose a new algorithm to enhance clustering by any baseline method using bootstrap samples. Cluster alignment enables us to quantify variation in the clustering result at the levels of both overall partitions and individual clusters. Set relationships between clusters such as one-to-one match, split, and merge can be revealed. A covering point set for each cluster, a concept kin to the confidence interval, is proposed. The tools we have developed here will help address the crucial question of whether any cluster is an intrinsic or spurious pattern. Experimental results on both simulated and real datasets are provided.

  13. f

    Fig_SupportingData

    • figshare.com
    xlsx
    Updated Jan 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jing Fu (2025). Fig_SupportingData [Dataset]. http://doi.org/10.6084/m9.figshare.28194263.v2
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jan 13, 2025
    Dataset provided by
    figshare
    Authors
    Jing Fu
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The provided dataset contains results from Monte Carlo simulations related to variance swaps. The data is organized into multiple sheets, each focusing on different parameters and scenarios.Figure 1:Monte Carlo Simulations: This section presents the results of Monte Carlo simulations for both discretely-sampled and continuously-sampled variance swaps. The values are reported for different sample sizes (N=12 to N=322), showing how the estimated variance swap values converge as the number of samples increases.Sample 1 and Sample 2: These represent two different sets of simulation results, each showing the impact of varying sample sizes on the variance swap values.Figure 2:κθ (Kappa Theta): This section explores the impact of different values of κθ on the variance swap values. θ̃ (Theta Tilde): This part examines the effect of varying θ̃ on the variance swap values .σθ (Sigma Theta): This section analyzes the influence of σθ on the variance swap values .θ₀ (Theta Zero): This part investigates the impact of different initial volatility levels (θ₀) on the variance swap values .Sheet 3:λ (Lambda): This section studies the effect of varying λ on the variance swap values .η (Eta): This part examines the influence of η on the variance swap values .v (Nu): This section analyzes the impact of v on the variance swap values .δ (Delta): This part investigates the effect of varying δ on the variance swap values .Overall, the dataset provides a comprehensive analysis of how different parameters and sampling methods affect the valuation of variance swaps, offering insights into the sensitivity and convergence behavior of these financial instruments under various conditions.

  14. N

    Median Household Income Variation by Family Size in Will County, IL:...

    • neilsberg.com
    csv, json
    Updated Jan 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2024). Median Household Income Variation by Family Size in Will County, IL: Comparative analysis across 7 household sizes [Dataset]. https://www.neilsberg.com/research/datasets/1b9b14a5-73fd-11ee-949f-3860777c1fe6/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Jan 11, 2024
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Will County, Illinois
    Variables measured
    Household size, Median Household Income
    Measurement technique
    The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. It delineates income distributions across 7 household sizes (mentioned above) following an initial analysis and categorization. Using this dataset, you can find out how household income varies with the size of the family unit. For additional information about these estimations, please contact us via email at research@neilsberg.com
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset presents median household incomes for various household sizes in Will County, IL, as reported by the U.S. Census Bureau. The dataset highlights the variation in median household income with the size of the family unit, offering valuable insights into economic trends and disparities within different household sizes, aiding in data analysis and decision-making.

    Key observations

    • Of the 7 household sizes (1 person to 7-or-more person households) reported by the census bureau, all of the household sizes were found in Will County. Across the different household sizes in Will County the mean income is $104,449, and the standard deviation is $29,150. The coefficient of variation (CV) is 27.91%. This high CV indicates high relative variability, suggesting that the incomes vary significantly across different sizes of households.
    • In the most recent year, 2021, The smallest household size for which the bureau reported a median household income was 1-person households, with an income of $44,120. It then further increased to $105,920 for 7-person households, the largest household size for which the bureau reported a median household income.

    https://i.neilsberg.com/ch/will-county-il-median-household-income-by-household-size.jpeg" alt="Will County, IL median household income, by household size (in 2022 inflation-adjusted dollars)">

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

    Household Sizes:

    • 1-person households
    • 2-person households
    • 3-person households
    • 4-person households
    • 5-person households
    • 6-person households
    • 7-or-more-person households

    Variables / Data Columns

    • Household Size: This column showcases 7 household sizes ranging from 1-person households to 7-or-more-person households (As mentioned above).
    • Median Household Income: Median household income, in 2022 inflation-adjusted dollars for the specific household size.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Will County median household income. You can refer the same here

  15. f

    Large scale variation in the rate of germ-line de novo mutation, base...

    • figshare.com
    • data.niaid.nih.gov
    • +2more
    docx
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thomas C. A. Smith; Peter F. Arndt; Adam Eyre-Walker (2023). Large scale variation in the rate of germ-line de novo mutation, base composition, divergence and diversity in humans [Dataset]. http://doi.org/10.1371/journal.pgen.1007254
    Explore at:
    docxAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS Genetics
    Authors
    Thomas C. A. Smith; Peter F. Arndt; Adam Eyre-Walker
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    It has long been suspected that the rate of mutation varies across the human genome at a large scale based on the divergence between humans and other species. However, it is now possible to directly investigate this question using the large number of de novo mutations (DNMs) that have been discovered in humans through the sequencing of trios. We investigate a number of questions pertaining to the distribution of mutations using more than 130,000 DNMs from three large datasets. We demonstrate that the amount and pattern of variation differs between datasets at the 1MB and 100KB scales probably as a consequence of differences in sequencing technology and processing. In particular, datasets show different patterns of correlation to genomic variables such as replication time. Never-the-less there are many commonalities between datasets, which likely represent true patterns. We show that there is variation in the mutation rate at the 100KB, 1MB and 10MB scale that cannot be explained by variation at smaller scales, however the level of this variation is modest at large scales–at the 1MB scale we infer that ~90% of regions have a mutation rate within 50% of the mean. Different types of mutation show similar levels of variation and appear to vary in concert which suggests the pattern of mutation is relatively constant across the genome. We demonstrate that variation in the mutation rate does not generate large-scale variation in GC-content, and hence that mutation bias does not maintain the isochore structure of the human genome. We find that genomic features explain less than 40% of the explainable variance in the rate of DNM. As expected the rate of divergence between species is correlated to the rate of DNM. However, the correlations are weaker than expected if all the variation in divergence was due to variation in the mutation rate. We provide evidence that this is due the effect of biased gene conversion on the probability that a mutation will become fixed. In contrast to divergence, we find that most of the variation in diversity can be explained by variation in the mutation rate. Finally, we show that the correlation between divergence and DNM density declines as increasingly divergent species are considered.

  16. e

    SEV-LTER Mean - Variance Experiment Quadrat Plant Species Cover and Height

    • portal.edirepository.org
    csv
    Updated Mar 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jennifer Rudgers; Lauren Baur; Scott Collins; Marcy Litvak; Seth Newsome; William Pockman; Tom Miller; Yiqi Luo (2024). SEV-LTER Mean - Variance Experiment Quadrat Plant Species Cover and Height [Dataset]. http://doi.org/10.6073/pasta/1bcd36f543c4d3a8f96ba824dbf5eaeb
    Explore at:
    csv(3890213 byte)Available download formats
    Dataset updated
    Mar 5, 2024
    Dataset provided by
    EDI
    Authors
    Jennifer Rudgers; Lauren Baur; Scott Collins; Marcy Litvak; Seth Newsome; William Pockman; Tom Miller; Yiqi Luo
    Time period covered
    May 23, 2018 - Sep 25, 2023
    Area covered
    Variables measured
    obs, plot, quad, site, year, block, count, cover, height, quadID, and 10 more
    Description

    We designed novel field experimental infrastructure to resolve the relative importance of changes in the climate mean and variance in regulating the structure and function of dryland populations, communities, and ecosystem processes. The Mean - Variance Climate Experiment (MVE) adds three novel elements to prior designs that have manipulated interannual variance in climate in the field (Gherardi & Sala, 2013) by (i) determining interactive effects of mean and variance with a factorial design that crosses reduced mean with increased variance, (ii) studying multiple dryland biomes to compare their susceptibility to transition under interactive climate drivers, and (iii) adding stochasticity to our treatments to permit the antecedent effects that occur under natural climate variability. This new infrastructure enables direct experimental tests of the hypothesis that interactions between the mean and variance of precipitation will have larger ecological impacts than either the mean or variance in precipitation alone.

       This dataset includes plant species cover and height data measured in 1 m x 1 m quadrats at all Mean - Variance experiment sites. Quadrat locations span five important ecosystems of the American southwest: blue grama-dominated Plains grassland (est. fall 2019), black grama-dominated Chihuahuan Desert grassland (est. fall 2020), creosotebush dominated Chihuahuan Desert shrubland (est. fall 2021), juniper savanna (est. fall 2022) and pinon-juniper woodland (est. fall 2023). Data on plant cover and height for each plant species are collected per individual plant or patch (for clonal plants) within 1 m x 1 m quadrats. These data inform population dynamics of foundational and rare plant species. The cover and height of individual plants or patches are sampled twice yearly (spring and fall) in permanent 1m x 1m plots within each site or experiment. This data package includes plant cover and height only -- for species biomass estimates per quad see package knb-lter-sev.350.
    
  17. N

    Show Low, AZ Median Household Income Trends (2010-2021, in 2022...

    • neilsberg.com
    csv, json
    Updated Jan 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2024). Show Low, AZ Median Household Income Trends (2010-2021, in 2022 inflation-adjusted dollars) [Dataset]. https://www.neilsberg.com/research/datasets/91dc1de4-73f0-11ee-949f-3860777c1fe6/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Jan 11, 2024
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Show Low, Arizona
    Variables measured
    Median Household Income, Median Household Income Year on Year Change, Median Household Income Year on Year Percent Change
    Measurement technique
    The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. It presents the median household income from the years 2010 to 2021 following an initial analysis and categorization of the census data. Subsequently, we adjusted these figures for inflation using the Consumer Price Index retroactive series via current methods (R-CPI-U-RS). For additional information about these estimations, please contact us via email at research@neilsberg.com
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset illustrates the median household income in Show Low, spanning the years from 2010 to 2021, with all figures adjusted to 2022 inflation-adjusted dollars. Based on the latest 2017-2021 5-Year Estimates from the American Community Survey, it displays how income varied over the last decade. The dataset can be utilized to gain insights into median household income trends and explore income variations.

    Key observations:

    From 2010 to 2021, the median household income for Show Low increased by $2,280 (4.10%), as per the American Community Survey estimates. In comparison, median household income for the United States increased by $4,559 (6.51%) between 2010 and 2021.

    Analyzing the trend in median household income between the years 2010 and 2021, spanning 11 annual cycles, we observed that median household income, when adjusted for 2022 inflation using the Consumer Price Index retroactive series (R-CPI-U-RS), experienced growth year by year for 6 years and declined for 5 years.

    https://i.neilsberg.com/ch/show-low-az-median-household-income-trend.jpeg" alt="Show Low, AZ median household income trend (2010-2021, in 2022 inflation-adjusted dollars)">

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. All incomes have been adjusting for inflation and are presented in 2022-inflation-adjusted dollars.

    Years for which data is available:

    • 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021

    Variables / Data Columns

    • Year: This column presents the data year from 2010 to 2021
    • Median Household Income: Median household income, in 2022 inflation-adjusted dollars for the specific year
    • YOY Change($): Change in median household income between the current and the previous year, in 2022 inflation-adjusted dollars
    • YOY Change(%): Percent change in median household income between current and the previous year

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Show Low median household income. You can refer the same here

  18. GulfFlow: A gridded surface current product for the Gulf of Mexico from...

    • zenodo.org
    Updated Jul 5, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jonathan M. Lilly; Jonathan M. Lilly; Paula Pérez-Brunius; Paula Pérez-Brunius (2023). GulfFlow: A gridded surface current product for the Gulf of Mexico from consolidated drifter measurements [Dataset]. http://doi.org/10.5281/zenodo.3978794
    Explore at:
    Dataset updated
    Jul 5, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jonathan M. Lilly; Jonathan M. Lilly; Paula Pérez-Brunius; Paula Pérez-Brunius
    Area covered
    Gulf of Mexico (Gulf of America)
    Description

    This dataset is comprised of mean and variance of the surface velocity field of the Gulf of Mexico, obtained from a large set of historical surface drifter data from the Gulf of Mexico—3761 trajectories spanning 27 years and more than a dozen data sources— which were, uniformly processed and quality controlled, and assimilated into a spatially and temporally gridded dataset. A gridded product, called GulfFlow, is created by averaging all available data from the GulfDrifters dataset within quarter-degree spatial bins, and within overlapping monthlong temporal bins having a semimonthly spacing. The data set runs from August 16, 1992 to August 1, 2019, for a total of 648 overlapping time slices. Odd- numbered slices correspond to calendar months, while even-numbered slices run from halfway through one month to halfway through the following month. A higher spatial resolution version, GulfFlow-1/12 degree is created in the identical way but using 1/12 degree bins instead of quarter-degree bins. In addition to the average velocities within each 3D bin, the count of sources contributing to each bin is also distributed, as is the subgridscale velocity variance discussed in the next section. The count variable is a four-dimensional array of integers, the fourth dimension of which has length 30. This variable gives the number of hourly observations from each source dataset contributing to each three-dimensional bin. Values 1–15 are the count of velocities from drifters from each of the 15 experiments that have not been flagged as having lost their drogues, while values 16–30 are for observation from drifters that have been flagged as having lost their drogue. Values above 15 are only populated for the GDP, HARGOS, LASER and some of the DWDE drifters, as a drogue presence flag is not always available. It is useful at this stage to introduce notation for different types of averages. For convenience we represent the velocity as a vector, u = [u v]T , where the superscript “T” denotes the transpose. Let an overbar, \(\overline {\bf u}\) , denote an average over a spatial bin and over all times, while angled brackets, <u>, denote an average over a spatial bin and a particular temporal bin. Thus, <u>, is a function of time while \(\overline {\bf u}\) is not. We refer to <u>, as the local average, \(\overline {\bf u}\) as the global average, and \(\overline {<\bf u>}\) as the double average. Given the inhomogeneity of the drifter data, turns out the global average is biased towards intensive but short duration programs, hence the double average results in a much better representation of the true mean velocity field. The dataset includes the global average \(\overline {<\bf u>}\), the local covariance defined as

    \(\bf{ε}=<(u − )(𝐮−< 𝐮 >)^T>\)

    and \(\epsilon^2\)which is the trace of \(\overline{\bf ε}\)

    \(\epsilon^2\)=\(tr\{\overline{\bf ε}\}\)

    The data is distributed in two separate netcdCDF files, one for each grid resolution.

  19. r

    Assessment of protocols and variance for specific leaf area (SLA) in 10...

    • researchdata.edu.au
    • adelaide.figshare.com
    Updated Mar 18, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rhys Morgan; Irene Martin Fores; Greg Guerin; Emrys Leitch (2021). Assessment of protocols and variance for specific leaf area (SLA) in 10 Eucalyptus species to inform functional trait sampling strategies for TERN Ausplots [Dataset]. http://doi.org/10.25909/14197298
    Explore at:
    Dataset updated
    Mar 18, 2021
    Dataset provided by
    The University of Adelaide
    Authors
    Rhys Morgan; Irene Martin Fores; Greg Guerin; Emrys Leitch
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description
    Introduction

    Functional trait-based approaches to ecological questions are becoming more common. The ongoing development of large functional trait databases has further enabled these kinds of studies.

    TERN is Australia's national land ecosystem observatory. Through monitoring more than 750 plots across major biomes and bioclimatic gradients, TERN Ecosystem Surveillance has collected over 40,000 voucher specimens with herbarium level taxonomic determinations. This collection represents an opportunity to generate high quality functional trait data for a large number of Australian flora.

    This pilot study aimed to test the feasibility of using the TERN collection to measure a simple morphological trait. Specific leaf area (SLA) is the one-sided area of a fresh leaf divided by its dry mass. We restricted our study to the Eucalyptus genus as Eucalyptus species are common in TERN monitoring plots and are often the dominant tree species. The results of the study should inform the future sampling strategy for SLA.


    Method

    The first component of the study was the measurement of leaves from voucher specimens. We located 30 Eucalyptus vouchers exclusively from South Australian plots (figure 1) and took 5 leaves from each specimen. The leaves were mounted onto sheets of paper and scanned with a flatbed scanner. Leaf area was measured from the scans using ImageJ software. The mounted leaves were placed into a plant press and dried in an oven for 24 hours at 70C. Each leaf was individually weighed using a 0.1mg microbalance.

    The second component was the collection and measurement of fresh leaf samples. We collected 5 leaves from 5 individuals for 3 species growing at Waite Conservation Reserve in Adelaide, South Australia. The leaves were mounted, scanned and measured as before. They were then dried in an oven for 72 hours at 70C and weighed with the same microbalance. The dried leaves were scanned and measured again so that leaf area shrinkage between fresh and dry leaves could be estimated.

    Shrinkage percentage was obtained using the formula:
    ((fresh area - dry area) / fresh area) * 100

    SLA was obtained for each leaf using the formula:
    dry area / dry mass

    We ran an Anova (type II) on our SLA data using the Car v3.0-10 package in R v4.0.3.


    Results

    The pilot dataset contained 225 leaves from 45 individuals encompassing 10 species.

    The mean shrinkage in leaf area was 10.27% with a standard deviation of 1.75%. This estimate came from 75 leaves (25 each from E. microcarpa, E. leucoxylon and E. camaldulensis subsp. camaldulensis).

    The Anova output (figure 2) revealed that variation between individuals of the same species contributed the most to the overall SLA variation (sumsq = 111.6, R^2 = 0.499). A substantial portion of the overall variation was also attributed to variation between species (sumsq = 88.5, R^2 = 0.396%). The residual variation (sumsq = 23.3, R^2 = 0.104) was attributed to the variation between leaves from the same individual. The boxplots of 'SLA by individual' and 'SLA by species' (figure 3) support these results.


    Recommendations

    The shrinkage results show that shrinkage due to water loss is consistent and predictable for Eucalyptus leaves. This means that leaf traits can be reliably measured from herbarium style collections and the data derived from such measurements can be compared and integrated with data from fresh leaves.

    By analysing the variance in the SLA data we have shown that the variation between individuals of the same species is significant and deserves further attention. However, the variation between species is also significant and should be captured in future studies. As such, we recommend that any subsequent attempt to construct a larger dataset of SLA measurements from the TERN voucher collection should focus on Eucalyptus species which are well represented. This will ensure that both intraspecific and interspecific variation is captured. Currently there are 27 Eucalyptus species with 10 or more vouchers, 47 species with 5 or more vouchers and 130 species with 1 or more voucher.

    The variation between individual leaves was found to be a small part of the overall variation. This means that in future it should not be necessary to measure 5 leaves from each individual. Measuring 1 leaf from an individual will likely give a reliable estimate of the individual's mean SLA.

    Certain changes to the TERN survey methodology could help to facilitate the accumulation of trait data. It is important that plant material taken for vouchers is the youngest fully mature material available and that it is from the outermost part of the canopy (i.e. sun leaves). This will help to ensure consistency and accuracy of trait measurements. When fruit/seeds are present they should be placed into a bag and kept with the voucher specimen. Taking ample plant material from each individual will ensure that destructive trait analysis does not affect the quality of the voucher specimen. Where a species is abundant or dominant it will be beneficial to take vouchers from multiple individuals to further investigate intraspecific and within-site trait variation.

    This study has served to highlight the potential for a trait database to be produced from the TERN voucher collection. It is evident that, for at least the Eucalyptus genus, there is valuable trait information contained in specimen vouchers which, if measured, could enable further research into important ecological questions (e.g. how does SLA vary intraspecifically and does it contribute to a species' environmental tolerance?).


    References

    Garnier, E, Shipley, B, Roumet, C & Laurent, G 2001, 'A standardized protocol for the determination of specific leaf area and leaf dry matter content', Functional Ecology, vol. 15, pp. 688-695.

    Munroe, S, Guerin, GR, Saleeba, T, , Martin-Fores, M, Blanco-Martin, B, Sparrow, B & Tokmakoff, A 2020. 'ausplotsR: An R package for rapid extraction and analysis of vegetation and soil data collected by Australia’s Terrestrial Ecosystem Research Network'. EcoEvoRxiv, DOI 10.32942/osf.io/25phx

    Nock, CA, Vogt, RJ & Beatrix, BE 2016, 'Functional Traits', eLS.

    Pérez-Harguindeguy, N, Díaz, S, Garnier, E, Lavorel, S, Poorter, H, Jaureguiberry, P, Bret-Harte, MS, Cornwell, WK, Craine, JM, Gurvich, DE, Urcelay, C, Veneklaas, EJ, Reich, PB, Poorter, L, Wright, IJ, Ray, P, Enrico, L, Pausas, JG, de Vos, AC, Buchmann, N, Funes, G, Quétier, F, Hodgson, JG, Thompson, K, Morgan, HD, ter Steege, H, van der Heijden, MGA, Sack, L, Blonder, B, Poschlod, P, Vaieretti, MV, Conti, G, Staver, AC, Aquino, S & Cornelissen, JHC 2016, 'New handbook for standardised measurement of plant functional traits worldwide', Australian Joural of Botany, vol. 64, pp. 715-716.

    Perez, TM, Rodriguez, J & Heberling, JM 2020, 'Herbarium-based measurements reliably estimate three functional traits', American Journal of Botany, vol. 107, no. 10, pp. 1457-1464.

    Queenborough, S 2017, 'Collections-based Studies of Plant Functional Traits', Scientia Danica, Series B, no. 6, pp. 223-236.

    Sparrow, B, Foulkes, J, Wardle, G. Leitch, E, Caddy-Retalic, S, van Leeuwen, S, Tokmakoff, A, Thurgate, N, Guerin, G & Lowe, A 2020. 'A Vegetation and soil survey method for surveillance monitoring of rangeland environments'. Frontiers in Ecology and Evolution, vol. 8, pp. 157.

    Tavsanoglu, C & Pausas, JG 2018, 'A functional trait database for Mediterranean Basin plants', Scientific Data, vol. 5, pp. 1-18.

    Torrez, V, Jorgensen, PM & Zanne, AE 2012, 'Specific leaf area: a predictive model using dried samples', Australian Journal of Botany, vol. 61, pp. 350-357.
  20. d

    Digital surfaces and site data of well-screen top and bottom altitudes...

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Jul 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Digital surfaces and site data of well-screen top and bottom altitudes defining the irrigation production zone of the Mississippi River Valley alluvial aquifer within the Mississippi Alluvial Plain project region [Dataset]. https://catalog.data.gov/dataset/digital-surfaces-and-site-data-of-well-screen-top-and-bottom-altitudes-defining-the-irriga
    Explore at:
    Dataset updated
    Jul 6, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    Mississippi River Alluvial Plain, Mississippi River
    Description

    Site data contained in the ScrIntrvls_AllSrcRefs_AllWellsRev.csv dataset define the top and bottom altitudes of well screens in 64,763 irrigation wells completed in the Mississippi River Valley alluvial aquifer (MRVA) that constitute a production zone in the Mississippi Alluvial Plain (MAP) extending across the midwestern and southern United States from Illinois to Louisiana. Each well entry contains an Enumerated Domain Value of the Attribute Label SrcRefNo to identify the state environmental agency that contributed to the database, and enumerated values are associated with specific state agencies by using the Enumerated Domain Value Definition. Screen-top and -bottom altitudes and land surface are referenced (corrected) to the National Elevation Dataset (NED) 10-meter digital elevation model (DEM; https://nationalmap.gov/elevation.html). The dataset identifies 50,103 screen-bottom altitudes and 50,457 screen-top altitudes that were used in subsequent geostatistical estimation after spatial analytics filtered out duplicate-coordinate wells, geographic and stratigraphic outliers, and incongruities of screen-top and -bottom altitudes compared with DEM land-surface altitude and the published digital surface of the bottom altitude of the MRVA (https://doi.org/10.3133/sim3426). Well entries are indexed in the dataset to identify use in four regional geostatistical models that collectively encompass the MAP extent and provide gridded estimates of screen-top and-bottom altitudes and estimation uncertainty (estimation variance) associated with each gridded altitude estimate. Digital surfaces of screen-top and -bottom estimates and estimation variances are represented as raster datasets that were converted to netCDF format and conform with the previously published National Hydrologic Grid (https://doi.org/10.5066/F7P84B24) at one-kilometer resolution. This dataset contains high-quality map images for gridded estimates of MRVA screen-top and -bottom altitude and for corresponding gridded estimation variances saved in the Tagged Image File (.tif) format.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Modiehi Mophethe (2024). Results and analysis using the Lean Six-Sigma define, measure, analyze, improve, and control (DMAIC) Framework [Dataset]. http://doi.org/10.25403/UPresearchdata.25370374.v1

Results and analysis using the Lean Six-Sigma define, measure, analyze, improve, and control (DMAIC) Framework

Explore at:
docxAvailable download formats
Dataset updated
Mar 12, 2024
Dataset provided by
University of Pretoria
Authors
Modiehi Mophethe
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This section presents a discussion of the research data. The data was received as secondary data however, it was originally collected using the time study techniques. Data validation is a crucial step in the data analysis process to ensure that the data is accurate, complete, and reliable. Descriptive statistics was used to validate the data. The mean, mode, standard deviation, variance and range determined provides a summary of the data distribution and assists in identifying outliers or unusual patterns. The data presented in the dataset show the measures of central tendency which includes the mean, median and the mode. The mean signifies the average value of each of the factors presented in the tables. This is the balance point of the dataset, the typical value and behaviour of the dataset. The median is the middle value of the dataset for each of the factors presented. This is the point where the dataset is divided into two parts, half of the values lie below this value and the other half lie above this value. This is important for skewed distributions. The mode shows the most common value in the dataset. It was used to describe the most typical observation. These values are important as they describe the central value around which the data is distributed. The mean, mode and median give an indication of a skewed distribution as they are not similar nor are they close to one another. In the dataset, the results and discussion of the results is also presented. This section focuses on the customisation of the DMAIC (Define, Measure, Analyse, Improve, Control) framework to address the specific concerns outlined in the problem statement. To gain a comprehensive understanding of the current process, value stream mapping was employed, which is further enhanced by measuring the factors that contribute to inefficiencies. These factors are then analysed and ranked based on their impact, utilising factor analysis. To mitigate the impact of the most influential factor on project inefficiencies, a solution is proposed using the EOQ (Economic Order Quantity) model. The implementation of the 'CiteOps' software facilitates improved scheduling, monitoring, and task delegation in the construction project through digitalisation. Furthermore, project progress and efficiency are monitored remotely and in real time. In summary, the DMAIC framework was tailored to suit the requirements of the specific project, incorporating techniques from inventory management, project management, and statistics to effectively minimise inefficiencies within the construction project.

Search
Clear search
Close search
Google apps
Main menu