Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This excel file will do a statistical tests of whether two ROC curves are different from each other based on the Area Under the Curve. You'll need the coefficient from the presented table in the following article to enter the correct AUC value for the comparison: Hanley JA, McNeil BJ (1983) A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology 148:839-843.
Facebook
TwitterThe Excel file contains the model input-out data sets that where used to evaluate the two-layer soil moisture and flux dynamics model. The model is original and was developed by Dr. Hantush by integrating the well-known Richards equation over the root layer and the lower vadose zone. The input-output data are used for: 1) the numerical scheme verification by comparison against HYDRUS model as a benchmark; 2) model validation by comparison against real site data; and 3) for the estimation of model predictive uncertainty and sources of modeling errors. This dataset is associated with the following publication: He, J., M.M. Hantush, L. Kalin, and S. Isik. Two-Layer numerical model of soil moisture dynamics: Model assessment and Bayesian uncertainty estimation. JOURNAL OF HYDROLOGY. Elsevier Science Ltd, New York, NY, USA, 613 part A: 128327, (2022).
Facebook
TwitterThe Florida Flood Hub for Applied Research and Innovation and the U.S. Geological Survey have developed projected future change factors for precipitation depth-duration-frequency (DDF) curves at 242 National Oceanic and Atmospheric Administration (NOAA) Atlas 14 stations in Florida. The change factors were computed as the ratio of projected future to historical extreme-precipitation depths fitted to extreme-precipitation data from downscaled climate datasets using a constrained maximum likelihood (CML) approach as described in https://doi.org/10.3133/sir20225093. The change factors correspond to the period 2020-59 (centered in 2040) or to the period 2050-89 (centered in the year 2070) as compared to the 1966-2005 historical period. A Microsoft Excel workbook is provided that tabulates best models for each downscaled climate dataset and for all downscaled climate datasets considered together. Best models were identified based on how well the models capture the climatology and interannual variability of four climate extreme indices using the Model Climatology Index (MCI) and the Model Variability Index (MVI) of Srivastava and others (2020). The four indices consist of annual maxima consecutive precipitation for durations of 1, 3, 5, and 7 days compared against the same indices computed based on the PRISM and SFWMD gridded precipitation datasets for five climate regions: climate region 1 in Northwest Florida, 2 in North Florida, 3 in North Central Florida, 4 in South Central Florida, and climate region 5 in South Florida. The PRISM dataset is based on the Parameter-elevation Relationships on Independent Slopes Model interpolation method of Daly and others (2008). The South Florida Water Management District’s (SFWMD) precipitation super-grid is a gridded precipitation dataset developed by modelers at the agency for use in hydrologic modeling (SFWMD, 2005). This dataset is considered by the SFWMD as the best available gridded rainfall dataset for south Florida and was used in addition to PRISM to identify best models in the South Central and South Florida climate regions. Best models were selected based on MCI and MVI evaluated within each individual downscaled dataset. In addition, best models were selected by comparison across datasets and referred to as "ALL DATASETS" hereafter. Due to the small sample size, all models in the using the Weather Research and Forecasting Model (JupiterWRF) dataset were considered as best models.
Facebook
TwitterIn this project, I analysed the employees of an organization located in two distinct countries using Excel. This project covers:
1) How to approach a data analysis project 2) How to systematically clean data 3) Doing EDA with Excel formulas & tables 4) How to use Power Query to combine two datasets 5) Statistical Analysis of data 6) Using formulas like COUNTIFS, SUMIFS, XLOOKUP 7) Making an information finder with your data 8) Male vs. Female Analysis with Pivot tables 9) Calculating Bonuses based on business rules 10) Visual analytics of data with 4 topics 11) Analysing the salary spread (Histograms & Box plots) 12) Relationship between Salary & Rating 13) Staff growth over time - trend analysis 14) Regional Scorecard to compare NZ with India
Including various Excel features such as: 1) Using Tables 2) Working with Power Query 3) Formulas 4) Pivot Tables 5) Conditional formatting 6) Charts 7) Data Validation 8) Keyboard Shortcuts & tricks 9) Dashboard Design
Facebook
TwitterAge-depth models for Pb-210 datasets. The St Croix Watershed Research Station, of the Science Museum of Minnesota, kindly made available 210Pb datasets that have been measured in their lab over the past decades. The datasets come mostly from North American lakes. These datasets were used to produce both chronologies using the 'classical' CRS (Constant Rate of Supply) approach and also using a recently developed Bayesian alternative called 'Plum'. Both approaches were used in order to compare the two approaches. The 210Pb data will also be deposited in the neotomadb.org database. The dataset consists of 3 files; 1. Rcode_Pb210.R R code to process the data files, produce age-depth models and compare them. 2. StCroix_agemodel_output.zip Output of age-model runs of the St Croix datasets 3. StCroix_xlxs_files.zip Excel files of the St Croix Pb-210 datasets
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the Excel township population over the last 20 plus years. It lists the population for each year, along with the year on year change in population, as well as the change in percentage terms for each year. The dataset can be utilized to understand the population change of Excel township across the last two decades. For example, using this dataset, we can identify if the population is declining or increasing. If there is a change, when the population peaked, or if it is still growing and has not reached its peak. We can also compare the trend with the overall trend of United States population over the same period of time.
Key observations
In 2023, the population of Excel township was 300, a 0.99% decrease year-by-year from 2022. Previously, in 2022, Excel township population was 303, a decline of 0.98% compared to a population of 306 in 2021. Over the last 20 plus years, between 2000 and 2023, population of Excel township increased by 17. In this period, the peak population was 308 in the year 2020. The numbers suggest that the population has already reached its peak and is showing a trend of decline. Source: U.S. Census Bureau Population Estimates Program (PEP).
When available, the data consists of estimates from the U.S. Census Bureau Population Estimates Program (PEP).
Data Coverage:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Excel township Population by Year. You can refer the same here
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This zip file contains: - 3 .zip files = projects to be imported into SmartPLS 3
DLOQ-A model with 7 dimensions DLOQ-A model with second-order latent variable ECSI model (Tenenhaus et al., 2005) to exemplify direct, indirect and total effects, as well as importance-performance map and moderation with continuous variables. ECSI Model (Sanches, 2013) to exemplify MGA (multi-group analysis)
Note: - DLOQ-A = new dataset (ours) - ECSI-Tenenhaus et al. [model for mediation and moderation] = available at: http://www.smartpls.com > Resources > SmartPLS Project Examples - ECSI-Sanches [dataset for MGA] = available in the software R > library(plspm) > data(satisfaction)
Facebook
Twitterhttps://borealisdata.ca/api/datasets/:persistentId/versions/2.1/customlicense?persistentId=doi:10.5683/SP3/SZHJFYhttps://borealisdata.ca/api/datasets/:persistentId/versions/2.1/customlicense?persistentId=doi:10.5683/SP3/SZHJFY
This CD-ROM product is an authoritative reference source of 15 key financial ratios by industry groupings compiled from the North American Industry Classification System (NAICS 2007). It is based on up-to-date, reliable and comprehensive data on Canadian businesses, derived from Statistics Canada databases of financial statements for three reference years. The CD-ROM enables users to compare their enterprise's performance to that of their industry and to address issues such as profitability, efficiency and business risk. Financial Performance Indicators can also be used for inter-industry comparisons. Volume 1 covers large enterprises in both the financial and non-financial sectors, at the national level, with annual operating revenue of $25 million or more. Volume 2 covers medium-sized enterprises in the non-financial sector, at the national level, with annual operating revenue of $5 million to less than $25 million. Volume 3 covers small enterprises in the non-financial sector, at the national, provincial, territorial, Atlantic region and Prairie region levels, with annual operating revenue of $30,000 to less than $5 million. Note: FPICB has been discontinued as of 2/23/2015. Statistics Canada continues to provide information on Canadian businesses through alternative data sources. Information on specific financial ratios will continue to be available through the annual Financial and Taxation Statistics for Enterprises program: CANSIM table 180-0003 ; the Quarterly Survey of Financial Statements: CANSIM tables 187-0001 and 187-0002 ; and the Small Business Profiles, which present financial data for small businesses in Canada, available on Industry Canada's website: Financial Performance Data.
Facebook
TwitterAttribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
All raw data, processed data, and statistical outputs on which the study’s conclusions are based. Three types of data are present: questionnaire, performance, and physiological. Processed performance and questionnaire, as well as associated statistical outputs are data are in .sav and .spv format (SPSS). Physiological data and associated statistical outputs can be opened using Excel.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The open repository consists of two folders; Dataset and Picture. The dataset folder consists file “AWS Dataset Pangandaraan.xlsx”. There are 10 columns with three first columns as time attributes and the other six as atmosphere datasets. Each parameter has 8085 data, and Each parameter has a parameter index at the bottom of the column we added, including mMinimum, mMaximum, and Average values.
For further use, the user can choose one or more parameters for calculating or analyzing. For example, wind data (speed and direction) can be utilized to calculate Waves using the Hindcast method. Furthermore, the user can filter data by using the feature in Excel to extract the exact time range for analyzing various phenomena considered correlated to atmosphere data around Pangandaran, Indonesia.
The second folder, named “Picture,” contains three figures, including the monthly distribution of datasets, temporal data, and wind rose. Furthermore, the user can filter data by using the feature in Excel sheet to extract the exact time range for analyzing various phenomena considered correlated to atmosphere data around Pangandaran, Indonesia
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Hypothesis: The reliability can be adopted to quantitatively measure the sustainability of mega-projects.
Presentation: This dataset shows two scenario based examples to establish an initial reliability assessment of megaproject sustainability. Data were gathered from the author’s assumption with regard to assumed differences between scenarios A and B. There are two sheets in this Microsoft Excel file, including a comparison between two scenarios by using a Fault Tree Analysis model, and a correlation analysis between reliability and unavailability.
Notable findings: It has been found from this exploratory experiment that the reliability can be used to quantitatively measure megaproject sustainability, and there is a negative correlation between reliability and unavailability among 11 related events in association with sustainability goals in the life-cycle of megaproject.
Interpretation: Results from data analysis by using the two sheets can be useful to inform decision making on megaproject sustainability. For example, the reliability to achieve sustainability goals can be enhanced by decrease the unavailability or the failure at individual work stages in megaproject delivery.
Implication: This dataset file can be used to perform reliability analysis in other experiment to access megaproject sustainability.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the Excel population over the last 20 plus years. It lists the population for each year, along with the year on year change in population, as well as the change in percentage terms for each year. The dataset can be utilized to understand the population change of Excel across the last two decades. For example, using this dataset, we can identify if the population is declining or increasing. If there is a change, when the population peaked, or if it is still growing and has not reached its peak. We can also compare the trend with the overall trend of United States population over the same period of time.
Key observations
In 2022, the population of Excel was 539, a 1.46% decrease year-by-year from 2021. Previously, in 2021, Excel population was 547, a decline of 1.08% compared to a population of 553 in 2020. Over the last 20 plus years, between 2000 and 2022, population of Excel decreased by 36. In this period, the peak population was 713 in the year 2010. The numbers suggest that the population has already reached its peak and is showing a trend of decline. Source: U.S. Census Bureau Population Estimates Program (PEP).
When available, the data consists of estimates from the U.S. Census Bureau Population Estimates Program (PEP).
Data Coverage:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Excel Population by Year. You can refer the same here
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This database studies the performance inconsistency on the biomass HHV ultimate analysis. The research null hypothesis is the consistency in the rank of a biomass HHV model. Fifteen biomass models are trained and tested in four datasets. In each dataset, the rank invariability of these 15 models indicates the performance consistency.
The database includes the datasets and source codes to analyze the performance consistency of the biomass HHV. These datasets are stored in tabular on an excel workbook. The source codes are the biomass HHV machine learning model through the MATLAB Objected Orient Program (OOP). These machine learning models consist of eight regressions, four supervised learnings, and three neural networks.
An excel workbook, "BiomassDataSetUltimate.xlsx," collects the research datasets in six worksheets. The first worksheet, "Ultimate," contains 908 HHV data from 20 pieces of literature. The names of the worksheet column indicate the elements of the ultimate analysis on a % dry basis. The HHV column refers to the higher heating value in MJ/kg. The following worksheet, "Full Residuals," backups the model testing's residuals based on the 20-fold cross-validations. The article (Kijkarncharoensin & Innet, 2021) verifies the performance consistency through these residuals. The other worksheets present the literature datasets implemented to train and test the model performance in many pieces of literature.
A file named "SourceCodeUltimate.rar" collects the MATLAB machine learning models implemented in the article. The list of the folders in this file is the class structure of the machine learning models. These classes extend the features of the original MATLAB's Statistics and Machine Learning Toolbox to support, e.g., the k-fold cross-validation. The MATLAB script, name "runStudyUltimate.m," is the article's main program to analyze the performance consistency of the biomass HHV model through the ultimate analysis. The script instantly loads the datasets from the excel workbook and automatically fits the biomass model through the OOP classes.
The first section of the MATLAB script generates the most accurate model by optimizing the model's higher parameters. It takes a few hours for the first run to train the machine learning model via the trial and error process. The trained models can be saved in MATLAB .mat file and loaded back to the MATLAB workspace. The remaining script, separated by the script section break, performs the residual analysis to inspect the performance consistency. Furthermore, the figure of the biomass data in the 3D scatter plot, and the box plots of the prediction residuals are exhibited. Finally, the interpretations of these results are examined in the author's article.
Reference : Kijkarncharoensin, A., & Innet, S. (2022). Performance inconsistency of the Biomass Higher Heating Value (HHV) Models derived from Ultimate Analysis [Manuscript in preparation]. University of the Thai Chamber of Commerce.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Poseidon 2.0 is a user-oriented, simple and fast Excel-Tool which aims to compare different wastewater treatment techniques based on their pollutant removal efficiencies, their costs and additional assessment criteria. Poseidon can be applied for pre-feasibility studies in order to assess possible water reuse options and can show decision makers and other stakeholders that implementable solutions are available to comply with local requirements. This upload consists in:
Poseidon 2.0 Excel File that can be used with Microsoft Excel - XLSM
Handbook presenting main features of the decision support tool - PDF
Externally hosted supplementary file 1, Oertlé, Emmanuel. (2018, December 5). Poseidon - Decision Support Tool for Water Reuse (Microsoft Excel) and Handbook (Version 1.1.1). Zenodo. http://doi.org/10.5281/zenodo.3341573
Externally hosted supplementary file 2, Oertlé, Emmanuel. (2018). Wastewater Treatment Unit Processes Datasets: Pollutant removal efficiencies, evaluation criteria and cost estimations (Version 1.0.0) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.1247434
Externally hosted supplementary file 3, Oertlé, Emmanuel. (2018). Treatment Trains for Water Reclamation (Dataset) (Version 1.0.0) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.1972627
Externally hosted supplementary file 4, Oertlé, Emmanuel. (2018). Water Quality Classes - Recommended Water Quality Based on Guideline and Typical Wastewater Qualities (Version 1.0.2) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.3341570
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset presents the the household distribution across 16 income brackets among four distinct age groups in Excel: Under 25 years, 25-44 years, 45-64 years, and over 65 years. The dataset highlights the variation in household income, offering valuable insights into economic trends and disparities within different age categories, aiding in data analysis and decision-making..
Key observations
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.
Income brackets:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Excel median household income by age. You can refer the same here
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset involves the normal distribution parameters, calculated using used to create simulation instances for the experimental study of the paper "Dynamic Multi-Period Vehicle Routing with Touting". These distributions are used to generate random demand on each planning day. The dataset also includes 3-months historical collections data for a waste collection company, which are used to compare the heuristic solutions with the ones obtained by an exact solver. There are two sets of customers, belonging to two drivers, who cover different geographical areas.
The information for two drivers are stored in separate excel files and each file has 2 sheets, as explained below:
In "Orders" sheet, The historical collections information is presented. The amount of demand is given in Column A, while the day the demand is generated is provided in Column C. Column B has an index value for each customer, indicating its location. The depot has the index of "1".
In "Customer Information" sheet, The list of customers is presented in Column A. They are named starting from C1. Columns B and C present the mean and variance values of the customers' demand distributions. Column D provides the index value for each customer, indicating its location. Finally, Column E presents the tank capacity values for the customers.
In "Distance-Time Matrices" sheet, The distance (in Column C) and travel time (in Column D) values for each pair of nodes are presented. Distances and travel times are given in kilometres and minutes, respectively. "from" and "to" entries are associated with the index values of the customers. For example, to find the distance and travel time from C1 to C2 for Driver 1, one needs to look at the values from index 167 to index 168, which corresponds to a 28.7 kilometres of distance and 21.45 minutes of travel time.
Driver 1 and 2 have 142 and 125 unique customers, respectively. However, the historical data includes total of 273 and 260 customers for Driver 1 and Driver 2, respectively, which means that some customers have multiple orders within that 3-month period.
Facebook
TwitterThis data is random generated in Excel to practice forecasting and visualizations.
The two branches utilize data of thousands of generated product data with nearly 200 different employees. Product ID numbers are randomly generated for each file
This project was for my practice
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Datasets provided here are collected from an extensive literature review based on 92 articles; see title and DOI of linked peer-reviewed article below. These publications were reviewed in detail to compile a summary of current state-of-the-art understanding of the time-dependent tensile behaviour of soda-lime-silica glass. Two main test methods to characterise glass have been identified: (1) the static fatigue test, constant applied stress, and (2) the dynamic fatigue test, constant stress rate. Hence, the following two files (Excel spreadsheets) can be accessed: (1) SLSG_static_fatigue_datasets.xlsx(2) SLSG_dynamic_fatigue_datasets.xlsxEach file is composed of individual sheets divided into author and dataset. The individual sheets contain the reviewed experimental data, post-processed for comparison purposes, and the associated references.Data that only have been presented in the publications as a plot were extracted using the freeware software WebPlotDigitizer (ver. 4.3).
Facebook
TwitterThis dataset in an Excel file presents our data-base on world trade from 1800 to 1938. We have collected or estimated series of imports and exports, at current and constant (1913) prices and at current and at constant (1913) borders, for 149 polities. After a short review of the available series, we describe the methods for the construction of the data-base. We then deal with the criteria for the inclusion of polities, the representativeness of our series, the main types of sources, the procedures of deflation and, when necessary, of adjustments to 1913 borders. We discuss the details of the estimation of our polity series in Appendix B. Following Feinstein and Thomas (2001), we assess the reliability of our polity estimates. In the last two sections we present our trade series at current and 1913 borders and compare them with other available series. This dataset is related to the working paper "World trade, 1800-1938 : a new data-set" by Giovanni Federico and Antonio Tena Junguito, available on: http://hdl.handle.net/10016/22222.
Facebook
TwitterThis Excel file contains all the raw data of Legionella pneumophila determined by both culture and Legiolert methods. Datasheets were prepared separately for each figure and table.
This dataset is associated with the following publication: Boczek, L., M. Tang, C. Formal, D. Lytle, and H. Ryu. Comparison of two culture methods for the enumeration of Legionella pneumophila from potable water samples. JOURNAL OF WATER AND HEALTH. IWA Publishing, London, UK, 19(3): 468-477, (2021).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This excel file will do a statistical tests of whether two ROC curves are different from each other based on the Area Under the Curve. You'll need the coefficient from the presented table in the following article to enter the correct AUC value for the comparison: Hanley JA, McNeil BJ (1983) A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology 148:839-843.