Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This article describes a free, open-source collection of templates for the popular Excel (2013, and later versions) spreadsheet program. These templates are spreadsheet files that allow easy and intuitive learning and the implementation of practical examples concerning descriptive statistics, random variables, confidence intervals, and hypothesis testing. Although they are designed to be used with Excel, they can also be employed with other free spreadsheet programs (changing some particular formulas). Moreover, we exploit some possibilities of the ActiveX controls of the Excel Developer Menu to perform interactive Gaussian density charts. Finally, it is important to note that they can be often embedded in a web page, so it is not necessary to employ Excel software for their use. These templates have been designed as a useful tool to teach basic statistics and to carry out data analysis even when the students are not familiar with Excel. Additionally, they can be used as a complement to other analytical software packages. They aim to assist students in learning statistics, within an intuitive working environment. Supplementary materials with the Excel templates are available online.
Facebook
TwitterAdditionally, all P-values used to determine statistical significance have also been included in this file. (XLSX)
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
To create the dataset, the top 10 countries leading in the incidence of COVID-19 in the world were selected as of October 22, 2020 (on the eve of the second full of pandemics), which are presented in the Global 500 ranking for 2020: USA, India, Brazil, Russia, Spain, France and Mexico. For each of these countries, no more than 10 of the largest transnational corporations included in the Global 500 rating for 2020 and 2019 were selected separately. The arithmetic averages were calculated and the change (increase) in indicators such as profitability and profitability of enterprises, their ranking position (competitiveness), asset value and number of employees. The arithmetic mean values of these indicators for all countries of the sample were found, characterizing the situation in international entrepreneurship as a whole in the context of the COVID-19 crisis in 2020 on the eve of the second wave of the pandemic. The data is collected in a general Microsoft Excel table. Dataset is a unique database that combines COVID-19 statistics and entrepreneurship statistics. The dataset is flexible data that can be supplemented with data from other countries and newer statistics on the COVID-19 pandemic. Due to the fact that the data in the dataset are not ready-made numbers, but formulas, when adding and / or changing the values in the original table at the beginning of the dataset, most of the subsequent tables will be automatically recalculated and the graphs will be updated. This allows the dataset to be used not just as an array of data, but as an analytical tool for automating scientific research on the impact of the COVID-19 pandemic and crisis on international entrepreneurship. The dataset includes not only tabular data, but also charts that provide data visualization. The dataset contains not only actual, but also forecast data on morbidity and mortality from COVID-19 for the period of the second wave of the pandemic in 2020. The forecasts are presented in the form of a normal distribution of predicted values and the probability of their occurrence in practice. This allows for a broad scenario analysis of the impact of the COVID-19 pandemic and crisis on international entrepreneurship, substituting various predicted morbidity and mortality rates in risk assessment tables and obtaining automatically calculated consequences (changes) on the characteristics of international entrepreneurship. It is also possible to substitute the actual values identified in the process and following the results of the second wave of the pandemic to check the reliability of pre-made forecasts and conduct a plan-fact analysis. The dataset contains not only the numerical values of the initial and predicted values of the set of studied indicators, but also their qualitative interpretation, reflecting the presence and level of risks of a pandemic and COVID-19 crisis for international entrepreneurship.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Example of how I use MS Excel's VLOOKUP() function to filter my data.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introduction
We are enclosing the database used in our research titled "Concentration and Geospatial Modelling of Health Development Offices' Accessibility for the Total and Elderly Populations in Hungary", along with our statistical calculations. For the sake of reproducibility, further information can be found in the file Short_Description_of_Data_Analysis.pdf and Statistical_formulas.pdf
The sharing of data is part of our aim to strengthen the base of our scientific research. As of March 7, 2024, the detailed submission and analysis of our research findings to a scientific journal has not yet been completed.
The dataset was expanded on 23rd September 2024 to include SPSS statistical analysis data, a heatmap, and buffer zone analysis around the Health Development Offices (HDOs) created in QGIS software.
Short Description of Data Analysis and Attached Files (datasets):
Our research utilised data from 2022, serving as the basis for statistical standardisation. The 2022 Hungarian census provided an objective basis for our analysis, with age group data available at the county level from the Hungarian Central Statistical Office (KSH) website. The 2022 demographic data provided an accurate picture compared to the data available from the 2023 microcensus. The used calculation is based on our standardisation of the 2022 data. For xlsx files, we used MS Excel 2019 (version: 1808, build: 10406.20006) with the SOLVER add-in.
Hungarian Central Statistical Office served as the data source for population by age group, county, and regions: https://www.ksh.hu/stadat_files/nep/hu/nep0035.html, (accessed 04 Jan. 2024.) with data recorded in MS Excel in the Data_of_demography.xlsx file.
In 2022, 108 Health Development Offices (HDOs) were operational, and it's noteworthy that no developments have occurred in this area since 2022. The availability of these offices and the demographic data from the Central Statistical Office in Hungary are considered public interest data, freely usable for research purposes without requiring permission.
The contact details for the Health Development Offices were sourced from the following page (Hungarian National Population Centre (NNK)): https://www.nnk.gov.hu/index.php/efi (n=107). The Semmelweis University Health Development Centre was not listed by NNK, hence it was separately recorded as the 108th HDO. More information about the office can be found here: https://semmelweis.hu/egeszsegfejlesztes/en/ (n=1). (accessed 05 Dec. 2023.)
Geocoordinates were determined using Google Maps (N=108): https://www.google.com/maps. (accessed 02 Jan. 2024.) Recording of geocoordinates (latitude and longitude according to WGS 84 standard), address data (postal code, town name, street, and house number), and the name of each HDO was carried out in the: Geo_coordinates_and_names_of_Hungarian_Health_Development_Offices.csv file.
The foundational software for geospatial modelling and display (QGIS 3.34), an open-source software, can be downloaded from:
https://qgis.org/en/site/forusers/download.html. (accessed 04 Jan. 2024.)
The HDOs_GeoCoordinates.gpkg QGIS project file contains Hungary's administrative map and the recorded addresses of the HDOs from the
Geo_coordinates_and_names_of_Hungarian_Health_Development_Offices.csv file,
imported via .csv file.
The OpenStreetMap tileset is directly accessible from www.openstreetmap.org in QGIS. (accessed 04 Jan. 2024.)
The Hungarian county administrative boundaries were downloaded from the following website: https://data2.openstreetmap.hu/hatarok/index.php?admin=6 (accessed 04 Jan. 2024.)
HDO_Buffers.gpkg is a QGIS project file that includes the administrative map of Hungary, the county boundaries, as well as the HDO offices and their corresponding buffer zones with a radius of 7.5 km.
Heatmap.gpkg is a QGIS project file that includes the administrative map of Hungary, the county boundaries, as well as the HDO offices and their corresponding heatmap (Kernel Density Estimation).
A brief description of the statistical formulas applied is included in the Statistical_formulas.pdf.
Recording of our base data for statistical concentration and diversification measurement was done using MS Excel 2019 (version: 1808, build: 10406.20006) in .xlsx format.
Using the SPSS 29.0.1.0 program, we performed the following statistical calculations with the databases Data_HDOs_population_without_outliers.sav and Data_HDOs_population.sav:
For easier readability, the files have been provided in both SPV and PDF formats.
The translation of these supplementary files into English was completed on 23rd Sept. 2024.
If you have any further questions regarding the dataset, please contact the corresponding author: domjan.peter@phd.semmelweis.hu
Facebook
TwitterThis dataset is a cleaned and preprocessed version of the original Netflix Movies and TV Shows dataset available on Kaggle. All cleaning was done using Microsoft Excel — no programming involved.
🎯 What’s Included: - Cleaned Excel file (standardized columns, proper date format, removed duplicates/missing values) - A separate "formulas_used.txt" file listing all Excel formulas used during cleaning (e.g., TRIM, CLEAN, DATE, SUBSTITUTE, TEXTJOIN, etc.) - Columns like 'date_added' have been properly formatted into DMY structure - Multi-valued columns like 'listed_in' are split for better analysis - Null values replaced with “Unknown” for clarity - Duration field broken into numeric + unit components
🔍 Dataset Purpose: Ideal for beginners and analysts who want to: - Practice data cleaning in Excel - Explore Netflix content trends - Analyze content by type, country, genre, or date added
📁 Original Dataset Credit: The base version was originally published by Shivam Bansal on Kaggle: https://www.kaggle.com/shivamb/netflix-shows
📌 Bonus: You can find a step-by-step cleaning guide and the same dataset on GitHub as well — along with screenshots and formulas documentation.
Facebook
TwitterIn this project, I analysed the employees of an organization located in two distinct countries using Excel. This project covers:
1) How to approach a data analysis project 2) How to systematically clean data 3) Doing EDA with Excel formulas & tables 4) How to use Power Query to combine two datasets 5) Statistical Analysis of data 6) Using formulas like COUNTIFS, SUMIFS, XLOOKUP 7) Making an information finder with your data 8) Male vs. Female Analysis with Pivot tables 9) Calculating Bonuses based on business rules 10) Visual analytics of data with 4 topics 11) Analysing the salary spread (Histograms & Box plots) 12) Relationship between Salary & Rating 13) Staff growth over time - trend analysis 14) Regional Scorecard to compare NZ with India
Including various Excel features such as: 1) Using Tables 2) Working with Power Query 3) Formulas 4) Pivot Tables 5) Conditional formatting 6) Charts 7) Data Validation 8) Keyboard Shortcuts & tricks 9) Dashboard Design
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The hectares of habitat protected and the number of adults and children fed in one year were calculated for each of the six crop types for Canada and United States. The calculations were based on the 50th centile of the cumulative frequency distributions of change in crop yield due to pesticide treatment for each crop type. An editable interactive table was created using Microsoft Excel that would allow individuals to determine how pesticide treatment in their selected jurisdiction (province in Canada or state in the United States) and crop translates into habitat saved, calories produced, and mouths fed. This table allows the user to choose the country (Canada or United States), whether to include the organic agriculture correction factor, their state or province of interest, crop, and whether a young child, adolescent child, adult women, or adult man is being fed. The table will then calculate the hectares of habitat saved, added number of calories produced (kcal), the number of individual fed in one day, and the number of individual fed in one year. Due to the variability in yield results between crops and studies, the Excel user form allows individuals to set whichever yield increase they anticipate observing or use the 50th centile of yield increase from the cumulative frequency distribution for each crop.
Facebook
TwitterU.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
The U.S. Geological Survey (USGS), in cooperation with Connecticut Department of Transportation, completed a study to improve flood-frequency estimates in Connecticut. This companion data release is a Microsoft Excel workbook for: (1) computing flood discharges for the 50- to 0.2-percent annual exceedance probabilities from peak-flow regression equations, and (2) computing additional prediction intervals, not available through the USGS StreamStats web application. The current StreamStats application (version 4) only computes the 90-percent prediction interval for stream sites in Connecticut. The Excel workbook can be used to compute the 70-, 80-, 90-, 95-, and 99-percent prediction intervals. The prediction interval provides upper and lower limits of the estimated flood discharge with a certain probability, or level of confidence in the accuracy of the estimate. The standard error of prediction for the Connecticut peak-flow regression equations ranged from 26.3 to 45.0 percent ( ...
Facebook
TwitterAPLE is a Microsoft Excel spreadsheet model that runs on an annual time-step and estimates field-scale, sediment bound and dissolved P loss (kg ha−1) in surface runoff for agricultural field. APLE is intended to quantify P loss through process-based equations. It has been tested for its ability to reliably predict P loss in runoff for systems with machine-applied manure and for soil P cycling using data from a wide variety of agricultural fields and regions. Resources in this dataset:Resource Title: Annual P Loss Estimator (APLE). File Name: APLE 2.5.2.xlsxResource Description: APLE is a fairly simple, user-friendly, Microsoft Excel spreadsheet model that runs on an annual time-step and estimates field-scale, sediment bound and dissolved P loss (kg ha−1) in surface runoff for agricultural field. To download the spreadsheet, fill out the form at https://www.ars.usda.gov/research/software/download/?softwareid=304 Resource Title: Annual Phosphorus Loss Estimator User’s Manual Version 2.4. File Name: APLEUsersManual24.pdf
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This work uses projected population data from 2022 World Population Prospects published by UN DESA.
Facebook
TwitterAn excel file containing the following on the seasons 1998 to 2021: -Personal stats of drivers (championship finishes, wins/season, total wins, podiums, points, fastest laps and pole positions) -Championship stats (drivers and teams, with colours, and their championship positions at the end of each season) -Table with the wins per circuit per year (also with colours) and the wins per team per year
This dataset was mainly made for fun / nice looking visualization so first open it in excel to see the colours as well. If you want to use it for more complex purposes, I would recommend to do some data-prepping
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This file provides the input data, assumptions and calculations used to compute the statistics on gross and net annual forest increment and natural losses for Europe, harmonized for definitions and reference period.The description of the content of this file is provided in the "ReadMe" spreadsheet within the file
Facebook
TwitterThe link for the Excel project to download can be found on GitHub here.
It includes the raw data, Pivot Tables, and an interactive dashboard with Pivot Charts and Slicers. The project also includes business questions and the formulas I used to answer. The image below is included for ease.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12904052%2F61e460b5f6a1fa73cfaaa33aa8107bd5%2FBusinessQuestions.png?generation=1686190703261971&alt=media" alt="">
The link for the Tableau adjusted dashboard can be found here.
A screenshot of the interactive Excel dashboard is also included below for ease.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12904052%2Fe581f1fce8afc732f7823904da9e4cce%2FScooter%20Dashboard%20Image.png?generation=1686190815608343&alt=media" alt="">
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Modified equations of state (EoS) of forsterite, wadsleyite, ringwoodite, akimotoite, bridgmanite and post-perovskite based on the Helmholtz free energy are described using Microsoft Excel spreadsheets. The equations of state were set up by joint analysis of reference experimental data and can be used to calculate thermodynamic and thermoelastic parameters and P–V–T properties of the Mg-silicates. We used Visual Basic for Applications module in Microsoft Excel and presented a simultaneous calculation of full set of thermodynamic and thermoelastic functions using only T–P and T–V data as input parameters. Phase transitions in the MgSiO3–MgO system play an important role in the interpretation of the seismic boundaries of the upper Earth’s mantle and in the D″ layer. Therefore, proposed EoSes of silicates in the MgSiO3–MgO system have clear geophysical implications. The developed software will be interesting to specialists who are engaged to study the mantle mineralogy and Earth’s interior.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Datasets underlying the analysis of the paper "Memecry: Tracing the Repetition-with-Variation of Formulas on 4chan/pol/ This upload includes the following: seedwords.csv: A .csv file with terms we used as a seed list to filter for 4chan/pol/-post containing vernacular. seedword-network_x.gdf/gephi: .gdf and .gephi network files for NPMI-weighted co-word networks of /pol/-posts. We only included posts that contained one of the aforementioned seed list words. twoflow-data_x.xlsx: .xlsx files with data on triplets common to 4chan/pol/. We identified these three-word sequences through the above network files. For example: "gr8 b8 m8", "orange man bad", "lurk moar newfag". The Excel data on these triplet includes: The absolute amount of /pol/-posts per year mentioning the triplets (within a window of five words). The average NPMI scores between the three triplet words per year. The top co-words per year having an average NPMI higher than 0.18 with two of the three triplet words. triplets.csv: A .csv file with the extracted triplets, including their common appearance as memetic phrases and a short explanation. This data was used for "two-flow graphs" available at oilab.eu/formulas/. See the paper for full explanations on the data.
Facebook
TwitterThis is part 2 of INDILACT, part 1 is published separately.
The objective of this study is to investigate how a customized voluntary waiting period before first insemination in prmiparous dairy cows would affect milk production, fertility and health of primparous dairy cows during their first calving interval.
The data was registered between January 2019 and october 2022.
This data is archived: - Metadata (publically available) - Raw data (.txt files) from the Swedish national herd recording scheme (SNDRS), operated by Växa Sverige: access restricted due to agreements with the principle owners of the data, Växa Sverige and the farms. Code lists are available in INDILACT part 1. - Aggregated data (Excel files): access restricted due to agreements with the principle owners of the data, Växa Sverige and the farms - R- scripts with statistical calculations (Openly available)
Metadata (3 filer): - Metadata gentypning: The only new file type compared to INDILACT Part 1, description of how this data category have been handled. The other file-types have been handled in the same way as in INDILACT Part 1. - Metadata - del 2 - General summary of initioal data handeling for aggregation of the files of the same types (dates etc.) to create excel-files used in the R-scripts. - DisCodes: Divisions of the diagnoses into categories.
Raw data: -59 .txt files containing data retrieved from SNDRS from 8 separate occacions. -Data from 18 Swedish farms from Jan 2019 to Oct 2022.
Aggregeated data: - 29 Excelfiles. The textfiles have been transformed to Excel formate and all data from the same file type is aggregated into one file. - Data collected from the farms by email and phone contact, about individual cows enrolled in the trial, from Oct 2020 to Oct 2022. - One merged Script derived from initial data handeling in R where relevant variables were calculated and aggregated to be used for statistical calculations.
R-script with data handeling and statistical calculations: - "Data analysis part 2 - final": Data handeling to create the file used in the statistical calculations. - "Part 2 - Binomial models - Fertility": Statistiscal calculations of variables using Binomial models. - "Part 2 - glmmTMB models - Fertility": Statistiscal calculations of variables using glmmTMB models. - "Part 2 - linear models - Fertility": Statistiscal calculations of fertility variables using linear models. - "Part 2 - linear models": Statistiscal calculations of milk variables using linear models.
Running the R scripts requires access to the restricted files. The files should be unpacked in a subdirectory "data" relative to the working directory for the scripts. See also the file "sessionInfo.txt" for information on R packages used.
Facebook
TwitterThis project includes a series of Excel files demonstrating key Excel functionalities, including:
You can download the original Excel file with all formatting here: https://www.kaggle.com/datasets/carinacruz/excel-project
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This file provides the input data, assumptions and calculations used to compute the statistics on forest area, biomass, FAWS and BAWS for Europe, harmonized for definitions and reference year.The description of the content of this file is provided in the "ReadMe" spreadsheet within the file
Facebook
TwitterThe world is presently burdened with the COVID-19 pandemic. As of 7 February 2022, there have been 394,381,395 confirmed cases of COVID-19, including 5,735,179 deaths, reported to WHO. We used the Health Belief Model as a conceptual framework, which has largely been tested empirically to predict preventive health behaviour, focusing on the relationship between health behaviour on COVID-19 prevention. This data contains the questionnaire that we used in this research. The data contains questionnaire related to measuring Community Perception and COVID-19 Preventive using the Health Belief Model in Indonesia: Structural Equation Model Analysis and the data can be opened with Microsoft Excel., This is a cross sectional study using google form with structured survey questionnaires. The answers were using five-points Likert scale: 1 (strongly disagree), 2 (disagree), 3 (neutral), 4 (agree), 5 (strongly agree). The questionnaire was developed using various references from reputable journal and validated questionnaire from Ministry of Health Indonesia. Data were collected through WhatsApp, Line, and Telegram in summer 2021. In term to expand the coverage throughout Indonesia, social media influencer was also asked to distribute the google form through Twitter and Instagram. The respondents who participated in this study were people between the ages of 15-64 years old and Indonesian citizens. It took 20 -25 minutes to complete the form. Respondents signed the informed consent prior to completing it. The data were analysed by using the Covariance Based- Structural Equation Model (SEM) analysis. It was performed using Lisrel version 8.8. Descriptive statistics are presented as numbe..., The data can be opened with Microsoft Excel.Â
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This article describes a free, open-source collection of templates for the popular Excel (2013, and later versions) spreadsheet program. These templates are spreadsheet files that allow easy and intuitive learning and the implementation of practical examples concerning descriptive statistics, random variables, confidence intervals, and hypothesis testing. Although they are designed to be used with Excel, they can also be employed with other free spreadsheet programs (changing some particular formulas). Moreover, we exploit some possibilities of the ActiveX controls of the Excel Developer Menu to perform interactive Gaussian density charts. Finally, it is important to note that they can be often embedded in a web page, so it is not necessary to employ Excel software for their use. These templates have been designed as a useful tool to teach basic statistics and to carry out data analysis even when the students are not familiar with Excel. Additionally, they can be used as a complement to other analytical software packages. They aim to assist students in learning statistics, within an intuitive working environment. Supplementary materials with the Excel templates are available online.