57 datasets found

f
Data from: Excel Templates: A Helpful Tool for Teaching Statistics
tandf.figshare.com
zip
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alejandro Quintela-del-Río; Mario Francisco-Fernández (2023). Excel Templates: A Helpful Tool for Teaching Statistics [Dataset]. http://doi.org/10.6084/m9.figshare.3408052.v2
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.3408052.v2
Dataset updated
May 30, 2023
Dataset provided by
Taylor & Francis
Authors
Alejandro Quintela-del-Río; Mario Francisco-Fernández
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This article describes a free, open-source collection of templates for the popular Excel (2013, and later versions) spreadsheet program. These templates are spreadsheet files that allow easy and intuitive learning and the implementation of practical examples concerning descriptive statistics, random variables, confidence intervals, and hypothesis testing. Although they are designed to be used with Excel, they can also be employed with other free spreadsheet programs (changing some particular formulas). Moreover, we exploit some possibilities of the ActiveX controls of the Excel Developer Menu to perform interactive Gaussian density charts. Finally, it is important to note that they can be often embedded in a web page, so it is not necessary to employ Excel software for their use. These templates have been designed as a useful tool to teach basic statistics and to carry out data analysis even when the students are not familiar with Excel. Additionally, they can be used as a complement to other analytical software packages. They aim to assist students in learning statistics, within an intuitive working environment. Supplementary materials with the Excel templates are available online.
f
Data from: Supplemental data
figshare.com
xlsx
Updated Mar 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
T Miyakoshi; Yoichi M. Ito (2024). Supplemental data [Dataset]. http://doi.org/10.6084/m9.figshare.24596058.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.24596058.v1
Dataset updated
Mar 15, 2024
Dataset provided by
figshare
Authors
T Miyakoshi; Yoichi M. Ito
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset for the article "The current utilization status of wearable devices in clinical research".Analyses were performed by utilizing the JMP Pro 16.10, Microsoft Excel for Mac version 16 (Microsoft).The file extension "jrp" is a file of the statistical analysis software JMP, which contains both the analysis code and the data set.In case JMP is not available, a "csv" file as a data set and JMP script, the analysis code, are prepared in "rtf" format.The "xlsx" file is a Microsoft Excel file that contains the data set and the data plotted or tabulated using Microsoft Excel functions.Supplementary Figure 1. NCT number duplication frequencyIncludes Excel file used to create the figure (Supplemental Figure 1).・Sfig1_NCT number duplication frequency.xlsxSupplementary Figure 2-5　Simple and annual time series aggregationIncludes Excel file, JMP repo file, csv dataset of JMP repo file and JMP scripts used to create the figure (Supplementary Figures 2-5).・Sfig2-5 Annual time series aggregation.xlsx・Sfig2 Study Type.jrp・Sfig4device type.jrp・Sfig3 Interventions Type.jrp・Sfig5Conditions type.jrp・Sfig2, 3 ,5_database.csv・Sfig2_JMP script_Study type.rtf・Sfig3_JMP script Interventions type.rtf・Sfig5_JMP script Conditions type.rtf・Sfig4_dataset.csv・Sfig4_JMP script_device type.rtfSupplementary Figures 6-11 Mosaic diagram of intervention by conditionSupplementary tables 4-9 Analysis of contingency table for intervention by condition JMP repot files used to create the figures(Supplementary Figures 6-11 ) and tables(Supplementary Tablea 4-9) , including the csv dataset of JMP repot files and JMP scripts.・Sfig6-11 Stable4-9 Intervention devicetype_conditions.jrp・Sfig6-11_Stable4-9_dataset.csv・Sfig6-11_Stable4-9_JMP script.rtfSupplementary Figure 12. Distribution of enrollmentIncludes Excel file, JMP repo file, csv dataset of JMP repo file and JMP scripts used to create the figure (Supplementary Figures 12).・Sfig12_Distribution of enrollment.jrp・Sfig12_Distribution of enrollment.csv・Sfig12_JMP script.rtf
Superstore Sales Analysis
kaggle.com
Updated Oct 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ali Reda Elblgihy (2023). Superstore Sales Analysis [Dataset]. https://www.kaggle.com/datasets/aliredaelblgihy/superstore-sales-analysis/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 21, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ali Reda Elblgihy
Description
Analyzing sales data is essential for any business looking to make informed decisions and optimize its operations. In this project, we will utilize Microsoft Excel and Power Query to conduct a comprehensive analysis of Superstore sales data. Our primary objectives will be to establish meaningful connections between various data sheets, ensure data quality, and calculate critical metrics such as the Cost of Goods Sold (COGS) and discount values. Below are the key steps and elements of this analysis:

1- Data Import and Transformation:

Gather and import relevant sales data from various sources into Excel.

Utilize Power Query to clean, transform, and structure the data for analysis.

Merge and link different data sheets to create a cohesive dataset, ensuring that all data fields are connected logically.

2- Data Quality Assessment:

Perform data quality checks to identify and address issues like missing values, duplicates, outliers, and data inconsistencies.

Standardize data formats and ensure that all data is in a consistent, usable state.

3- Calculating COGS:

Determine the Cost of Goods Sold (COGS) for each product sold by considering factors like purchase price, shipping costs, and any additional expenses.

Apply appropriate formulas and calculations to determine COGS accurately.

4- Discount Analysis:

Analyze the discount values offered on products to understand their impact on sales and profitability.

Calculate the average discount percentage, identify trends, and visualize the data using charts or graphs.

5- Sales Metrics:

Calculate and analyze various sales metrics, such as total revenue, profit margins, and sales growth.

Utilize Excel functions to compute these metrics and create visuals for better insights.

6- Visualization:

Create visualizations, such as charts, graphs, and pivot tables, to present the data in an understandable and actionable format.

Visual representations can help identify trends, outliers, and patterns in the data.

7- Report Generation:

Compile the findings and insights into a well-structured report or dashboard, making it easy for stakeholders to understand and make informed decisions.

Throughout this analysis, the goal is to provide a clear and comprehensive understanding of the Superstore's sales performance. By using Excel and Power Query, we can efficiently manage and analyze the data, ensuring that the insights gained contribute to the store's growth and success.
m
Dataset of development of business during the COVID-19 crisis
data.mendeley.com
narcis.nl
Updated Nov 9, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tatiana N. Litvinova (2020). Dataset of development of business during the COVID-19 crisis [Dataset]. http://doi.org/10.17632/9vvrd34f8t.1
Explore at:
Unique identifier
https://doi.org/10.17632/9vvrd34f8t.1
Dataset updated
Nov 9, 2020
Authors
Tatiana N. Litvinova
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
To create the dataset, the top 10 countries leading in the incidence of COVID-19 in the world were selected as of October 22, 2020 (on the eve of the second full of pandemics), which are presented in the Global 500 ranking for 2020: USA, India, Brazil, Russia, Spain, France and Mexico. For each of these countries, no more than 10 of the largest transnational corporations included in the Global 500 rating for 2020 and 2019 were selected separately. The arithmetic averages were calculated and the change (increase) in indicators such as profitability and profitability of enterprises, their ranking position (competitiveness), asset value and number of employees. The arithmetic mean values of these indicators for all countries of the sample were found, characterizing the situation in international entrepreneurship as a whole in the context of the COVID-19 crisis in 2020 on the eve of the second wave of the pandemic. The data is collected in a general Microsoft Excel table. Dataset is a unique database that combines COVID-19 statistics and entrepreneurship statistics. The dataset is flexible data that can be supplemented with data from other countries and newer statistics on the COVID-19 pandemic. Due to the fact that the data in the dataset are not ready-made numbers, but formulas, when adding and / or changing the values in the original table at the beginning of the dataset, most of the subsequent tables will be automatically recalculated and the graphs will be updated. This allows the dataset to be used not just as an array of data, but as an analytical tool for automating scientific research on the impact of the COVID-19 pandemic and crisis on international entrepreneurship. The dataset includes not only tabular data, but also charts that provide data visualization. The dataset contains not only actual, but also forecast data on morbidity and mortality from COVID-19 for the period of the second wave of the pandemic in 2020. The forecasts are presented in the form of a normal distribution of predicted values and the probability of their occurrence in practice. This allows for a broad scenario analysis of the impact of the COVID-19 pandemic and crisis on international entrepreneurship, substituting various predicted morbidity and mortality rates in risk assessment tables and obtaining automatically calculated consequences (changes) on the characteristics of international entrepreneurship. It is also possible to substitute the actual values identified in the process and following the results of the second wave of the pandemic to check the reliability of pre-made forecasts and conduct a plan-fact analysis. The dataset contains not only the numerical values of the initial and predicted values of the set of studied indicators, but also their qualitative interpretation, reflecting the presence and level of risks of a pandemic and COVID-19 crisis for international entrepreneurship.
d
Sample Data for Excel - Dataset - Datopian CKAN instance
demo.dev.datopian.com
Updated May 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Sample Data for Excel - Dataset - Datopian CKAN instance [Dataset]. https://demo.dev.datopian.com/dataset/city-xx12--sample-data-for-excel
Explore at:
Dataset updated
May 13, 2025
Description
This dataset contains various sample data files for practicing Excel functions and features, including data related to sales orders, athletes, food nutrients, insurance policies, and workplace safety.
Sorting/selecting data in Excel with VLOOKUP()
figshare.com
xlsx
Updated Jan 18, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anneke Batenburg (2016). Sorting/selecting data in Excel with VLOOKUP() [Dataset]. http://doi.org/10.6084/m9.figshare.964802.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.964802.v1
Dataset updated
Jan 18, 2016
Dataset provided by
Figsharehttp://figshare.com/
Authors
Anneke Batenburg
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Example of how I use MS Excel's VLOOKUP() function to filter my data.
d
R script that creates a wrapper function to automate the generation of...
catalog.data.gov
Updated Jul 20, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). R script that creates a wrapper function to automate the generation of boxplots of change factors for all Florida HUC-8 basins (basin_boxplot.R) [Dataset]. https://catalog.data.gov/dataset/r-script-that-creates-a-wrapper-function-to-automate-the-generation-of-boxplots-of-change--f7fc2
Explore at:
Dataset updated
Jul 20, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Description
The Florida Flood Hub for Applied Research and Innovation and the U.S. Geological Survey have developed projected future change factors for precipitation depth-duration-frequency (DDF) curves at 242 National Oceanic and Atmospheric Administration (NOAA) Atlas 14 stations in Florida. The change factors were computed as the ratio of projected future to historical extreme-precipitation depths fitted to extreme-precipitation data from downscaled climate datasets using a constrained maximum likelihood (CML) approach as described in https://doi.org/10.3133/sir20225093. The change factors correspond to the periods 2020-59 (centered in the year 2040) and 2050-89 (centered in the year 2070) as compared to the 1966-2005 historical period. An R script (basin_boxplot.R) is provided as an example on how to create a wrapper function that will automate the generation of boxplots of change factors for all Florida HUC-8 basins. The wrapper script sources the file create_boxplot.R and calls the function create_boxplot() one Florida basin at a time to create a figure with boxplots of change factors for all durations (1, 3, and 7 days) and return periods (5, 10, 25, 50, 100, 200, and 500 years) evaluated as part of this project. An example is also provided in the code that shows how to generate a figure showing boxplots of change factors for a single duration and return period. A Microsoft Word file documenting code usage is also provided within this data release (Documentation_R_script_create_boxplot.docx). As described in the documentation, the R script relies on some of the Microsoft Excel spreadsheets published as part of this data release. The script uses HUC-8 basins defined in the "Florida Hydrologic Unit Code (HUC) Basins (areas)" from the Florida Department of Environmental Protection (FDEP; https://geodata.dep.state.fl.us/datasets/FDEP::florida-hydrologic-unit-code-huc-basins-areas/explore) and their names are listed in the file basins_list.txt provided with the script. County names are listed in the file counties_list.txt provided with the script. NOAA Atlas 14 stations located in each Florida basin or county are defined in the Microsoft Excel spreadsheet Datasets_station_information.xlsx which is part of this data release. Instructions are provided in code documentation (see highlighted text on page 7 of Documentation_R_script_create_boxplot.docx) so that users can modify the script to generate boxplots for basins different from the FDEP "Florida Hydrologic Unit Code (HUC) Basins (areas)."
d
Becoming Excel Experts
search.dataone.org
Updated Dec 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Julie Marcoux (2023). Becoming Excel Experts [Dataset]. http://doi.org/10.5683/SP3/XB6FPP
Explore at:
Unique identifier
https://doi.org/10.5683/SP3/XB6FPP
Dataset updated
Dec 28, 2023
Dataset provided by
Borealis
Authors
Julie Marcoux
Description
In Julie Marcoux's Excel Workshop, she demonstrates some great tricks that will make it easier to work with DLI data, including creating macros, useful excel functions, and tools.
2011 skills for life survey: small area estimation data
gov.uk
Updated Dec 12, 2012
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department for Business, Innovation & Skills (2012). 2011 skills for life survey: small area estimation data [Dataset]. https://www.gov.uk/government/statistical-data-sets/2011-skills-for-life-survey-small-area-estimation-data
Explore at:
Dataset updated
Dec 12, 2012
Dataset provided by
GOV.UKhttp://gov.uk/
Authors
Department for Business, Innovation & Skills
Description
Small area estimation modelling methods have been applied to the 2011 Skills for Life survey data in order to generate local level area estimates of the number and proportion of adults (aged 16-64 years old) in England living in households with defined skill levels in:

literacy

numeracy

information and communication technology (ICT); including emailing, word processing, spreadsheet use and a multiple-choice assessment of ICT awareness

The number and proportion of adults in households who do not speak English as a first language are also included.

Two sets of small area estimates are provided for 7 geographies; middle layer super output areas (MSOAs), standard table wards, 2005 statistical wards, 2011 council wards, 2011 parliamentary constituencies, local authorities, and local enterprise partnership areas.

Regional estimates have also been provided, however, unlike the other geographies, these estimates are based on direct survey estimates and not modelled estimates.

The files are available as both Excel and csv files – the user guide explains the estimates and modelling approach in more detail.

How to use the small area estimation files, an example

To find the estimate for the proportion of adults with entry level 1 or below literacy in the Manchester Central parliamentary constituency, you need to:

select the link to the ‘parliamentary-constituencies-2009-all’ Excel file in the table above

select the ‘literacy proportions’ page of the Excel spreadsheet

use the ‘find’ function to locate ‘Manchester Central’

note the proportion listed for Entry Level and below

It is estimated that 8.1% of adults aged 16-64 in Manchester Central have entry level or below literacy. The Credible Intervals for this estimate are 7.0 and 9.3% at the 95 per cent level. This means that while the estimate is 8.1%, there is a 95% likelihood that the actual value lies between 7.0 and 9.3%.

https://assets.publishing.service.gov.uk/media/5a79d91240f0b670a8025dd8/middle-layer-super-output-areas-2001-all_1_.xlsx">

https://assets.publishing.service.gov.uk/media/5a79d91240f0b670a8025dd8/middle-layer-super-output-areas-2001-all_1_.xlsx">Middle layer super output areas: 2001 all skill level estimates

MS Excel Spreadsheet, 14.5 MB This file may not be suitable for users of assistive technology. <details data-module="ga4-event-tracker" data-ga4-event='{"event_name":"select_content","type":"detail","text":"Request an accessible format.","section":"Request an accessible format.","index_section":1}' class="gem-c-details govuk-details govuk-!-margin-bottom-0" title="Request an accessible format.">

Request an accessible format.

If you use assistive technology (such as a screen reader) and need a version of this document in a more accessible format, please email <a href="mailto:enquiries@beis.gov.uk" target="_blank" class="govuk-link">enquiries@beis.gov.uk</a>. Please tell us what format you need. It will help us if you say what assistive technology you use.

<div class="gem-c-attachmen
m
Raw data outputs 1-18
bridges.monash.edu
researchdata.edu.au
xlsx
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abbas Salavaty Hosein Abadi; Sara Alaei; Mirana Ramialison; Peter Currie (2023). Raw data outputs 1-18 [Dataset]. http://doi.org/10.26180/21259491.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.26180/21259491.v1
Dataset updated
May 30, 2023
Dataset provided by
Monash University
Authors
Abbas Salavaty Hosein Abadi; Sara Alaei; Mirana Ramialison; Peter Currie
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Raw data outputs 1-18 Raw data output 1. Differentially expressed genes in AML CSCs compared with GTCs as well as in TCGA AML cancer samples compared with normal ones. This data was generated based on the results of AML microarray and TCGA data analysis. Raw data output 2. Commonly and uniquely differentially expressed genes in AML CSC/GTC microarray and TCGA bulk RNA-seq datasets. This data was generated based on the results of AML microarray and TCGA data analysis. Raw data output 3. Common differentially expressed genes between training and test set samples the microarray dataset. This data was generated based on the results of AML microarray data analysis. Raw data output 4. Detailed information on the samples of the breast cancer microarray dataset (GSE52327) used in this study. Raw data output 5. Differentially expressed genes in breast CSCs compared with GTCs as well as in TCGA BRCA cancer samples compared with normal ones. Raw data output 6. Commonly and uniquely differentially expressed genes in breast cancer CSC/GTC microarray and TCGA BRCA bulk RNA-seq datasets. This data was generated based on the results of breast cancer microarray and TCGA BRCA data analysis. CSC, and GTC are abbreviations of cancer stem cell, and general tumor cell, respectively. Raw data output 7. Differential and common co-expression and protein-protein interaction of genes between CSC and GTC samples. This data was generated based on the results of AML microarray and STRING database-based protein-protein interaction data analysis. CSC, and GTC are abbreviations of cancer stem cell, and general tumor cell, respectively. Raw data output 8. Differentially expressed genes between AML dormant and active CSCs. This data was generated based on the results of AML scRNA-seq data analysis. Raw data output 9. Uniquely expressed genes in dormant or active AML CSCs. This data was generated based on the results of AML scRNA-seq data analysis. Raw data output 10. Intersections between the targeting transcription factors of AML key CSC genes and differentially expressed genes between AML CSCs vs GTCs and between dormant and active AML CSCs or the uniquely expressed genes in either class of CSCs. Raw data output 11. Targeting desirableness score of AML key CSC genes and their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 12. CSC-specific targeting desirableness score of AML key CSC genes and their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 13. The protein-protein interactions between AML key CSC genes with themselves and their targeting transcription factors. This data was generated based on the results of AML microarray and STRING database-based protein-protein interaction data analysis. Raw data output 14. The previously confirmed associations of genes having the highest targeting desirableness and CSC-specific targeting desirableness scores with AML or other cancers’ (stem) cells as well as hematopoietic stem cells. These data were generated based on a PubMed database-based literature mining. Raw data output 15. Drug score of available drugs and bioactive small molecules targeting AML key CSC genes and/or their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 16. CSC-specific drug score of available drugs and bioactive small molecules targeting AML key CSC genes and/or their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 17. Candidate drugs for experimental validation. These drugs were selected based on their respective (CSC-specific) drug scores. CSC is the abbreviation of cancer stem cell. Raw data output 18. Detailed information on the samples of the AML microarray dataset GSE30375 used in this study.
o
Getting Started with Excel
explore.openaire.eu
Updated Jul 1, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dr Jianzhou Zhao (2021). Getting Started with Excel [Dataset]. http://doi.org/10.5281/zenodo.6423544
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.6423544
Dataset updated
Jul 1, 2021
Authors
Dr Jianzhou Zhao
Description
About this webinar We rarely receive the research data in an appropriate form. Often data is messy. Sometimes it is incomplete. And sometimes there’s too much of it. Frequently, it has errors. This webinar targets beginners and presents a quick demonstration of using the most widespread data wrangling tool, Microsoft Excel, to sort, filter, copy, protect, transform, aggregate, summarise, and visualise research data. Webinar Topics Introduction to Microsoft Excel user interface Interpret data using sorting, filtering, and conditional formatting Summarise data using functions Analyse data using pivot tables Manipulate and visualise data Handy tips to speed up your work Licence Copyright © 2021 Intersect Australia Ltd. All rights reserved.
f
Microsoft excel database containing all the simulated (10 sets) and...
plos.figshare.com
xlsx
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hamed Ahmadi (2023). Microsoft excel database containing all the simulated (10 sets) and experimental data used in this study. [Dataset]. http://doi.org/10.1371/journal.pone.0187292.s001
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0187292.s001
Dataset updated
Jun 3, 2023
Dataset provided by
PLOS ONE
Authors
Hamed Ahmadi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Excel sheets in order: The sheet entitled “Hens Original Data” contains the results of an experiment conducted to study the response of laying hens during initial phase of egg production subjected to different intakes of dietary threonine. The sheet entitled “Simulated data & fitting values” contains the 10 simulated data sets that were generated using a standard procedure of random number generator. The predicted values obtained by the new three-parameter and conventional four-parameter logistic models were also appeared in this sheet. (XLSX)
d
Finsheet - Stock Price in Excel and Google Sheet
search.dataone.org
dataverse.harvard.edu
Updated Nov 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Do, Tuan (2023). Finsheet - Stock Price in Excel and Google Sheet [Dataset]. http://doi.org/10.7910/DVN/ZD9XVF
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/ZD9XVF
Dataset updated
Nov 8, 2023
Dataset provided by
Harvard Dataverse
Authors
Do, Tuan
Description
This dataset contains the valuation template the researcher can use to retrieve real-time Excel stock price and stock price in Google Sheets. The dataset is provided by Finsheet, the leading financial data provider for spreadsheet users. To get more financial data, visit the website and explore their function. For instance, if a researcher would like to get the last 30 years of income statement for Meta Platform Inc, the syntax would be =FS_EquityFullFinancials("FB", "ic", "FY", 30) In addition, this syntax will return the latest stock price for Caterpillar Inc right in your spreadsheet. =FS_Latest("CAT") If you need assistance with any of the function, feel free to reach out to their customer support team. To get starter, install their Excel and Google Sheets add-on.
Instagram Reach Analysis - Excel Project
kaggle.com
Updated Jun 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Raghad Al-marshadi (2025). Instagram Reach Analysis - Excel Project [Dataset]. https://www.kaggle.com/datasets/raghadalmarshadi/instagram-reach-analysis-excel-project/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 14, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Raghad Al-marshadi
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
📊 Instagram Reach Analysis | تحليل الوصول في إنستغرام

An exploratory data analysis project using Excel to understand what influences Instagram post reach and engagement.
مشروع تحليل استكشافي لفهم العوامل المؤثرة في وصول منشورات إنستغرام وتفاعل المستخدمين، باستخدام Excel.

📁 Project Description | وصف المشروع

This project uses an Instagram dataset imported from Kaggle to explore how different factors like hashtags, saves, shares, and caption length influence impressions and engagement.
يستخدم هذا المشروع بيانات من إنستغرام تم استيرادها من منصة Kaggle لتحليل كيف تؤثر عوامل مثل الهاشتاقات، الحفظ، المشاركة، وطول التسمية التوضيحية في عدد مرات الظهور والتفاعل.

🛠️ Tools Used | الأدوات المستخدمة

Microsoft Excel

Pivot Tables

TRIM, WRAP, and other Excel formulas

مايكروسوفت إكسل

الجداول المحورية

دوال مثل TRIM و WRAP وغيرها في Excel

🧹 Data Cleaning | تنظيف البيانات

Removed unnecessary spaces using TRIM

Removed 17 duplicate rows → 103 unique rows remained

Standardized formatting: freeze top row, wrap text, center align

إزالة المسافات غير الضرورية باستخدام TRIM

حذف 17 صفًا مكررًا → تبقى 103 صفوف فريدة

تنسيق موحد: تثبيت الصف الأول، لف النص، وتوسيط المحتوى

🔍 Key Analysis Highlights | أبرز نتائج التحليل

1. Impressions by Source | مرات الظهور حسب المصدر

Highest reach: Home > Hashtags > Explore > Other

Some totals exceed 100% due to overlapping

2. Engagement Insights | رؤى حول التفاعل

Saves strongly correlate with higher impressions

Caption length is inversely related to likes

Shares have weak correlation with impressions

3. Hashtag Patterns | تحليل الهاشتاقات

Most used: #Thecleverprogrammer, #Amankharwal, #Python

Repeating hashtags does not guarantee higher reach

✅ Conclusion | الخلاصة

Shorter captions and higher save counts contribute more to reach than repeated hashtags. Profile visits are often linked to new followers.
العناوين القصيرة وعدد الحفظات تلعب دورًا أكبر في الوصول من تكرار الهاشتاقات. كما أن زيارات الملف الشخصي ترتبط غالبًا بزيادة المتابعين.

👩‍💻 Author | المؤلفة

Raghad's LinkedIn

🧠 Inspiration | الإلهام

Inspired by content from TheCleverProgrammer, Aman Kharwal, and Kaggle datasets.
استُلهم المشروع من محتوى TheCleverProgrammer وأمان خروال، وبيانات من Kaggle.

💬 Feedback | الملاحظات

Feel free to open an issue or share suggestions!
يسعدنا تلقي ملاحظاتكم واقتراحاتكم عبر صفحة المشروع.
Hive Annotation Job Results - Cleaned and Audited
kaggle.com
Updated Apr 28, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Brendan Kelley (2021). Hive Annotation Job Results - Cleaned and Audited [Dataset]. https://www.kaggle.com/brendankelley/hive-annotation-job-results-cleaned-and-audited/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 28, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Brendan Kelley
Description
Context

This notebook serves to showcase my problem solving ability, knowledge of the data analysis process, proficiency with Excel and its various tools and functions, as well as my strategic mindset and statistical prowess. This project consist of an auditing prompt provided by Hive Data, a raw Excel data set, a cleaned and audited version of the raw Excel data set, and my description of my thought process and knowledge used during completion of the project. The prompt can be found below:

Hive Data Audit Prompt

The raw data that accompanies the prompt can be found below:

Hive Annotation Job Results - Raw Data

^ These are the tools I was given to complete my task. The rest of the work is entirely my own.

To summarize broadly, my task was to audit the dataset and summarize my process and results. Specifically, I was to create a method for identifying which "jobs" - explained in the prompt above - needed to be rerun based on a set of "background facts," or criteria. The description of my extensive thought process and results can be found below in the Content section.

Content

Brendan Kelley April 23, 2021

Hive Data Audit Prompt Results

This paper explains the auditing process of the “Hive Annotation Job Results” data. It includes the preparation, analysis, visualization, and summary of the data. It is accompanied by the results of the audit in the excel file “Hive Annotation Job Results – Audited”.

Observation

The “Hive Annotation Job Results” data comes in the form of a single excel sheet. It contains 7 columns and 5,001 rows, including column headers. The data includes “file”, “object id”, and the pseudonym for five questions that each client was instructed to answer about their respective table: “tabular”, “semantic”, “definition list”, “header row”, and “header column”. The “file” column includes non-unique (that is, there are multiple instances of the same value in the column) numbers separated by a dash. The “object id” column includes non-unique numbers ranging from 5 to 487539. The columns containing the answers to the five questions include Boolean values - TRUE or FALSE – which depend upon the yes/no worker judgement.

Use of the COUNTIF() function reveals that there are no values other than TRUE or FALSE in any of the five question columns. The VLOOKUP() function reveals that the data does not include any missing values in any of the cells.

Assumptions

Based on the clean state of the data and the guidelines of the Hive Data Audit Prompt, the assumption is that duplicate values in the “file” column are acceptable and should not be removed. Similarly, duplicated values in the “object id” column are acceptable and should not be removed. The data is therefore clean and is ready for analysis/auditing.

Preparation

The purpose of the audit is to analyze the accuracy of the yes/no worker judgement of each question according to the guidelines of the background facts. The background facts are as follows:

• A table that is a definition list should automatically be tabular and also semantic • Semantic tables should automatically be tabular • If a table is NOT tabular, then it is definitely not semantic nor a definition list • A tabular table that has a header row OR header column should definitely be semantic

These background facts serve as instructions for how the answers to the five questions should interact with one another. These facts can be re-written to establish criteria for each question:

For tabular column: - If the table is a definition list, it is also tabular - If the table is semantic, it is also tabular

For semantic column: - If the table is a definition list, it is also semantic - If the table is not tabular, it is not semantic - If the table is tabular and has either a header row or a header column...
f
Additional file 1: of Simulation study of activities of daily living...
figshare.com
application/cdfv2
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tsair-Wei Chien; Weir-Sen Lin (2023). Additional file 1: of Simulation study of activities of daily living functions using online computerized adaptive testing [Dataset]. http://doi.org/10.6084/m9.figshare.c.3644072_D2.v1
Explore at:
application/cdfv2Available download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.c.3644072_D2.v1
Dataset updated
Jun 3, 2023
Dataset provided by
figshare
Authors
Tsair-Wei Chien; Weir-Sen Lin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The algorithm for determining Cutpoints and simulating data using MS Excel. (XLS 2362Â kb)
d
Relaxed Naïve Bayes Data
search.dataone.org
Updated Nov 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Relaxed Naïve Bayes Team (2023). Relaxed Naïve Bayes Data [Dataset]. http://doi.org/10.7910/DVN/7KNKLL
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/7KNKLL
Dataset updated
Nov 8, 2023
Dataset provided by
Harvard Dataverse
Authors
Relaxed Naïve Bayes Team
Description
NaiveBayes_R.xlsx: This Excel file includes information as to how probabilities of observed features are calculated given recidivism (P(x_ij│R)) in the training data. Each cell is embedded with an Excel function to render appropriate figures. P(Xi|R): This tab contains probabilities of feature attributes among recidivated offenders. NIJ_Recoded: This tab contains re-coded NIJ recidivism challenge data following our coding schema described in Table 1. Recidivated_Train: This tab contains re-coded features of recidivated offenders. Tabs from [Gender] through [Condition_Other]: Each tab contains probabilities of feature attributes given recidivism. We use these conditional probabilities to replace the raw values of each feature in P(Xi|R) tab. NaiveBayes_NR.xlsx: This Excel file includes information as to how probabilities of observed features are calculated given non-recidivism (P(x_ij│N)) in the training data. Each cell is embedded with an Excel function to render appropriate figures. P(Xi|N): This tab contains probabilities of feature attributes among non-recidivated offenders. NIJ_Recoded: This tab contains re-coded NIJ recidivism challenge data following our coding schema described in Table 1. NonRecidivated_Train: This tab contains re-coded features of non-recidivated offenders. Tabs from [Gender] through [Condition_Other]: Each tab contains probabilities of feature attributes given non-recidivism. We use these conditional probabilities to replace the raw values of each feature in P(Xi|N) tab. Training_LnTransformed.xlsx: Figures in each cell are log-transformed ratios of probabilities in NaiveBayes_R.xlsx (P(Xi|R)) to the probabilities in NaiveBayes_NR.xlsx (P(Xi|N)). TestData.xlsx: This Excel file includes the following tabs based on the test data: P(Xi|R), P(Xi|N), NIJ_Recoded, and Test_LnTransformed (log-transformed P(Xi|R)/ P(Xi|N)). Training_LnTransformed.dta: We transform Training_LnTransformed.xlsx to Stata data set. We use Stat/Transfer 13 software package to transfer the file format. StataLog.smcl: This file includes the results of the logistic regression analysis. Both estimated intercept and coefficient estimates in this Stata log correspond to the raw weights and standardized weights in Figure 1. Brier Score_Re-Check.xlsx: This Excel file recalculates Brier scores of Relaxed Naïve Bayes Classifier in Table 3, showing evidence that results displayed in Table 3 are correct. *****Full List***** NaiveBayes_R.xlsx NaiveBayes_NR.xlsx Training_LnTransformed.xlsx TestData.xlsx Training_LnTransformed.dta StataLog.smcl Brier Score_Re-Check.xlsx Data for Weka (Training Set): Bayes_2022_NoID Data for Weka (Test Set): BayesTest_2022_NoID Weka output for machine learning models (Conventional naïve Bayes, AdaBoost, Multilayer Perceptron, Logistic Regression, and Random Forest)
c
ckanext-excelforms
catalog.civicdataecosystem.org
Updated Jun 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). ckanext-excelforms [Dataset]. https://catalog.civicdataecosystem.org/dataset/ckanext-excelforms
Explore at:
Dataset updated
Jun 4, 2025
Description
The excelforms extension for CKAN provides a mechanism for users to input data into Table Designer tables using Excel-based forms, enhancing data entry efficiency. This extension focuses on streamlining the process of adding data rows to tables within CKAN's Table Designer. A key component of the functionality is the ability to import multiple rows in a single operation, which significant reduces overhead associated with entering multiple data points. Key Features: Excel-Based Forms: Users can enter data using familiar Excel spreadsheets, leveraging their existing skills and software. Table Designer Integration: Designed to work seamlessly with CKAN's Table Designer, extending its functionality to include Excel-based data entry. Multiple Row Import: Supports importing multiple rows of data at once, improving data entry efficiency, especially when dealing with large datasets. Data mapping: Simplifies the process of aligning excel column headers to their corresponding data fields in tables. Improved Data Entry Speed: Provides an alternative to manual data entry, resulting in faster population and easier updates. Technical Integration: The excelforms extension integrates with CKAN by introducing new functionalities and workflows around the Table Designer plugin. The installation instructions specify that this plugin to be added before the tabledesigner plugin. Benefits & Impact: By enabling Excel-based data entry, the excelforms extension improves the user experience for those familiar with spreadsheet software. The ability to import multiple rows simultaneously significantly reduces the time and effort required to populate tables, particularly when dealing with large amounts of data. The impact is better data accessibility through the streamlining of data population workflows.
Data from: Statistical Software Benchmarks
icpsr.umich.edu
Updated Oct 31, 2001
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statistical Software Benchmarks [Dataset]. https://www.icpsr.umich.edu/web/ICPSR/studies/1243
Explore at:
Unique identifier
https://doi.org/10.3886/ICPSR01243.v1
Dataset updated
Oct 31, 2001
Dataset provided by
Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
Authors
Altman, Micah; McDonald, Michael P.
License
https://www.icpsr.umich.edu/web/ICPSR/studies/1243/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/1243/terms
Description
This study provides tools to test the reliability of selected statistical software: Excel, Gauss, Stata, and SST. Functions covered include non-linear optimization algorithms, distributions, and pseudo-random number generators.
Grandpa Golf
kaggle.com
Updated Sep 12, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FletcherKennamer (2023). Grandpa Golf [Dataset]. https://www.kaggle.com/datasets/fletcherkennamer/grandpa-golf/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 12, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
FletcherKennamer
Description
My Grandpa asked if the programs I was using could calculate his Golf League’s handicaps, so I decided to play around with SQL and Google Sheets to see if I could functionally recreate what they were doing.

The goal is to calculate a player’s handicap, which is the average of the last six months of their scores minus 29. The average is calculated based on how many games they have actually played in the last six months, and the number of scores averaged correlates to total games. For example, Clem played over 20 games so his handicap will be calculated with the maximum possible scores accounted for, that being 8. Schomo only played six games, so the lowest 4 will be used for their average. Handicap is always calculated with the lowest available scores.

This league uses Excel, so upon receiving the data I converted it into a CSV and uploaded it into bigQuery.

First thing I did was change column names to best represent what they were and simplify things in the code. It is much easier to remember ‘someone_scores’ than ‘int64_field_number’. It also seemed to confuse SQL less, as int64 can mean something independently. (ALTER TABLE grandpa-golf.grandpas_golf_35.should only need the one RENAME COLUMN int64_field_4 TO schomo_scores;)

To Find the average of Clem’s scores: SELECT AVG(clem_scores) FROM grandpa-golf.grandpas_golf_35.should only need the one LIMIT 8; RESULT: 43.1

Remembering that handicap is the average minus 29, the final computation looks like: SELECT AVG(clem_scores) - 29 FROM grandpa-golf.grandpas_golf_35.should only need the one LIMIT 8; RESULT: 14.1

Find the average of Schomo’s scores: SELECT AVG(schomo_scores) - 29 FROM grandpa-golf.grandpas_golf_35.should only need the one LIMIT 6; RESULT: 10.5

This data was already automated to calculate a handicap in the league’s excel spreadsheet, so I asked for more data to see if i could recreate those functions.

Grandpa provided the past three years of league data. The names were all replaced with generic “Golfer 001, Golfer 002, etc”. I had planned on converting this Excel sheet into a CSV and manipulating it in SQL like with the smaller sample, but this did not work.

Immediately, there were problems. I had initially tried to just convert the file into a CSV and drop it into SQL, but there were functions that did not transfer properly from what was functionally the PDF I had been emailed. So instead of working with SQL, I decided to pull this into google sheets and recreate the functions for this spreadsheet. We only need the most recent 6 months of scores to calculate our handicap, so once I made a working copy I deleted the data from before this time period. Once that was cleaned up, I started working on a function that would pull the working average from these values, which is still determined by how many total values there were. This correlates as follows: for 20 or more scores average the lowest 8, for 15 to 19 scores average the lowest 6, for 6 to 14 scores average the lowest 4 and for 6 or fewer scores average the lowest 2. We also need to ensure that an average value of 0 returns a value of 0 so our handicap calculator works. My formula ended up being:

=IF(COUNT(E2:AT2)>=20, AVERAGE(SMALL(E2:AT2, ROW(INDIRECT("1:"&8)))), IF(COUNT(E2:AT2)>=15, AVERAGE(SMALL(E2:AT2, ROW(INDIRECT("1:"&6)))), IF(COUNT(E2:AT2)>=6, AVERAGE(SMALL(E2:AT2, ROW(INDIRECT("1:"&4)))), IF(COUNT(E2:AT2)>=1, AVERAGE(SMALL(E2:AT2, ROW(INDIRECT("1:"&2)))), IF(COUNT(E2:AT2)=0, 0, "")))))

The handicap is just this value minus 29, so for the handicap column the script is relatively simple: =IF(D2=0,0,IF(D2>47,18,D2-29)) This ensures that we will not get a negative value for our handicap, and pulls the basic average from the right place. It also sets the handicap to zero if there are no scores present.

Now that we have our spreadsheet back in working order with our new scripts, we are functionally done. We have recreated what my Grandpa’s league uses to generate handicaps.

Facebook

Twitter

Click to copy link

Link copied

Cite

Alejandro Quintela-del-Río; Mario Francisco-Fernández (2023). Excel Templates: A Helpful Tool for Teaching Statistics [Dataset]. http://doi.org/10.6084/m9.figshare.3408052.v2

Data from: Excel Templates: A Helpful Tool for Teaching Statistics

Explore at:

zipAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.3408052.v2

Dataset updated

May 30, 2023

Dataset provided by

Taylor & Francis

Authors

Alejandro Quintela-del-Río; Mario Francisco-Fernández

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This article describes a free, open-source collection of templates for the popular Excel (2013, and later versions) spreadsheet program. These templates are spreadsheet files that allow easy and intuitive learning and the implementation of practical examples concerning descriptive statistics, random variables, confidence intervals, and hypothesis testing. Although they are designed to be used with Excel, they can also be employed with other free spreadsheet programs (changing some particular formulas). Moreover, we exploit some possibilities of the ActiveX controls of the Excel Developer Menu to perform interactive Gaussian density charts. Finally, it is important to note that they can be often embedded in a web page, so it is not necessary to employ Excel software for their use. These templates have been designed as a useful tool to teach basic statistics and to carry out data analysis even when the students are not familiar with Excel. Additionally, they can be used as a complement to other analytical software packages. They aim to assist students in learning statistics, within an intuitive working environment. Supplementary materials with the Excel templates are available online.

Clear search

Close search

Google apps

Main menu

Data from: Excel Templates: A Helpful Tool for Teaching Statistics

Data from: Supplemental data

Superstore Sales Analysis

Dataset of development of business during the COVID-19 crisis

Sample Data for Excel - Dataset - Datopian CKAN instance

Sorting/selecting data in Excel with VLOOKUP()

R script that creates a wrapper function to automate the generation of...

Becoming Excel Experts

2011 skills for life survey: small area estimation data

How to use the small area estimation files, an example

https://assets.publishing.service.gov.uk/media/5a79d91240f0b670a8025dd8/middle-layer-super-output-areas-2001-all_1_.xlsx">Middle layer super output areas: 2001 all skill level estimates

Raw data outputs 1-18

Getting Started with Excel

Microsoft excel database containing all the simulated (10 sets) and...

Finsheet - Stock Price in Excel and Google Sheet

Instagram Reach Analysis - Excel Project

📊 Instagram Reach Analysis | تحليل الوصول في إنستغرام

📁 Project Description | وصف المشروع

🛠️ Tools Used | الأدوات المستخدمة

🧹 Data Cleaning | تنظيف البيانات

🔍 Key Analysis Highlights | أبرز نتائج التحليل

1. Impressions by Source | مرات الظهور حسب المصدر

2. Engagement Insights | رؤى حول التفاعل

3. Hashtag Patterns | تحليل الهاشتاقات

✅ Conclusion | الخلاصة

👩‍💻 Author | المؤلفة

🧠 Inspiration | الإلهام

💬 Feedback | الملاحظات

Hive Annotation Job Results - Cleaned and Audited

Context

Content

Additional file 1: of Simulation study of activities of daily living...

Relaxed Naïve Bayes Data

ckanext-excelforms

Data from: Statistical Software Benchmarks

Grandpa Golf

Data from: Excel Templates: A Helpful Tool for Teaching Statistics