Facebook
TwitterThe link for the Excel project to download can be found on GitHub here.
It includes the raw data, Pivot Tables, and an interactive dashboard with Pivot Charts and Slicers. The project also includes business questions and the formulas I used to answer. The image below is included for ease.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12904052%2F61e460b5f6a1fa73cfaaa33aa8107bd5%2FBusinessQuestions.png?generation=1686190703261971&alt=media" alt="">
The link for the Tableau adjusted dashboard can be found here.
A screenshot of the interactive Excel dashboard is also included below for ease.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12904052%2Fe581f1fce8afc732f7823904da9e4cce%2FScooter%20Dashboard%20Image.png?generation=1686190815608343&alt=media" alt="">
Facebook
TwitterOpen Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
In this project, I conducted a comprehensive analysis of retail and warehouse sales data to derive actionable insights. The primary objective was to understand sales trends, evaluate performance across channels, and identify key contributors to overall business success.
To achieve this, I transformed raw data into interactive Excel dashboards that highlight sales performance and channel contributions, providing a clear and concise representation of business metrics.
Key Highlights of the Project:
Created two dashboards: Sales Dashboard and Contribution Dashboard. Answered critical business questions, such as monthly trends, channel performance, and top contributors. Presented actionable insights with professional visuals, making it easy for stakeholders to make data-driven decisions.
Facebook
TwitterThis dashboard was created from data published by Olist Store (a Brazilian e-commerce public dataset). Raw data contains information about 100 000 orders from 2016 to 2018 placed in many regions of Brazil.
The raw datasets were imported into Excel using “Get data” option (formerly known as “Power Query”) and cleaned. An additional table with the names of Brazilian states was also imported from the Wikipedia page.
A Data Table about payment information was created based on imported statistics with the usage of nested formulas. Then, proper pivot charts were used to build an Olist Store Payment Dashboard which allows you to review the data using a connected timeline and slicers.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Sample data for exercises in Further Adventures in Data Cleaning.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset illustrates sales data from a company and its three product lines - boats, cars, and planes. It contains information such as historical and sales data. This is fictional data, created and used for data exploration and profit margin analysis.
The link for the Excel project to download can be found at this GitHub Repository. It includes the raw data, statistical analysis, Pivot Tables, and a dashboard with Pivot Charts for interaction.
Below is a screenshot of the charts for ease.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10624788%2Fc945ef4223f1b0b6c2dfe7ade798e34e%2FWeekly%20Revenue%20by%20Product%20Line.png?generation=1722385095875351&alt=media" alt="">
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10624788%2Fd3be2fd1f741b0899e79b9c50c7e29a0%2FRevenue%20and%20Profit%20by%20Quarter.png?generation=1722385108310009&alt=media" alt="">
Facebook
TwitterThis dataset illustrates customer data from bike sales. It contains information such as Income, Occupation, Age, Commute, Gender, Children, and more. This is fictional data, created and used for data exploration and cleaning.
The link for the Excel project to download can be found on GitHub here. It includes the raw data, the cleaned data, Pivot Tables, and a dashboard with Pivot Charts and Slicers for interaction. This allows the interactive dashboard to filter by Marital Status, Region, and Education.
Below is a screenshot of the dashboard for ease.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12904052%2Fcbc9db6fe00f3201c64e4fdb668ce9d1%2FBikeBuyers%20Dashboard%20Image.png?generation=1686186378985936&alt=media" alt="">
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is a compiled list of all the women who took part in the data collection process.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raw Dataset for University Instructors' Technostress and Online Teaching Self-Efficacy Project
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset was created during the research carrried out for the PhD of Negin Afsharzadeh and the subsequent manuscript arising from this research. The main purpose of this dataset is to create a record of the raw data that was used in the analyses in the manuscript.
This dataset includes:
In this study, we aimed to optimize approaches to improve the biotechnological production of important metabolites in G. glabra. The study is made up of four experiments that correspond to particular figures/tables in the manuscript and data, as described below.
We tested approaches for the cultivation of G. glabra, specifically the breaking of seed dormancy, to ensure timely and efficient seed germination. To do this, we tested the effect of different pretreatments, sterilization treatments and growth media on the germination success of G. glabra.
This experiment corresponds to:
We aimed to optimize the induction of hairy roots in G. glabra. Four strains of R. rhizogenes were tested to identify the most effective strain for inducing hairy root formation and we tested different tissue explants (cotyledons/hypocotyls) and methods of R. rhizogenes infection (injection or soaking for different durations) in these tissues.
This experiment corresponds to:
Eight distinct hairy root lines were established and the growth rate of these lines was measured over 40 days.
This experiment corresponds to:
We aimed to test different qualities of light on hairy root cultures in order to induce higher growth and possible enhanced metabolite production. A line with a high growth rate from experiment 3, line S, was selected for growth under different light treatments: red light, blue light, and a combination of blue and red light. To assess the overall impact of these treatments, the growth of line S, as well as the increase in antioxidant capacity and total phenolic content, were tracked over this induction period.
This experiment corresponds to:
To work with the .R file and the R datasets, it is necessary to use R: A Language and Environment for Statistical Computing and a package within R, aDHARMA. The versions used for the analyses are R version 4.4.1 and aDHARMA version 0.4.6.
The references for these are:
R Core Team, R: A Language and Environment for Statistical Computing 2024. https://www.R-project.org/
Hartig F, DHARMa: Residual Diagnostics for Hierarchical (Multi-Level/Mixed) Regression Models 2022. https://CRAN.R-project.org/package=DHARMa
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains the raw experimental data and supplementary materials for the "Asymmetry Effects in Virtual Reality Rod and Frame Test". The materials included are:
• Raw Experimental Data: older.csv and young.csv
• Mathematica Notebooks: a collection of Mathematica notebooks used for data analysis and visualization. These notebooks provide scripts for processing the experimental data, performing statistical analyses, and generating the figures used in the project.
• Unity Package: a Unity package featuring a sample scene related to the project. The scene was built using Unity’s Universal Rendering Pipeline (URP). To utilize this package, ensure that URP is enabled in your Unity project. Instructions for enabling URP can be found in the Unity URP Documentation.
Requirements:
• For Data Files: software capable of opening CSV files (e.g., Microsoft Excel, Google Sheets, or any programming language that can read CSV formats).
• For Mathematica Notebooks: Wolfram Mathematica software to run and modify the notebooks.
• For Unity Package: Unity Editor version compatible with URP (2019.3 or later recommended). URP must be installed and enabled in your Unity project.
Usage Notes:
• The dataset facilitates comparative studies between different age groups based on the collected variables.
• Users can modify the Mathematica notebooks to perform additional analyses.
• The Unity scene serves as a reference to the project setup and can be expanded or integrated into larger projects.
Citation: Please cite this dataset when using it in your research or publications.
Facebook
TwitterThe heat pump monitoring datasets are a key output of the Electrification of Heat Demonstration (EoH) project, a government-funded heat pump trial assessing the feasibility of heat pumps across the UK’s diverse housing stock. These datasets are provided in both cleansed and raw form and allow analysis of the initial performance of the heat pumps installed in the trial. From the datasets, insights such as heat pump seasonal performance factor (a measure of the heat pump's efficiency), heat pump performance during the coldest day of the year, and half-hourly performance to inform peak demand can be gleaned.
For the second edition (December 2024), the data were updated to include cleaned performance data collected between November 2020 and September 2023. The only documentation currently available with the study is the Excel data dictionary. Reports and other contextual information can be found on the Energy Systems Catapult website.
The EoH project was funded by the Department of Business, Energy and Industrial Strategy. From 2023, it is covered by the new Department for Energy Security and Net Zero.
Data availability
This study comprises the open-access cleansed data from the EoH project and a summary dataset, available in four zipped files (see the 'Access Data' tab). Users must download all four zip files to obtain the full set of cleansed data and accompanying documentation.
When unzipped, the full cleansed data comprises 742 CSV files. Most of the individual CSV files are too large to open in Excel. Users should ensure they have sufficient computing facilities to analyse the data.
The UKDS also holds an accompanying study, SN 9049 Electrification of Heat Demonstration Project: Heat Pump Performance Raw Data, 2020-2023, which is available only to registered UKDS users. This contains the raw data from the EoH project. Since the data are very large, only the summary dataset is available to download; an order must be placed for FTP delivery of the remaining raw data. Other studies in the set include SN 9209, which comprises 30-minute interval heat pump performance data, and SN 9210, which includes daily heat pump performance data.
The Python code used to cleanse the raw data and then perform the analysis is accessible via the
"https://github.com/ES-Catapult/electrification_of_heat" target="_blank">
Energy Systems Catapult Github
Facebook
TwitterThis dataset was created by Pinky Verma
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the dataset that accompanies the study: "On the Distribution of "Simple Stupid Bugs" in Unit Test Files: An Exploratory Study." This study has been accepted for publication at the 2021 Mining Software Repositories Conference.
Following is the abstract of the study:
A key aspect of ensuring the quality of a software system is the practice of unit testing. Through unit tests, developers verify the correctness of production source code, thereby verifying the system's intended behavior under test. However, unit test code is subject to issues, ranging from bugs in the code to poor test case design (i.e., test smells). In this study, we compare and contrast the occurrences of a type of single-statement-bug-fix known as "simple stupid bugs" (SStuBs) in test and non-test (i.e., production) files in popular open-source Java Maven projects. Our results show that SStuBs occur more frequently in non-test files than in test files, with most fix-related code associated with assertion statements in test files. Further, most test files exhibiting SStuBs also exhibit test smells. We envision our findings enabling tool vendors to better support developers in improving the maintenance of test suites.
Following are the contents of the dataset:
Dataset.sqlite -- A SQLite database containing the raw dataset used in this project CompleteTableEntries.xlsx -- Excel spreadsheet containing the complete listings for the tables in the paper
Key contents of Dataset.sqlite
Table Name ---- Table Description "sstubs"---- The set of sstubs in popular Maven repositories "testsmells" ---- Mined test smells from test files "topJavaMavenProjects" ---- Repository details for the Maven projects "topJavaMaven_Commit" ---- Commit details for the Maven projects
Facebook
TwitterThis dataset is a label-free quantitation of proteins milk and dry secretions from the end of lactation through day 21 of the dry period using liquid chromatography with tandem mass spectrometry (LC-MS/MS). The data supplied in this article supports the accompanying publication entitled “Characterization of bovine mammary gland dry secretions and their proteome from the end of lactation through day 21 of the dry period”. The Thermo mass spectrometry raw files and MaxQuant files have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset number PXD017837. Resources in this dataset:Resource Title: Characterization of Bovine Dry Secretions and their Proteome from the End of Lactation Through Day 21 of the Dry Period - ProteomeXchange Consortium via the PRIDE partner repository, Project PXD017837. File Name: Web Page, url: https://www.ebi.ac.uk/pride/archive/projects/PXD017837 Thermo raw file code for Pride raw files and supplemental Excel files. The 3 technical replicates are denoted as a letter A, B and C. The number following is the cow identification number for 11 cows used. The final two-digit number after the underscore is the day sampled where _01 = day 1, _03 = day 3, _10 = day 10 and _21 = day 21 of dry period. For example, A1313_01 is technical replicate A for cow 1313 collected on day 1. B1313_03 is technical replicate B for cow 1313 collected on day 3. Details of sample and data processing protocols are provided.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
The dataset includes the interview guidelines of both pre-study and respondent validation phases of the study and the transcripts of the interviews. Also, the questionnaire and the raw dataset of the quantitative phase has also been included in the uploaded files.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the dataset that accompanies the study: "Using Grammar Patterns to Interpret Test Method Name Evolution." This study has been accepted for publication at 29th IEEE/ACM International Conference on Program Comprehension.
Following is the abstract of the study: It is good practice to name test methods such that they are comprehensible to developers; they must be written in such a way that their purpose and functionality are clear to those who will maintain them. Unfortunately, there is little automated support for writing or maintaining the names of test methods. This can lead to inconsistent and low-quality test names and increase the maintenance cost of supporting these methods. Due to this risk, it is essential to help developers in maintaining their test method names over time. In this paper, we use grammar patterns, and how they relate to test method behavior, to understand test naming practices. This data will be used to support an automated tool for maintaining test names.
Following are the contents of the dataset:
ICPC2021-Public.sqlite -- A SQLite database containing the raw dataset used in this project
ICPC2021-Public.xlsx -- Excel spreadsheet containing the complete listings for the tables in the paper
Contents of SANER2021-Public.sqlite
Table Name ---- Table Description "gitCommit" ---- The commit log for all projects "refactoring" ---- Mined refactoring operations from RefactoringMiner "refactoring_renamedMethod" ---- Mined Rename Method refactoring operations "detected_testfiles" ---- Detected unit test files "detected_testfiles_refactored" ---- Refactored unit test files "detected_testfiles_refactored_renamemethod" ---- Renamed Methods in refactored unit test files "annotation_grammar" ---- The data that was provided to the annotators "annotation_grammar_results" ---- The finalized results of the annotation "annotation_grammar_results_prefix2" ---- The first two part-of-speech tags of finalized annotation "annotation_grammar_results_prefix3" ---- The first three part-of-speech tags of finalized annotation "annotation_grammar_results_prefix4" ---- The first four part-of-speech tags of finalized annotation "annotation_grammar_results_prefix5" ---- The first five part-of-speech tags of finalized annotation
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This zip file contains: - 3 .zip files = projects to be imported into SmartPLS 3
DLOQ-A model with 7 dimensions DLOQ-A model with second-order latent variable ECSI model (Tenenhaus et al., 2005) to exemplify direct, indirect and total effects, as well as importance-performance map and moderation with continuous variables. ECSI Model (Sanches, 2013) to exemplify MGA (multi-group analysis)
Note: - DLOQ-A = new dataset (ours) - ECSI-Tenenhaus et al. [model for mediation and moderation] = available at: http://www.smartpls.com > Resources > SmartPLS Project Examples - ECSI-Sanches [dataset for MGA] = available in the software R > library(plspm) > data(satisfaction)
Facebook
Twitter_Dataset for project outputs_
doi: 10.5281/zenodo.17220467
last updated: 2025-09-28
_Contact_
Marek Holba
holba@asio.cz
+ 420 606 610 347
ASIO TECH, spol. s r.o
Kšírova 552/45, 619 00 Brno, Czech Republic
_Licence/Data availability_
Research data described in this ReadMe file and in this metadata deposit are confidential given their nature revealing trade secrets and infringing intellectual property (IP) rights of the parties. Sample data can be provided upon request by the above-described contact person. With respect to IP protection and protection of commercial interests, nondisclosure agreements will be required.
------------------------------------------------------------------------------------------------
_About the dataset_
Pilot plant at Kladno-Vrapice served for verification of laboratory hypotheses at pilot-scale. Pilot plant with two reactors has been built and one of the reactors was filled also by biomass carriers to compare efficiency. Reactors were regularly sampled by University of Chemistry and Technology, Prague (UCT Prague) for evaluation – those data are stored and managed by UCT Prague. However, operational data have been monitored and saved every five minutes to evaluate the actual shape of reactors. Concentration of dissolved oxygen, pH and temperature were monitored. Operation of all adjacent devices was monitored (on/off) – pumps, blowers and dosing pumps.
_Methods of data collection_
Data was saved on memory card placed at the control panel. All data files (24h operation file) were saved in one week interval through VPN connection and stored at hard-disk at ASIO TECH.
_Methods of data processing_
Datasets contains raw data in *.csv format. Those data were re-processed into Excel file for possibility of data handling and visual representation. No further processing was conducted.
------------------------------------------------------------------------------------------------
_File name structure_
* Data Kladno/monthYYYY/DDMMYYYY.csv
Data Kladno – name of the main file
month – name of the data from measured month (e.g. březen, duben)
YYYY – year of operation (e.g. 2024 or 2025)
00 – day (e.g. 05, 31); 11 – month (e.g. 08, 12); 2222 – year (e.g. 2024 or 2025)
* Example: Data Kladno/březen2025/31032025.csv
_File formats_
* operational data – original CSV + converted XLSM
* Text documents – PDF
_Date formats_
* YYYY-MM-DD
* HH-MM-SS 24hr format
_Units and abbreviations_
* All dissolved oxygen concentration are in mg/L
* All temperature measurements are in °C
------------------------------------------------------------------------------------------------
_List of files_
* Operational
Facebook
TwitterContinuous Temperature Data. Raw Hobo data files collected at Willapa NWR (Omeara Creek) and compiled at a regular frequency into a annual (cleaned) dataset (Excel). Omeara Creek had five locations monitored during the pilot project and these data are included in this dataset. Each location could have multiple loggers deployed (planned redundancy in case of logger failure). See the CURRENT-TL-Logger Log for details at https://ecos.fws.gov/ServCat/Reference/Profile/132469. OMEDSA02W: 46.4021548, -123.9490759 OMEDSA01W: 46.4021189, -123.9487788 OMETSC01W: 46.4019155, -123.9474488 OMEUSK02W: 46.4002542, -123.9445530 OMEUSK03A: 46.4002542, -123.9445530
Facebook
TwitterThis data is associated with the Nevada Play Fairway project and includes excel files containing raw 2-meter temperature data and corrections. GIS shapefiles and layer files contain ing location and attribute information for the data are included. Well data includes both deep and shallow TG holes, GIS shapefiles and layer files.
Facebook
TwitterThe link for the Excel project to download can be found on GitHub here.
It includes the raw data, Pivot Tables, and an interactive dashboard with Pivot Charts and Slicers. The project also includes business questions and the formulas I used to answer. The image below is included for ease.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12904052%2F61e460b5f6a1fa73cfaaa33aa8107bd5%2FBusinessQuestions.png?generation=1686190703261971&alt=media" alt="">
The link for the Tableau adjusted dashboard can be found here.
A screenshot of the interactive Excel dashboard is also included below for ease.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12904052%2Fe581f1fce8afc732f7823904da9e4cce%2FScooter%20Dashboard%20Image.png?generation=1686190815608343&alt=media" alt="">