Analyzing sales data is essential for any business looking to make informed decisions and optimize its operations. In this project, we will utilize Microsoft Excel and Power Query to conduct a comprehensive analysis of Superstore sales data. Our primary objectives will be to establish meaningful connections between various data sheets, ensure data quality, and calculate critical metrics such as the Cost of Goods Sold (COGS) and discount values. Below are the key steps and elements of this analysis:
1- Data Import and Transformation:
2- Data Quality Assessment:
3- Calculating COGS:
4- Discount Analysis:
5- Sales Metrics:
6- Visualization:
7- Report Generation:
Throughout this analysis, the goal is to provide a clear and comprehensive understanding of the Superstore's sales performance. By using Excel and Power Query, we can efficiently manage and analyze the data, ensuring that the insights gained contribute to the store's growth and success.
https://digital.nhs.uk/about-nhs-digital/terms-and-conditionshttps://digital.nhs.uk/about-nhs-digital/terms-and-conditions
Contains monthly data from the Assuring Transformation dataset. Data is available in Excel or CSV format. Note: This data was republished on 20th December 2021 due to an issue with historic length of stay data being incorrectly calculated in table 2.7 (now removed). This has been replaced with table 3.4. Figures for November 2021 were unaffected.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The folder contains two subfolders: one with EyeLink 1000 Plus eye-tracking data, and the other with Tobii Nano Pro data. Each of these folders includes a file named "Gaze_position_raw", which is an Excel file containing all the raw data collected from participants. In this file, different trials are stored in separate sheets, and each sheet contains columns for the displayed target location and the corresponding gaze location.A separate folder called "Processed_data_EyeLink/Tobii" contains the results after performing a linear transformation. In this folder, each trial is saved as a separate Excel file. These files include the target location, the original gaze location, and the corrected gaze location after the corresponding linear transformation.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This project focuses on data mapping, integration, and analysis to support the development and enhancement of six UNCDF operational applications: OrgTraveler, Comms Central, Internal Support Hub, Partnership 360, SmartHR, and TimeTrack. These apps streamline workflows for travel claims, internal support, partnership management, and time tracking within UNCDF.Key Features and Tools:Data Mapping for Salesforce CRM Migration: Structured and mapped data flows to ensure compatibility and seamless migration to Salesforce CRM.Python for Data Cleaning and Transformation: Utilized pandas, numpy, and APIs to clean, preprocess, and transform raw datasets into standardized formats.Power BI Dashboards: Designed interactive dashboards to visualize workflows and monitor performance metrics for decision-making.Collaboration Across Platforms: Integrated Google Collab for code collaboration and Microsoft Excel for data validation and analysis.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This folder contains the following files and datasets:Flow Cytometry DataIndividual FCS files - Raw data files obtained following segmentationAnalysis file (pre-transformation) - Data analysis file before transformation, compatible with FCS ExpressAnalysis file (post-transformation) - Data analysis file after transformation, compatible with FCS ExpressDNS format files - Processed files analyzed following data transformationStatistical Analysis and FiguresManuscript figures - All figures from the manuscript in GraphPad Prism format, accessible with Numbers, including statistical test resultsData Extraction and Spatial AnalysisCluster percentages - Excel file containing individual cluster percentages extracted from the analysis fileSpatial neighborhood data - Excel file with all data used as starting point for spatial neighborhood map generationSpatial interaction maps - ZIP archive containing heatmaps showing spatial interactions between individual clustersPlease see the collection for related records https://doi.org/10.25405/data.ncl.c.7890872
The excelimport extension enhances CKAN's data ingestion capabilities by enabling users to create and update datasets directly from Excel files. This facilitates a streamlined process for importing structured data, eliminating the need for manual data entry or complex transformations. By directly supporting Excel files, the extension offers a user-friendly method for populating CKAN with dataset information. Key Features: Excel-Based Dataset Creation: Allows users to create new datasets by uploading Excel files, parsing the data within the file to automatically populate dataset fields and resources. Excel-Based Dataset Updates: Permits the updating of existing datasets in CKAN by uploading modified Excel files, ensuring data integrity and reducing manual manipulation. Metadata Mapping: Provides the ability to map Excel columns to CKAN dataset fields, allowing for flexible customization of the import process to align with specific data models based on how the mapping is set up. Benefits & Impact: Leveraging the excelimport extension facilitates and allows users to quickly ingest data into CKAN from Excel files. This reduces the barrier to entry for populating the catalog, by simplifying the data ingestion workflow, it improves data accessibility and promotes data sharing within organizations.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Additional file 1. Example of translation from VCF into GDM format for genomic region data: This .xlsx (MS Excel) spreadsheet exemplifies the transformation of the original 1KGP mutations—expressed in VCF format—into GDM genomic regions. As a demonstrative example, some variants about chromosome X have been selected from the source data (in VCF format) and listed in the first table at the top of the file. The values of columns #CHROM, POS, REF and ALT appear as in the source. We removed the details that are unnecessary for the transformation from the column INFO. In the column FORMAT it is indicated exclusively the value “GT”, meaning that the next columns contain only the genotype of the samples (this and other conventions are expressed in the VCF specification document and in the header section of each VCF file). In multiallelic variants (examples e, f.1 and f.2), the genotype indicates with a number which of the alternative alleles in ALT is present in the corresponding samples (e.g., the number 2 means that the second variant is present); otherwise, it only assumes the values 0—mutation absent, or 1—the mutation is present. Additionally, the genotype indicates whether one or both chromosome copies contain the mutation and which one, i.e., the left one or the right one; the mutated alleles are normally separated by a pipe (“|”), if not otherwise specified in the header section; we do not know which chromosome copy is maternal or paternal, but as the 1KGP mutations are “phased”, we know that the “left chromosome” is the same in every mutation located in the same chromosome of the same donor. As in this example we have only one column after the FORMAT one, the mutations described are relative to only one sample, called “HG123456”. Actually, this sample does not exist in the source, but serves the purpose of demonstrating several mutation types that are found in the original data. The table reports six variants in VCF format, with the last one repeated two times to show how different values of genotype lead to a different translation (indeed, examples f.1 and f.2 differ only for the last column). Below in the same file, the same variants appear converted in GDM format. The transformation outputs the chr, left, right, strand, AL1, AL2, ref, alt, mut_type and length columns. The value of strand is positive in every mutation, as clarified by the 1KGP Consortium after the release of the data collections. Values of AL1 and AL2 express on which chromatid the mutation occur and depend on the value of the original genotype (column HG123456). The values of the other columns, namely chr, left, right, ref, alt, mut_type and length, are obtained from the variant original values after the split of multi-allelic variants, the transformation of the original position into 0-based coordinates, and the removal of repeated nucleotide bases from the original REF and ALT columns. In 0-based coordinates, a nucleotide base occupies the space between the coordinates x and x + 1. So, SNPs (examples a and f.2) are encoded as the replacement of ref at position between left and right with alt. Insertions (examples c and f.1) are described as the addition of the sequence of bases in alt at the position indicated in left and right, i.e., in between two nucleotide bases. Deletions (example b) are represented as the substitution of ref between positions left and right with an empty value (alt is indeed empty in this case). Finally, structural variants (examples d and e) such as copy number variations and large deletions have an empty ref because, according to the VCF specification document, the original column REF reports a nucleotide (called padding-base) that is located before the scope of the variant on the genome and is unnecessary in a 0-based representation. In this file, we reported only the columns relevant for the understanding of the transformation method regarding the mutation coordinates, reference and alternative alleles. Actually, in addition to the ones reported in the second table, the transformation adds some more columns, called as the attributes in the original INFO column to capture a selection of the attributes present in the original file.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Excel sheet which has all the raw data. It has the raw data from Google forms. Raw data after it was cleaned, raw data after it was cleaned and coded, as well as raw data after outliers were removed. It also has the coding system used.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
It contains all the necessary data to replicate the statistical and regression results presented in this paper. (XLSX)
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
The 2014 15 Budget is officially available at budget.gov.au as the authoritative source of Budget Papers (BPs) and Portfolio Budget Statement (PBS) documents. This dataset is a collection of data sources from the 2014 15 Budget, including:
Data from the 2014-15 Budget are provided to assist those who wish to analyse, visualise and programmatically access the 2014-15 Budget. It is the first time this has been done as per our announcement blog post. We intend to move further down the digital by default route to make the 2015-16 Budget more accessible and reusable in data form. We welcome your feedback and comments below. Data users should refer to footnotes and memoranda in the original files as these are not usually captured in machine readable CSVs.
This dataset was prepared by the Department of Finance and the Department of the Treasury.
The PBS Excel files published should include the following financial tables with headings and footnotes. Only the line item data (table 2.2) is available in CSV at this stage as we thought this would be the most useful PBS data to extract. Much of the other data is also available in the Budget Papers 1 and 4 in aggregate form:
Please note, total expenses reported in the csv file ‘2014-15 PBS line items dataset’ was prepared from individual agency programme expense tables. Totalling these figures does not produce the total expense figure in ‘Table1: estimates of general government expenses’ (Statement 6, Budget Paper 1). Differences relate to:
Intra agency charging for services which are eliminated for the reporting of general government financial statements;
Agency expenses that involve revaluation of assets and liabilities are reported as other economic flows in general government financial statements; and
Additional agencies’ expenses are included in general government sector expenses (e.g. Australian Strategic Policy Institute Limited and other entities) noting that only agencies that are directly government funded are required to prepare a PBS.
At this stage, the following Portfolios have contributed their PBS Excel files and are included in the line item CSV: 1.1 Agriculture Portfolio; 1.2 Attorney-General’s Portfolio; 1.3 Communications Portfolio; 1.4A Defence Portfolio; 1.4B Defence Portfolio (Department of Veterans’ Affairs); 1.5 Education Portfolio; 1.6 Employment Portfolio; 1.7 Environment Portfolio; 1.8 Finance Portfolio; 1.9 Foreign Affairs and Trade Portfolio; 1.10 Health Portfolio; 1.11 Immigration and Border Protection Portfolio; 1.12 Industry Portfolio; 1.13 Infrastructure and Regional Development Portfolio; 1.14 Prime Minister and Cabinet Portfolio; 1.15A Social Services Portfolio; 1.15B Social Services Portfolio (Department of Human Services); 1.16 Treasury Portfolio; 1.17A Department of the House of Representatives; 1.17B Department of the Senate; 1.17C Department of Parliamentary Services; and 1.17D Department of the Parliamentary Budget Office.
The original PBS Excel files and published documents include sub-totals and totals by agency and appropriation type which are not included in the line item CSV as these can be calculated programmatically. Where modifications are identified they will be updated as required. If a corrigendum to an agencies PBS is issued after budget night, tables will be updated as necessary.
Below is the CSV structure of the line item CSV. The data transformation is expected to be complete by midday 14 May, so we have put up the incomplete CSV which will be updated as additional PBSs are transformed into data form. Please keep refreshing for now.
Portfolio, Department/Agency, Outcome, Program, Expense type, Appropriation type, Description, 2012-13, 2013-14, 2014-15, 2015-16, 2016-17, Source document, Source table, URL
We have made a number of data tables from Budget Papers 1 and 4 available in their original format as Excel or XML files. We have transformed a number of these into machine readable format (as prioritised by several users of budget data) which will be published here as they are ready. Below is the list of the tables published and whether we’ve translated them into CSV form this year:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Additional file 2. Example of transformed metadata: In this .xlsx (MS Excel) file, we list all the output metadata categories generated for each sample from the transformation of the 1KGP input datasets. The output metadata include information collected from all the four 1KGP metadata files considered. Some categories are not reported in the source metadata files—they are identified by the label manually_curated_...—and were added by the developed pipeline to store technical details (e.g., download date, the md5 hash of the source file, file size, etc.) and information derived from the knowledge of the source, such as the species, the processing pipeline used in the source and the health status. For every information category, the table reports a possible value. The third column (cardinality > 1) tells whether the same key can appear multiple times in the output GDM metadata file. This is used to represent multi-valued metadata categories; for example, in a GDM metadata file, the key manually_curated_chromosome appears once for every chromosome mutated by the variants of the sample.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This document compiles raw data used in the aerosol to vesicle transformation study carried out by Serge Nader et al. For detailed information and context, refer to the main article and its supplementary material published in ACS Earth and Space Chemistry.
The Excel file contains data relevant to each figure in the main article and supporting information. The additional compressed file contains raw Transmission Electron Microscopy (TEM) photographs.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The rapid advancement of additive manufacturing (AM) requires researchers to keep up with these advancements by continually improving the AM processes. Improving manufacturing processes involves evaluating the process outputs and their conformity to the required specifications. Process capability indices, calculated using critical quality characteristics (QCs), have long been used in the evaluation process due to their proven effectiveness. AM processes typically involve multi-correlated critical QCs, indicating the need to develop a multivariate process capability index (MPCI) rather than a univariate capability index, which may lead to misleading results. In this regard, this study proposes a general methodological framework for evaluating AM processes using MPCI. The proposed framework starts by identifying the AM process and product design. Fused Deposition Modeling (FDM) is chosen for this investigation. Then, the specification limits associated with critical QCs are established. To ensure that the MPCI assumptions are met, the critical QCs data are examined for normality, stability, and correlation. Additionally, the MPCI is estimated by simulating a large sample using the properties of the collected QCs data and determining the percent of nonconforming (PNC). Furthermore, the FDM process and its capable tolerance limits are then assessed using the proposed MPCI. Finally, the study presents a sensitivity analysis of the FDM process and suggestions for improvement based on the analysis of assignable causes of variation. The results revealed that the considered process mean is shifted for all QCs, and the most variation is associated with part diameter data. Moreover, the process data are not normally distributed, and the proposed transformation algorithm performs well in reducing data skewness. Also, the performance of the FDM process according to different designations of specification limits was estimated. The results showed that the FDM process is incapable of different designs except with very coarse specifications.
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
The 2015-16 Budget is officially available at budget.gov.au as the authoritative source of Budget Papers and Portfolio Budget Statement (PBS) documents. This dataset is a collection of data sources from the 2015-16 Budget, including:
Data from the 2015-16 Budget are provided to assist those who wish to analyse, visualise and programmatically access the 2015-16 Budget.
Data users should refer to footnotes and memoranda in the original files as these are not usually captured in machine readable CSVs.
We welcome your feedback and comments below.
This dataset was prepared by the Department of Finance and the Department of the Treasury.
The PBS Excel files published should include the following financial tables with headings and footnotes. Only the line item data (table 2.2) is available in CSV at this stage. Much of the other data is also available in the Budget Papers 1 and 4 in aggregate form:
Please note, total expenses reported in the CSV file ‘2015-16 PBS line items dataset’ was prepared from individual entity programme expense tables. Totalling these figures does not produce the total expense figure in ‘Table1: Estimates of General Government Expenses’ (Statement 6, Budget Paper 1).
Differences relate to:
The original PBS Excel files and published documents include sub-totals and totals by entity and appropriation type which are not included in the line item CSV. These can be calculated programmatically. Where modifications are identified they will be updated as required.
If a corrigendum to an entities PBS is issued after budget night, tables will be updated as necessary.
The structure of the line item CSV is;
The data transformation is expected to be complete by midday 13 May. We may put up an incomplete CSV which will continue to be updated as additional PBSs are transformed into data form.
The following Portfolios are included in the line item CSV:
We have made a number of data tables from the Budget Papers available in Excel and CSV formats.
Below is the list of the tables published and whether we’ve translated them into CSV form this year:
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Excel work book with data collected and collated from a speech-in-noise test, and associated statistical analyses.
Background music is cited as one of the four main problems that listeners complain about with regards to foreground speech audibility in broadcast audio. Yet broadcasters, such as the BBC, only provide limited guidance to content producers regarding background music. For example, turn the background music down by 3 dB when co-present with foreground speech, and avoid lyrics and heavily percussive beats.
This quantitative, subjective listening experiment investigated whether or not background music arrangement and tempo have any effect on foreground speech intelligibility, such that additional broadcasting guidelines can be written if there are any genuine effects.
Full details of the listening experiment, results and analyses are reported in the PhD thesis by P. Demonte (2022).
KEY
5 x background music pieces (created with Apple Loops in Garage Band) - M1: legato string quartet - M2: solo cello; single note in a bowed, staccato style - M3: cello + lightly percussive instrumentation - M4: cello + heavily percussive instrumentation
3 x tempi - T1: 60 beats per minute (BPM) - T2: 100 bpm - T3: 140 bpm
Control condition: M5_T0 - purely energetic masking noise of speech-shaped noise - for comparison against music arrangement effects.
This speech-in-noise test used the R-SPIN speech corpus, which contains end-of-sentence target words in two semantic levels:
2 x spoken sentence semantic levels - HP: high predictability, e.g. "His plan meant taking a big RISK." - LP: low predictability, e.g. "He wants to talk about the RISK."
PID = (anonymised) participant ID #
Spreadsheet pages
total_CWS: Split into several tables, including:
Music_Tempo_Pred: Word recognition percentages by participant and combination of the independent variables (music arrangement, tempo, and semantic level of sentence predictability), excluding the control conditions: M5_T0_HP and M5_T0_LP. Statistical analyses conducted using IBM's SPSS, including: - checks of the criteria for using 3-way Repeated Measures ANOVA;
Since not all of the criteria for use of 3-way RMANOVA are fulfilled, and the outcomes of the non-parametric testing were not useful, attempts were also made to transform the data (square root, squared, and arcsine transformations) and statistically re-analyse them. See spreadsheet pages: - SQRT_Transformed_MTP - ^2_Transformed_MTP - Arcsine_Transformed_MTP
The spreadsheet pages thereafter group the data in different ways to do 2-way and 1-way RMANOVA statistical analyses: - Tempo-Pred: summation across all background music pieces; - Music: summation across all tempi and semantic levels; - Tempo: summation across all music pieces and semantic levels; - SentencePredictability: summation across all music pieces and tempi
The final page in this Excel work book - 'Deleted_Test' - contains data that were collected in an initial version of the listening experiment but not used towards the thesis. A quality check revealed that although all participants had completed the same total number of trials, there had been an imbalance in the number of trials per combination of independent variable. The problem was rectified in order to then conduct the listening experiment correctly. These 'Deleted_Test' data have neverthess been retained on this page in the Excel work book such that a researcher with more in-depth knowledge of other statistical methods may one day be able to analyse them for comparison.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Analyzing sales data is essential for any business looking to make informed decisions and optimize its operations. In this project, we will utilize Microsoft Excel and Power Query to conduct a comprehensive analysis of Superstore sales data. Our primary objectives will be to establish meaningful connections between various data sheets, ensure data quality, and calculate critical metrics such as the Cost of Goods Sold (COGS) and discount values. Below are the key steps and elements of this analysis:
1- Data Import and Transformation:
2- Data Quality Assessment:
3- Calculating COGS:
4- Discount Analysis:
5- Sales Metrics:
6- Visualization:
7- Report Generation:
Throughout this analysis, the goal is to provide a clear and comprehensive understanding of the Superstore's sales performance. By using Excel and Power Query, we can efficiently manage and analyze the data, ensuring that the insights gained contribute to the store's growth and success.