CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Data organization for the figures in the document: Figure 3A LineOutWithSun_SSAzi_135to225_green_Correct_ROI5_INFO.xls Figure 3b LineOutWithSun_SSAzi_m45to45_green_Correct_ROI5_INFO.xls Figure 4 fulllinear_inDic_SqAzi_m180to0_CP_20to50_green_Correct_ROI5_INFO.xls fulllinear_inDic_SqAzi_m180to0_CP_20to50_green_Sim_Correct_ROI5_INFO.xls Figure 5a LineOut_Camera_Elevation_SqAzi_m180to0_green_Sim_Correct_ROI5_INFO.xls LineOut_Camera_Elevation_SqAzi_m180to0_green_Correct_ROI5_INFO.xls Figure 5b LineOut_Camera_Elevation_SqAzi_0to180_green_Correct_ROI5_INFO.xls LineOut_Camera_Elevation_SqAzi_0to180_green_Sim_Correct_ROI5_INFO.xls Figure 6a LineOutColor_SqAzi_m180to0_CP_20to50_Correct_ROI5_INFO.xls Figure 6b LineOutROI_SqAzi_m180to0_CP_20to50_green_Correct_INFO.xls Figure 7 fulllinear_inDic_SqAzi_m180to0_CP_20to50_green_Correct_ROI5_INFO.xls LineOut_MeshAoPDif_Camera_Elevation_SqAzi_0to180_green_Correct_ROI5_INFO.xls LineOut_MeshAoPDif_Camera_Elevation_SqAzi_m180to0_green_Correct_ROI5_INFO.xls
Excel spreadsheets by species (4 letter code is abbreviation for genus and species used in study, year 2010 or 2011 is year data collected, SH indicates data for Science Hub, date is date of file preparation). The data in a file are described in a read me file which is the first worksheet in each file. Each row in a species spreadsheet is for one plot (plant). The data themselves are in the data worksheet. One file includes a read me description of the column in the date set for chemical analysis. In this file one row is an herbicide treatment and sample for chemical analysis (if taken). This dataset is associated with the following publication: Olszyk , D., T. Pfleeger, T. Shiroyama, M. Blakely-Smith, E. Lee , and M. Plocher. Plant reproduction is altered by simulated herbicide drift toconstructed plant communities. ENVIRONMENTAL TOXICOLOGY AND CHEMISTRY. Society of Environmental Toxicology and Chemistry, Pensacola, FL, USA, 36(10): 2799-2813, (2017).
https://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/
This dataset was created by MICADEE
This dataset was created by Jyoti Patil
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by Akshita khare
Released under Apache 2.0
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Sample data for exercises in Further Adventures in Data Cleaning.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Pravender Chand
Released under CC0: Public Domain
This dataset was created by MICADEE
This dataset was created by Ng Choon Yan
It completely data clean excel file to attain accurate data analysis with proper visualization
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
Dataset Overview: This dataset pertains to the examination results of students who participated in a series of academic assessments at a fictitious educational institution named "University of Exampleville." The assessments were administered across various courses and academic levels, with a focus on evaluating students' performance in general management and domain-specific topics.
Columns: The dataset comprises 12 columns, each representing specific attributes and performance indicators of the students. These columns encompass information such as the students' names (which have been anonymized), their respective universities, academic program names (including BBA and MBA), specializations, the semester of the assessment, the type of examination domain (general management or domain-specific), general management scores (out of 50), domain-specific scores (out of 50), total scores (out of 100), student ranks, and percentiles.
Data Collection: The examination data was collected during a standardized assessment process conducted by the University of Exampleville. The exams were designed to assess students' knowledge and skills in general management and their chosen domain-specific subjects. It involved students from both BBA and MBA programs who were in their final year of study.
Data Format: The dataset is available in a structured format, typically as a CSV file. Each row represents a unique student's performance in the examination, while columns contain specific information about their results and academic details.
Data Usage: This dataset is valuable for analyzing and gaining insights into the academic performance of students pursuing BBA and MBA degrees. It can be used for various purposes, including statistical analysis, performance trend identification, program assessment, and comparison of scores across domains and specializations. Furthermore, it can be employed in predictive modeling or decision-making related to curriculum development and student support.
Data Quality: The dataset has undergone preprocessing and anonymization to protect the privacy of individual students. Nevertheless, it is essential to use the data responsibly and in compliance with relevant data protection regulations when conducting any analysis or research.
Data Format: The exam data is typically provided in a structured format, commonly as a CSV (Comma-Separated Values) file. Each row in the dataset represents a unique student's examination performance, and each column contains specific attributes and scores related to the examination. The CSV format allows for easy import and analysis using various data analysis tools and programming languages like Python, R, or spreadsheet software like Microsoft Excel.
Here's a column-wise description of the dataset:
Name OF THE STUDENT: The full name of the student who took the exam. (Anonymized)
UNIVERSITY: The university where the student is enrolled.
PROGRAM NAME: The name of the academic program in which the student is enrolled (BBA or MBA).
Specialization: If applicable, the specific area of specialization or major that the student has chosen within their program.
Semester: The semester or academic term in which the student took the exam.
Domain: Indicates whether the exam was divided into two parts: general management and domain-specific.
GENERAL MANAGEMENT SCORE (OUT of 50): The score obtained by the student in the general management part of the exam, out of a maximum possible score of 50.
Domain-Specific Score (Out of 50): The score obtained by the student in the domain-specific part of the exam, also out of a maximum possible score of 50.
TOTAL SCORE (OUT of 100): The total score obtained by adding the scores from the general management and domain-specific parts, out of a maximum possible score of 100.
This dataset was created by Ajayvardhan
In this dataset, you will find three Excel files (store data, customer data, sale data).
This dataset is suitable for practicing EDA and finding relationships within tables and features.
I obtained this dataset from the YouTube channel "https://www.youtube.com/@AlexTheAnalyst">Alex The Analyst
Market basket analysis with Apriori algorithm
The retailer wants to target customers with suggestions on itemset that a customer is most likely to purchase .I was given dataset contains data of a retailer; the transaction data provides data around all the transactions that have happened over a period of time. Retailer will use result to grove in his industry and provide for customer suggestions on itemset, we be able increase customer engagement and improve customer experience and identify customer behavior. I will solve this problem with use Association Rules type of unsupervised learning technique that checks for the dependency of one data item on another data item.
Association Rule is most used when you are planning to build association in different objects in a set. It works when you are planning to find frequent patterns in a transaction database. It can tell you what items do customers frequently buy together and it allows retailer to identify relationships between the items.
Assume there are 100 customers, 10 of them bought Computer Mouth, 9 bought Mat for Mouse and 8 bought both of them. - bought Computer Mouth => bought Mat for Mouse - support = P(Mouth & Mat) = 8/100 = 0.08 - confidence = support/P(Mat for Mouse) = 0.08/0.09 = 0.89 - lift = confidence/P(Computer Mouth) = 0.89/0.10 = 8.9 This just simple example. In practice, a rule needs the support of several hundred transactions, before it can be considered statistically significant, and datasets often contain thousands or millions of transactions.
Number of Attributes: 7
https://user-images.githubusercontent.com/91852182/145270162-fc53e5a3-4ad1-4d06-b0e0-228aabcf6b70.png">
First, we need to load required libraries. Shortly I describe all libraries.
https://user-images.githubusercontent.com/91852182/145270210-49c8e1aa-9753-431b-a8d5-99601bc76cb5.png">
Next, we need to upload Assignment-1_Data. xlsx to R to read the dataset.Now we can see our data in R.
https://user-images.githubusercontent.com/91852182/145270229-514f0983-3bbb-4cd3-be64-980e92656a02.png">
https://user-images.githubusercontent.com/91852182/145270251-6f6f6472-8817-435c-a995-9bc4bfef10d1.png">
After we will clear our data frame, will remove missing values.
https://user-images.githubusercontent.com/91852182/145270286-05854e1a-2b6c-490e-ab30-9e99e731eacb.png">
To apply Association Rule mining, we need to convert dataframe into transaction data to make all items that are bought together in one invoice will be in ...
Perform this analysis on 2 excel files ‘Unstructured Data English.xlsx’ and ‘Unstructured Data Japanese.xls
Task: 1. Generate code for data cleansing in Python 2. Write a SQL Query/python to generate top 10 frequently occurring meaningful phrases in the data set 3. Using above cleansed data create a term document matrix for the given unstructured data. Use Python 3.7. The output should be in excel format explained as below,
S.no. (Unique identifier of a comment)
Token 1 2 3 4 so on…….. Unique identifier of each comment
Fire 0 0 3 0 ………. Token
Bad 2 0 1 8 ………. Number of times a token appears in a comment
Hate 5 2 2 0 ……….
Shine 1 1 2 3 ……….
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Vrinda Store: Interactive Ms Excel dashboardVrinda Store: Interactive Ms Excel dashboard Feb 2024 - Mar 2024Feb 2024 - Mar 2024 The owner of Vrinda store wants to create an annual sales report for 2022. So that their employees can understand their customers and grow more sales further. Questions asked by Owner of Vrinda store are as follows:- 1) Compare the sales and orders using single chart. 2) Which month got the highest sales and orders? 3) Who purchased more - women per men in 2022? 4) What are different order status in 2022?
And some other questions related to business. The owner of Vrinda store wanted a visual story of their data. Which can depict all the real time progress and sales insight of the store. This project is a Ms Excel dashboard which presents an interactive visual story to help the Owner and employees in increasing their sales. Task performed : Data cleaning, Data processing, Data analysis, Data visualization, Report. Tool used : Ms Excel The owner of Vrinda store wants to create an annual sales report for 2022. So that their employees can understand their customers and grow more sales further. Questions asked by Owner of Vrinda store are as follows:- 1) Compare the sales and orders using single chart. 2) Which month got the highest sales and orders? 3) Who purchased more - women per men in 2022? 4) What are different order status in 2022? And some other questions related to business. The owner of Vrinda store wanted a visual story of their data. Which can depict all the real time progress and sales insight of the store. This project is a Ms Excel dashboard which presents an interactive visual story to help the Owner and employees in increasing their sales. Task performed : Data cleaning, Data processing, Data analysis, Data visualization, Report. Tool used : Ms Excel Skills: Data Analysis · Data Analytics · ms excel · Pivot Tables
This dataset was created by Srushti Dongre
It contains the following files:
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This Excel-based dashboard visualizes Amazon sales performance using KPIs such as Total Sales, Profit, and Quantity Sold. Built with Pivot Tables, Slicers, and Charts, the dashboard includes regional breakdowns, product performance, and trend analysis. Suitable for beginners learning Sales Operations and Excel Dashboarding.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
A company has a fleet of devices transmitting daily sensor readings. They would like to create a predictive maintenance solution to proactively identify when maintenance should be performed. This approach promises cost savings over routine or time based preventive maintenance, because tasks are performed only when warranted.
The task is to build a predictive model using machine learning to predict the probability of a device failure. When building this model, be sure to minimize false positives and false negatives. The column you are trying to Predict is called failure with binary value 0 for non-failure and 1 for failure.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Data organization for the figures in the document: Figure 3A LineOutWithSun_SSAzi_135to225_green_Correct_ROI5_INFO.xls Figure 3b LineOutWithSun_SSAzi_m45to45_green_Correct_ROI5_INFO.xls Figure 4 fulllinear_inDic_SqAzi_m180to0_CP_20to50_green_Correct_ROI5_INFO.xls fulllinear_inDic_SqAzi_m180to0_CP_20to50_green_Sim_Correct_ROI5_INFO.xls Figure 5a LineOut_Camera_Elevation_SqAzi_m180to0_green_Sim_Correct_ROI5_INFO.xls LineOut_Camera_Elevation_SqAzi_m180to0_green_Correct_ROI5_INFO.xls Figure 5b LineOut_Camera_Elevation_SqAzi_0to180_green_Correct_ROI5_INFO.xls LineOut_Camera_Elevation_SqAzi_0to180_green_Sim_Correct_ROI5_INFO.xls Figure 6a LineOutColor_SqAzi_m180to0_CP_20to50_Correct_ROI5_INFO.xls Figure 6b LineOutROI_SqAzi_m180to0_CP_20to50_green_Correct_INFO.xls Figure 7 fulllinear_inDic_SqAzi_m180to0_CP_20to50_green_Correct_ROI5_INFO.xls LineOut_MeshAoPDif_Camera_Elevation_SqAzi_0to180_green_Correct_ROI5_INFO.xls LineOut_MeshAoPDif_Camera_Elevation_SqAzi_m180to0_green_Correct_ROI5_INFO.xls