100+ datasets found

Ecommerce Dataset for Data Analysis
kaggle.com
zip
Updated Sep 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shrishti Manja (2024). Ecommerce Dataset for Data Analysis [Dataset]. https://www.kaggle.com/datasets/shrishtimanja/ecommerce-dataset-for-data-analysis/code
Explore at:
zip(2028853 bytes)Available download formats
Dataset updated
Sep 19, 2024
Authors
Shrishti Manja
Description
This dataset contains 55,000 entries of synthetic customer transactions, generated using Python's Faker library. The goal behind creating this dataset was to provide a resource for learners like myself to explore, analyze, and apply various data analysis techniques in a context that closely mimics real-world data.

About the Dataset: - CID (Customer ID): A unique identifier for each customer. - TID (Transaction ID): A unique identifier for each transaction. - Gender: The gender of the customer, categorized as Male or Female. - Age Group: Age group of the customer, divided into several ranges. - Purchase Date: The timestamp of when the transaction took place. - Product Category: The category of the product purchased, such as Electronics, Apparel, etc. - Discount Availed: Indicates whether the customer availed any discount (Yes/No). - Discount Name: Name of the discount applied (e.g., FESTIVE50). - Discount Amount (INR): The amount of discount availed by the customer. - Gross Amount: The total amount before applying any discount. - Net Amount: The final amount after applying the discount. - Purchase Method: The payment method used (e.g., Credit Card, Debit Card, etc.). - Location: The city where the purchase took place.

Use Cases: 1. Exploratory Data Analysis (EDA): This dataset is ideal for conducting EDA, allowing users to practice techniques such as summary statistics, visualizations, and identifying patterns within the data. 2. Data Preprocessing and Cleaning: Learners can work on handling missing data, encoding categorical variables, and normalizing numerical values to prepare the dataset for analysis. 3. Data Visualization: Use tools like Python’s Matplotlib, Seaborn, or Power BI to visualize purchasing trends, customer demographics, or the impact of discounts on purchase amounts. 4. Machine Learning Applications: After applying feature engineering, this dataset is suitable for supervised learning models, such as predicting whether a customer will avail a discount or forecasting purchase amounts based on the input features.

This dataset provides an excellent sandbox for honing skills in data analysis, machine learning, and visualization in a structured but flexible manner.

This is not a real dataset. This dataset was generated using Python's Faker library for the sole purpose of learning
Z
Data Analysis for the Systematic Literature Review of DL4SE
data.niaid.nih.gov
data-staging.niaid.nih.gov
Updated Jul 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cody Watson; Nathan Cooper; David Nader; Kevin Moran; Denys Poshyvanyk (2024). Data Analysis for the Systematic Literature Review of DL4SE [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4768586
Explore at:
Dataset updated
Jul 19, 2024
Dataset provided by
College of William and Mary
Washington and Lee University
Authors
Cody Watson; Nathan Cooper; David Nader; Kevin Moran; Denys Poshyvanyk
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data Analysis is the process that supports decision-making and informs arguments in empirical studies. Descriptive statistics, Exploratory Data Analysis (EDA), and Confirmatory Data Analysis (CDA) are the approaches that compose Data Analysis (Xia & Gong; 2014). An Exploratory Data Analysis (EDA) comprises a set of statistical and data mining procedures to describe data. We ran EDA to provide statistical facts and inform conclusions. The mined facts allow attaining arguments that would influence the Systematic Literature Review of DL4SE.

The Systematic Literature Review of DL4SE requires formal statistical modeling to refine the answers for the proposed research questions and formulate new hypotheses to be addressed in the future. Hence, we introduce DL4SE-DA, a set of statistical processes and data mining pipelines that uncover hidden relationships among Deep Learning reported literature in Software Engineering. Such hidden relationships are collected and analyzed to illustrate the state-of-the-art of DL techniques employed in the software engineering context.

Our DL4SE-DA is a simplified version of the classical Knowledge Discovery in Databases, or KDD (Fayyad, et al; 1996). The KDD process extracts knowledge from a DL4SE structured database. This structured database was the product of multiple iterations of data gathering and collection from the inspected literature. The KDD involves five stages:

Selection. This stage was led by the taxonomy process explained in section xx of the paper. After collecting all the papers and creating the taxonomies, we organize the data into 35 features or attributes that you find in the repository. In fact, we manually engineered features from the DL4SE papers. Some of the features are venue, year published, type of paper, metrics, data-scale, type of tuning, learning algorithm, SE data, and so on.

Preprocessing. The preprocessing applied was transforming the features into the correct type (nominal), removing outliers (papers that do not belong to the DL4SE), and re-inspecting the papers to extract missing information produced by the normalization process. For instance, we normalize the feature “metrics” into “MRR”, “ROC or AUC”, “BLEU Score”, “Accuracy”, “Precision”, “Recall”, “F1 Measure”, and “Other Metrics”. “Other Metrics” refers to unconventional metrics found during the extraction. Similarly, the same normalization was applied to other features like “SE Data” and “Reproducibility Types”. This separation into more detailed classes contributes to a better understanding and classification of the paper by the data mining tasks or methods.

Transformation. In this stage, we omitted to use any data transformation method except for the clustering analysis. We performed a Principal Component Analysis to reduce 35 features into 2 components for visualization purposes. Furthermore, PCA also allowed us to identify the number of clusters that exhibit the maximum reduction in variance. In other words, it helped us to identify the number of clusters to be used when tuning the explainable models.

Data Mining. In this stage, we used three distinct data mining tasks: Correlation Analysis, Association Rule Learning, and Clustering. We decided that the goal of the KDD process should be oriented to uncover hidden relationships on the extracted features (Correlations and Association Rules) and to categorize the DL4SE papers for a better segmentation of the state-of-the-art (Clustering). A clear explanation is provided in the subsection “Data Mining Tasks for the SLR od DL4SE”. 5.Interpretation/Evaluation. We used the Knowledge Discover to automatically find patterns in our papers that resemble “actionable knowledge”. This actionable knowledge was generated by conducting a reasoning process on the data mining outcomes. This reasoning process produces an argument support analysis (see this link).

We used RapidMiner as our software tool to conduct the data analysis. The procedures and pipelines were published in our repository.

Overview of the most meaningful Association Rules. Rectangles are both Premises and Conclusions. An arrow connecting a Premise with a Conclusion implies that given some premise, the conclusion is associated. E.g., Given that an author used Supervised Learning, we can conclude that their approach is irreproducible with a certain Support and Confidence.

Support = Number of occurrences this statement is true divided by the amount of statements Confidence = The support of the statement divided by the number of occurrences of the premise
Exploratory data analysis of a clinical study group: Development of a...
plos.figshare.com
txt
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bogumil M. Konopka; Felicja Lwow; Magdalena Owczarz; Łukasz Łaczmański (2023). Exploratory data analysis of a clinical study group: Development of a procedure for exploring multidimensional data [Dataset]. http://doi.org/10.1371/journal.pone.0201950
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0201950
Dataset updated
May 31, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Bogumil M. Konopka; Felicja Lwow; Magdalena Owczarz; Łukasz Łaczmański
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Thorough knowledge of the structure of analyzed data allows to form detailed scientific hypotheses and research questions. The structure of data can be revealed with methods for exploratory data analysis. Due to multitude of available methods, selecting those which will work together well and facilitate data interpretation is not an easy task. In this work we present a well fitted set of tools for a complete exploratory analysis of a clinical dataset and perform a case study analysis on a set of 515 patients. The proposed procedure comprises several steps: 1) robust data normalization, 2) outlier detection with Mahalanobis (MD) and robust Mahalanobis distances (rMD), 3) hierarchical clustering with Ward’s algorithm, 4) Principal Component Analysis with biplot vectors. The analyzed set comprised elderly patients that participated in the PolSenior project. Each patient was characterized by over 40 biochemical and socio-geographical attributes. Introductory analysis showed that the case-study dataset comprises two clusters separated along the axis of sex hormone attributes. Further analysis was carried out separately for male and female patients. The most optimal partitioning in the male set resulted in five subgroups. Two of them were related to diseased patients: 1) diabetes and 2) hypogonadism patients. Analysis of the female set suggested that it was more homogeneous than the male dataset. No evidence of pathological patient subgroups was found. In the study we showed that outlier detection with MD and rMD allows not only to identify outliers, but can also assess the heterogeneity of a dataset. The case study proved that our procedure is well suited for identification and visualization of biologically meaningful patient subgroups.
Youtube cookery channels viewers comments in Hinglish
zenodo.org
csv
Updated Jan 24, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abhishek Kaushik; Abhishek Kaushik; Gagandeep Kaur; Gagandeep Kaur (2020). Youtube cookery channels viewers comments in Hinglish [Dataset]. http://doi.org/10.5281/zenodo.2841848
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.2841848
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Abhishek Kaushik; Abhishek Kaushik; Gagandeep Kaur; Gagandeep Kaur
License
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
Area covered
YouTube
Description
The data was collected from the famous cookery Youtube channels in India. The major focus was to collect the viewers' comments in Hinglish languages. The datasets are taken from top 2 Indian cooking channel named Nisha Madhulika channel and Kabita’s Kitchen channel.

Both the datasets comments are divided into seven categories:-

Label 1- Gratitude

Label 2- About the recipe

Label 3- About the video

Label 4- Praising

Label 5- Hybrid

Label 6- Undefined

Label 7- Suggestions and queries

All the labelling has been done manually.

Nisha Madhulika dataset:

Dataset characteristics: Multivariate

Number of instances: 4900

Area: Cooking

Attribute characteristics: Real

Number of attributes: 3

Date donated: March, 2019

Associate tasks: Classification

Missing values: Null

Kabita Kitchen dataset:

Dataset characteristics: Multivariate

Number of instances: 4900

Area: Cooking

Attribute characteristics: Real

Number of attributes: 3

Date donated: March, 2019

Associate tasks: Classification

Missing values: Null

There are two separate datasets file of each channel named as preprocessing and main file .

The files with preprocessing names are generated after doing the preprocessing and exploratory data analysis on both the datasets. This file includes:

Id

Comment text

Labels

Count of stop-words

Uppercase words

Hashtags

Word count

Char count

Average words

Numeric

The main file includes:

Id

comment text

Labels

Please cite the paper

https://www.mdpi.com/2504-2289/3/3/37

MDPI and ACS Style

Kaur, G.; Kaushik, A.; Sharma, S. Cooking Is Creating Emotion: A Study on Hinglish Sentiments of Youtube Cookery Channels Using Semi-Supervised Approach. Big Data Cogn. Comput. 2019, 3, 37.
Electronic Store Sales Data
kaggle.com
zip
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Saumay Dhaundiyal (2023). Electronic Store Sales Data [Dataset]. https://www.kaggle.com/saumaydhaundiyal/electronic-store-sales-data
Explore at:
zip(4996940 bytes)Available download formats
Dataset updated
Jun 1, 2023
Authors
Saumay Dhaundiyal
Description
***The dataset contains sales data for an electronic store for 12 month in 12 different csv files.* **

Your are expected to pre-process, clean data and perform EDA on it.

Questions the Business Owner would like answered.

Question 1: What was the best month for sales? Question2: Which city sold the most product? Question3: What time should we display advertisements to maximize likelihood of customer's buying product? Question4: What products are most often sold together? Question5: What product sold the most?
Orange dataset table
figshare.com
xlsx
Updated Mar 4, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rui Simões (2022). Orange dataset table [Dataset]. http://doi.org/10.6084/m9.figshare.19146410.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.19146410.v1
Dataset updated
Mar 4, 2022
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Rui Simões
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The complete dataset used in the analysis comprises 36 samples, each described by 11 numeric features and 1 target. The attributes considered were caspase 3/7 activity, Mitotracker red CMXRos area and intensity (3 h and 24 h incubations with both compounds), Mitosox oxidation (3 h incubation with the referred compounds) and oxidation rate, DCFDA fluorescence (3 h and 24 h incubations with either compound) and oxidation rate, and DQ BSA hydrolysis. The target of each instance corresponds to one of the 9 possible classes (4 samples per class): Control, 6.25, 12.5, 25 and 50 µM for 6-OHDA and 0.03, 0.06, 0.125 and 0.25 µM for rotenone. The dataset is balanced, it does not contain any missing values and data was standardized across features. The small number of samples prevented a full and strong statistical analysis of the results. Nevertheless, it allowed the identification of relevant hidden patterns and trends.

Exploratory data analysis, information gain, hierarchical clustering, and supervised predictive modeling were performed using Orange Data Mining version 3.25.1 [41]. Hierarchical clustering was performed using the Euclidean distance metric and weighted linkage. Cluster maps were plotted to relate the features with higher mutual information (in rows) with instances (in columns), with the color of each cell representing the normalized level of a particular feature in a specific instance. The information is grouped both in rows and in columns by a two-way hierarchical clustering method using the Euclidean distances and average linkage. Stratified cross-validation was used to train the supervised decision tree. A set of preliminary empirical experiments were performed to choose the best parameters for each algorithm, and we verified that, within moderate variations, there were no significant changes in the outcome. The following settings were adopted for the decision tree algorithm: minimum number of samples in leaves: 2; minimum number of samples required to split an internal node: 5; stop splitting when majority reaches: 95%; criterion: gain ratio. The performance of the supervised model was assessed using accuracy, precision, recall, F-measure and area under the ROC curve (AUC) metrics.
f
Data from: The Often-Overlooked Power of Summary Statistics in Exploratory...
acs.figshare.com
xlsx
Updated Jun 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tahereh G. Avval; Behnam Moeini; Victoria Carver; Neal Fairley; Emily F. Smith; Jonas Baltrusaitis; Vincent Fernandez; Bonnie. J. Tyler; Neal Gallagher; Matthew R. Linford (2023). The Often-Overlooked Power of Summary Statistics in Exploratory Data Analysis: Comparison of Pattern Recognition Entropy (PRE) to Other Summary Statistics and Introduction of Divided Spectrum-PRE (DS-PRE) [Dataset]. http://doi.org/10.1021/acs.jcim.1c00244.s002
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1021/acs.jcim.1c00244.s002
Dataset updated
Jun 8, 2023
Dataset provided by
ACS Publications
Authors
Tahereh G. Avval; Behnam Moeini; Victoria Carver; Neal Fairley; Emily F. Smith; Jonas Baltrusaitis; Vincent Fernandez; Bonnie. J. Tyler; Neal Gallagher; Matthew R. Linford
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Unsupervised exploratory data analysis (EDA) is often the first step in understanding complex data sets. While summary statistics are among the most efficient and convenient tools for exploring and describing sets of data, they are often overlooked in EDA. In this paper, we show multiple case studies that compare the performance, including clustering, of a series of summary statistics in EDA. The summary statistics considered here are pattern recognition entropy (PRE), the mean, standard deviation (STD), 1-norm, range, sum of squares (SSQ), and X4, which are compared with principal component analysis (PCA), multivariate curve resolution (MCR), and/or cluster analysis. PRE and the other summary statistics are direct methods for analyzing datathey are not factor-based approaches. To quantify the performance of summary statistics, we use the concept of the “critical pair,” which is employed in chromatography. The data analyzed here come from different analytical methods. Hyperspectral images, including one of a biological material, are also analyzed. In general, PRE outperforms the other summary statistics, especially in image analysis, although a suite of summary statistics is useful in exploring complex data sets. While PRE results were generally comparable to those from PCA and MCR, PRE is easier to apply. For example, there is no need to determine the number of factors that describe a data set. Finally, we introduce the concept of divided spectrum-PRE (DS-PRE) as a new EDA method. DS-PRE increases the discrimination power of PRE. We also show that DS-PRE can be used to provide the inputs for the k-nearest neighbor (kNN) algorithm. We recommend PRE and DS-PRE as rapid new tools for unsupervised EDA.
f
Detailed characterization of the dataset.
figshare.com
xls
Updated Sep 26, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rodrigo Gutiérrez Benítez; Alejandra Segura Navarrete; Christian Vidal-Castro; Claudia Martínez-Araneda (2024). Detailed characterization of the dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0310707.t006
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0310707.t006
Dataset updated
Sep 26, 2024
Dataset provided by
PLOS ONE
Authors
Rodrigo Gutiérrez Benítez; Alejandra Segura Navarrete; Christian Vidal-Castro; Claudia Martínez-Araneda
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Over the last ten years, social media has become a crucial data source for businesses and researchers, providing a space where people can express their opinions and emotions. To analyze this data and classify emotions and their polarity in texts, natural language processing (NLP) techniques such as emotion analysis (EA) and sentiment analysis (SA) are employed. However, the effectiveness of these tasks using machine learning (ML) and deep learning (DL) methods depends on large labeled datasets, which are scarce in languages like Spanish. To address this challenge, researchers use data augmentation (DA) techniques to artificially expand small datasets. This study aims to investigate whether DA techniques can improve classification results using ML and DL algorithms for sentiment and emotion analysis of Spanish texts. Various text manipulation techniques were applied, including transformations, paraphrasing (back-translation), and text generation using generative adversarial networks, to small datasets such as song lyrics, social media comments, headlines from national newspapers in Chile, and survey responses from higher education students. The findings show that the Convolutional Neural Network (CNN) classifier achieved the most significant improvement, with an 18% increase using the Generative Adversarial Networks for Sentiment Text (SentiGan) on the Aggressiveness (Seriousness) dataset. Additionally, the same classifier model showed an 11% improvement using the Easy Data Augmentation (EDA) on the Gender-Based Violence dataset. The performance of the Bidirectional Encoder Representations from Transformers (BETO) also improved by 10% on the back-translation augmented version of the October 18 dataset, and by 4% on the EDA augmented version of the Teaching survey dataset. These results suggest that data augmentation techniques enhance performance by transforming text and adapting it to the specific characteristics of the dataset. Through experimentation with various augmentation techniques, this research provides valuable insights into the analysis of subjectivity in Spanish texts and offers guidance for selecting algorithms and techniques based on dataset features.
G
EDA with AI Market Research Report 2033
growthmarketreports.com
csv, pdf, pptx
Updated Aug 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Growth Market Reports (2025). EDA with AI Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/eda-with-ai-market
Explore at:
pdf, pptx, csvAvailable download formats
Dataset updated
Aug 29, 2025
Dataset authored and provided by
Growth Market Reports
Time period covered
2024 - 2032
Area covered
Global
Description
EDA with AI Market Outlook

According to our latest research, the global EDA with AI market size reached USD 7.9 billion in 2024, reflecting robust demand for advanced automation in electronic design automation (EDA) powered by artificial intelligence. The sector is experiencing a strong compound annual growth rate (CAGR) of 18.2% from 2025 to 2033. By the end of 2033, the market is forecasted to reach USD 37.2 billion, driven by the increasing complexity of semiconductor devices, rapid growth in AI-enabled chip design, and the need for faster, more efficient design cycles. These advancements are further supported by the proliferation of IoT devices and the expansion of high-performance computing, which are contributing significantly to the marketÂ’s expansion as per our latest research.

One of the primary growth factors for the EDA with AI market is the escalating complexity of semiconductor designs, which demands more sophisticated solutions for verification, simulation, and optimization. Traditional EDA tools are struggling to keep pace with the miniaturization of nodes and the integration of multi-billion transistor chips. AI-powered EDA solutions are revolutionizing the industry by automating complex tasks such as floorplanning, routing, and verification, significantly reducing time-to-market and design errors. These AI-driven tools are also enabling predictive analytics and intelligent optimization, allowing design teams to anticipate bottlenecks and improve overall productivity. As chipmakers race to develop next-generation processors for applications like autonomous vehicles, 5G, and quantum computing, the adoption of AI-enhanced EDA tools is accelerating across the globe.

Another critical growth driver is the increasing adoption of AI and machine learning across various industries, which is fueling demand for specialized hardware and custom chipsets. This trend is particularly evident in sectors such as automotive, healthcare, and consumer electronics, where smart devices and advanced driver-assistance systems (ADAS) require highly reliable and efficient silicon. The integration of AI into EDA workflows is not only improving design accuracy but also facilitating the development of application-specific integrated circuits (ASICs) and system-on-chip (SoC) solutions. Furthermore, the shift towards cloud-based EDA platforms is democratizing access to advanced design tools, enabling startups and small enterprises to compete alongside established industry players. As a result, the ecosystem for EDA with AI is becoming more vibrant and inclusive, spurring innovation at an unprecedented pace.

The third major growth factor lies in the convergence of EDA with AI and emerging technologies such as the Internet of Things (IoT), edge computing, and 5G communications. The proliferation of connected devices is driving the need for power-efficient, high-performance chips capable of real-time data processing. AI-driven EDA solutions are uniquely positioned to address these requirements by optimizing designs for power, performance, and area (PPA) metrics. Additionally, the use of AI in verification and simulation is reducing the incidence of costly design respins, thereby lowering overall development costs. Strategic collaborations between EDA vendors, semiconductor foundries, and cloud service providers are further enhancing the capabilities of AI-powered design tools, paving the way for the next wave of semiconductor innovation.

EDA Software plays a crucial role in the burgeoning EDA with AI market, as it forms the backbone of the design automation process. These software solutions are essential for managing the increasing complexity of chip designs, offering tools that automate routine tasks, enhance simulation accuracy, and enable predictive analytics. As the demand for custom and complex chips grows, the reliance on advanced EDA software will only intensify. The software's ability to incorporate machine learning algorithms that learn from historical design data, optimize layouts, and minimize errors is pivotal in maintaining competitive advantage in the fast-evolving semiconductor industry. As such, EDA Software is not just a tool but a strategic asset that drives innovation and efficiency in electronic design.

From a regional perspective, Asia Pacific continues to dominate the EDA with AI market, accounting f
Understanding Placement Factors
kaggle.com
zip
Updated Sep 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hetavi Ganatra (2024). Understanding Placement Factors [Dataset]. https://www.kaggle.com/datasets/hetavig/understanding-placement-factors/discussion
Explore at:
zip(1632 bytes)Available download formats
Dataset updated
Sep 25, 2024
Authors
Hetavi Ganatra
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
As a part of our Data Analytics Assignment, we did a survey on factors that affect placements of under-graduate students. The responses once filled were processed and used for performing EDA which helped us to Analyze the data and answer some crucial questions that concern the students currently pursuing their UG. Below are the Links you can access the codes for the Pre-processing and EDA from: Preprocessing: https://www.kaggle.com/code/hetavig/pre-processing EDA-1: https://www.kaggle.com/code/hetavig/exploratory-data-analysis-1 EDA-2: https://www.kaggle.com/code/hetavig/exploratory-data-analysis-2 EDA-3: https://www.kaggle.com/code/hetavig/exploratory-data-analysis-3 EDA-4: https://www.kaggle.com/code/hetavig/exploratory-data-analysis-4 EDA-5: https://www.kaggle.com/code/hetavig/exploratory-data-analysis-5
SEM regression for H1-5.
plos.figshare.com
xls
Updated Nov 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daan Kolkman; Gwendolyn K. Lee; Arjen van Witteloostuijn (2024). SEM regression for H1-5. [Dataset]. http://doi.org/10.1371/journal.pone.0309318.t004
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0309318.t004
Dataset updated
Nov 4, 2024
Dataset provided by
PLOShttp://plos.org/
Authors
Daan Kolkman; Gwendolyn K. Lee; Arjen van Witteloostuijn
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Recent calls to take up data science either revolve around the superior predictive performance associated with machine learning or the potential of data science techniques for exploratory data analysis. Many believe that these strengths come at the cost of explanatory insights, which form the basis for theorization. In this paper, we show that this trade-off is false. When used as a part of a full research process, including inductive, deductive and abductive steps, machine learning can offer explanatory insights and provide a solid basis for theorization. We present a systematic five-step theory-building and theory-testing cycle that consists of: 1. Element identification (reduction); 2. Exploratory analysis (induction); 3. Hypothesis development (retroduction); 4. Hypothesis testing (deduction); and 5. Theorization (abduction). We demonstrate the usefulness of this approach, which we refer to as co-duction, in a vignette where we study firm growth with real-world observational data.

Global EDA in Industrial Electronics Market Research Report: By Application...

wiseguyreports.com

Updated Sep 15, 2025

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

(2025). Global EDA in Industrial Electronics Market Research Report: By Application (Automation, Control Systems, Signal Processing, Data Acquisition), By End Use (Manufacturing, Energy, Transportation, Telecommunications), By Product Type (Embedded Systems, Industrial Process Control, Industrial Communication Equipment), By Technology (Analog Electronics, Digital Electronics, Power Electronics) and By Regional (North America, Europe, South America, Asia Pacific, Middle East and Africa) - Forecast to 2035 [Dataset]. https://www.wiseguyreports.com/reports/eda-in-industrial-electronic-market

Explore at:

Dataset updated

Sep 15, 2025

License

https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy

Time period covered

Sep 25, 2025

Area covered

Global

Description

BASE YEAR	2024
HISTORICAL DATA	2019 - 2023
REGIONS COVERED	North America, Europe, APAC, South America, MEA
REPORT COVERAGE	Revenue Forecast, Competitive Landscape, Growth Factors, and Trends
MARKET SIZE 2024	2.26(USD Billion)
MARKET SIZE 2025	2.45(USD Billion)
MARKET SIZE 2035	5.5(USD Billion)
SEGMENTS COVERED	Application, End Use, Product Type, Technology, Regional
COUNTRIES COVERED	US, Canada, Germany, UK, France, Russia, Italy, Spain, Rest of Europe, China, India, Japan, South Korea, Malaysia, Thailand, Indonesia, Rest of APAC, Brazil, Mexico, Argentina, Rest of South America, GCC, South Africa, Rest of MEA
KEY MARKET DYNAMICS	Technological advancements, Growing automation demand, Rising electronic complexity, Increased adoption of IoT, Emergence of smart manufacturing
MARKET FORECAST UNITS	USD Billion
KEY COMPANIES PROFILED	Microchip Technology, Analog Devices, Synopsys, Cadence Design Systems, Texas Instruments, Infineon Technologies, Keysight Technologies, ANSYS, NXP Semiconductors, STMicroelectronics, Altium, Maxim Integrated, Rohm Semiconductor, Siemens, Broadcom, Mentor Graphics
MARKET FORECAST PERIOD	2025 - 2035
KEY MARKET OPPORTUNITIES	AI integration for design optimization, IoT expansion in manufacturing systems, Rise in smart factory implementations, Increasing demand for energy-efficient solutions, Growth in autonomous industrial applications
COMPOUND ANNUAL GROWTH RATE (CAGR)	8.5% (2025 - 2035)

W
Wafer Fabrication EDA Tools Report
datainsightsmarket.com
doc, pdf, ppt
Updated Jul 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2025). Wafer Fabrication EDA Tools Report [Dataset]. https://www.datainsightsmarket.com/reports/wafer-fabrication-eda-tools-504653
Explore at:
ppt, doc, pdfAvailable download formats
Dataset updated
Jul 1, 2025
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The Wafer Fabrication EDA Tools market, valued at $1667 million in 2025, is projected to experience robust growth, driven by the increasing complexity of semiconductor designs and the rising demand for advanced process nodes. The market's Compound Annual Growth Rate (CAGR) of 6.4% from 2025 to 2033 reflects a consistent need for sophisticated Electronic Design Automation (EDA) tools to optimize wafer fabrication processes. Key drivers include the miniaturization of semiconductor devices, the proliferation of 5G and AI technologies fueling demand for high-performance chips, and the growing adoption of advanced packaging techniques. Leading players like Synopsys, Cadence, and Siemens EDA are at the forefront of innovation, continuously improving the accuracy, speed, and efficiency of their EDA tools to meet the evolving needs of the semiconductor industry. The market is also witnessing trends such as the integration of AI and machine learning into EDA workflows, enhancing design automation and optimization. While the market faces some restraints, such as high costs associated with advanced EDA tools and the complexities of software integration, the overall growth trajectory remains positive due to the continued technological advancements and increasing demand for high-performance computing. This growth is further fueled by strong regional demands, particularly in North America and Asia, where significant investments in semiconductor manufacturing facilities are occurring. The competitive landscape is characterized by both established industry giants and emerging players, leading to continuous innovation and improved tool capabilities. Despite the challenges of maintaining high accuracy in complex simulations and keeping up with the rapid pace of technological advancement, the wafer fabrication EDA tools market's expansion is likely to continue as the semiconductor industry progresses towards smaller, faster, and more energy-efficient chips. The market's segmentation (while not detailed in the provided data) is likely to reflect different EDA tool categories, such as physical verification, layout design, and process simulation, each exhibiting distinct growth rates.
i
Data from: A novel spatial prediction method integrating Exploratory Spatial...
ieee-dataport.org
Updated Mar 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bingbo Gao (2025). A novel spatial prediction method integrating Exploratory Spatial Data Analysis into Random Forest for large scale daily air temperature mapping [Dataset]. https://ieee-dataport.org/documents/novel-spatial-prediction-method-integrating-exploratory-spatial-data-analysis-random
Explore at:
Dataset updated
Mar 19, 2025
Authors
Bingbo Gao
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
environmental management
m
Data from: Wrist-worn sensor validation for heart rate variability and...
data.mendeley.com
data.niaid.nih.gov
Updated Jun 21, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Simone Costantini (2023). Wrist-worn sensor validation for heart rate variability and electrodermal activity detection in a stressful driving environment [Dataset]. http://doi.org/10.17632/npnv4tsbg7.1
Explore at:
Unique identifier
https://doi.org/10.17632/npnv4tsbg7.1
Dataset updated
Jun 21, 2023
Authors
Simone Costantini
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The current dataset contributes to assess the accuracy of the Empatica 4 (E4) wristband for the detection of heart rate variability (HRV) and electrodermal activity (EDA) metrics in stress-inducing conditions and growing-risk driving scenarios. Heart Rate Variability (HRV) and ElectroDermal Activity (EDA) signals were recorded over six experimental conditions (i.e., Baseline, Video Clip, Scream, No Risk Driving, Low-Risk Driving, and High-Risk Driving) and by means of two measurement systems: the E4 device and a gold standard system. The raw quality of the physiological signals was enhanced by means of robust semi-automatic reconstruction algorithms. Heart Rate Variability time-domain parameters showed high accuracy in motion-free experimental conditions, while Heart Rate Variability frequency-domain parameters reported sufficient accuracy in almost every experimental condition.

Folder 01 contains both HRV and EDA parameters for every experimental condition, according to the Gold Standard measurement system and the Empatica 4 device, in two separate Excel files.

Folder 02 contains supplementary material on the assessment of the signals quality.

Folder 03 contains the Bland-Altman plot for each HRV and EDA parameter and for each condition (1 .png file per each parameter), and an excel file that resumes the Bland-Altman analyses numerical outcomes.
g
Electrodermal Activity (EDA) of Bi-cultural Visitors In Virtual Park...
gimi9.com
Updated Dec 14, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Electrodermal Activity (EDA) of Bi-cultural Visitors In Virtual Park Settings | gimi9.com [Dataset]. https://gimi9.com/dataset/eu_5f6276bf-6e08-4432-bf3b-9c77672976ba-envidat/
Explore at:
Dataset updated
Dec 14, 2022
Description
This repository contains data on EDA measurements of visitors with different cultural backgrounds in virtual urban park settings. The parks are a Persian garden (Shiraz, Iran) and a historical park in Zurich, Switzerland. The cultural background of the visitors is Persian and Central European. The repository contains raw data from EDA, processed time series and statistical procedures.
n
HadISD: Global sub-daily, surface meteorological station data, 1931-2023,...
data-search.nerc.ac.uk
Updated Jul 24, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). HadISD: Global sub-daily, surface meteorological station data, 1931-2023, v3.4.0.2023f [Dataset]. https://data-search.nerc.ac.uk/geonetwork/srv/search?keyword=dewpoint
Explore at:
Dataset updated
Jul 24, 2021
Description
This is version v3.4.0.2023f of Met Office Hadley Centre's Integrated Surface Database, HadISD. These data are global sub-daily surface meteorological data. This update (v3.4.0.2023f) to HadISD corrects a long-standing bug which was discovered in autumn 2023 whereby the neighbour checks (and associated [un]flagging for some other tests) were not being implemented. For more details see the posts on the HadISD blog: https://hadisd.blogspot.com/2023/10/bug-in-buddy-checks.html & https://hadisd.blogspot.com/2024/01/hadisd-v3402023f-future-look.html The quality controlled variables in this dataset are: temperature, dewpoint temperature, sea-level pressure, wind speed and direction, cloud data (total, low, mid and high level). Past significant weather and precipitation data are also included, but have not been quality controlled, so their quality and completeness cannot be guaranteed. Quality control flags and data values which have been removed during the quality control process are provided in the qc_flags and flagged_values fields, and ancillary data files show the station listing with a station listing with IDs, names and location information. The data are provided as one NetCDF file per station. Files in the station_data folder station data files have the format "station_code"_HadISD_HadOBS_19310101-20240101_v3.4.1.2023f.nc. The station codes can be found under the docs tab. The station codes file has five columns as follows: 1) station code, 2) station name 3) station latitude 4) station longitude 5) station height. To keep informed about updates, news and announcements follow the HadOBS team on twitter @metofficeHadOBS. For more detailed information e.g bug fixes, routine updates and other exploratory analysis, see the HadISD blog: http://hadisd.blogspot.co.uk/ References: When using the dataset in a paper you must cite the following papers (see Docs for link to the publications) and this dataset (using the "citable as" reference) : Dunn, R. J. H., (2019), HadISD version 3: monthly updates, Hadley Centre Technical Note. Dunn, R. J. H., Willett, K. M., Parker, D. E., and Mitchell, L.: Expanding HadISD: quality-controlled, sub-daily station data from 1931, Geosci. Instrum. Method. Data Syst., 5, 473-491, doi:10.5194/gi-5-473-2016, 2016. Dunn, R. J. H., et al. (2012), HadISD: A Quality Controlled global synoptic report database for selected variables at long-term stations from 1973-2011, Clim. Past, 8, 1649-1679, 2012, doi:10.5194/cp-8-1649-2012 Smith, A., N. Lott, and R. Vose, 2011: The Integrated Surface Database: Recent Developments and Partnerships. Bulletin of the American Meteorological Society, 92, 704–708, doi:10.1175/2011BAMS3015.1 For a homogeneity assessment of HadISD please see this following reference Dunn, R. J. H., K. M. Willett, C. P. Morice, and D. E. Parker. "Pairwise homogeneity assessment of HadISD." Climate of the Past 10, no. 4 (2014): 1501-1522. doi:10.5194/cp-10-1501-2014, 2014.

Data from: Supplementary Material for "Sonification for Exploratory Data...

pub.uni-bielefeld.de
search.datacite.org

Updated Feb 5, 2019

Facebook

Twitter

Click to copy link

Link copied

Cite

Thomas Hermann (2019). Supplementary Material for "Sonification for Exploratory Data Analysis" [Dataset]. https://pub.uni-bielefeld.de/record/2920448

Explore at:

Dataset updated

Feb 5, 2019

Authors

Thomas Hermann

License

Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically

Description

Sonification for Exploratory Data Analysis

Chapter 8: Sonification Models

In Chapter 8 of the thesis, 6 sonification models are presented to give some examples for the framework of Model-Based Sonification, developed in Chapter 7. Sonification models determine the rendering of the sonification and possible interactions. The "model in mind" helps the user to interprete the sound with respect to the data.

8.1 Data Sonograms

Data Sonograms use spherical expanding shock waves to excite linear oscillators which are represented by point masses in model space.

Table 8.2, page 87: Sound examples for Data Sonograms

File:	Iris dataset: started in plot "https://pub.uni-bielefeld.de/download/2920448/2920454">(a) at S0 (b) at S1 (c) at S2 10d noisy circle dataset: started in plot (c) at "https://pub.uni-bielefeld.de/download/2920448/2920451">S0 (mean) (d) at S1 (edge) 10d Gaussian: plot (d) started at S0 3 clusters: Example 1 3 clusters: invisible columns used as output variables: "https://pub.uni-bielefeld.de/download/2920448/2920450">Example 2
Description:	Data Sonogram Sound examples for synthetic datasets and the Iris dataset
Duration:	about 5 s

8.2 Particle Trajectory Sonification Model

This sonification model explores features of a data distribution by computing the trajectories of test particles which are injected into model space and move according to Newton's laws of motion in a potential given by the dataset.

Sound example: page 93, PTSM-Ex-1 Audification of 1 particle in the potential of phi(x).
Sound example: page 93, PTSM-Ex-2 Audification of a sequence of 15 particles in the potential of a dataset with 2 clusters.
Sound example: page 94, PTSM-Ex-3 Audification of 25 particles simultaneous in a potential of a dataset with 2 clusters.
Sound example: page 94, PTSM-Ex-4 Audification of 25 particles simultaneous in a potential of a dataset with 1 cluster.
Sound example: page 95, PTSM-Ex-5 sigma-step sequence for a mixture of three Gaussian clusters
Sound example: page 95, PTSM-Ex-6 sigma-step sequence for a Gaussian cluster
Sound example: page 96, PTSM-Iris-1 Sonification for the Iris Dataset with 20 particles per step.
Sound example: page 96, PTSM-Iris-2 Sonification for the Iris Dataset with 3 particles per step.
Sound example: page 96, PTSM-Tetra-1 Sonification for a 4d tetrahedron clusters dataset.

8.3 Markov chain Monte Carlo Sonification

The McMC Sonification Model defines a exploratory process in the domain of a given density p such that the acoustic representation summarizes features of p, particularly concerning the modes of p by sound.

Sound Example: page 105, MCMC-Ex-1 McMC Sonification, stabilization of amplitudes.
Sound Example: page 106, MCMC-Ex-2 Trajectory Audification for 100 McMC steps in 3 cluster dataset
McMC Sonification for Cluster Analysis, dataset with three clusters, page 107
- Stream 1 MCMC-Ex-3.1
- Stream 2 MCMC-Ex-3.2
- Stream 3 MCMC-Ex-3.3
- Mix MCMC-Ex-3.4
McMC Sonification for Cluster

f
Exploratory data analysis.
plos.figshare.com
datasetcatalog.nlm.nih.gov
+1more
xls
Updated Jun 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Oscar Ngesa; Henry Mwambi; Thomas Achia (2023). Exploratory data analysis. [Dataset]. http://doi.org/10.1371/journal.pone.0103299.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0103299.t001
Dataset updated
Jun 5, 2023
Dataset provided by
PLOS ONE
Authors
Oscar Ngesa; Henry Mwambi; Thomas Achia
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Exploratory data analysis.
n
HadISD: Global sub-daily, surface meteorological station data, 1931-2017,...
data-search.nerc.ac.uk
catalogue.ceda.ac.uk
Updated Jul 24, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). HadISD: Global sub-daily, surface meteorological station data, 1931-2017, v2.0.2.2017f [Dataset]. https://data-search.nerc.ac.uk/geonetwork/srv/search?keyword=dewpoint
Explore at:
Dataset updated
Jul 24, 2021
Description
This is version 2.0.2.2017f of Met Office Hadley Centre's Integrated Surface Database, HadISD. These data are global sub-daily surface meteorological data that extends HadISD v2.0.1.2016p to include 2017 and so spans 1931-2017, it replaces the preliminary version (v2.0.2.2017p) as the ISD data for 2017 are now finalised. The quality controlled variables in this dataset are: temperature, dewpoint temperature, sea-level pressure, wind speed and direction, cloud data (total, low, mid and high level). Past significant weather and precipitation data are also included, but have not been quality controlled, so their quality and completeness cannot be guaranteed. Quality control flags and data values which have been removed during the quality control process are provided in the qc_flags and flagged_values fields, and ancillary data files show the station listing with a station listing with IDs, names and location information. The data are provided as one NetCDF file per station. Files in the station_data folder station data files have the format "station_code"_HadISD_HadOBS_19310101-20171231_v2-0-2-2017f.nc. The station codes can be found under the docs tab or on the archive beside the station_data folder. The station codes file has five columns as follows: 1) station code, 2) station name 3) station latitude 4) station longitude 5) station height. To keep informed about updates, news and announcements follow the HadOBS team on twitter @metofficeHadOBS. For more detailed information e.g bug fixes, routine updates and other exploratory analysis, see the HadISD blog: http://hadisd.blogspot.co.uk/ For a more detailed description of precipitation see: http://hadisd.blogspot.co.uk/2018/03/precipitation-in-hadisd.html References: When using the dataset in a paper you must cite the following papers (see Docs for link to the publications) and this dataset (using the "citable as" reference) : Dunn, R. J. H., Willett, K. M., Parker, D. E., and Mitchell, L.: Expanding HadISD: quality-controlled, sub-daily station data from 1931, Geosci. Instrum. Method. Data Syst., 5, 473-491, doi:10.5194/gi-5-473-2016, 2016. Dunn, R. J. H., et al. (2012), HadISD: A Quality Controlled global synoptic report database for selected variables at long-term stations from 1973-2011, Clim. Past, 8, 1649-1679, 2012, doi:10.5194/cp-8-1649-2012 Smith, A., N. Lott, and R. Vose, 2011: The Integrated Surface Database: Recent Developments and Partnerships. Bulletin of the American Meteorological Society, 92, 704–708, doi:10.1175/2011BAMS3015.1 For a homogeneity assessment of HadISD please see this following reference Dunn, R. J. H., K. M. Willett, C. P. Morice, and D. E. Parker. "Pairwise homogeneity assessment of HadISD." Climate of the Past 10, no. 4 (2014): 1501-1522. doi:10.5194/cp-10-1501-2014, 2014.

Facebook

Twitter

Click to copy link

Link copied

Cite

Shrishti Manja (2024). Ecommerce Dataset for Data Analysis [Dataset]. https://www.kaggle.com/datasets/shrishtimanja/ecommerce-dataset-for-data-analysis/code

Ecommerce Dataset for Data Analysis

Exploratory Data Analysis, Data Visualisation and Machine Learning

Explore at:

zip(2028853 bytes)Available download formats

Dataset updated

Sep 19, 2024

Authors

Shrishti Manja

Description

This dataset contains 55,000 entries of synthetic customer transactions, generated using Python's Faker library. The goal behind creating this dataset was to provide a resource for learners like myself to explore, analyze, and apply various data analysis techniques in a context that closely mimics real-world data.

About the Dataset: - CID (Customer ID): A unique identifier for each customer. - TID (Transaction ID): A unique identifier for each transaction. - Gender: The gender of the customer, categorized as Male or Female. - Age Group: Age group of the customer, divided into several ranges. - Purchase Date: The timestamp of when the transaction took place. - Product Category: The category of the product purchased, such as Electronics, Apparel, etc. - Discount Availed: Indicates whether the customer availed any discount (Yes/No). - Discount Name: Name of the discount applied (e.g., FESTIVE50). - Discount Amount (INR): The amount of discount availed by the customer. - Gross Amount: The total amount before applying any discount. - Net Amount: The final amount after applying the discount. - Purchase Method: The payment method used (e.g., Credit Card, Debit Card, etc.). - Location: The city where the purchase took place.

Use Cases: 1. Exploratory Data Analysis (EDA): This dataset is ideal for conducting EDA, allowing users to practice techniques such as summary statistics, visualizations, and identifying patterns within the data. 2. Data Preprocessing and Cleaning: Learners can work on handling missing data, encoding categorical variables, and normalizing numerical values to prepare the dataset for analysis. 3. Data Visualization: Use tools like Python’s Matplotlib, Seaborn, or Power BI to visualize purchasing trends, customer demographics, or the impact of discounts on purchase amounts. 4. Machine Learning Applications: After applying feature engineering, this dataset is suitable for supervised learning models, such as predicting whether a customer will avail a discount or forecasting purchase amounts based on the input features.

This dataset provides an excellent sandbox for honing skills in data analysis, machine learning, and visualization in a structured but flexible manner.

This is not a real dataset. This dataset was generated using Python's Faker library for the sole purpose of learning

Clear search

Close search

Google apps

Main menu

Ecommerce Dataset for Data Analysis

Data Analysis for the Systematic Literature Review of DL4SE

Exploratory data analysis of a clinical study group: Development of a...

Youtube cookery channels viewers comments in Hinglish

Electronic Store Sales Data

Questions the Business Owner would like answered.

Orange dataset table

Data from: The Often-Overlooked Power of Summary Statistics in Exploratory...

Detailed characterization of the dataset.

EDA with AI Market Research Report 2033

EDA with AI Market Outlook

Understanding Placement Factors

SEM regression for H1-5.

Global EDA in Industrial Electronics Market Research Report: By Application...

Wafer Fabrication EDA Tools Report

Data from: A novel spatial prediction method integrating Exploratory Spatial...

Data from: Wrist-worn sensor validation for heart rate variability and...

Electrodermal Activity (EDA) of Bi-cultural Visitors In Virtual Park...

HadISD: Global sub-daily, surface meteorological station data, 1931-2023,...

Data from: Supplementary Material for "Sonification for Exploratory Data...

Sonification for Exploratory Data Analysis

Chapter 8: Sonification Models

8.1 Data Sonograms

8.2 Particle Trajectory Sonification Model

8.3 Markov chain Monte Carlo Sonification

Exploratory data analysis.

HadISD: Global sub-daily, surface meteorological station data, 1931-2017,...

Ecommerce Dataset for Data Analysis

Exploratory Data Analysis, Data Visualisation and Machine Learning