100+ datasets found
  1. Ecommerce Dataset for Data Analysis

    • kaggle.com
    zip
    Updated Sep 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shrishti Manja (2024). Ecommerce Dataset for Data Analysis [Dataset]. https://www.kaggle.com/datasets/shrishtimanja/ecommerce-dataset-for-data-analysis/code
    Explore at:
    zip(2028853 bytes)Available download formats
    Dataset updated
    Sep 19, 2024
    Authors
    Shrishti Manja
    Description

    This dataset contains 55,000 entries of synthetic customer transactions, generated using Python's Faker library. The goal behind creating this dataset was to provide a resource for learners like myself to explore, analyze, and apply various data analysis techniques in a context that closely mimics real-world data.

    About the Dataset: - CID (Customer ID): A unique identifier for each customer. - TID (Transaction ID): A unique identifier for each transaction. - Gender: The gender of the customer, categorized as Male or Female. - Age Group: Age group of the customer, divided into several ranges. - Purchase Date: The timestamp of when the transaction took place. - Product Category: The category of the product purchased, such as Electronics, Apparel, etc. - Discount Availed: Indicates whether the customer availed any discount (Yes/No). - Discount Name: Name of the discount applied (e.g., FESTIVE50). - Discount Amount (INR): The amount of discount availed by the customer. - Gross Amount: The total amount before applying any discount. - Net Amount: The final amount after applying the discount. - Purchase Method: The payment method used (e.g., Credit Card, Debit Card, etc.). - Location: The city where the purchase took place.

    Use Cases: 1. Exploratory Data Analysis (EDA): This dataset is ideal for conducting EDA, allowing users to practice techniques such as summary statistics, visualizations, and identifying patterns within the data. 2. Data Preprocessing and Cleaning: Learners can work on handling missing data, encoding categorical variables, and normalizing numerical values to prepare the dataset for analysis. 3. Data Visualization: Use tools like Python’s Matplotlib, Seaborn, or Power BI to visualize purchasing trends, customer demographics, or the impact of discounts on purchase amounts. 4. Machine Learning Applications: After applying feature engineering, this dataset is suitable for supervised learning models, such as predicting whether a customer will avail a discount or forecasting purchase amounts based on the input features.

    This dataset provides an excellent sandbox for honing skills in data analysis, machine learning, and visualization in a structured but flexible manner.

    This is not a real dataset. This dataset was generated using Python's Faker library for the sole purpose of learning

  2. E

    Exploratory Data Analysis (EDA) Tools Report

    • marketreportanalytics.com
    doc, pdf, ppt
    Updated Apr 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Market Report Analytics (2025). Exploratory Data Analysis (EDA) Tools Report [Dataset]. https://www.marketreportanalytics.com/reports/exploratory-data-analysis-eda-tools-54369
    Explore at:
    doc, ppt, pdfAvailable download formats
    Dataset updated
    Apr 2, 2025
    Dataset authored and provided by
    Market Report Analytics
    License

    https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    Discover the booming Exploratory Data Analysis (EDA) tools market! Our in-depth analysis reveals key trends, growth drivers, and top players shaping this $3 billion industry, projected for 15% CAGR through 2033. Learn about market segmentation, regional insights, and future opportunities.

  3. f

    DataSheet1_Exploratory data analysis (EDA) machine learning approaches for...

    • frontiersin.figshare.com
    docx
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Victoria Da Poian; Bethany Theiling; Lily Clough; Brett McKinney; Jonathan Major; Jingyi Chen; Sarah Hörst (2023). DataSheet1_Exploratory data analysis (EDA) machine learning approaches for ocean world analog mass spectrometry.docx [Dataset]. http://doi.org/10.3389/fspas.2023.1134141.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Frontiers
    Authors
    Victoria Da Poian; Bethany Theiling; Lily Clough; Brett McKinney; Jonathan Major; Jingyi Chen; Sarah Hörst
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    World
    Description

    Many upcoming and proposed missions to ocean worlds such as Europa, Enceladus, and Titan aim to evaluate their habitability and the existence of potential life on these moons. These missions will suffer from communication challenges and technology limitations. We review and investigate the applicability of data science and unsupervised machine learning (ML) techniques on isotope ratio mass spectrometry data (IRMS) from volatile laboratory analogs of Europa and Enceladus seawaters as a case study for development of new strategies for icy ocean world missions. Our driving science goal is to determine whether the mass spectra of volatile gases could contain information about the composition of the seawater and potential biosignatures. We implement data science and ML techniques to investigate what inherent information the spectra contain and determine whether a data science pipeline could be designed to quickly analyze data from future ocean worlds missions. In this study, we focus on the exploratory data analysis (EDA) step in the analytics pipeline. This is a crucial unsupervised learning step that allows us to understand the data in depth before subsequent steps such as predictive/supervised learning. EDA identifies and characterizes recurring patterns, significant correlation structure, and helps determine which variables are redundant and which contribute to significant variation in the lower dimensional space. In addition, EDA helps to identify irregularities such as outliers that might be due to poor data quality. We compared dimensionality reduction methods Uniform Manifold Approximation and Projection (UMAP) and Principal Component Analysis (PCA) for transforming our data from a high-dimensional space to a lower dimension, and we compared clustering algorithms for identifying data-driven groups (“clusters”) in the ocean worlds analog IRMS data and mapping these clusters to experimental conditions such as seawater composition and CO2 concentration. Such data analysis and characterization efforts are the first steps toward the longer-term science autonomy goal where similar automated ML tools could be used onboard a spacecraft to prioritize data transmissions for bandwidth-limited outer Solar System missions.

  4. Exploratory Data Analysis on Automobile Dataset

    • kaggle.com
    zip
    Updated Sep 12, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Monis Ahmad (2022). Exploratory Data Analysis on Automobile Dataset [Dataset]. https://www.kaggle.com/datasets/monisahmad/automobile
    Explore at:
    zip(4915 bytes)Available download formats
    Dataset updated
    Sep 12, 2022
    Authors
    Monis Ahmad
    Description

    Dataset

    This dataset was created by Monis Ahmad

    Contents

  5. E

    Exploratory Data Analysis (EDA) Tools Report

    • marketreportanalytics.com
    doc, pdf, ppt
    Updated Apr 2, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Market Report Analytics (2025). Exploratory Data Analysis (EDA) Tools Report [Dataset]. https://www.marketreportanalytics.com/reports/exploratory-data-analysis-eda-tools-54257
    Explore at:
    ppt, doc, pdfAvailable download formats
    Dataset updated
    Apr 2, 2025
    Dataset authored and provided by
    Market Report Analytics
    License

    https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Exploratory Data Analysis (EDA) tools market is experiencing robust growth, driven by the increasing need for businesses to derive actionable insights from their ever-expanding datasets. The market, currently estimated at $15 billion in 2025, is projected to witness a Compound Annual Growth Rate (CAGR) of 15% from 2025 to 2033, reaching an estimated $45 billion by 2033. This growth is fueled by several factors, including the rising adoption of big data analytics, the proliferation of cloud-based solutions offering enhanced accessibility and scalability, and the growing demand for data-driven decision-making across diverse industries like finance, healthcare, and retail. The market is segmented by application (large enterprises and SMEs) and type (graphical and non-graphical tools), with graphical tools currently holding a larger market share due to their user-friendly interfaces and ability to effectively communicate complex data patterns. Large enterprises are currently the dominant segment, but the SME segment is anticipated to experience faster growth due to increasing affordability and accessibility of EDA solutions. Geographic expansion is another key driver, with North America currently holding the largest market share due to early adoption and a strong technological ecosystem. However, regions like Asia-Pacific are exhibiting high growth potential, fueled by rapid digitalization and a burgeoning data science talent pool. Despite these opportunities, the market faces certain restraints, including the complexity of some EDA tools requiring specialized skills and the challenge of integrating EDA tools with existing business intelligence platforms. Nonetheless, the overall market outlook for EDA tools remains highly positive, driven by ongoing technological advancements and the increasing importance of data analytics across all sectors. The competition among established players like IBM Cognos Analytics and Altair RapidMiner, and emerging innovative companies like Polymer Search and KNIME, further fuels market dynamism and innovation.

  6. E

    Exploratory Data Analysis (EDA) Tools Report

    • archivemarketresearch.com
    doc, pdf, ppt
    Updated Feb 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Archive Market Research (2025). Exploratory Data Analysis (EDA) Tools Report [Dataset]. https://www.archivemarketresearch.com/reports/exploratory-data-analysis-eda-tools-21680
    Explore at:
    doc, pdf, pptAvailable download formats
    Dataset updated
    Feb 12, 2025
    Dataset authored and provided by
    Archive Market Research
    License

    https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The global Exploratory Data Analysis (EDA) Tools market is anticipated to experience significant growth in the coming years, driven by the increasing adoption of data-driven decision-making and the growing need for efficient data exploration and analysis. The market size is valued at USD XX million in 2025 and is projected to reach USD XX million by 2033, registering a CAGR of XX% during the forecast period. The increasing complexity and volume of data generated by businesses and organizations have necessitated the use of advanced data analysis tools to derive meaningful insights and make informed decisions. Key trends driving the market include the rising adoption of AI and machine learning technologies, the growing need for self-service data analytics, and the increasing emphasis on data visualization and storytelling. Non-graphical EDA tools are gaining traction due to their ability to handle large and complex datasets. Graphical EDA tools are preferred for their intuitive and interactive user interfaces that simplify data exploration. Large enterprises are major consumers of EDA tools as they have large volumes of data to analyze. SMEs are also increasingly adopting EDA tools as they realize the importance of data-driven insights for business growth. The North American region holds a significant market share due to the presence of established technology companies and a high adoption rate of data analytics solutions. The Asia Pacific region is expected to witness substantial growth due to the rising number of businesses and organizations in emerging economies.

  7. Exploratory Data Analysis | EDA - use case

    • kaggle.com
    zip
    Updated Feb 11, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohinur Abdurahimova (2022). Exploratory Data Analysis | EDA - use case [Dataset]. https://www.kaggle.com/datasets/mohinurabdurahimova/exploratory-data-analysis-eda-use-case
    Explore at:
    zip(49796 bytes)Available download formats
    Dataset updated
    Feb 11, 2022
    Authors
    Mohinur Abdurahimova
    Description

    Dataset

    This dataset was created by Mohinur Abdurahimova

    Released under Data files © Original Authors

    Contents

  8. Exploratory data analysis of a clinical study group: Development of a...

    • plos.figshare.com
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bogumil M. Konopka; Felicja Lwow; Magdalena Owczarz; Łukasz Łaczmański (2023). Exploratory data analysis of a clinical study group: Development of a procedure for exploring multidimensional data [Dataset]. http://doi.org/10.1371/journal.pone.0201950
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Bogumil M. Konopka; Felicja Lwow; Magdalena Owczarz; Łukasz Łaczmański
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Thorough knowledge of the structure of analyzed data allows to form detailed scientific hypotheses and research questions. The structure of data can be revealed with methods for exploratory data analysis. Due to multitude of available methods, selecting those which will work together well and facilitate data interpretation is not an easy task. In this work we present a well fitted set of tools for a complete exploratory analysis of a clinical dataset and perform a case study analysis on a set of 515 patients. The proposed procedure comprises several steps: 1) robust data normalization, 2) outlier detection with Mahalanobis (MD) and robust Mahalanobis distances (rMD), 3) hierarchical clustering with Ward’s algorithm, 4) Principal Component Analysis with biplot vectors. The analyzed set comprised elderly patients that participated in the PolSenior project. Each patient was characterized by over 40 biochemical and socio-geographical attributes. Introductory analysis showed that the case-study dataset comprises two clusters separated along the axis of sex hormone attributes. Further analysis was carried out separately for male and female patients. The most optimal partitioning in the male set resulted in five subgroups. Two of them were related to diseased patients: 1) diabetes and 2) hypogonadism patients. Analysis of the female set suggested that it was more homogeneous than the male dataset. No evidence of pathological patient subgroups was found. In the study we showed that outlier detection with MD and rMD allows not only to identify outliers, but can also assess the heterogeneity of a dataset. The case study proved that our procedure is well suited for identification and visualization of biologically meaningful patient subgroups.

  9. E

    Exploratory Data Analysis (EDA) Tools Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Nov 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Exploratory Data Analysis (EDA) Tools Report [Dataset]. https://www.datainsightsmarket.com/reports/exploratory-data-analysis-eda-tools-532159
    Explore at:
    pdf, ppt, docAvailable download formats
    Dataset updated
    Nov 7, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    Explore the booming Exploratory Data Analysis (EDA) Tools market, projected to reach $10.5 billion by 2025 with a 12.5% CAGR. Discover key drivers, trends, and market share for large enterprises, SMEs, graphical & non-graphical tools across North America, Europe, APAC, and more.

  10. EDA with Pandas

    • kaggle.com
    zip
    Updated Feb 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amir Raja (2023). EDA with Pandas [Dataset]. https://www.kaggle.com/datasets/amirraja/eda-with-pandas
    Explore at:
    zip(231014 bytes)Available download formats
    Dataset updated
    Feb 15, 2023
    Authors
    Amir Raja
    Description

    Dataset

    This dataset was created by Amir Raja

    Contents

  11. Z

    Data Analysis for the Systematic Literature Review of DL4SE

    • data.niaid.nih.gov
    • data-staging.niaid.nih.gov
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cody Watson; Nathan Cooper; David Nader; Kevin Moran; Denys Poshyvanyk (2024). Data Analysis for the Systematic Literature Review of DL4SE [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4768586
    Explore at:
    Dataset updated
    Jul 19, 2024
    Dataset provided by
    College of William and Mary
    Washington and Lee University
    Authors
    Cody Watson; Nathan Cooper; David Nader; Kevin Moran; Denys Poshyvanyk
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data Analysis is the process that supports decision-making and informs arguments in empirical studies. Descriptive statistics, Exploratory Data Analysis (EDA), and Confirmatory Data Analysis (CDA) are the approaches that compose Data Analysis (Xia & Gong; 2014). An Exploratory Data Analysis (EDA) comprises a set of statistical and data mining procedures to describe data. We ran EDA to provide statistical facts and inform conclusions. The mined facts allow attaining arguments that would influence the Systematic Literature Review of DL4SE.

    The Systematic Literature Review of DL4SE requires formal statistical modeling to refine the answers for the proposed research questions and formulate new hypotheses to be addressed in the future. Hence, we introduce DL4SE-DA, a set of statistical processes and data mining pipelines that uncover hidden relationships among Deep Learning reported literature in Software Engineering. Such hidden relationships are collected and analyzed to illustrate the state-of-the-art of DL techniques employed in the software engineering context.

    Our DL4SE-DA is a simplified version of the classical Knowledge Discovery in Databases, or KDD (Fayyad, et al; 1996). The KDD process extracts knowledge from a DL4SE structured database. This structured database was the product of multiple iterations of data gathering and collection from the inspected literature. The KDD involves five stages:

    Selection. This stage was led by the taxonomy process explained in section xx of the paper. After collecting all the papers and creating the taxonomies, we organize the data into 35 features or attributes that you find in the repository. In fact, we manually engineered features from the DL4SE papers. Some of the features are venue, year published, type of paper, metrics, data-scale, type of tuning, learning algorithm, SE data, and so on.

    Preprocessing. The preprocessing applied was transforming the features into the correct type (nominal), removing outliers (papers that do not belong to the DL4SE), and re-inspecting the papers to extract missing information produced by the normalization process. For instance, we normalize the feature “metrics” into “MRR”, “ROC or AUC”, “BLEU Score”, “Accuracy”, “Precision”, “Recall”, “F1 Measure”, and “Other Metrics”. “Other Metrics” refers to unconventional metrics found during the extraction. Similarly, the same normalization was applied to other features like “SE Data” and “Reproducibility Types”. This separation into more detailed classes contributes to a better understanding and classification of the paper by the data mining tasks or methods.

    Transformation. In this stage, we omitted to use any data transformation method except for the clustering analysis. We performed a Principal Component Analysis to reduce 35 features into 2 components for visualization purposes. Furthermore, PCA also allowed us to identify the number of clusters that exhibit the maximum reduction in variance. In other words, it helped us to identify the number of clusters to be used when tuning the explainable models.

    Data Mining. In this stage, we used three distinct data mining tasks: Correlation Analysis, Association Rule Learning, and Clustering. We decided that the goal of the KDD process should be oriented to uncover hidden relationships on the extracted features (Correlations and Association Rules) and to categorize the DL4SE papers for a better segmentation of the state-of-the-art (Clustering). A clear explanation is provided in the subsection “Data Mining Tasks for the SLR od DL4SE”. 5.Interpretation/Evaluation. We used the Knowledge Discover to automatically find patterns in our papers that resemble “actionable knowledge”. This actionable knowledge was generated by conducting a reasoning process on the data mining outcomes. This reasoning process produces an argument support analysis (see this link).

    We used RapidMiner as our software tool to conduct the data analysis. The procedures and pipelines were published in our repository.

    Overview of the most meaningful Association Rules. Rectangles are both Premises and Conclusions. An arrow connecting a Premise with a Conclusion implies that given some premise, the conclusion is associated. E.g., Given that an author used Supervised Learning, we can conclude that their approach is irreproducible with a certain Support and Confidence.

    Support = Number of occurrences this statement is true divided by the amount of statements Confidence = The support of the statement divided by the number of occurrences of the premise

  12. B

    Exploratory Data Analysis of Airbnb Data

    • borealisdata.ca
    • dataone.org
    Updated Dec 19, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Imad Ahmad; Ibtassam Rasheed; Yip Chi Man (2022). Exploratory Data Analysis of Airbnb Data [Dataset]. http://doi.org/10.5683/SP3/F2OCZF
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 19, 2022
    Dataset provided by
    Borealis
    Authors
    Imad Ahmad; Ibtassam Rasheed; Yip Chi Man
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Airbnb® is an American company operating an online marketplace for lodging, primarily for vacation rentals. The purpose of this study is to perform an exploratory data analysis of the two datasets containing Airbnb® listings and across 10 major cities. We aim to use various data visualizations to gain valuable insight on the effects of pricing, covid, and more!

  13. f

    Exploratory data analysis.

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    • +1more
    xls
    Updated Jun 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oscar Ngesa; Henry Mwambi; Thomas Achia (2023). Exploratory data analysis. [Dataset]. http://doi.org/10.1371/journal.pone.0103299.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Oscar Ngesa; Henry Mwambi; Thomas Achia
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Exploratory data analysis.

  14. f

    Data from: FactExplorer: Fact Embedding-Based Exploratory Data Analysis for...

    • tandf.figshare.com
    pdf
    Updated Sep 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Qi Jiang; Guodao Sun; Yue Dong; Lvhan Pan; Baofeng Chang; Li Jiang; Haoran Liang; Ronghua Liang (2025). FactExplorer: Fact Embedding-Based Exploratory Data Analysis for Tabular Data [Dataset]. http://doi.org/10.6084/m9.figshare.28399639.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Sep 23, 2025
    Dataset provided by
    Taylor & Francis
    Authors
    Qi Jiang; Guodao Sun; Yue Dong; Lvhan Pan; Baofeng Chang; Li Jiang; Haoran Liang; Ronghua Liang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Despite exploratory data analysis (EDA) is a powerful approach for uncovering insights from unfamiliar datasets, existing EDA tools face challenges in assisting users to assess the progress of exploration and synthesize coherent insights from isolated findings. To address these challenges, we present FactExplorer, a novel fact-based EDA system that shifts the analysis focus from raw data to data facts. FactExplorer employs a hybrid logical-visual representation, providing users with a comprehensive overview of all potential facts at the outset of their exploration. Moreover, FactExplorer introduces fact-mining techniques, including topic-based drill-down and transition path search capabilities. These features facilitate in-depth analysis of facts and enhance the understanding of interconnections between specific facts. Finally, we present a usage scenario and conduct a user study to assess the effectiveness of FactExplorer. The results indicate that FactExplorer facilitates the understanding of isolated findings and enables users to steer a thorough and effective EDA.

  15. S

    Global Exploratory Data Analysis (EDA) Tools Market Revenue Forecasts...

    • statsndata.org
    excel, pdf
    Updated Oct 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stats N Data (2025). Global Exploratory Data Analysis (EDA) Tools Market Revenue Forecasts 2025-2032 [Dataset]. https://www.statsndata.org/report/exploratory-data-analysis-eda-tools-market-313301
    Explore at:
    excel, pdfAvailable download formats
    Dataset updated
    Oct 2025
    Dataset authored and provided by
    Stats N Data
    License

    https://www.statsndata.org/how-to-orderhttps://www.statsndata.org/how-to-order

    Area covered
    Global
    Description

    Exploratory Data Analysis (EDA) Tools play a pivotal role in the modern data-driven landscape, transforming raw data into actionable insights. As businesses increasingly recognize the value of data in informing decisions, the market for EDA tools has witnessed substantial growth, driven by the rapid expansion of dat

  16. Titanic- exploratory data analysis

    • kaggle.com
    zip
    Updated Jul 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Karthik (2025). Titanic- exploratory data analysis [Dataset]. https://www.kaggle.com/datasets/pandureddy123/titanic-exploratory-data-analysis
    Explore at:
    zip(962897 bytes)Available download formats
    Dataset updated
    Jul 19, 2025
    Authors
    Karthik
    Description

    One more step towards Machine learning! This is a titatic dataset with exploratory data analysis html file. I used pandas-profiling for fast analysis.

  17. H

    Physical Properties of Lakes: Exploratory Data Analysis

    • hydroshare.org
    • search.dataone.org
    zip
    Updated Jan 29, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gabriela Garcia; Kateri Salk (2021). Physical Properties of Lakes: Exploratory Data Analysis [Dataset]. https://www.hydroshare.org/resource/42052357655f4ad39f8ec7d0bef351c7
    Explore at:
    zip(2.0 MB)Available download formats
    Dataset updated
    Jan 29, 2021
    Dataset provided by
    HydroShare
    Authors
    Gabriela Garcia; Kateri Salk
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    May 27, 1984 - Aug 17, 2016
    Area covered
    Description

    Exploratory Data Analysis for the Physical Properties of Lakes

    This lesson was adapted from educational material written by Dr. Kateri Salk for her Fall 2019 Hydrologic Data Analysis course at Duke University. This is the first part of a two-part exercise focusing on the physical properties of lakes.

    Introduction

    Lakes are dynamic, nonuniform bodies of water in which the physical, biological, and chemical properties interact. Lakes also contain the majority of Earth's fresh water supply. This lesson introduces exploratory data analysis using R statistical software in the context of the physical properties of lakes.

    Learning Objectives

    After successfully completing this exercise, you will be able to:

    1. Apply exploratory data analytics skills to applied questions about physical properties of lakes
    2. Communicate findings with peers through oral, visual, and written modes
  18. Advanced exploratory data analysis (EDA)

    • kaggle.com
    zip
    Updated Nov 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mustafa Ghzi (2023). Advanced exploratory data analysis (EDA) [Dataset]. https://www.kaggle.com/datasets/mustafaghzi/advanced-exploratory-data-analysis-eda
    Explore at:
    zip(4818027 bytes)Available download formats
    Dataset updated
    Nov 18, 2023
    Authors
    Mustafa Ghzi
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Mustafa Ghzi

    Released under CC BY-NC-SA 4.0

    Contents

  19. Data from: Penguins Go Parallel: A Grammar of Graphics Framework for...

    • tandf.figshare.com
    txt
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Susan VanderPlas; Yawei Ge; Antony Unwin; Heike Hofmann (2023). Penguins Go Parallel: A Grammar of Graphics Framework for Generalized Parallel Coordinate Plots [Dataset]. http://doi.org/10.6084/m9.figshare.22467369.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    Taylor & Francishttps://taylorandfrancis.com/
    Authors
    Susan VanderPlas; Yawei Ge; Antony Unwin; Heike Hofmann
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Parallel Coordinate Plots (PCP) are a valuable tool for exploratory data analysis of high-dimensional numerical data. The use of PCPs is limited when working with categorical variables or a mix of categorical and continuous variables. In this article, we propose Generalized Parallel Coordinate Plots (GPCP) to extend the ability of PCPs from just numeric variables to dealing seamlessly with a mix of categorical and numeric variables in a single plot. In this process we find that existing solutions for categorical values only, such as hammock plots or parsets become edge cases in the new framework. By focusing on individual observations rather than a marginal frequency we gain additional flexibility. The resulting approach is implemented in the R package ggpcp. Supplementary materials for this article are available online.

  20. Exploratory Data Analysis: Crop Classification & Forecasting - EDA

    • ai.tracebloc.io
    json
    Updated Nov 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    tracebloc (2025). Exploratory Data Analysis: Crop Classification & Forecasting - EDA [Dataset]. https://ai.tracebloc.io/explore/ai-satellite-crop-classification-and-yield-forecasting?tab=exploratory-data-analysis
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Nov 17, 2025
    Dataset provided by
    Tracebloc GmbH
    Authors
    tracebloc
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Missing Values
    Measurement technique
    Statistical and exploratory data analysis
    Description

    Understand the satellite dataset before training. Explore distributions, preprocessing steps, and key insights to evaluate 3rd-party models effectively.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Shrishti Manja (2024). Ecommerce Dataset for Data Analysis [Dataset]. https://www.kaggle.com/datasets/shrishtimanja/ecommerce-dataset-for-data-analysis/code
Organization logo

Ecommerce Dataset for Data Analysis

Exploratory Data Analysis, Data Visualisation and Machine Learning

Explore at:
zip(2028853 bytes)Available download formats
Dataset updated
Sep 19, 2024
Authors
Shrishti Manja
Description

This dataset contains 55,000 entries of synthetic customer transactions, generated using Python's Faker library. The goal behind creating this dataset was to provide a resource for learners like myself to explore, analyze, and apply various data analysis techniques in a context that closely mimics real-world data.

About the Dataset: - CID (Customer ID): A unique identifier for each customer. - TID (Transaction ID): A unique identifier for each transaction. - Gender: The gender of the customer, categorized as Male or Female. - Age Group: Age group of the customer, divided into several ranges. - Purchase Date: The timestamp of when the transaction took place. - Product Category: The category of the product purchased, such as Electronics, Apparel, etc. - Discount Availed: Indicates whether the customer availed any discount (Yes/No). - Discount Name: Name of the discount applied (e.g., FESTIVE50). - Discount Amount (INR): The amount of discount availed by the customer. - Gross Amount: The total amount before applying any discount. - Net Amount: The final amount after applying the discount. - Purchase Method: The payment method used (e.g., Credit Card, Debit Card, etc.). - Location: The city where the purchase took place.

Use Cases: 1. Exploratory Data Analysis (EDA): This dataset is ideal for conducting EDA, allowing users to practice techniques such as summary statistics, visualizations, and identifying patterns within the data. 2. Data Preprocessing and Cleaning: Learners can work on handling missing data, encoding categorical variables, and normalizing numerical values to prepare the dataset for analysis. 3. Data Visualization: Use tools like Python’s Matplotlib, Seaborn, or Power BI to visualize purchasing trends, customer demographics, or the impact of discounts on purchase amounts. 4. Machine Learning Applications: After applying feature engineering, this dataset is suitable for supervised learning models, such as predicting whether a customer will avail a discount or forecasting purchase amounts based on the input features.

This dataset provides an excellent sandbox for honing skills in data analysis, machine learning, and visualization in a structured but flexible manner.

This is not a real dataset. This dataset was generated using Python's Faker library for the sole purpose of learning

Search
Clear search
Close search
Google apps
Main menu