100+ datasets found
  1. Customer Sale Dataset for Data Visualization

    • kaggle.com
    Updated Jun 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Atul (2025). Customer Sale Dataset for Data Visualization [Dataset]. https://www.kaggle.com/datasets/atulkgoyl/customer-sale-dataset-for-visualization
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 6, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Atul
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This synthetic dataset is designed specifically for practicing data visualization and exploratory data analysis (EDA) using popular Python libraries like Seaborn, Matplotlib, and Pandas.

    Unlike most public datasets, this one includes a diverse mix of column types:

    📅 Date columns (for time series and trend plots) 🔢 Numerical columns (for histograms, boxplots, scatter plots) 🏷️ Categorical columns (for bar charts, group analysis)

    Whether you are a beginner learning how to visualize data or an intermediate user testing new charting techniques, this dataset offers a versatile playground.

    Feel free to:

    Create EDA notebooks Practice plotting techniques Experiment with filtering, grouping, and aggregations 🛠️ No missing values, no data cleaning needed — just download and start exploring!

    Hope you find this helpful. Looking forward to hearing from you all.

  2. Iris Flower Visualization using Python

    • kaggle.com
    zip
    Updated Oct 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harsh Kashyap (2023). Iris Flower Visualization using Python [Dataset]. https://www.kaggle.com/datasets/imharshkashyap/iris-flower-visualization-using-python
    Explore at:
    zip(1307 bytes)Available download formats
    Dataset updated
    Oct 24, 2023
    Authors
    Harsh Kashyap
    Description

    The "Iris Flower Visualization using Python" project is a data science project that focuses on exploring and visualizing the famous Iris flower dataset. The Iris dataset is a well-known dataset in the field of machine learning and data science, containing measurements of four features (sepal length, sepal width, petal length, and petal width) for three different species of Iris flowers (Setosa, Versicolor, and Virginica).

    In this project, Python is used as the primary programming language along with popular libraries such as pandas, matplotlib, seaborn, and plotly. The project aims to provide a comprehensive visual analysis of the Iris dataset, allowing users to gain insights into the relationships between the different features and the distinct characteristics of each Iris species.

    The project begins by loading the Iris dataset into a pandas DataFrame, followed by data preprocessing and cleaning if necessary. Various visualization techniques are then applied to showcase the dataset's characteristics and patterns. The project includes the following visualizations:

    1. Scatter Plot: Visualizes the relationship between two features, such as sepal length and sepal width, using points on a 2D plane. Different species are represented by different colors or markers, allowing for easy differentiation.

    2. Pair Plot: Displays pairwise relationships between all features in the dataset. This matrix of scatter plots provides a quick overview of the relationships and distributions of the features.

    3. Andrews Curves: Represents each sample as a curve, with the shape of the curve representing the corresponding Iris species. This visualization technique allows for the identification of distinct patterns and separability between species.

    4. Parallel Coordinates: Plots each feature on a separate vertical axis and connects the values for each data sample using lines. This visualization technique helps in understanding the relative importance and range of each feature for different species.

    5. 3D Scatter Plot: Creates a 3D plot with three features represented on the x, y, and z axes. This visualization allows for a more comprehensive understanding of the relationships between multiple features simultaneously.

    Throughout the project, appropriate labels, titles, and color schemes are used to enhance the visualizations' interpretability. The interactive nature of some visualizations, such as the 3D Scatter Plot, allows users to rotate and zoom in on the plot for a more detailed examination.

    The "Iris Flower Visualization using Python" project serves as an excellent example of how data visualization techniques can be applied to gain insights and understand the characteristics of a dataset. It provides a foundation for further analysis and exploration of the Iris dataset or similar datasets in the field of data science and machine learning.

  3. Data from: Teaching and Learning Data Visualization: Ideas and Assignments

    • tandf.figshare.com
    • figshare.com
    txt
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Deborah Nolan; Jamis Perrett (2023). Teaching and Learning Data Visualization: Ideas and Assignments [Dataset]. http://doi.org/10.6084/m9.figshare.1627940.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Taylor & Francishttps://taylorandfrancis.com/
    Authors
    Deborah Nolan; Jamis Perrett
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This article discusses how to make statistical graphics a more prominent element of the undergraduate statistics curricula. The focus is on several different types of assignments that exemplify how to incorporate graphics into a course in a pedagogically meaningful way. These assignments include having students deconstruct and reconstruct plots, copy masterful graphs, create one-minute visual revelations, convert tables into “pictures,” and develop interactive visualizations, for example, with the virtual earth as a plotting canvas. In addition to describing the goals and details of each assignment, we also discuss the broader topic of graphics and key concepts that we think warrant inclusion in the statistics curricula. We advocate that more attention needs to be paid to this fundamental field of statistics at all levels, from introductory undergraduate through graduate level courses. With the rapid rise of tools to visualize data, for example, Google trends, GapMinder, ManyEyes, and Tableau, and the increased use of graphics in the media, understanding the principles of good statistical graphics, and having the ability to create informative visualizations is an ever more important aspect of statistics education. Supplementary materials containing code and data for the assignments are available online.

  4. Data manipulation and visualization exercise

    • kaggle.com
    zip
    Updated Oct 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pawan Saini (2023). Data manipulation and visualization exercise [Dataset]. https://www.kaggle.com/datasets/pawansaini01/data-manipulation-and-visualization-exercise
    Explore at:
    zip(7845632 bytes)Available download formats
    Dataset updated
    Oct 8, 2023
    Authors
    Pawan Saini
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Pawan Saini

    Released under CC0: Public Domain

    Contents

  5. f

    Data_Sheet_1_Toward a Taxonomy for Adaptive Data Visualization in Analytics...

    • frontiersin.figshare.com
    • figshare.com
    xlsx
    Updated Jun 2, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tristan Poetzsch; Panagiotis Germanakos; Lynn Huestegge (2023). Data_Sheet_1_Toward a Taxonomy for Adaptive Data Visualization in Analytics Applications.xlsx [Dataset]. http://doi.org/10.3389/frai.2020.00009.s001
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    Frontiers
    Authors
    Tristan Poetzsch; Panagiotis Germanakos; Lynn Huestegge
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data analytics as a field is currently at a crucial point in its development, as a commoditization takes place in the context of increasing amounts of data, more user diversity, and automated analysis solutions, the latter potentially eliminating the need for expert analysts. A central hypothesis of the present paper is that data visualizations should be adapted to both the user and the context. This idea was initially addressed in Study 1, which demonstrated substantial interindividual variability among a group of experts when freely choosing an option to visualize data sets. To lay the theoretical groundwork for a systematic, taxonomic approach, a user model combining user traits, states, strategies, and actions was proposed and further evaluated empirically in Studies 2 and 3. The results implied that for adapting to user traits, statistical expertise is a relevant dimension that should be considered. Additionally, for adapting to user states different user intentions such as monitoring and analysis should be accounted for. These results were used to develop a taxonomy which adapts visualization recommendations to these (and other) factors. A preliminary attempt to validate the taxonomy in Study 4 tested its visualization recommendations with a group of experts. While the corresponding results were somewhat ambiguous overall, some aspects nevertheless supported the claim that a user-adaptive data visualization approach based on the principles outlined in the taxonomy can indeed be useful. While the present approach to user adaptivity is still in its infancy and should be extended (e.g., by testing more participants), the general approach appears to be very promising.

  6. Dataset for data visualization

    • kaggle.com
    zip
    Updated Aug 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nadeem Qamar (2024). Dataset for data visualization [Dataset]. https://www.kaggle.com/datasets/nadeemkaggle123/dataset-for-data-visualization/code
    Explore at:
    zip(425673 bytes)Available download formats
    Dataset updated
    Aug 6, 2024
    Authors
    Nadeem Qamar
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Nadeem Qamar

    Released under MIT

    Contents

  7. A dataset that rewards people for visual data exploration

    • figshare.com
    txt
    Updated Mar 15, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Francisco Rodriguez-Sanchez (2017). A dataset that rewards people for visual data exploration [Dataset]. http://doi.org/10.6084/m9.figshare.4753675.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Mar 15, 2017
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Francisco Rodriguez-Sanchez
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Visual data exploration is a key step in any data analysis, but often ignored by practitioners who want to jump fast into model output.This dataset, intended mostly to be used in statistics lectures and training sessions, provides a small but unexpected reward to people who actually plot it.Made with http://robertgrantstats.co.uk/drawmydata.html. Thanks to Robert Grant for the app.

  8. G

    Data Visualization Software Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Aug 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Data Visualization Software Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/data-visualization-software-market
    Explore at:
    pptx, csv, pdfAvailable download formats
    Dataset updated
    Aug 29, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Data Visualization Software Market Outlook



    According to our latest research, the global Data Visualization Software market size reached USD 8.2 billion in 2024, reflecting the sectorÂ’s rapid adoption across industries. With a robust CAGR of 10.8% projected from 2025 to 2033, the market is expected to grow significantly, attaining a value of USD 20.3 billion by 2033. This dynamic expansion is primarily driven by the increasing demand for actionable business insights, the proliferation of big data analytics, and the growing need for real-time decision-making tools across enterprises worldwide.




    One of the most powerful growth factors for the Data Visualization Software market is the surge in big data generation and the corresponding need for advanced analytics solutions. Organizations are increasingly dealing with massive and complex datasets that traditional reporting tools cannot handle efficiently. Modern data visualization software enables users to interpret these vast datasets quickly, presenting trends, patterns, and anomalies in intuitive graphical formats. This empowers organizations to make informed decisions faster, boosting overall operational efficiency and competitive advantage. Furthermore, the integration of artificial intelligence and machine learning capabilities into data visualization platforms is enhancing their analytical power, allowing for predictive and prescriptive insights that were previously unattainable.




    Another significant driver of the Data Visualization Software market is the widespread digital transformation initiatives across various sectors. Enterprises are investing heavily in digital technologies to streamline operations, improve customer experiences, and unlock new revenue streams. Data visualization tools have become integral to these transformations, serving as a bridge between raw data and strategic business outcomes. By offering interactive dashboards, real-time reporting, and customizable analytics, these solutions enable users at all organizational levels to engage with data meaningfully. The democratization of data access facilitated by user-friendly visualization software is fostering a data-driven culture, encouraging innovation and agility across industries such as BFSI, healthcare, retail, and manufacturing.




    The increasing adoption of cloud-based data visualization solutions is also fueling market growth. Cloud deployment offers scalability, flexibility, and cost-effectiveness, making advanced analytics accessible to organizations of all sizes, including small and medium enterprises (SMEs). Cloud-based platforms support seamless integration with other business applications, facilitate remote collaboration, and provide robust security features. As businesses continue to embrace remote and hybrid work models, the demand for cloud-based data visualization tools is expected to rise, further accelerating market expansion. Vendors are responding with enhanced offerings, including AI-driven analytics, embedded BI, and self-service visualization capabilities, catering to the evolving needs of modern enterprises.



    In the realm of warehouse management systems (WMS), the integration of WMS Data Visualization Tools is becoming increasingly vital. These tools offer a comprehensive view of warehouse operations, enabling managers to visualize data related to inventory levels, order processing, and shipment tracking in real-time. By leveraging advanced visualization techniques, WMS data visualization tools help in identifying bottlenecks, optimizing resource allocation, and improving overall efficiency. The ability to transform complex data sets into intuitive visual formats empowers warehouse managers to make informed decisions swiftly, thereby enhancing productivity and reducing operational costs. As the demand for streamlined logistics and supply chain management continues to grow, the adoption of WMS data visualization tools is expected to rise, driving further innovation in the sector.




    Regionally, North America continues to dominate the Data Visualization Software market due to early technology adoption, a strong presence of leading vendors, and a mature analytics landscape. However, the Asia Pacific region is witnessing the fastest growth, driven by rapid digitalization, increasing IT investments, and the emergence of data-centric business models in countries like China, India

  9. f

    Data_Sheet_1_“R” U ready?: a case study using R to analyze changes in gene...

    • frontiersin.figshare.com
    docx
    Updated Mar 22, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amy E. Pomeroy; Andrea Bixler; Stefanie H. Chen; Jennifer E. Kerr; Todd D. Levine; Elizabeth F. Ryder (2024). Data_Sheet_1_“R” U ready?: a case study using R to analyze changes in gene expression during evolution.docx [Dataset]. http://doi.org/10.3389/feduc.2024.1379910.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    Mar 22, 2024
    Dataset provided by
    Frontiers
    Authors
    Amy E. Pomeroy; Andrea Bixler; Stefanie H. Chen; Jennifer E. Kerr; Todd D. Levine; Elizabeth F. Ryder
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    As high-throughput methods become more common, training undergraduates to analyze data must include having them generate informative summaries of large datasets. This flexible case study provides an opportunity for undergraduate students to become familiar with the capabilities of R programming in the context of high-throughput evolutionary data collected using macroarrays. The story line introduces a recent graduate hired at a biotech firm and tasked with analysis and visualization of changes in gene expression from 20,000 generations of the Lenski Lab’s Long-Term Evolution Experiment (LTEE). Our main character is not familiar with R and is guided by a coworker to learn about this platform. Initially this involves a step-by-step analysis of the small Iris dataset built into R which includes sepal and petal length of three species of irises. Practice calculating summary statistics and correlations, and making histograms and scatter plots, prepares the protagonist to perform similar analyses with the LTEE dataset. In the LTEE module, students analyze gene expression data from the long-term evolutionary experiments, developing their skills in manipulating and interpreting large scientific datasets through visualizations and statistical analysis. Prerequisite knowledge is basic statistics, the Central Dogma, and basic evolutionary principles. The Iris module provides hands-on experience using R programming to explore and visualize a simple dataset; it can be used independently as an introduction to R for biological data or skipped if students already have some experience with R. Both modules emphasize understanding the utility of R, rather than creation of original code. Pilot testing showed the case study was well-received by students and faculty, who described it as a clear introduction to R and appreciated the value of R for visualizing and analyzing large datasets.

  10. d

    OSM Visualize Data

    • data.depositar.io
    geojson, ipynb, pbf +2
    Updated Aug 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pyrosm Visualize (2025). OSM Visualize Data [Dataset]. https://data.depositar.io/dataset/osm-visualize-data
    Explore at:
    shp(12801023), geojson(93524401), ipynb(22126802), geojson(14808500), pbf(302549264), geojson(6293228), geojson(51289357), zip(818487462), shp(22309758), shp(3762381)Available download formats
    Dataset updated
    Aug 29, 2025
    Dataset provided by
    Pyrosm Visualize
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset belongs to the Taiwan-building-footprints project. It contains a example of the visualization code and the data needed to run the code. More code and information can be found on the Github Repo and Juputer Book.

    The ZIP file contains 80 images showcasing the result various visualization options, with 4 images for each county. These images are the same to those showed in the Jupyter Book, but this Zip file contains the original .png files without compression.

  11. f

    [ ingredients by cuisine data vizualization for Data Bloom 2024 ]

    • rochester.figshare.com
    pdf
    Updated Sep 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harris Mazhar (2025). [ ingredients by cuisine data vizualization for Data Bloom 2024 ] [Dataset]. http://doi.org/10.60593/ur.d.30064042.v2
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Sep 17, 2025
    Dataset provided by
    University of Rochester
    Authors
    Harris Mazhar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This visualization is a dashboard (2 pages), but the focus is on page 1, where there are 3 visuals, a word cloud, bubble chart and a bar chart, that are completely interactive (like every single visual can be interacted with to change the entire dashboard), along with selectable filters, you can use these to see real time, correlations between ingredients and cuisines, and visualize what cuisine leans towards what kind of ingredients, and even variants of specific ingredients. The second page contains, filters that show you more numerical data, where you can see side by side comparisons, of ingredients within two separate cuisines, or even the extent to which, two cuisines can use the same ingredient.This viz was submitted as part of the Data Bloom 2024 Viz competition.This viz was created using PowerBI and is based on the following data source: Kaggle - https://www.kaggle.com/datasets/kaggle/recipe-ingredients-dataset/dataPowerBI or a free viewer is required to render and view the full dynamic visualization within the PBIX file.

  12. High Interactivity Visualization Software for Large Computational Data Sets,...

    • data.nasa.gov
    application/rdfxml +5
    Updated Jun 26, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). High Interactivity Visualization Software for Large Computational Data Sets, Phase II [Dataset]. https://data.nasa.gov/dataset/High-Interactivity-Visualization-Software-for-Larg/ttzp-wtjx
    Explore at:
    application/rdfxml, xml, csv, application/rssxml, tsv, jsonAvailable download formats
    Dataset updated
    Jun 26, 2018
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Description

    Existing scientific visualization tools have specific limitations for large scale scientific data sets. Of these four limitations can be seen as paramount: (i) memory management, (ii) remote visualization, (iii) interactivity, and (iv) specificity. In Phase I, we proposed and successfully developed a prototype of a collection of computer tools and libraries called SciViz that overcome these limitations and enable researchers to visualize large scale data sets (greater than 200 gigabytes) on HPC resources remotely from their workstations at interactive rates. A key element of our technology is the stack oriented rather than a framework driven approach which allows it to interoperate with common existing scientific visualization software thereby eliminating the need for the user to switch and learn new software. The result is a versatile 3D visualization capability that will significantly decrease the time to knowledge discovery from large, complex data sets.

    Typical visualization activity can be organized into a simple stack of steps that leads to the visualization result. These steps can broadly be classified into data retrieval, data analysis, visual representation, and rendering. Our approach will be to continue with the technique selected in Phase I of utilizing existing visualization tools at each point in the visualization stack and to develop specific tools that address the core limitations identified and seamlessly integrate them into the visualization stack. Specifically, we intend to complete technical objectives in four areas that will complete the development of visualization tools for interactive visualization of very large data sets in each layer of the visualization stack. These four areas are: Feature Objectives, C++ Conversion and Optimization, Testing Objectives, and Domain Specifics and Integration. The technology will be developed and tested at NASA and the San Diego Supercomputer Center.

  13. g

    Data from: 3D Visualization of Zoning Plans

    • data.groningen.nl
    • data.overheid.nl
    • +2more
    pdf
    Updated Sep 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Groningen (2024). 3D Visualization of Zoning Plans [Dataset]. https://data.groningen.nl/dataset/3d-visualization-of-zoning-plans
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Sep 17, 2024
    Dataset provided by
    Groningen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Traditionally, zoning plans have been represented on a 2D map. However, visualizing a zoning plan in 2D has several limitations, such as visualizing heights of buildings. Furthermore, a zoning plan is abstract, which for citizens can be hard to interpret. Therefore, the goal of this research is to explore how a zoning plan can be visualized in 3D and how it can be visualized it is understandable for the public. The 3D visualization of a zoning plan is applied in a case study, presented in Google Earth, and a survey is executed to verify how the respondents perceive the zoning plan from the case study. An important factor of zoning plans is interpretation, since it determines if the public is able to understand what is visualized by the zoning plan. This is challenging, since a zoning plan is abstract and consists of many detailed information and difficult terms. In the case study several techniques are used to visualize the zoning plan in 3D. The survey shows that visualizing heights in 3D gives a good impression of the maximum heights and is considered as an important advantage in comparison to 2D. The survey also made clear including existing buildings is useful, which can help that the public can recognize the area easier. Another important factor is interactivity. Interactivity can range from letting people navigate through a zoning plan area and in the case study users can click on a certain area or object in the plan and subsequently a menu pops up showing more detailed information of a certain object. The survey made clear that using a popup menu is useful, but this technique did not optimally work. Navigating in Google Earth was also being positively judged. Information intensity is also an important factor Information intensity concerns the level of detail of a 3D representation of an object. Zoning plans are generally not meant to be visualized in a high level of detail, but should be represented abstract. The survey could not implicitly point out that the zoning plan shows too much or too less detail, but it could point out that the majority of the respondents answered that the zoning plan does not show too much information. The interface used for the case study, Google Earth, has a substantial influence on the interpretation of the zoning plan. The legend in Google Earth is unclear and an explanation of the zoning plan is lacking, which is required to make the zoning plan more understandable. This research has shown that 3D can stimulate the interpretation of zoning plans, because users can get a better impression of the plan and is clearer than a current 2D zoning plan. However, the interpretation of a zoning plan, even in 3D, still is complex.

  14. d

    Blog | Help CDC Visualize Vital Statistics

    • catalog.data.gov
    • data.virginia.gov
    • +1more
    Updated Mar 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Paula Braun (2025). Blog | Help CDC Visualize Vital Statistics [Dataset]. https://catalog.data.gov/dataset/blog-help-cdc-visualize-vital-statistics
    Explore at:
    Dataset updated
    Mar 26, 2025
    Dataset provided by
    Paula Braun
    Description

    This blog post was posted by Paula Braun on January 16, 2015.

  15. R

    Detect, Count, And Visualize Object Detection Dataset

    • universe.roboflow.com
    zip
    Updated Mar 18, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Guinea (2025). Detect, Count, And Visualize Object Detection Dataset [Dataset]. https://universe.roboflow.com/guinea/detect-count-and-visualize-object-detection-y4dag/model/7
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 18, 2025
    Dataset authored and provided by
    Guinea
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Items Bounding Boxes
    Description

    Detect, Count, And Visualize Object Detection

    ## Overview
    
    Detect, Count, And Visualize Object Detection is a dataset for object detection tasks - it contains Items annotations for 211 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  16. Netflix Data: Cleaning, Analysis and Visualization

    • kaggle.com
    zip
    Updated Aug 26, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abdulrasaq Ariyo (2022). Netflix Data: Cleaning, Analysis and Visualization [Dataset]. https://www.kaggle.com/datasets/ariyoomotade/netflix-data-cleaning-analysis-and-visualization
    Explore at:
    zip(276607 bytes)Available download formats
    Dataset updated
    Aug 26, 2022
    Authors
    Abdulrasaq Ariyo
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Netflix is a popular streaming service that offers a vast catalog of movies, TV shows, and original contents. This dataset is a cleaned version of the original version which can be found here. The data consist of contents added to Netflix from 2008 to 2021. The oldest content is as old as 1925 and the newest as 2021. This dataset will be cleaned with PostgreSQL and visualized with Tableau. The purpose of this dataset is to test my data cleaning and visualization skills. The cleaned data can be found below and the Tableau dashboard can be found here .

    Data Cleaning

    We are going to: 1. Treat the Nulls 2. Treat the duplicates 3. Populate missing rows 4. Drop unneeded columns 5. Split columns Extra steps and more explanation on the process will be explained through the code comments

    --View dataset
    
    SELECT * 
    FROM netflix;
    
    
    --The show_id column is the unique id for the dataset, therefore we are going to check for duplicates
                                      
    SELECT show_id, COUNT(*)                                                                                      
    FROM netflix 
    GROUP BY show_id                                                                                              
    ORDER BY show_id DESC;
    
    --No duplicates
    
    --Check null values across columns
    
    SELECT COUNT(*) FILTER (WHERE show_id IS NULL) AS showid_nulls,
        COUNT(*) FILTER (WHERE type IS NULL) AS type_nulls,
        COUNT(*) FILTER (WHERE title IS NULL) AS title_nulls,
        COUNT(*) FILTER (WHERE director IS NULL) AS director_nulls,
        COUNT(*) FILTER (WHERE movie_cast IS NULL) AS movie_cast_nulls,
        COUNT(*) FILTER (WHERE country IS NULL) AS country_nulls,
        COUNT(*) FILTER (WHERE date_added IS NULL) AS date_addes_nulls,
        COUNT(*) FILTER (WHERE release_year IS NULL) AS release_year_nulls,
        COUNT(*) FILTER (WHERE rating IS NULL) AS rating_nulls,
        COUNT(*) FILTER (WHERE duration IS NULL) AS duration_nulls,
        COUNT(*) FILTER (WHERE listed_in IS NULL) AS listed_in_nulls,
        COUNT(*) FILTER (WHERE description IS NULL) AS description_nulls
    FROM netflix;
    
    We can see that there are NULLS. 
    director_nulls = 2634
    movie_cast_nulls = 825
    country_nulls = 831
    date_added_nulls = 10
    rating_nulls = 4
    duration_nulls = 3 
    

    The director column nulls is about 30% of the whole column, therefore I will not delete them. I will rather find another column to populate it. To populate the director column, we want to find out if there is relationship between movie_cast column and director column

    -- Below, we find out if some directors are likely to work with particular cast
    
    WITH cte AS
    (
    SELECT title, CONCAT(director, '---', movie_cast) AS director_cast 
    FROM netflix
    )
    
    SELECT director_cast, COUNT(*) AS count
    FROM cte
    GROUP BY director_cast
    HAVING COUNT(*) > 1
    ORDER BY COUNT(*) DESC;
    
    With this, we can now populate NULL rows in directors 
    using their record with movie_cast 
    
    UPDATE netflix 
    SET director = 'Alastair Fothergill'
    WHERE movie_cast = 'David Attenborough'
    AND director IS NULL ;
    
    --Repeat this step to populate the rest of the director nulls
    --Populate the rest of the NULL in director as "Not Given"
    
    UPDATE netflix 
    SET director = 'Not Given'
    WHERE director IS NULL;
    
    --When I was doing this, I found a less complex and faster way to populate a column which I will use next
    

    Just like the director column, I will not delete the nulls in country. Since the country column is related to director and movie, we are going to populate the country column with the director column

    --Populate the country using the director column
    
    SELECT COALESCE(nt.country,nt2.country) 
    FROM netflix AS nt
    JOIN netflix AS nt2 
    ON nt.director = nt2.director 
    AND nt.show_id <> nt2.show_id
    WHERE nt.country IS NULL;
    UPDATE netflix
    SET country = nt2.country
    FROM netflix AS nt2
    WHERE netflix.director = nt2.director and netflix.show_id <> nt2.show_id 
    AND netflix.country IS NULL;
    
    
    --To confirm if there are still directors linked to country that refuse to update
    
    SELECT director, country, date_added
    FROM netflix
    WHERE country IS NULL;
    
    --Populate the rest of the NULL in director as "Not Given"
    
    UPDATE netflix 
    SET country = 'Not Given'
    WHERE country IS NULL;
    

    The date_added rows nulls is just 10 out of over 8000 rows, deleting them cannot affect our analysis or visualization

    --Show date_added nulls
    
    SELECT show_id, date_added
    FROM netflix_clean
    WHERE date_added IS NULL;
    
    --DELETE nulls
    
    DELETE F...
    
  17. D

    Set Visualization Tools Market Research Report 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Oct 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Set Visualization Tools Market Research Report 2033 [Dataset]. https://dataintelo.com/report/set-visualization-tools-market
    Explore at:
    csv, pdf, pptxAvailable download formats
    Dataset updated
    Oct 1, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Set Visualization Tools Market Outlook



    According to our latest research, the global set visualization tools market size reached USD 3.6 billion in 2024, with a robust year-over-year growth driven by the surging demand for advanced data analysis and visualization solutions across industries. The market is projected to expand at a CAGR of 11.7% from 2025 to 2033, reaching a forecasted value of USD 10.1 billion by 2033. This remarkable growth trajectory is primarily attributed to the increasing adoption of big data analytics, artificial intelligence, and digital transformation initiatives among enterprises, government bodies, and academic institutions worldwide.




    One of the primary growth factors for the set visualization tools market is the escalating volume, velocity, and variety of data generated across sectors such as business intelligence, scientific research, and education. Organizations are increasingly recognizing the value of transforming complex, multidimensional datasets into intuitive, interactive visual representations to facilitate better decision-making, uncover hidden insights, and drive operational efficiency. The proliferation of IoT devices, cloud computing, and advanced analytics platforms has further amplified the need for sophisticated set visualization tools that can seamlessly integrate with existing data ecosystems, enabling users to analyze relationships, intersections, and trends within large, heterogeneous datasets.




    Another significant driver propelling the market growth is the rapid digitalization of enterprises and the growing emphasis on data-driven strategies. Businesses are leveraging set visualization tools to enhance their business intelligence capabilities, monitor key performance indicators, and gain a competitive edge in an increasingly data-centric landscape. These tools empower organizations to visualize overlaps, gaps, and anomalies in data sets, supporting functions such as market segmentation, customer profiling, and risk management. As companies continue to invest in advanced analytics and visualization solutions, the demand for customizable, scalable, and user-friendly set visualization platforms is poised to witness sustained growth throughout the forecast period.




    Furthermore, the integration of artificial intelligence and machine learning algorithms into set visualization tools is revolutionizing the market, enabling automated pattern recognition, predictive analytics, and real-time data exploration. This technological evolution is not only enhancing the accuracy and efficiency of data analysis but also democratizing access to complex analytical capabilities for non-technical users. The growing focus on enhancing user experience, interoperability, and cross-platform compatibility is fostering innovation and differentiation among solution providers, further accelerating market expansion. Additionally, the increasing adoption of remote and hybrid work models is driving demand for cloud-based visualization tools that offer flexibility, scalability, and collaborative features.




    From a regional perspective, North America currently dominates the set visualization tools market, accounting for the largest revenue share in 2024, followed closely by Europe and Asia Pacific. The strong presence of leading technology vendors, high digital adoption rates, and significant investments in data analytics infrastructure are key factors underpinning North America's leadership. Meanwhile, Asia Pacific is emerging as the fastest-growing region, fueled by rapid digital transformation, expanding enterprise IT budgets, and a burgeoning ecosystem of startups and academic institutions. As organizations across all regions continue to prioritize data-driven decision-making, the global set visualization tools market is expected to maintain its upward momentum over the coming years.



    Component Analysis



    The set visualization tools market by component is primarily segmented into software and services, each playing a pivotal role in the overall ecosystem. Software solutions dominate the market, driven by the continuous evolution of visualization platforms that offer advanced features such as dynamic dashboards, drag-and-drop interfaces, and integration with diverse data sources. Vendors are focusing on enhancing the scalability, security, and customization capabilities of their software offerings to cater to the unique requirements of various industries. The growing trend of self-service analytics is further boo

  18. f

    Data from: Multivariate Outliers and the O3 Plot

    • figshare.com
    • tandf.figshare.com
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Antony Unwin (2023). Multivariate Outliers and the O3 Plot [Dataset]. http://doi.org/10.6084/m9.figshare.7792115.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Antony Unwin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Identifying and dealing with outliers is an important part of data analysis. A new visualization, the O3 plot, is introduced to aid in the display and understanding of patterns of multivariate outliers. It uses the results of identifying outliers for every possible combination of dataset variables to provide insight into why particular cases are outliers. The O3 plot can be used to compare the results from up to six different outlier identification methods. There is anRpackage OutliersO3 implementing the plot. The article is illustrated with outlier analyses of German demographic and economic data. Supplementary materials for this article are available online.

  19. u

    Data from: Data products for visualizing of past, current, and alternate...

    • research.usc.edu.au
    • researchdata.edu.au
    zip
    Updated Sep 14, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sanjeev K Srivastava; Gary Scott; Jo Rosier (2021). Data products for visualizing of past, current, and alternate scenarios for an ecologically sensitive coastal spit at a local scale [Dataset]. https://research.usc.edu.au/esploro/outputs/dataset/Data-products-for-visualizing-of-past/99450756102621
    Explore at:
    zip(1175901733 bytes), zip(92133340 bytes)Available download formats
    Dataset updated
    Sep 14, 2021
    Dataset provided by
    University of the Sunshine Coast
    Authors
    Sanjeev K Srivastava; Gary Scott; Jo Rosier
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2018
    Description

    This study presents data products to visualize past, current and alternate scenarios for an ecologically sensitive and development prone area in a sub-tropical coastal spit. Data products are created using a diverse range of geodesign tools that include existing and archived high resolution active and passive remote sensing datasets, existing, derived, and digitized spatial layers together with procedural modelling. The final products include 3d and interactive Cityengine Webscene files and fly-throughs in a generic movie format. While the fly-through movies can be played on standard digital devices, the Cityengine Webscenes once uploaded on ArcGIS website requires an Internet ready device for visualization and interaction.

  20. f

    Data from: Visualization of Molecular Fingerprints

    • acs.figshare.com
    zip
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John R. Owen; Ian T. Nabney; José L. Medina-Franco; Fabian López-Vallejo (2023). Visualization of Molecular Fingerprints [Dataset]. http://doi.org/10.1021/ci1004042.s002
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    ACS Publications
    Authors
    John R. Owen; Ian T. Nabney; José L. Medina-Franco; Fabian López-Vallejo
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    A visualization plot of a data set of molecular data is a useful tool for gaining insight into a set of molecules. In chemoinformatics, most visualization plots are of molecular descriptors, and the statistical model most often used to produce a visualization is principal component analysis (PCA). This paper takes PCA, together with four other statistical models (NeuroScale, GTM, LTM, and LTM-LIN), and evaluates their ability to produce clustering in visualizations not of molecular descriptors but of molecular fingerprints. Two different tasks are addressed: understanding structural information (particularly combinatorial libraries) and relating structure to activity. The quality of the visualizations is compared both subjectively (by visual inspection) and objectively (with global distance comparisons and local k-nearest-neighbor predictors). On the data sets used to evaluate clustering by structure, LTM is found to perform significantly better than the other models. In particular, the clusters in LTM visualization space are consistent with the relationships between the core scaffolds that define the combinatorial sublibraries. On the data sets used to evaluate clustering by activity, LTM again gives the best performance but by a smaller margin. The results of this paper demonstrate the value of using both a nonlinear projection map and a Bernoulli noise model for modeling binary data.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Atul (2025). Customer Sale Dataset for Data Visualization [Dataset]. https://www.kaggle.com/datasets/atulkgoyl/customer-sale-dataset-for-visualization
Organization logo

Customer Sale Dataset for Data Visualization

A clean, beginner-friendly dataset with date, numeric, and categorical features

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 6, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Atul
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

This synthetic dataset is designed specifically for practicing data visualization and exploratory data analysis (EDA) using popular Python libraries like Seaborn, Matplotlib, and Pandas.

Unlike most public datasets, this one includes a diverse mix of column types:

📅 Date columns (for time series and trend plots) 🔢 Numerical columns (for histograms, boxplots, scatter plots) 🏷️ Categorical columns (for bar charts, group analysis)

Whether you are a beginner learning how to visualize data or an intermediate user testing new charting techniques, this dataset offers a versatile playground.

Feel free to:

Create EDA notebooks Practice plotting techniques Experiment with filtering, grouping, and aggregations 🛠️ No missing values, no data cleaning needed — just download and start exploring!

Hope you find this helpful. Looking forward to hearing from you all.

Search
Clear search
Close search
Google apps
Main menu