U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Existing scientific visualization tools have specific limitations for large scale scientific data sets. Of these four limitations can be seen as paramount: (i) memory management, (ii) remote visualization, (iii) interactivity, and (iv) specificity. In Phase I, we proposed and successfully developed a prototype of a collection of computer tools and libraries called SciViz that overcome these limitations and enable researchers to visualize large scale data sets (greater than 200 gigabytes) on HPC resources remotely from their workstations at interactive rates. A key element of our technology is the stack oriented rather than a framework driven approach which allows it to interoperate with common existing scientific visualization software thereby eliminating the need for the user to switch and learn new software. The result is a versatile 3D visualization capability that will significantly decrease the time to knowledge discovery from large, complex data sets.
Typical visualization activity can be organized into a simple stack of steps that leads to the visualization result. These steps can broadly be classified into data retrieval, data analysis, visual representation, and rendering. Our approach will be to continue with the technique selected in Phase I of utilizing existing visualization tools at each point in the visualization stack and to develop specific tools that address the core limitations identified and seamlessly integrate them into the visualization stack. Specifically, we intend to complete technical objectives in four areas that will complete the development of visualization tools for interactive visualization of very large data sets in each layer of the visualization stack. These four areas are: Feature Objectives, C++ Conversion and Optimization, Testing Objectives, and Domain Specifics and Integration. The technology will be developed and tested at NASA and the San Diego Supercomputer Center.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
As high-throughput methods become more common, training undergraduates to analyze data must include having them generate informative summaries of large datasets. This flexible case study provides an opportunity for undergraduate students to become familiar with the capabilities of R programming in the context of high-throughput evolutionary data collected using macroarrays. The story line introduces a recent graduate hired at a biotech firm and tasked with analysis and visualization of changes in gene expression from 20,000 generations of the Lenski Lab’s Long-Term Evolution Experiment (LTEE). Our main character is not familiar with R and is guided by a coworker to learn about this platform. Initially this involves a step-by-step analysis of the small Iris dataset built into R which includes sepal and petal length of three species of irises. Practice calculating summary statistics and correlations, and making histograms and scatter plots, prepares the protagonist to perform similar analyses with the LTEE dataset. In the LTEE module, students analyze gene expression data from the long-term evolutionary experiments, developing their skills in manipulating and interpreting large scientific datasets through visualizations and statistical analysis. Prerequisite knowledge is basic statistics, the Central Dogma, and basic evolutionary principles. The Iris module provides hands-on experience using R programming to explore and visualize a simple dataset; it can be used independently as an introduction to R for biological data or skipped if students already have some experience with R. Both modules emphasize understanding the utility of R, rather than creation of original code. Pilot testing showed the case study was well-received by students and faculty, who described it as a clear introduction to R and appreciated the value of R for visualizing and analyzing large datasets.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Please also see the latest version of the repository: |
The explosion in the volume of biological imaging data challenges the available technologies for data interrogation and its intersection with related published bioinformatics data sets. Moreover, intersection of highly rich and complex datasets from different sources provided as flat csv files requires advanced informatics skills, which is time consuming and not accessible to all. Here, we provide a “user manual” to our new paradigm for systematically filtering and analysing a dataset with more than 1300 microscopy data figures using Multi-Dimensional Viewer (MDV) -link, a solution for interactive multimodal data visualisation and exploration. The primary data we use are derived from our published systematic analysis of 200 YFP traps reveals common discordance between mRNA and protein across the nervous system (eprint link). This manual provides the raw image data together with the expert annotations of the mRNA and protein distribution as well as associated bioinformatics data. We provide an explanation, with specific examples, of how to use MDV to make the multiple data types interoperable and explore them together. We also provide the open-source python code (github link) used to annotate the figures, which could be adapted to any other kind of data annotation task.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ParaView is an open-source, multi-platform data analysis and visualization application. ParaView users can quickly build visualizations to analyze their data using qualitative and quantitative techniques. The data exploration can be done interactively in 3D or programmatically using ParaView’s batch processing capabilities. ParaView was developed to analyze extremely large datasets using distributed memory computing resources. It can be run on supercomputers to analyze datasets of petascale size as well as on laptops for smaller data, has become an integral tool in many national laboratories, universities and industry, and has won several awards related to high performance computation.
https://entrepot.recherche.data.gouv.fr/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.15454/AGU4QEhttps://entrepot.recherche.data.gouv.fr/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.15454/AGU4QE
WIDEa is R-based software aiming to provide users with a range of functionalities to explore, manage, clean and analyse "big" environmental and (in/ex situ) experimental data. These functionalities are the following, 1. Loading/reading different data types: basic (called normal), temporal, infrared spectra of mid/near region (called IR) with frequency (wavenumber) used as unit (in cm-1); 2. Interactive data visualization from a multitude of graph representations: 2D/3D scatter-plot, box-plot, hist-plot, bar-plot, correlation matrix; 3. Manipulation of variables: concatenation of qualitative variables, transformation of quantitative variables by generic functions in R; 4. Application of mathematical/statistical methods; 5. Creation/management of data (named flag data) considered as atypical; 6. Study of normal distribution model results for different strategies: calibration (checking assumptions on residuals), validation (comparison between measured and fitted values). The model form can be more or less complex: mixed effects, main/interaction effects, weighted residuals.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Today, everybody around the world is living and working under the coverage of Geographic Information system (GIS) application and services such as the Google Earth, GPS and much more. Big Data visualization tools are increasingly creating a wonder in the world of GIS. GIS has diverse application, from geo-positioning services to 3D demonstrations and virtual reality. Big Data and its tools of visualization has boosted the field of GIS. This article seeks to explore how Big data visualization has expanded the field of Geo- spatial analysis with the intention to present practicable GIS-based tools required to stay ahead in this field.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
IntroductionIn recent years, numerous AI tools have been employed to equip learners with diverse technical skills such as coding, data analysis, and other competencies related to computational sciences. However, the desired outcomes have not been consistently achieved. This study aims to analyze the perspectives of students and professionals from non-computational fields on the use of generative AI tools, augmented with visualization support, to tackle data analytics projects. The focus is on promoting the development of coding skills and fostering a deep understanding of the solutions generated. Consequently, our research seeks to introduce innovative approaches for incorporating visualization and generative AI tools into educational practices.MethodsThis article examines how learners perform and their perspectives when using traditional tools vs. LLM-based tools to acquire data analytics skills. To explore this, we conducted a case study with a cohort of 59 participants among students and professionals without computational thinking skills. These participants developed a data analytics project in the context of a Data Analytics short session. Our case study focused on examining the participants' performance using traditional programming tools, ChatGPT, and LIDA with GPT as an advanced generative AI tool.ResultsThe results shown the transformative potential of approaches based on integrating advanced generative AI tools like GPT with specialized frameworks such as LIDA. The higher levels of participant preference indicate the superiority of these approaches over traditional development methods. Additionally, our findings suggest that the learning curves for the different approaches vary significantly. Since learners encountered technical difficulties in developing the project and interpreting the results. Our findings suggest that the integration of LIDA with GPT can significantly enhance the learning of advanced skills, especially those related to data analytics. We aim to establish this study as a foundation for the methodical adoption of generative AI tools in educational settings, paving the way for more effective and comprehensive training in these critical areas.DiscussionIt is important to highlight that when using general-purpose generative AI tools such as ChatGPT, users must be aware of the data analytics process and take responsibility for filtering out potential errors or incompleteness in the requirements of a data analytics project. These deficiencies can be mitigated by using more advanced tools specialized in supporting data analytics tasks, such as LIDA with GPT. However, users still need advanced programming knowledge to properly configure this connection via API. There is a significant opportunity for generative AI tools to improve their performance, providing accurate, complete, and convincing results for data analytics projects, thereby increasing user confidence in adopting these technologies. We hope this work underscores the opportunities and needs for integrating advanced LLMs into educational practices, particularly in developing computational thinking skills.
The Data Visualization Workshop II: Data Wrangling was a web-based event held on October 18, 2017. This workshop report summarizes the individual perspectives of a group of visualization experts from the public, private, and academic sectors who met online to discuss how to improve the creation and use of high-quality visualizations. The specific focus of this workshop was on the complexities of "data wrangling". Data wrangling includes finding the appropriate data sources that are both accessible and usable and then shaping and combining that data to facilitate the most accurate and meaningful analysis possible. The workshop was organized as a 3-hour web event and moderated by the members of the Human Computer Interaction and Information Management Task Force of the Networking and Information Technology Research and Development Program's Big Data Interagency Working Group. Report prepared by the Human Computer Interaction And Information Management Task Force, Big Data Interagency Working Group, Networking & Information Technology Research & Development Subcommittee, Committee On Technology Of The National Science & Technology Council...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A diverse selection of 1000 empirical time series, along with results of an hctsa feature extraction, using v1.06 of hctsa and Matlab 2019b, computed on a server at The University of Sydney.The results of the computation are in the hctsa file, HCTSA_Empirical1000.mat for use in Matlab using v1.06 of hctsa.The same data is also provided in .csv format for the hctsa_datamatrix.csv (results of feature computation), with information about rows (time series) in hctsa_timeseries-info.csv, information about columns (features) in hctsa_features.csv (and corresponding hctsa code used to compute each feature in hctsa_masterfeatures.csv), and the data of individual time series (each line a time series, for time series described in hctsa_timeseries-info.csv) is in hctsa_timeseries-data.csv. These .csv files were produced by running >>OutputToCSV(HCTSA_Empirical1000.mat,true,true); in hctsa.The input file, INP_Empirical1000.mat, is for use with hctsa, and contains the time-series data and metadata for the 1000 time series. For example, massive feature extraction from these data on the user's machine, using hctsa, can proceed as>> TS_Init('INP_Empirical1000.mat');Some visualizations of the dataset are in CarpetPlot.png (first 1000 samples of all time series as a carpet (color) plot) and 150TS-250samples.png (conventional time-series plots of the first 250 samples of a sample of 150 time series from the dataset). More visualizations can be performed by the user using TS_PlotTimeSeries from the hctsa package.See links in references for more comprehensive documentation for performing methodological comparison using this dataset, and on how to download and use v1.06 of hctsa.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
To create the dataset, the top 10 countries leading in the incidence of COVID-19 in the world were selected as of October 22, 2020 (on the eve of the second full of pandemics), which are presented in the Global 500 ranking for 2020: USA, India, Brazil, Russia, Spain, France and Mexico. For each of these countries, no more than 10 of the largest transnational corporations included in the Global 500 rating for 2020 and 2019 were selected separately. The arithmetic averages were calculated and the change (increase) in indicators such as profitability and profitability of enterprises, their ranking position (competitiveness), asset value and number of employees. The arithmetic mean values of these indicators for all countries of the sample were found, characterizing the situation in international entrepreneurship as a whole in the context of the COVID-19 crisis in 2020 on the eve of the second wave of the pandemic. The data is collected in a general Microsoft Excel table. Dataset is a unique database that combines COVID-19 statistics and entrepreneurship statistics. The dataset is flexible data that can be supplemented with data from other countries and newer statistics on the COVID-19 pandemic. Due to the fact that the data in the dataset are not ready-made numbers, but formulas, when adding and / or changing the values in the original table at the beginning of the dataset, most of the subsequent tables will be automatically recalculated and the graphs will be updated. This allows the dataset to be used not just as an array of data, but as an analytical tool for automating scientific research on the impact of the COVID-19 pandemic and crisis on international entrepreneurship. The dataset includes not only tabular data, but also charts that provide data visualization. The dataset contains not only actual, but also forecast data on morbidity and mortality from COVID-19 for the period of the second wave of the pandemic in 2020. The forecasts are presented in the form of a normal distribution of predicted values and the probability of their occurrence in practice. This allows for a broad scenario analysis of the impact of the COVID-19 pandemic and crisis on international entrepreneurship, substituting various predicted morbidity and mortality rates in risk assessment tables and obtaining automatically calculated consequences (changes) on the characteristics of international entrepreneurship. It is also possible to substitute the actual values identified in the process and following the results of the second wave of the pandemic to check the reliability of pre-made forecasts and conduct a plan-fact analysis. The dataset contains not only the numerical values of the initial and predicted values of the set of studied indicators, but also their qualitative interpretation, reflecting the presence and level of risks of a pandemic and COVID-19 crisis for international entrepreneurship.
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Petscale computing is leading to significant breakthroughs in a number of fields and is revolutionizing the way science is conducted. Data is not knowledge, however, and the challenge has been how to analyze and gain insight from the massive quantities of data that are generated. In order to address the peta-scale visualization challenges, we propose to develop a scientific visualization software that would enable real-time visualization capability of extremely large data sets. We plan to accomplish this by extending the ParaView visualization architecture to extreme scales. ParaView is an open source software installed on all HPC sites including NASA's Pleiades and has a large user base in diverse areas of science and engineering. Our proposed solution will significantly enhance the scientific return from NASA HPC investments by providing the next generation of open source data analysis and visualization tools for very large datasets. To test our solution on real world data with complex pipeline, we have partnered with SciberQuest, who have recently performed the largest kinetic simulations of magnetosphere using 25 K cores on Pleiades and 100 K cores on Kraken. Given that IO is the main bottleneck for scientific visualization at large scales, we propose to work closely with Pleiades's systems team and provide efficient prepackaged general purpose I/O component for ParaView for structured and unstructured data across a spectrum of scales and access patterns with focus on Lustre file system used by Pleiades.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Presentation Date: Friday, March 15, 2019. Location: Barnstable, MA. Abstract: A presentation to a crowd of Barnstable High "AstroJunkies," about how we use physics, statistics, and visualizations to turn information from large, public, astronomical data sets, across many wavelengths into a better understanding of the structure of the Milky Way.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository accompanying the article “DEVILS: a tool for the visualization of large datasets with a high dynamic range” contains the following:
Extended Material of the article
An example raw dataset corresponding to the images shown in Fig. 3
A workflow description that demonstrates the use of the DEVILS workflow with BigStitcher.
Two scripts (“CLAHE_Parameters_test.ijm” and a “DEVILS_Parallel_tests.groovy”) used for Figure S2, S3 and S4.
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
The global data visualization market size was valued at approximately USD 6.5 billion in 2023 and is projected to reach USD 19.8 billion by 2032, growing at a robust CAGR of 12.8% during the forecast period. This impressive growth can be attributed to the escalating need for organizations to make data-driven decisions, the proliferation of big data, and the increasing adoption of advanced analytics tools.
One of the primary growth factors driving the data visualization market is the exponential increase in data generation across various industries. With the advent of IoT, social media proliferation, and digital transformation, organizations are inundated with vast amounts of data. The need to interpret this data to derive meaningful insights has never been greater. Data visualization tools enable businesses to transform raw data into graphical representations, facilitating easier understanding and more informed decision-making.
Another significant growth driver is the increasing adoption of business intelligence (BI) and analytics solutions. Enterprises are progressively recognizing the value of BI tools in gaining competitive advantages. Data visualization is a critical component of these BI platforms, providing interactive and dynamic representations of data that can be manipulated to uncover trends, patterns, and correlations. This ability to visualize complex data sets enhances strategic planning and operational efficiencies.
The rising demand for personalized customer experiences is also contributing to market growth. In sectors like retail, BFSI, and healthcare, understanding customer behavior and preferences is paramount. Data visualization tools help organizations analyze customer data in real-time, enabling them to tailor offerings and improve customer engagement. The ability to visualize data in an intuitive manner accelerates the speed at which businesses can respond to market changes and customer needs.
Marketing Dashboards have become an essential tool for businesses seeking to optimize their marketing strategies through data visualization. These dashboards provide a comprehensive view of marketing performance by aggregating data from various sources such as social media, email campaigns, and web analytics. By presenting this data in an easily digestible format, marketing teams can quickly identify trends, track campaign effectiveness, and make informed decisions to enhance their marketing efforts. The ability to customize these dashboards allows organizations to focus on key performance indicators that are most relevant to their objectives, ultimately leading to more targeted and successful marketing initiatives.
From a regional perspective, North America holds a significant share of the data visualization market, driven by the presence of major technology providers and high adoption rates of advanced analytics tools. However, the Asia Pacific region is expected to witness the highest growth rate during the forecast period. This growth is fueled by the increasing digitalization initiatives, rising investments in IT infrastructure, and the growing awareness of data-driven decision-making in emerging economies such as India and China.
The data visualization market comprises two primary components: software and services. The software segment is further categorized into standalone visualization tools and integrated data visualization solutions. Standalone visualization tools are designed specifically for data visualization purposes, offering features such as interactive dashboards, real-time analytics, and customizable visualizations. Integrated solutions, on the other hand, are part of larger business intelligence or analytics platforms, providing seamless integration with other data management and analysis tools.
The services segment includes consulting, implementation, and support services. Consulting services help organizations identify the right data visualization tools and strategies to meet their specific needs. Implementation services ensure the successful deployment and integration of visualization solutions within the existing IT infrastructure. Support services provide ongoing maintenance, updates, and troubleshooting to ensure the smooth functioning of the data visualization tools.
Within the software segment, the demand for cloud-based data visualization solutions is growing rapidly. Cloud
https://darus.uni-stuttgart.de/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=doi:10.18419/DARUS-3884https://darus.uni-stuttgart.de/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=doi:10.18419/DARUS-3884
Understanding the link between visual attention and user’s needs when visually exploring information visualisations is under-explored due to a lack of large and diverse datasets to facilitate these analyses. To fill this gap, we introduce SalChartQA - a novel crowd-sourced dataset that uses the BubbleView interface as a proxy for human gaze and a question-answering (QA) paradigm to induce different information needs in users. SalChartQA contains 74,340 answers to 6,000 questions on 3,000 visualisations. Informed by our analyses demonstrating the tight correlation between the question and visual saliency, we propose the first computational method to predict question-driven saliency on information visualisations. Our method outperforms state-of-the-art saliency models, improving several metrics, such as the correlation coefficient and the Kullback-Leibler divergence. These results show the importance of information needs for shaping attention behaviour and paving the way for new applications, such as task-driven optimisation of visualisations or explainable AI in chart question-answering. The files of this dataset are documented in README.md.
https://www.verifiedmarketresearch.com/privacy-policy/https://www.verifiedmarketresearch.com/privacy-policy/
Data Visualization Software Market size is growing at a moderate pace with substantial growth rates over the last few years and is estimated that the market will grow significantly in the forecasted period i.e., 2024 to 2031.
Global Data Visualization Software Market Drivers
The market drivers for the Data Visualization Software Market can be influenced by various factors. These may include:
Growing Need for Analytics and Big Data:Businesses are producing enormous volumes of data, which they must analyse and visualise with the help of effective technologies to get insights that can be put to use.
Expanding Use of Business Intelligence (BI) Products:The need for data visualisation software is being driven by the increasing use of BI tools by businesses to improve their decision-making procedures.
Increase in Cloud-Based Solution Adoption:Businesses of all sizes are adopting cloud-based data visualisation solutions due to their affordability, scalability, and flexibility.
IoT Device Proliferation:Massive data volumes are being generated by the proliferation of IoT devices, necessitating the use of sophisticated visualisation tools for efficient analysis and interpretation.
Increased Requirement for Data Analysis in Real Time:The market for data visualisation software is being driven by the need for real-time data analysis to support dynamic corporate settings and quick decision-making.
Increasing Application of Machine Learning (ML) and Artificial Intelligence (AI):When AI and ML are combined with data visualisation tools, it becomes easier to analyse large, complicated data sets, forecast trends, and offer deeper insights.
Increase in Business Strategies Driven by Data:In order to remain competitive, businesses are embracing data-driven strategies more and more, which is driving up demand for advanced data visualisation tools.
Raising Knowledge of the Advantages of Data Visualisation:The growth of data visualisation is being driven by a growing understanding of its benefits, which include enhanced communication and data comprehension across a range of industries.
Improvements in Tools and Techniques for Visualisation:Market expansion is being driven by ongoing advancements in visualisation technologies as well as the creation of more feature-rich and user-friendly tools.
Widening the Scope of Applications:The market is growing as a result of the growing application of data visualisation in industries like healthcare, finance, retail, and education.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1430847%2F29f7950c3b7daf11175aab404725542c%2FGettyImages-1187621904-600x360.jpg?generation=1601115151722854&alt=media" alt="">
Data visualization is the graphical representation of information and data. By using visual elements like charts, graphs, and maps, data visualization tools provide an accessible way to see and understand trends, outliers, and patterns in data.
In the world of Big Data, data visualization tools and technologies are essential to analyze massive amounts of information and make data-driven decisions
32 cheat sheets: This includes A-Z about the techniques and tricks that can be used for visualization, Python and R visualization cheat sheets, Types of charts, and their significance, Storytelling with data, etc..
32 Charts: The corpus also consists of a significant amount of data visualization charts information along with their python code, d3.js codes, and presentations relation to the respective charts explaining in a clear manner!
Some recommended books for data visualization every data scientist's should read:
In case, if you find any books, cheat sheets, or charts missing and if you would like to suggest some new documents please let me know in the discussion sections!
A kind request to kaggle users to create notebooks on different visualization charts as per their interest by choosing a dataset of their own as many beginners and other experts could find it useful!
To create interactive EDA using animation with a combination of data visualization charts to give an idea about how to tackle data and extract the insights from the data
Feel free to use the discussion platform of this data set to ask questions or any queries related to the data visualization corpus and data visualization techniques
Through a human factors study, we evaluated the use of contour and glyph visualizations for two modern power systems models: an urban distribution model and a large-scale transmission model. This dataset provides model data and scripts for recreating the power flow data and visualizations used in the study, in addition to the study results and statistical analysis.
The global big data and business analytics (BDA) market was valued at 168.8 billion U.S. dollars in 2018 and is forecast to grow to 215.7 billion U.S. dollars by 2021. In 2021, more than half of BDA spending will go towards services. IT services is projected to make up around 85 billion U.S. dollars, and business services will account for the remainder. Big data High volume, high velocity and high variety: one or more of these characteristics is used to define big data, the kind of data sets that are too large or too complex for traditional data processing applications. Fast-growing mobile data traffic, cloud computing traffic, as well as the rapid development of technologies such as artificial intelligence (AI) and the Internet of Things (IoT) all contribute to the increasing volume and complexity of data sets. For example, connected IoT devices are projected to generate 79.4 ZBs of data in 2025. Business analytics Advanced analytics tools, such as predictive analytics and data mining, help to extract value from the data and generate business insights. The size of the business intelligence and analytics software application market is forecast to reach around 16.5 billion U.S. dollars in 2022. Growth in this market is driven by a focus on digital transformation, a demand for data visualization dashboards, and an increased adoption of cloud.
Click to add a brief description of the dataset (Markdown and LaTeX enabled).
Provide:
a high-level explanation of the dataset characteristics explain motivations and summary of its content potential use cases of the dataset
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Existing scientific visualization tools have specific limitations for large scale scientific data sets. Of these four limitations can be seen as paramount: (i) memory management, (ii) remote visualization, (iii) interactivity, and (iv) specificity. In Phase I, we proposed and successfully developed a prototype of a collection of computer tools and libraries called SciViz that overcome these limitations and enable researchers to visualize large scale data sets (greater than 200 gigabytes) on HPC resources remotely from their workstations at interactive rates. A key element of our technology is the stack oriented rather than a framework driven approach which allows it to interoperate with common existing scientific visualization software thereby eliminating the need for the user to switch and learn new software. The result is a versatile 3D visualization capability that will significantly decrease the time to knowledge discovery from large, complex data sets.
Typical visualization activity can be organized into a simple stack of steps that leads to the visualization result. These steps can broadly be classified into data retrieval, data analysis, visual representation, and rendering. Our approach will be to continue with the technique selected in Phase I of utilizing existing visualization tools at each point in the visualization stack and to develop specific tools that address the core limitations identified and seamlessly integrate them into the visualization stack. Specifically, we intend to complete technical objectives in four areas that will complete the development of visualization tools for interactive visualization of very large data sets in each layer of the visualization stack. These four areas are: Feature Objectives, C++ Conversion and Optimization, Testing Objectives, and Domain Specifics and Integration. The technology will be developed and tested at NASA and the San Diego Supercomputer Center.