Facebook
TwitterThere's a story behind every dataset and here's your opportunity to share yours.
What's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too.
We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.
Your data will be in front of the world's largest data science community. What questions do you want to see answered?
Analyze closing price of all the stocks. Analyze the total volume of stocks being traded each day. Analyze daily price change in stock. Analyze monthly mean of close feature. Analyze whether Stock prices of these tech companies are correlated or not. Analyze Daily return of each stock and how they are co-related. Value at risk analysis for Tech Companies.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset contains clinical and diagnostic features related to Breast Cancer, designed for comprehensive Exploratory Data Analysis (EDA) and subsequent predictive modeling.
It is derived from digitized images of Fine Needle Aspirates (FNA) of breast masses.
The dataset features quantitative measurements, typically calculated from the characteristics of cell nuclei, including: - Radius - Texture - Perimeter - Area - Smoothness - Compactness - Concavity - Concave Points - Symmetry - Fractal Dimension
These features are provided as mean, standard error, and "worst" (largest) values.
The primary goal of this resource is to support the validation of EDA techniques necessary for clinical data science: - Data quality assessment (missing values, inconsistencies). - Feature assessment (distributions, correlations). - Visualization for diagnostic modeling.
The primary target variable is the binary classification of the tissue sample: Malignant vs. Benign.
Facebook
TwitterOpen Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
In Chapter 8 of the thesis, 6 sonification models are presented to give some examples for the framework of Model-Based Sonification, developed in Chapter 7. Sonification models determine the rendering of the sonification and possible interactions. The "model in mind" helps the user to interprete the sound with respect to the data.
Data Sonograms use spherical expanding shock waves to excite linear oscillators which are represented by point masses in model space.
|
File:
| Iris dataset: started in plot "https://pub.uni-bielefeld.de/download/2920448/2920454">(a)
at S0 (b) at S1
(c) at S2
10d noisy circle dataset: started in plot (c) at "https://pub.uni-bielefeld.de/download/2920448/2920451">S0 (mean) (d) at S1 (edge) 10d Gaussian: plot (d) started at S0 3 clusters: Example 1 3 clusters: invisible columns used as output variables: "https://pub.uni-bielefeld.de/download/2920448/2920450">Example 2 |
|
Description:
| Data Sonogram Sound examples for synthetic datasets and the Iris dataset |
|
Duration:
| about 5 s |
This sonification model explores features of a data distribution by computing the trajectories of test particles which are injected into model space and move according to Newton's laws of motion in a potential given by the dataset.
The McMC Sonification Model defines a exploratory process in the domain of a given density p such that the acoustic representation summarizes features of p, particularly concerning the modes of p by sound.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Unsupervised exploratory data analysis (EDA) is often the first step in understanding complex data sets. While summary statistics are among the most efficient and convenient tools for exploring and describing sets of data, they are often overlooked in EDA. In this paper, we show multiple case studies that compare the performance, including clustering, of a series of summary statistics in EDA. The summary statistics considered here are pattern recognition entropy (PRE), the mean, standard deviation (STD), 1-norm, range, sum of squares (SSQ), and X4, which are compared with principal component analysis (PCA), multivariate curve resolution (MCR), and/or cluster analysis. PRE and the other summary statistics are direct methods for analyzing datathey are not factor-based approaches. To quantify the performance of summary statistics, we use the concept of the “critical pair,” which is employed in chromatography. The data analyzed here come from different analytical methods. Hyperspectral images, including one of a biological material, are also analyzed. In general, PRE outperforms the other summary statistics, especially in image analysis, although a suite of summary statistics is useful in exploring complex data sets. While PRE results were generally comparable to those from PCA and MCR, PRE is easier to apply. For example, there is no need to determine the number of factors that describe a data set. Finally, we introduce the concept of divided spectrum-PRE (DS-PRE) as a new EDA method. DS-PRE increases the discrimination power of PRE. We also show that DS-PRE can be used to provide the inputs for the k-nearest neighbor (kNN) algorithm. We recommend PRE and DS-PRE as rapid new tools for unsupervised EDA.
Facebook
Twitterhttps://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms
Question Paper Solutions of chapter Exploratory Data Analytics and Descriptive Statistics of Data Analytics Skills for Managers, 5th Semester , Bachelor in Business Administration 2020 - 2021
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This is a cleaned version of a Netflix movies dataset originally used for exploratory data analysis (EDA). The dataset contains information such as:
Missing values have been handled using appropriate methods (mean, median, unknown), and new features like rating_level and popular have been added for deeper analysis.
The dataset is ready for: - EDA - Data visualization - Machine learning tasks - Dashboard building
Used in the accompanying notebook
Facebook
TwitterThis is a dataset for online orders placed at a retail business. The rows represent the transactions of every order being made. Our job is to find out how we can dive deep into this data set to bring out meaning for the retail business to make strategic business decisions. The dataset has 6000 rows.
***Invoice No:** The unique number assigned to this particular row/transaction StockCode: The code of the item purchased Description: The description of the item purchased Quantity: The quantity of the item purchased InvoiceDate: The Date on which the item was purchased UnitPrice: The price at which the item was purchased CustomerID: The ID of the customer which has made this transaction Country: The country in which this transactio took place*
Facebook
Twitterhttps://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
read-tv
The main paper is about, read-tv, open-source software for longitudinal data visualization. We uploaded sample use case surgical flow disruption data to highlight read-tv's capabilities. We scrubbed the data of protected health information, and uploaded it as a single CSV file. A description of the original data is described below.
Data source
Surgical workflow disruptions, defined as “deviations from the natural progression of an operation thereby potentially compromising the efficiency or safety of care”, provide a window on the systems of work through which it is possible to analyze mismatches between the work demands and the ability of the people to deliver the work. They have been shown to be sensitive to different intraoperative technologies, surgical errors, surgical experience, room layout, checklist implementation and the effectiveness of the supporting team. The significance of flow disruptions lies in their ability to provide a hitherto unavailable perspective on the quality and efficiency of the system. This allows for a systematic, quantitative and replicable assessment of risks in surgical systems, evaluation of interventions to address them, and assessment of the role that technology plays in exacerbation or mitigation.
In 2014, Drs Catchpole and Anger were awarded NIBIB R03 EB017447 to investigate flow disruptions in Robotic Surgery which has resulted in the detailed, multi-level analysis of over 4,000 flow disruptions. Direct observation of 89 RAS (robitic assisted surgery) cases, found a mean of 9.62 flow disruptions per hour, which varies across different surgical phases, predominantly caused by coordination, communication, equipment, and training problems.
Methods This section does not describe the methods of read-tv software development, which can be found in the associated manuscript from JAMIA Open (JAMIO-2020-0121.R1). This section describes the methods involved in the surgical work flow disruption data collection. A curated, PHI-free (protected health information) version of this dataset was used as a use case for this manuscript.
Observer training
Trained human factors researchers conducted each observation following the completion of observer training. The researchers were two full-time research assistants based in the department of surgery at site 3 who visited the other two sites to collect data. Human Factors experts guided and trained each observer in the identification and standardized collection of FDs. The observers were also trained in the basic components of robotic surgery in order to be able to tangibly isolate and describe such disruptive events.
Comprehensive observer training was ensured with both classroom and floor training. Observers were required to review relevant literature, understand general practice guidelines for observing in the OR (e.g., where to stand, what to avoid, who to speak to), and conduct practice observations. The practice observations were broken down into three phases, all performed under the direct supervision of an experienced observer. During phase one, the trainees oriented themselves to the real-time events of both the OR and the general steps in RAS. The trainee was also introduced to the OR staff and any other involved key personnel. During phase two, the trainer and trainee observed three RAS procedures together to practice collecting FDs and become familiar with the data collection tool. Phase three was dedicated to determining inter-rater reliability by having the trainer and trainee simultaneously, yet independently, conduct observations for at least three full RAS procedures. Observers were considered fully trained if, after three full case observations, intra-class correlation coefficients (based on number of observed disruptions per phase) were greater than 0.80, indicating good reliability.
Data collection
Following the completion of training, observers individually conducted observations in the OR. All relevant RAS cases were pre-identified on a monthly basis by scanning the surgical schedule and recording a list of procedures. All procedures observed were conducted with the Da Vinci Xi surgical robot, with the exception of one procedure at Site 2, which was performed with the Si robot. Observers attended those cases that fit within their allotted work hours and schedule. Observers used Microsoft Surface Pro tablets configured with a customized data collection tool developed using Microsoft Excel to collect data. The data collection tool divided procedures into five phases, as opposed to the four phases previously used in similar research, to more clearly distinguish between task demands throughout the procedure. Phases consisted of phase 1 - patient in the room to insufflation, phase 2 -insufflation to surgeon on console (including docking), phase 3 - surgeon on console to surgeon off console, phase 4 - surgeon off console to patient closure, and phase 5 - patient closure to patient leaves the operating room. During each procedure, FDs were recorded into the appropriate phase, and a narrative, time-stamp, and classification (based off of a robot-specific FD taxonomy) were also recorded.
Each FD was categorized into one of ten categories: communication, coordination, environment, equipment, external factors, other, patient factors, surgical task considerations, training, or unsure. The categorization system is modeled after previous studies, as well as the examples provided for each FD category.
Once in the OR, observers remained as unobtrusive as possible. They stood at an appropriate vantage point in the room without getting in the way of team members. Once an appropriate time presented itself, observers introduced themselves to the circulating nurse and informed them of the reason for their presence. Observers did not directly engage in conversations with operating room staff, however, if a staff member approached them with any questions/comments they would respond.
Data Reduction and PHI (Protected Health Information) Removal
This dataset uses 41 of the aforementioned surgeries. All columns have been removed except disruption type, a numeric timestamp for number of minutes into the day, and surgical phase. In addition, each surgical case had it's initial disruption set to 12 noon, (720 minutes).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ABSTRACT. The aim of this study was to employ the principal component technique to physiological data and environmental thermohygrometric variables correlated with detection of clinical and subclinical mastitis in dairy cattle. A total of 24 lactating Girolando cows with different clinical conditions were selected (healthy, and with clinical or subclinical mastitis). The following physiological variables were recorded: udder surface temperature, ST (°C); eyeball temperature, ET (°C); rectum temperature, RT (°C); respiratory frequency, RF (mov. min-1). Thermohygrometric variables included air temperature, AirT (°C), and relative humidity, RU (%). ST was determined by means of thermal images, with four images per animal, on these quarters: front left side (FL), front right side (FR), rear right side (RR) and rear left side (RL), totaling 96 images. Exploratory data analysis was run through multivariate statistical technique with the employment of principal components, comprehending nine variables: ST on the FL, FR, RL and RR quarters; ET, RT; RF, AirT and RU. The representative quarters of the animals with clinical and subclinical mastitis showed udder temperatures 8.55 and 2.46° C higher than those of healthy animals, respectively. The ETs of the animals with subclinical and clinical mastitis were, respectively, 7.9 and 8.0% higher than those of healthy animals. Rectum temperatures were 2.9% (subclinical mastitis) and 5.5% (clinical mastitis) higher compared to those of healthy animals. Respiratory frequencies were 40.3% (subclinical mastitis) and 61.6% (clinical mastitis) higher compared to those of healthy animals. The first component explained 91% of the total variance for the variables analyzed. The principal component technique allowed verifying the variables correlated with the animals' clinical condition and the degree of dependence between the study variables.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Big data, with N × P dimension where N is extremely large, has created new challenges for data analysis, particularly in the realm of creating meaningful clusters of data. Clustering techniques, such as K-means or hierarchical clustering are popular methods for performing exploratory analysis on large datasets. Unfortunately, these methods are not always possible to apply to big data due to memory or time constraints generated by calculations of order P*N(N−1)2. To circumvent this problem, typically the clustering technique is applied to a random sample drawn from the dataset; however, a weakness is that the structure of the dataset, particularly at the edges, is not necessarily maintained. We propose a new solution through the concept of “data nuggets”, which reduces a large dataset into a small collection of nuggets of data, each containing a center, weight, and scale parameter. The data nuggets are then input into algorithms that compute methods such as principal components analysis and clustering in a more computationally efficient manner. We show the consistency of the data nuggets based covariance estimator and apply the methodology of data nuggets to perform exploratory analysis of a flow cytometry dataset containing over one million observations using PCA and K-means clustering for weighted observations. Supplementary materials for this article are available online.
Facebook
TwitterThese data are part of NACJD's Fast Track Release and are distributed as they were received from the data depositor. The files have been zipped by NACJD for release, but not checked or processed except for the removal of direct identifiers. Users should refer to the accompanying readme file for a brief description of the files available with this collection and consult the investigator(s) if further information is needed. This study used secondary analysis of data from several different sources to examine the impact of increased oil development on domestic violence, dating violence, sexual assault, and stalking (DVDVSAS) in the Bakken region of Montana and North Dakota. Distributed here are the code used for the secondary analysis data; the data are not available through other public means. Please refer to the User Guide distributed with this study for a list of instructions on how to obtain all other data used in this study. This collection contains a secondary analysis of the Uniform Crime Reports (UCR). UCR data serve as periodic nationwide assessments of reported crimes not available elsewhere in the criminal justice system. Each year, participating law enforcement agencies contribute reports to the FBI either directly or through their state reporting programs. Distributed here are the codes used to create the datasets and preform the secondary analysis. Please refer to the User Guide, distributed with this study, for more information. This collection contains a secondary analysis of the National Incident Based Reporting System (NIBRS), a component part of the Uniform Crime Reporting Program (UCR) and an incident-based reporting system for crimes known to the police. For each crime incident coming to the attention of law enforcement, a variety of data were collected about the incident. These data included the nature and types of specific offenses in the incident, characteristics of the victim(s) and offender(s), types and value of property stolen and recovered, and characteristics of persons arrested in connection with a crime incident. NIBRS collects data on each single incident and arrest within 22 offense categories, made up of 46 specific crimes called Group A offenses. In addition, there are 11 Group B offense categories for which only arrest data were reported. NIBRS data on different aspects of crime incidents such as offenses, victims, offenders, arrestees, etc., can be examined as different units of analysis. Distributed here are the codes used to create the datasets and preform the secondary analysis. Please refer to the User Guide, distributed with this study, for more information. The collection includes 17 SPSS syntax files. Qualitative data collected for this study are not available as part of the data collection at this time.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Vitamin D insufficiency appears to be prevalent in SLE patients. Multiple factors potentially contribute to lower vitamin D levels, including limited sun exposure, the use of sunscreen, darker skin complexion, aging, obesity, specific medical conditions, and certain medications. The study aims to assess the risk factors associated with low vitamin D levels in SLE patients in the southern part of Bangladesh, a region noted for a high prevalence of SLE. The research additionally investigates the possible correlation between vitamin D and the SLEDAI score, seeking to understand the potential benefits of vitamin D in enhancing disease outcomes for SLE patients. The study incorporates a dataset consisting of 50 patients from the southern part of Bangladesh and evaluates their clinical and demographic data. An initial exploratory data analysis is conducted to gain insights into the data, which includes calculating means and standard deviations, performing correlation analysis, and generating heat maps. Relevant inferential statistical tests, such as the Student’s t-test, are also employed. In the machine learning part of the analysis, this study utilizes supervised learning algorithms, specifically Linear Regression (LR) and Random Forest (RF). To optimize the hyperparameters of the RF model and mitigate the risk of overfitting given the small dataset, a 3-Fold cross-validation strategy is implemented. The study also calculates bootstrapped confidence intervals to provide robust uncertainty estimates and further validate the approach. A comprehensive feature importance analysis is carried out using RF feature importance, permutation-based feature importance, and SHAP values. The LR model yields an RMSE of 4.83 (CI: 2.70, 6.76) and MAE of 3.86 (CI: 2.06, 5.86), whereas the RF model achieves better results, with an RMSE of 2.98 (CI: 2.16, 3.76) and MAE of 2.68 (CI: 1.83,3.52). Both models identify Hb, CRP, ESR, and age as significant contributors to vitamin D level predictions. Despite the lack of a significant association between SLEDAI and vitamin D in the statistical analysis, the machine learning models suggest a potential nonlinear dependency of vitamin D on SLEDAI. These findings highlight the importance of these factors in managing vitamin D levels in SLE patients. The study concludes that there is a high prevalence of vitamin D insufficiency in SLE patients. Although a direct linear correlation between the SLEDAI score and vitamin D levels is not observed, machine learning models suggest the possibility of a nonlinear relationship. Furthermore, factors such as Hb, CRP, ESR, and age are identified as more significant in predicting vitamin D levels. Thus, the study suggests that monitoring these factors may be advantageous in managing vitamin D levels in SLE patients. Given the immunological nature of SLE, the potential role of vitamin D in SLE disease activity could be substantial. Therefore, it underscores the need for further large-scale studies to corroborate this hypothesis.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains the data used for the publication entitled "Exploratory investigation of historical decorative laminates by means of vibrational spectroscopic techniques".
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ABSTRACT The purpose of this technical note is to verify the stationarity of flows in the Iguaçu River Basin, considering 14 fluviometric stations. For this purpose, three sets of annual flow series were studied: mean flows, maximum flows and minimum flows of 7 days. Initially, the exploratory analysis of the data was performed, based on the establishment of two points of change of the characteristics of the flows and the accomplishment of statistical tests of equality of mean and variance, parametric and nonparametric. Finally, composite tests were used considering the basin divided into Upper and Lower Iguaçu. In the exploratory analysis of the data, it was concluded that there was apparently a change in the trend of flow rates, part in the 1970s, part in the 1980s. Therefore, the series were divided in two different ways, at the half and at the point suggested in the exploratory analysis. Regarding the statistical tests, the Mann-Whitney test was chosen because it did not depend on the underlying distribution of the flow series, and because the test was highly recommended by other authors. It was concluded that the flow change occurred in a relatively short interval of time, and could be treated as a non-stationarity per hop. In general, the most recent change was downstream of the Uniao da Vitória fluviometric station. A change in the behavior of the mean and maximum flows was evident, however, such phenomenon was not observed in the evaluation of the minimum flows.
Facebook
TwitterPrior research has shown that during development, there is increased segregation between, and increased integration within, prototypical resting-state functional brain networks. Functional networks are typically defined by static functional connectivity over extended periods of rest. However, little is known about how time-varying properties of functional networks change with age. Likewise, a comparison of standard approaches to functional connectivity may provide a nuanced view of how network integration and segregation are reflected across the lifespan. Therefore, this exploratory study evaluated common approaches to static and dynamic functional network connectivity in a publicly available dataset of subjects ranging from 8 to 75 years of age. Analyses evaluated relationships between age and static resting-state functional connectivity, variability (standard deviation) of connectivity, and mean dwell time of functional network states defined by recurring patterns of whole-brain connectivity. Results showed that older age was associated with decreased static connectivity between nodes of different canonical networks, particularly between the visual system and nodes in other networks. Age was not significantly related to variability of connectivity. Mean dwell time of a network state reflecting high connectivity between visual regions decreased with age, but older age was also associated with increased mean dwell time of a network state reflecting high connectivity within and between canonical sensorimotor and visual networks. Results support a model of increased network segregation over the lifespan and also highlight potential pathways of top-down regulation among networks.
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
The UKCP18 exploratory extended time-mean sea level projections are provided as spatially a continuous dataset around the UK coastline for the period 2007-2300. These exploratory projections have been devised to be used seamlessly with the UKCP18 21st Century projections and provide very similar values for the period up to 2100. Users should be aware that post-2100 projections have a far greater degree of uncertainty than the 21st Century projections and should therefore be treated as illustrative of the potential future changes. Note that we cannot rule out substantially larger sea level rise in the coming centuries than is represented in the projections presented here. The data consist of annual time series of the projected change in the time-mean coastal water level relative to the average value for the period 1981-2000. Projections are available for the RCP2.6, RCP4.5 and RCP8.5 climate change scenarios (Meinshausen et al, 2011). As with the 21st Century projections, nine percentiles are provided to characterise the projection uncertainty, based on underlying modelling uncertainty. However, users should view these uncertainties with a much lower degree of confidence for the period post-2100.
This dataset was updated in March 2023 to correct a minor processing error in the earlier version of the UKCP18 site-specific sea level projections relating to the adjustment applied to convert from the IPCC AR5 baseline of 1986-2005 to the baseline period of 1981-2000. The update results in about a 1 cm increase compared to the original data release for all UKCP18 site-specific sea level projections at all timescales. Further details can be found in the accompanying technical note.
Facebook
TwitterPlease note this dataset supersedes previous versions on the Climate Data Portal. It has been uploaded following an update to the dataset in March 2023. This means sea level rise is approximately 1cm higher (larger) compared to the original data release (i.e. the previous version available on this portal) for all UKCP18 site-specific sea level projections at all timescales. For more details please refer to the technical note.What does the data show?The exploratory extended time-mean sea-level projections to 2300 show the amount of sea-level change (in cm) for each coastal location (grid-box) around the British Isles for several emission scenarios. Sea-level rise is the primary mechanism by which we expect coastal flood risk to change in the UK in the future. The amount of sea-level rise depends on the location around the British Isles and increases with higher emission scenarios. Here, we provide the relative time-mean sea-level projections to 2300, i.e. the local sea-level change experienced at a particular location compared to the 1981-2000 average, produced as part of UKCP18.For each grid box the time-mean sea-level change projections are provided for the end of each decade (e.g. 2010, 2020, 2030 etc) for three emission scenarios known as Representative Concentration Pathways (RCP) and for three percentiles.The emission scenarios are:RCP2.6RCP4.5RCP8.5The percentiles are:5th percentile50th percentile95th percentileImportant limitations of the dataWe cannot rule out substantial additional sea-level rise associated with ice sheet instability processes that are not represented in the UKCP18 projections, as discussed in the recent IPCC Sixth Assessment Report (AR6). These exploratory projections show sea levels continue to increase beyond 2100 even with large reductions in greenhouse gas emissions. It should be noted that these projections have a greater degree of uncertainty than the 21st Century Projections and should therefore be treated as illustrative of the potential future changes. They are designed to be used alongside the 21st Century projections for those interested in exploring post-2100 changes.What are the naming conventions and how do I explore the data?The data is supplied so that each row corresponds to the combination of a RCP emissions scenario and percentile value e.g. 'RCP45_50' is the RCP4.5 scenario and the 50th percentile. This can be viewed and filtered by the field 'RCP and Percentile'. The columns (fields) correspond to the end of each decade and the fields are named by sea level anomaly at year x, e.g. '2050 seaLevelAnom' is the sea level anomaly at 2050 compared to the 1981-2000 average.Please note that the styling and filtering options are independent of each other and the attribute you wish to style the data by can be set differently to the one you filter by. Please ensure that you have selected the RCP/percentile and decade you want to both filter and style the data by. Select the cell you are interested in to view all values.To understand how to explore the data please refer to the New Users ESRI Storymap.What are the emission scenarios?The 21st Century time-mean sea level projections were produced using some of the future emission scenarios used in the IPCC Fifth Assessment Report (AR5). These are RCP2.6, RCP4.5 and RCP8.5, which are based on the concentration of greenhouse gases and aerosols in the atmosphere. RCP2.6 is an aggressive mitigation pathway, where greenhouse gas emissions are strongly reduced. RCP4.5 is an intermediate ‘stabilisation’ pathway, where greenhouse gas emissions are reduced by varying levels. RCP8.5 is a high emission pathway, where greenhouse gas emissions continue to grow unmitigated. Further information is available in the Understanding Climate Data ESRI Storymap and the RCP Guidance on the UKCP18 website.What are the percentiles?The UKCP18 sea-level projections are based on a large Monte Carlo simulation that represents 450,000 possible outcomes in terms of global mean sea-level change. The Monte Carlo simulation is designed to sample the uncertainties across the different components of sea-level rise, and the amount of warming we see for a given emissions scenario across CMIP5 climate models. The percentiles are used to characterise the uncertainty in the Monte Carlo projections based on the statistical distribution of the 450,000 individual simulation members. For example, the 50th percentile represents the central estimate (median) amongst the model projections. Whilst the 95th percentile value means 95% of the model distribution is below that value and similarly the 5th percentile value means 5% of the model distribution is below that value. The range between the 5th to 95th percentiles represent the projection range amongst models and corresponds to the IPCC AR5 “likely range”. It should be noted that, there may be a greater than 10% chance that the real-world sea level rise lies outside this range.Data sourceThis data is an extract of a larger dataset (every year and more percentiles) which is available on CEDA at https://catalogue.ceda.ac.uk/uuid/a077f4058cda4cd4b37ccfbdf1a6bd29Data has been extracted from the v20221219 version (downloaded 17/04/2023) of three files:seaLevelAnom_marine-sim_rcp26_ann_2007-2300.ncseaLevelAnom_marine-sim_rcp45_ann_2007-2300.ncseaLevelAnom_marine-sim_rcp85_ann_2007-2300.ncUseful links to find out moreFor a comprehensive description of the underpinning science, evaluation and results see the UKCP18 Marine Projections Report (Palmer et al, 2018).For a discussion on ice sheet instability processes in the latest IPCC assessment report, see Fox-Kemper et al (2021). Technical note for the update to the underpinning data: https://www.metoffice.gov.uk/binaries/content/assets/metofficegovuk/pdf/research/ukcp/ukcp_tech_note_sea_level_mar23.pdf.Further information in the Met Office Climate Data Portal Understanding Climate Data ESRI Storymap.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
When creating choropleth maps, mapmakers often bin (i.e. group, classify) quantitative data values into groups to help show that certain areas fall within a similar range of values. For instance, a mapmaker may divide counties into groups of high, middle, and low life expectancy (measured in years). It is well known that different binning methods (e.g. natural breaks, quantiles) yield different groupings, meaning the same data can be presented differently depending on how it is divided into bins. To help guide a wide variety of users, we present a new, open-source, web-based, geospatial visualization tool, Exploropleth, that lets users interact with a catalog of established data binning methods, and subsequently compare, customize, and export custom maps. This tool advances the state of the art by providing multiple binning methods in one view and supporting administrative unit reclassification on-the-fly. We interviewed 16 cartographers and geographic information systems (GIS) experts from 13 government organizations, non-government organizations (NGOs), and federal agencies who identified opportunities to integrate Exploropleth into their existing mapmaking workflow, and found that the tool has the potential to educate students as well as mapmakers with varying levels of experience. Exploropleth is open-source and publicly available at https://exploropleth.github.io.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Mean and standard deviation of SI by BMI group and maternal age.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Mean, standard deviation and percentiles of SI and HR within 2 hours after birth.
Facebook
TwitterThere's a story behind every dataset and here's your opportunity to share yours.
What's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too.
We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.
Your data will be in front of the world's largest data science community. What questions do you want to see answered?
Analyze closing price of all the stocks. Analyze the total volume of stocks being traded each day. Analyze daily price change in stock. Analyze monthly mean of close feature. Analyze whether Stock prices of these tech companies are correlated or not. Analyze Daily return of each stock and how they are co-related. Value at risk analysis for Tech Companies.