26 datasets found

Data from: Outlier classification using autoencoders: application for...
osti.gov
Updated Jun 2, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bianchi, F. M.; Brunner, D.; Kube, R.; LaBombard, B. (2021). Outlier classification using autoencoders: application for fluctuation driven flows in fusion plasmas [Dataset]. https://www.osti.gov/dataexplorer/biblio/dataset/1882649-outlier-classification-using-autoencoders-application-fluctuation-driven-flows-fusion-plasmas
Explore at:
Dataset updated
Jun 2, 2021
Dataset provided by
Office of Sciencehttp://www.er.doe.gov/
United States Department of Energyhttp://energy.gov/
Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States). Plasma Science and Fusion Center
Authors
Bianchi, F. M.; Brunner, D.; Kube, R.; LaBombard, B.
Description
Understanding the statistics of fluctuation driven flows in the boundary layer of magnetically confined plasmas is desired to accurately model the lifetime of the vacuum vessel components. Mirror Langmuir probes (MLPs) are a novel diagnostic that uniquely allow us to sample the plasma parameters on a time scale shorter than the characteristic time scale of their fluctuations. Sudden large-amplitude fluctuations in the plasma degrade the precision and accuracy of the plasma parameters reported by MLPs for cases in which the probe bias range is of insufficient amplitude. While some data samples can readily be classified as valid and invalid, we find that such a classification may be ambiguous for up to 40% of data sampled for the plasma parameters and bias voltages considered in this study. In this contribution, we employ an autoencoder (AE) to learn a low-dimensional representation of valid data samples. By definition, the coordinates in this space are the features that mostly characterize valid data. Ambiguous data samples are classified in this space using standard classifiers for vectorial data. In this way, we avoid defining complicated threshold rules to identify outliers, which require strong assumptions and introduce biases in the analysis. By removing the outliers that aremore » identified in the latent low-dimensional space of the AE, we find that the average conductive and convective radial heat fluxes are between approximately 5% and 15% lower as when removing outliers identified by threshold values. For contributions to the radial heat flux due to triple correlations, the difference is up to 40%.« less
f
Data from: A Diagnostic Procedure for Detecting Outliers in Linear...
tandf.figshare.com
figshare.com
txt
Updated Feb 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dongjun You; Michael Hunter; Meng Chen; Sy-Miin Chow (2024). A Diagnostic Procedure for Detecting Outliers in Linear State–Space Models [Dataset]. http://doi.org/10.6084/m9.figshare.12162075.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12162075.v1
Dataset updated
Feb 9, 2024
Dataset provided by
Taylor & Francis
Authors
Dongjun You; Michael Hunter; Meng Chen; Sy-Miin Chow
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Outliers can be more problematic in longitudinal data than in independent observations due to the correlated nature of such data. It is common practice to discard outliers as they are typically regarded as a nuisance or an aberration in the data. However, outliers can also convey meaningful information concerning potential model misspecification, and ways to modify and improve the model. Moreover, outliers that occur among the latent variables (innovative outliers) have distinct characteristics compared to those impacting the observed variables (additive outliers), and are best evaluated with different test statistics and detection procedures. We demonstrate and evaluate the performance of an outlier detection approach for multi-subject state-space models in a Monte Carlo simulation study, with corresponding adaptations to improve power and reduce false detection rates. Furthermore, we demonstrate the empirical utility of the proposed approach using data from an ecological momentary assessment study of emotion regulation together with an open-source software implementation of the procedures.
d
Data from: Mining Distance-Based Outliers in Near Linear Time
catalog.data.gov
datasets.ai
+1more
Updated Apr 11, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dashlink (2025). Mining Distance-Based Outliers in Near Linear Time [Dataset]. https://catalog.data.gov/dataset/mining-distance-based-outliers-in-near-linear-time
Explore at:
Dataset updated
Apr 11, 2025
Dataset provided by
Dashlink
Description
Full title: Mining Distance-Based Outliers in Near Linear Time with Randomization and a Simple Pruning Rule Abstract: Defining outliers by their distance to neighboring examples is a popular approach to finding unusual examples in a data set. Recently, much work has been conducted with the goal of finding fast algorithms for this task. We show that a simple nested loop algorithm that in the worst case is quadratic can give near linear time performance when the data is in random order and a simple pruning rule is used. We test our algorithm on real high-dimensional data sets with millions of examples and show that the near linear scaling holds over several orders of magnitude. Our average case analysis suggests that much of the efficiency is because the time to process non-outliers, which are the majority of examples, does not depend on the size of the data set.
i
sequence outliers-supp data
ieee-dataport.org
Updated Jun 14, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lakshmi Sujeeun (2019). sequence outliers-supp data [Dataset]. https://ieee-dataport.org/documents/sequence-outliers-supp-data
Explore at:
Dataset updated
Jun 14, 2019
Authors
Lakshmi Sujeeun
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Normal 0

false false false

EN-US X-NONE X-NONE
m
Guidelines for benchmarking and outlier detection in clinical quality...
bridges.monash.edu
bin
Updated Mar 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jessy Hansen; Arul Earnest; Ahmad Reza Pourghaderi; Susannah Ahern (2025). Guidelines for benchmarking and outlier detection in clinical quality registries - simulation and model build code [Dataset]. http://doi.org/10.26180/28665671.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.26180/28665671.v1
Dataset updated
Mar 26, 2025
Dataset provided by
Monash University
Authors
Jessy Hansen; Arul Earnest; Ahmad Reza Pourghaderi; Susannah Ahern
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Contains the summary dataset, simulation Stata code and model build R code for the study titled "Benchmarking methods for detection of underperforming healthcare providers in clinical quality registries – implementation guidelines".Contains:guidelines_data_preparation.do Stata code for running the simulations (using the user written hiersim command available at https://doi.org/10.26180/24480889) and preparing the summary performance dataset. sim_extra_sum.dtaSummary performance dataset containing the average accuracy of outlier detection methods for simulations of clinical quality registry data of varied data parameters.guidelines_model_build.RR code for developing generalised linear models for predicting the accuracy of outlier detection based on registry data parameters.
R
Outliers Dataset
universe.roboflow.com
zip
Updated May 18, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Renz (2022). Outliers Dataset [Dataset]. https://universe.roboflow.com/renz/outliers/dataset/4
Explore at:
zipAvailable download formats
Dataset updated
May 18, 2022
Dataset authored and provided by
Renz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Leathers Bounding Boxes
Description
Here are a few use cases for this project:

Leather Quality Inspection: "Outliers" could be used by the leather manufacturing companies to evaluate the quality of their products. Any abnormal textures, cuts, or stains on leather fabrics could be instantly detected by the model, enabling quality control teams to speed up their inspection process.

Product Verification in E-commerce: Online retail shops selling leather products could use "Outliers" to verify the product images uploaded by sellers. Detecting any issues or inconsistencies in leather product images can help provide a quality certification and maintain a high standard of product listings.

Consumer Review Analysis: "Outliers" could be used to verify customer complaints or reviews regarding purchased leather goods. By analyzing pictures provided by customers, companies could effectively respond to any valid product defects.

Pre-Loved Goods Inspection: The model could assist second-hand goods websites or thrift stores in verifying the condition of leather goods. This could ensure that only goods meeting certain quality standards are accepted and sold.

Restoration and Repair Services: For businesses dealing with the restoration of antique or damaged leather items, "Outliers" could be used to spot problematic areas and assess the work needed for restoration or repair. This could help improve the service and pricing accuracy.
Data for Filtering Organized 3D Point Clouds for Bin Picking Applications
datasets.ai
res1catalogd-o-tdatad-o-tgov.vcapture.xyz
+1more
0, 34, 47
Updated Aug 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institute of Standards and Technology (2024). Data for Filtering Organized 3D Point Clouds for Bin Picking Applications [Dataset]. https://datasets.ai/datasets/data-for-filtering-organized-3d-point-clouds-for-bin-picking-applications
Explore at:
0, 34, 47Available download formats
Dataset updated
Aug 6, 2024
Dataset authored and provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
Description
Contains scans of a bin filled with different parts ( screws, nuts, rods, spheres, sprockets). For each part type, RGB image and organized 3D point cloud obtained with structured light sensor are provided. In addition, unorganized 3D point cloud representing an empty bin and a small Matlab script to read the files is also provided. 3D data contain a lot of outliers and the data were used to demonstrate a new filtering technique.
G
AI Histology QC Outlier Detection Tool Market Research Report 2033
growthmarketreports.com
csv, pdf, pptx
Updated Aug 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Growth Market Reports (2025). AI Histology QC Outlier Detection Tool Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/ai-histology-qc-outlier-detection-tool-market
Explore at:
pptx, csv, pdfAvailable download formats
Dataset updated
Aug 4, 2025
Dataset authored and provided by
Growth Market Reports
Time period covered
2024 - 2032
Area covered
Global
Description
AI Histology QC Outlier Detection Tool Market Outlook

According to our latest research, the global AI Histology QC Outlier Detection Tool market size reached USD 412 million in 2024, with a robust compound annual growth rate (CAGR) of 18.7% observed over the past year. The market’s expansion is primarily driven by the increasing adoption of artificial intelligence in digital pathology and the rising demand for high-precision quality control in histological workflows. By 2033, the market is forecasted to reach USD 1.97 billion, reflecting the accelerating integration of AI-powered QC outlier detection tools across clinical and research environments worldwide.

The surge in demand for AI Histology QC Outlier Detection Tools is primarily attributed to the pressing need for accuracy and consistency in histopathological diagnostics. Traditional quality control processes in histology are labor-intensive and prone to human error, which can result in diagnostic discrepancies and impact patient outcomes. The deployment of advanced AI-driven QC outlier detection tools addresses these challenges by automating the identification of anomalies and artifacts in histological slides, ensuring standardized results and significantly reducing turnaround times. Moreover, the integration of machine learning algorithms enables these systems to continuously improve their detection capabilities, further enhancing diagnostic reliability and supporting the growing trend towards digitization in pathology laboratories.

Another significant growth driver for the AI Histology QC Outlier Detection Tool market is the increasing prevalence of cancer and other chronic diseases that require histopathological examination for diagnosis and treatment planning. The rising global cancer burden, coupled with the shortage of skilled pathologists, is pushing healthcare providers to adopt AI-powered solutions that can streamline workflow efficiency and mitigate diagnostic bottlenecks. These tools not only facilitate faster and more accurate detection of outliers in tissue samples but also support pathologists in prioritizing cases that require immediate attention. As a result, healthcare institutions are investing heavily in AI-based QC solutions to optimize resource utilization, improve patient care, and comply with stringent regulatory standards for laboratory quality assurance.

Technological advancements and strategic collaborations between AI developers, pathology labs, and healthcare providers are further accelerating market growth. The ongoing development of sophisticated image analysis algorithms, cloud-based platforms, and interoperability standards is enabling seamless integration of AI QC tools into existing laboratory information systems. Additionally, government initiatives aimed at promoting digital health transformation and funding for AI research in medical diagnostics are creating a favorable environment for market expansion. The proliferation of digital pathology infrastructure, particularly in developed regions, is expected to drive the adoption of AI QC outlier detection tools, while emerging markets are witnessing growing interest as healthcare systems modernize and invest in advanced diagnostic technologies.

From a regional perspective, North America currently dominates the AI Histology QC Outlier Detection Tool market, accounting for a significant share of global revenues in 2024. The region’s leadership is underpinned by a well-established healthcare infrastructure, high adoption rates of digital pathology, and strong presence of leading AI technology providers. Europe follows closely, supported by robust investments in healthcare innovation and a proactive regulatory landscape. Meanwhile, the Asia Pacific region is poised for the fastest growth over the forecast period, driven by increasing healthcare expenditure, expanding cancer screening programs, and rising awareness of the benefits of AI-powered diagnostic solutions. Latin America and the Middle East & Africa are also expected to witness steady growth as digital transformation initiatives gain momentum in these regions.
Outlier Responses Reflect Sensitivity to Statistical Structure in the Human...
plos.figshare.com
tiff
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marta I. Garrido; Maneesh Sahani; Raymond J. Dolan (2023). Outlier Responses Reflect Sensitivity to Statistical Structure in the Human Brain [Dataset]. http://doi.org/10.1371/journal.pcbi.1002999
Explore at:
tiffAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pcbi.1002999
Dataset updated
Jun 1, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Marta I. Garrido; Maneesh Sahani; Raymond J. Dolan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We constantly look for patterns in the environment that allow us to learn its key regularities. These regularities are fundamental in enabling us to make predictions about what is likely to happen next. The physiological study of regularity extraction has focused primarily on repetitive sequence-based rules within the sensory environment, or on stimulus-outcome associations in the context of reward-based decision-making. Here we ask whether we implicitly encode non-sequential stochastic regularities, and detect violations therein. We addressed this question using a novel experimental design and both behavioural and magnetoencephalographic (MEG) metrics associated with responses to pure-tone sounds with frequencies sampled from a Gaussian distribution. We observed that sounds in the tail of the distribution evoked a larger response than those that fell at the centre. This response resembled the mismatch negativity (MMN) evoked by surprising or unlikely events in traditional oddball paradigms. Crucially, responses to physically identical outliers were greater when the distribution was narrower. These results show that humans implicitly keep track of the uncertainty induced by apparently random distributions of sensory events. Source reconstruction suggested that the statistical-context-sensitive responses arose in a temporo-parietal network, areas that have been associated with attention orientation to unexpected events. Our results demonstrate a very early neurophysiological marker of the brain's ability to implicitly encode complex statistical structure in the environment. We suggest that this sensitivity provides a computational basis for our ability to make perceptual inferences in noisy environments and to make decisions in an uncertain world.
Data from: Missing Data in the Uniform Crime Reports (UCR), 1977-2000...
catalog.data.gov
icpsr.umich.edu
Updated Mar 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institute of Justice (2025). Missing Data in the Uniform Crime Reports (UCR), 1977-2000 [United States] [Dataset]. https://catalog.data.gov/dataset/missing-data-in-the-uniform-crime-reports-ucr-1977-2000-united-states-4b340
Explore at:
Dataset updated
Mar 12, 2025
Dataset provided by
National Institute of Justicehttp://nij.ojp.gov/
Area covered
United States
Description
This study reexamined and recoded missing data in the Uniform Crime Reports (UCR) for the years 1977 to 2000 for all police agencies in the United States. The principal investigator conducted a data cleaning of 20,067 Originating Agency Identifiers (ORIs) contained within the Offenses-Known UCR data from 1977 to 2000. Data cleaning involved performing agency name checks and creating new numerical codes for different types of missing data including missing data codes that identify whether a record was aggregated to a particular month, whether no data were reported (true missing), if more than one index crime was missing, if a particular index crime (motor vehicle theft, larceny, burglary, assault, robbery, rape, murder) was missing, researcher assigned missing value codes according to the "rule of 20", outlier values, whether an ORI was covered by another agency, and whether an agency did not exist during a particular time period.
e
Sample of 45 H{alpha}EW outliers - Dataset - B2FIND
b2find.eudat.eu
Updated Oct 23, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). Sample of 45 H{alpha}EW outliers - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/7782063a-207c-571b-bad5-80eedba236cf
Explore at:
Dataset updated
Oct 23, 2023
Description
In this work, we calibrate the relationship between H{alpha} emission and M-dwarf ages. We compile a sample of 892 M-dwarfs with H{alpha} equivalent width (H{alpha}EW) measurements from the literature that are either comoving with a white dwarf of known age (21 stars) or in a known young association (871 stars). In this sample we identify 7 M-dwarfs that are new candidate members of known associations. By dividing the stars into active and inactive categories according to their H{alpha}EW and spectral type (SpT), we find that the fraction of active dwarfs decreases with increasing age, and the form of the decline depends on SpT. Using the compiled sample of age calibrators, we find that H{alpha} EW and fractional H{alpha} luminosity (L_H{alpha}/L_bol) decrease with increasing age. H{alpha}EW for SpT<~M7 decreases gradually up until ~1Gyr. For older ages, we found only two early M dwarfs that are both inactive and seem to continue the gradual decrease. We also found 14 mid-type M-dwarfs, out of which 11 are inactive and present a significant decrease in H{alpha}EW, suggesting that the magnetic activity decreases rapidly after ~1Gyr. We fit L_H{alpha}/L_bol versus age with a broken power law and find an index of -0.11_-0.01_^+0.02^ for ages >1Gyr) leaves this part of the relation far less constrained. Finally, from repeated independent measurements for the same stars, we find that 94% of them have a level of H{alpha}EW variability <~5{AA} at young ages (<1Gyr).
d
Data from: Is local selection so widespread in river organisms? Fractal...
search.dataone.org
datadryad.org
Updated Jun 27, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christophe Lemaire (2025). Is local selection so widespread in river organisms? Fractal geometry of river networks leads to high bias in outlier detection [Dataset]. http://doi.org/10.5061/dryad.8m30f
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.8m30f
Dataset updated
Jun 27, 2025
Dataset provided by
Dryad Digital Repository
Authors
Christophe Lemaire
Time period covered
Jul 17, 2020
Description
Identifying local adaptation is crucial in conservation biology in order to define ecotypes and establish management guidelines. Local adaptation is often inferred from the detection of loci showing a high differentiation between populations, the so-called FST outliers. Methods of detection of loci under selection are reputed to be robust in most spatial population models. However, using simulations we showed that FST outlier tests provided a high rate of false positives (up to 60%) in fractal environments such as river networks. Surprisingly, the number of sampled demes was correlated with parameters of population genetic structure, such as the variance of FSTs, and hence strongly influenced the rate of outliers. This unappreciated property of river networks therefore needs to be accounted for in genetic studies on adaptation and conservation of river organisms.
d
mumpcepy: A Python implementation of the Method of Uncertainty Minimization...
datasets.ai
catalog.data.gov
0
Updated Aug 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institute of Standards and Technology (2024). mumpcepy: A Python implementation of the Method of Uncertainty Minimization using Polynomial Chaos Expansions [Dataset]. https://datasets.ai/datasets/mumpcepy-a-python-implementation-of-the-method-of-uncertainty-minimization-using-polynomia-c2fc3
Explore at:
0Available download formats
Dataset updated
Aug 6, 2024
Dataset authored and provided by
National Institute of Standards and Technology
Description
The Method of Uncertainty Minimization using Polynomial Chaos Expansions (MUM-PCE) was developed as a software tool to constrain physical models against experimental measurements. These models contain parameters that cannot be easily determined from first principles and so must be measured, and some which cannot even be easily measured. In such cases, the models are validated and tuned against a set of global experiments which may depend on the underlying physical parameters in a complex way. The measurement uncertainty will affect the uncertainty in the parameter values.
b
Outliers - Website research page
data.bathspa.ac.uk
pdf
Updated Jun 1, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rosemary Snell (2023). Outliers - Website research page [Dataset]. http://doi.org/10.17870/bathspa.11538207.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.17870/bathspa.11538207.v1
Dataset updated
Jun 1, 2023
Dataset provided by
BathSPAdata
Authors
Rosemary Snell
License
http://rightsstatements.org/vocab/InC/1.0/http://rightsstatements.org/vocab/InC/1.0/
Description
Outliers is a research project articulated through a solo exhibition held at No 20 Arts in London. It contained a body of 26 works including paintings, drawings and photographs that were the culmination of a research trip to Greenland. This body of work aimed to explore how the medium of paint could be manipulated to not only represent the dramatic and transient nature of the icescapes of Greenland but to also emulate and explore the properties of snow and ice themselves. This item contains a text reproduction of a blog about the project originally appearing at link below. This content is provided as contextualising information. The work is under copyright and may not be used without permission. Use of this repository acknowledges cooperation with its policies and relevant copyright law.
D
Genomic regions underlying metabolic and neuronal signaling pathways are...
datasetcatalog.nlm.nih.gov
datadryad.org
+1more
Updated Apr 2, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wagner, Dominique; Lovette, Irby; Chen, Nancy; Taylor, Scott; Curry, Robert (2020). Genomic regions underlying metabolic and neuronal signaling pathways are temporally consistent outliers in a moving avian hybrid zone [Dataset]. http://doi.org/10.5061/dryad.j3tx95x8c
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.j3tx95x8c
Dataset updated
Apr 2, 2020
Authors
Wagner, Dominique; Lovette, Irby; Chen, Nancy; Taylor, Scott; Curry, Robert
Description
The study of hybrid zones can provide insight into the genetic basis of species differences that are relevant for the maintenance of reproductive isolation. Hybrid zones can also provide insight into climate change, species distributions, and evolution. The hybrid zone between black-capped chickadees (Poecile atricapillus) and Carolina chickadees (P. carolinensis) is shifting northward in response to increasing winter temperatures but is not increasing in width. This pattern indicates strong selection against chickadees with admixed genomes. Using high-resolution genomic data, we identified regions of the genomes that are outliers in both time points and do not introgress between the species; these regions may be involved in the maintenance of reproductive isolation. Genes involved in metabolic regulation processes were overrepresented in this dataset. Several gene ontology categories were also temporally consistent—including glutamate signaling, synaptic transmission, and catabolic processes—but the nucleotide variants leading to this pattern were not. Our results support recent findings that hybrids between black-capped and Carolina chickadees have higher basal metabolic rates than either parental species and suffer spatial memory and problem-solving deficits. Metabolic breakdown, as well as spatial memory and problem-solving, in hybrid chickadees may act as strong postzygotic isolation mechanisms in this moving hybrid zone.
f
Statistical results for Is, ΔD, and α after outlier removal.
plos.figshare.com
xls
Updated May 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hong Zhang; Yong Cao; Qiang Luo; Wei Qi (2025). Statistical results for Is, ΔD, and α after outlier removal. [Dataset]. http://doi.org/10.1371/journal.pone.0321740.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0321740.t002
Dataset updated
May 15, 2025
Dataset provided by
PLOS ONE
Authors
Hong Zhang; Yong Cao; Qiang Luo; Wei Qi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Statistical results for Is, ΔD, and α after outlier removal.
f
Data from: Predictive Control Charts (PCC): A Bayesian approach in online...
tandf.figshare.com
txt
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Konstantinos Bourazas; Dimitrios Kiagias; Panagiotis Tsiamyrtzis (2023). Predictive Control Charts (PCC): A Bayesian approach in online monitoring of short runs [Dataset]. http://doi.org/10.6084/m9.figshare.14588607.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.14588607.v1
Dataset updated
May 30, 2023
Dataset provided by
Taylor & Francis
Authors
Konstantinos Bourazas; Dimitrios Kiagias; Panagiotis Tsiamyrtzis
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Performing online monitoring for short horizon data is a challenging, though cost effective benefit. Self-starting methods attempt to address this issue adopting a hybrid scheme that executes calibration and monitoring simultaneously. In this work, we propose a Bayesian alternative that will utilize prior information and possible historical data (via power priors), offering a head-start in online monitoring, putting emphasis on outlier detection. For cases of complete prior ignorance, the objective Bayesian version will be provided. Charting will be based on the predictive distribution and the methodological framework will be derived in a general way, to facilitate discrete and continuous data from any distribution that belongs to the regular exponential family (with Normal, Poisson and Binomial being the most representative). Being in the Bayesian arena, we will be able to not only perform process monitoring, but also draw online inference regarding the unknown process parameter(s). An extended simulation study will evaluate the proposed methodology against frequentist based competitors and it will cover topics regarding prior sensitivity and model misspecification robustness. A continuous and a discrete real data set will illustrate its use in practice. Technical details, algorithms, guidelines on prior elicitation and R-codes are provided in appendices and supplementary material. Short production runs and online phase I monitoring are among the best candidates to benefit from the developed methodology.
f
Data from: Tolerated outlier prediction method of excavation damaged zone...
tandf.figshare.com
xlsx
Updated Dec 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yaxi Shen; Shunchuan Wu; Yongbing Wang; Jiaxin Wang; Shuxian Wang; Shigui Huang (2024). Tolerated outlier prediction method of excavation damaged zone thickness of drift based on interpretable SOA-QRF ensemble learning [Dataset]. http://doi.org/10.6084/m9.figshare.25585923.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.25585923.v1
Dataset updated
Dec 2, 2024
Dataset provided by
Taylor & Francis
Authors
Yaxi Shen; Shunchuan Wu; Yongbing Wang; Jiaxin Wang; Shuxian Wang; Shigui Huang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Drift excavation induces excavation damaged zones (EDZ) due to stress redistribution, impacting drift stability and rock deformation support. Predicting EDZ thickness is crucial, but traditional machine learning models are susceptible to potential outliers in dataset. Directly eliminating outliers, however, impacts training effectiveness. This study introduces an EDZ thickness prediction model utilising quantile loss and random forest (RF) optimised by the seagull optimisation algorithm (SOA), enabling median regression with tolerated outlier performance. 209 sets of data sets containing 34 mine borehole data were used to establish the prediction model. Evaluation using R2, explained variance score (EVS), mean absolute error (MAE), and mean square error (MSE) demonstrates the superior accuracy of the proposed SOA-QRF model compared to traditional models. Based on the discussion on the treatment of outliers, the outcomes indicate that the SOA-QRF model is more suitable for the dataset with outliers as well as being able to effectuate tolerated outlier prediction. Additionally, three interpretation methods were utilised to explain the SOA-QRF model and enhance the transparency of the model’s prediction process and facilitating the analysis of dispatcher regulation.
t
Methane in NEEM-2011-S1 ice core from North Greenland, 1800 years continuous...
service.tib.eu
Updated Dec 1, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Methane in NEEM-2011-S1 ice core from North Greenland, 1800 years continuous record: outliers, v2 [Dataset]. https://service.tib.eu/ldmservice/dataset/png-doi-10-1594-pangaea-899038
Explore at:
Dataset updated
Dec 1, 2024
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
North Greenland
Description
Description and Notes Description: Methane concentration from the Greenland NEEM-2011-S1 Ice Core from 71 to 408m depth (~270-1961 CE). Methane concentrations analysed online by laser spectrometer (SARA, Spectroscopy by Amplified Resonant Absorption, developed at Laboratoire Interdisciplinaire de Physique, Grenoble, France) on gas extracted from an ice core processed using a continuous melter system (Desert Research Institute). Methane data have a 5 second integration time (raw data acquisition rate 0.6 Hz). Analytical precision, from Allan Variance test, is 0.9 ppb (2 sigma). Long-term reproducibility is 2.6% (2 sigma). Gaps in the record are due to problems during online analysis. Online analysis conducted August-September 2011. Note: Lat-Long provided is for main NEEM borehole. The NEEM-2011-S1 core was drilled 200 m distance away in 2011 to 410 m depth. Methane concentrations are reported on NOAA2004 scale (instrument calibrated on dry synthetic air standards). A correction factor of 1.079 has been applied to all data to correct for methane dissolution in melted ice core sample prior to gas extraction. Correction factor calculated using empirical data (concentrations not aligned/tied to existing discrete methane measurements). Additional methods description provided in: Stowasser, C., Buizert, C., Gkinis, V., Chappellaz, J., Schupbach, S., Bigler, M., Fain, X., Sperlich, P., Baumgartner, M., Schilt, A., Blunier, T., 2012. Continuous measurements of methane mixing ratios from ice cores. Atmos. Meas. Tech. 5, 999-1013. Morville, J., Kassi, S., Chenevier, M., Romanini, D., 2005. Fast, low-noise, mode bymode, cavity-enhanced absorption spectroscopy by diode-laser self-locking. Appl. Phys. B Lasers Opt. 80, 1027-01038. * NEEM (North Greenland Eemian Ice Drilling) project information http://neem.dk/
Regression analysis for the WTP for pain relief for piglet castration (df =...
plos.figshare.com
xls
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ulrich J. Frey; Frauke Pirscher (2023). Regression analysis for the WTP for pain relief for piglet castration (df = 1203, adjusted R-squared: 0.69). [Dataset]. http://doi.org/10.1371/journal.pone.0202193.t005
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0202193.t005
Dataset updated
May 31, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Ulrich J. Frey; Frauke Pirscher
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Regression analysis for the WTP for pain relief for piglet castration (df = 1203, adjusted R-squared: 0.69).

Facebook

Twitter

Click to copy link

Link copied

Cite

Bianchi, F. M.; Brunner, D.; Kube, R.; LaBombard, B. (2021). Outlier classification using autoencoders: application for fluctuation driven flows in fusion plasmas [Dataset]. https://www.osti.gov/dataexplorer/biblio/dataset/1882649-outlier-classification-using-autoencoders-application-fluctuation-driven-flows-fusion-plasmas

Data from: Outlier classification using autoencoders: application for fluctuation driven flows in fusion plasmas

Explore at:

Dataset updated

Jun 2, 2021

Dataset provided by

Office of Sciencehttp://www.er.doe.gov/
United States Department of Energyhttp://energy.gov/
Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States). Plasma Science and Fusion Center

Authors

Bianchi, F. M.; Brunner, D.; Kube, R.; LaBombard, B.

Description

Understanding the statistics of fluctuation driven flows in the boundary layer of magnetically confined plasmas is desired to accurately model the lifetime of the vacuum vessel components. Mirror Langmuir probes (MLPs) are a novel diagnostic that uniquely allow us to sample the plasma parameters on a time scale shorter than the characteristic time scale of their fluctuations. Sudden large-amplitude fluctuations in the plasma degrade the precision and accuracy of the plasma parameters reported by MLPs for cases in which the probe bias range is of insufficient amplitude. While some data samples can readily be classified as valid and invalid, we find that such a classification may be ambiguous for up to 40% of data sampled for the plasma parameters and bias voltages considered in this study. In this contribution, we employ an autoencoder (AE) to learn a low-dimensional representation of valid data samples. By definition, the coordinates in this space are the features that mostly characterize valid data. Ambiguous data samples are classified in this space using standard classifiers for vectorial data. In this way, we avoid defining complicated threshold rules to identify outliers, which require strong assumptions and introduce biases in the analysis. By removing the outliers that aremore » identified in the latent low-dimensional space of the AE, we find that the average conductive and convective radial heat fluxes are between approximately 5% and 15% lower as when removing outliers identified by threshold values. For contributions to the radial heat flux due to triple correlations, the difference is up to 40%.« less

Clear search

Close search

Google apps

Main menu

Data from: Outlier classification using autoencoders: application for...

Data from: A Diagnostic Procedure for Detecting Outliers in Linear...

Data from: Mining Distance-Based Outliers in Near Linear Time

sequence outliers-supp data

Guidelines for benchmarking and outlier detection in clinical quality...

Outliers Dataset

Data for Filtering Organized 3D Point Clouds for Bin Picking Applications

AI Histology QC Outlier Detection Tool Market Research Report 2033

AI Histology QC Outlier Detection Tool Market Outlook

Outlier Responses Reflect Sensitivity to Statistical Structure in the Human...

Data from: Missing Data in the Uniform Crime Reports (UCR), 1977-2000...

Sample of 45 H{alpha}EW outliers - Dataset - B2FIND

Data from: Is local selection so widespread in river organisms? Fractal...

mumpcepy: A Python implementation of the Method of Uncertainty Minimization...

Outliers - Website research page

Genomic regions underlying metabolic and neuronal signaling pathways are...

Statistical results for Is, ΔD, and α after outlier removal.

Data from: Predictive Control Charts (PCC): A Bayesian approach in online...

Data from: Tolerated outlier prediction method of excavation damaged zone...

Methane in NEEM-2011-S1 ice core from North Greenland, 1800 years continuous...

Regression analysis for the WTP for pain relief for piglet castration (df =...

Data from: Outlier classification using autoencoders: application for fluctuation driven flows in fusion plasmasSee More Versions

Data from: Outlier classification using autoencoders: application for fluctuation driven flows in fusion plasmas