26 datasets found
  1. Data from: Outlier classification using autoencoders: application for...

    • osti.gov
    Updated Jun 2, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bianchi, F. M.; Brunner, D.; Kube, R.; LaBombard, B. (2021). Outlier classification using autoencoders: application for fluctuation driven flows in fusion plasmas [Dataset]. https://www.osti.gov/dataexplorer/biblio/dataset/1882649-outlier-classification-using-autoencoders-application-fluctuation-driven-flows-fusion-plasmas
    Explore at:
    Dataset updated
    Jun 2, 2021
    Dataset provided by
    Office of Sciencehttp://www.er.doe.gov/
    United States Department of Energyhttp://energy.gov/
    Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States). Plasma Science and Fusion Center
    Authors
    Bianchi, F. M.; Brunner, D.; Kube, R.; LaBombard, B.
    Description

    Understanding the statistics of fluctuation driven flows in the boundary layer of magnetically confined plasmas is desired to accurately model the lifetime of the vacuum vessel components. Mirror Langmuir probes (MLPs) are a novel diagnostic that uniquely allow us to sample the plasma parameters on a time scale shorter than the characteristic time scale of their fluctuations. Sudden large-amplitude fluctuations in the plasma degrade the precision and accuracy of the plasma parameters reported by MLPs for cases in which the probe bias range is of insufficient amplitude. While some data samples can readily be classified as valid and invalid, we find that such a classification may be ambiguous for up to 40% of data sampled for the plasma parameters and bias voltages considered in this study. In this contribution, we employ an autoencoder (AE) to learn a low-dimensional representation of valid data samples. By definition, the coordinates in this space are the features that mostly characterize valid data. Ambiguous data samples are classified in this space using standard classifiers for vectorial data. In this way, we avoid defining complicated threshold rules to identify outliers, which require strong assumptions and introduce biases in the analysis. By removing the outliers that aremore » identified in the latent low-dimensional space of the AE, we find that the average conductive and convective radial heat fluxes are between approximately 5% and 15% lower as when removing outliers identified by threshold values. For contributions to the radial heat flux due to triple correlations, the difference is up to 40%.« less

  2. f

    Data from: A Diagnostic Procedure for Detecting Outliers in Linear...

    • tandf.figshare.com
    • figshare.com
    txt
    Updated Feb 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dongjun You; Michael Hunter; Meng Chen; Sy-Miin Chow (2024). A Diagnostic Procedure for Detecting Outliers in Linear State–Space Models [Dataset]. http://doi.org/10.6084/m9.figshare.12162075.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Feb 9, 2024
    Dataset provided by
    Taylor & Francis
    Authors
    Dongjun You; Michael Hunter; Meng Chen; Sy-Miin Chow
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Outliers can be more problematic in longitudinal data than in independent observations due to the correlated nature of such data. It is common practice to discard outliers as they are typically regarded as a nuisance or an aberration in the data. However, outliers can also convey meaningful information concerning potential model misspecification, and ways to modify and improve the model. Moreover, outliers that occur among the latent variables (innovative outliers) have distinct characteristics compared to those impacting the observed variables (additive outliers), and are best evaluated with different test statistics and detection procedures. We demonstrate and evaluate the performance of an outlier detection approach for multi-subject state-space models in a Monte Carlo simulation study, with corresponding adaptations to improve power and reduce false detection rates. Furthermore, we demonstrate the empirical utility of the proposed approach using data from an ecological momentary assessment study of emotion regulation together with an open-source software implementation of the procedures.

  3. d

    Data from: Mining Distance-Based Outliers in Near Linear Time

    • catalog.data.gov
    • datasets.ai
    • +1more
    Updated Apr 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Mining Distance-Based Outliers in Near Linear Time [Dataset]. https://catalog.data.gov/dataset/mining-distance-based-outliers-in-near-linear-time
    Explore at:
    Dataset updated
    Apr 11, 2025
    Dataset provided by
    Dashlink
    Description

    Full title: Mining Distance-Based Outliers in Near Linear Time with Randomization and a Simple Pruning Rule Abstract: Defining outliers by their distance to neighboring examples is a popular approach to finding unusual examples in a data set. Recently, much work has been conducted with the goal of finding fast algorithms for this task. We show that a simple nested loop algorithm that in the worst case is quadratic can give near linear time performance when the data is in random order and a simple pruning rule is used. We test our algorithm on real high-dimensional data sets with millions of examples and show that the near linear scaling holds over several orders of magnitude. Our average case analysis suggests that much of the efficiency is because the time to process non-outliers, which are the majority of examples, does not depend on the size of the data set.

  4. i

    sequence outliers-supp data

    • ieee-dataport.org
    Updated Jun 14, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lakshmi Sujeeun (2019). sequence outliers-supp data [Dataset]. https://ieee-dataport.org/documents/sequence-outliers-supp-data
    Explore at:
    Dataset updated
    Jun 14, 2019
    Authors
    Lakshmi Sujeeun
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Normal 0

    false false false

    EN-US X-NONE X-NONE

  5. m

    Guidelines for benchmarking and outlier detection in clinical quality...

    • bridges.monash.edu
    bin
    Updated Mar 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jessy Hansen; Arul Earnest; Ahmad Reza Pourghaderi; Susannah Ahern (2025). Guidelines for benchmarking and outlier detection in clinical quality registries - simulation and model build code [Dataset]. http://doi.org/10.26180/28665671.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Mar 26, 2025
    Dataset provided by
    Monash University
    Authors
    Jessy Hansen; Arul Earnest; Ahmad Reza Pourghaderi; Susannah Ahern
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Contains the summary dataset, simulation Stata code and model build R code for the study titled "Benchmarking methods for detection of underperforming healthcare providers in clinical quality registries – implementation guidelines".Contains:guidelines_data_preparation.do Stata code for running the simulations (using the user written hiersim command available at https://doi.org/10.26180/24480889) and preparing the summary performance dataset. sim_extra_sum.dtaSummary performance dataset containing the average accuracy of outlier detection methods for simulations of clinical quality registry data of varied data parameters.guidelines_model_build.RR code for developing generalised linear models for predicting the accuracy of outlier detection based on registry data parameters.

  6. R

    Outliers Dataset

    • universe.roboflow.com
    zip
    Updated May 18, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Renz (2022). Outliers Dataset [Dataset]. https://universe.roboflow.com/renz/outliers/dataset/4
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 18, 2022
    Dataset authored and provided by
    Renz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Leathers Bounding Boxes
    Description

    Here are a few use cases for this project:

    1. Leather Quality Inspection: "Outliers" could be used by the leather manufacturing companies to evaluate the quality of their products. Any abnormal textures, cuts, or stains on leather fabrics could be instantly detected by the model, enabling quality control teams to speed up their inspection process.

    2. Product Verification in E-commerce: Online retail shops selling leather products could use "Outliers" to verify the product images uploaded by sellers. Detecting any issues or inconsistencies in leather product images can help provide a quality certification and maintain a high standard of product listings.

    3. Consumer Review Analysis: "Outliers" could be used to verify customer complaints or reviews regarding purchased leather goods. By analyzing pictures provided by customers, companies could effectively respond to any valid product defects.

    4. Pre-Loved Goods Inspection: The model could assist second-hand goods websites or thrift stores in verifying the condition of leather goods. This could ensure that only goods meeting certain quality standards are accepted and sold.

    5. Restoration and Repair Services: For businesses dealing with the restoration of antique or damaged leather items, "Outliers" could be used to spot problematic areas and assess the work needed for restoration or repair. This could help improve the service and pricing accuracy.

  7. Data for Filtering Organized 3D Point Clouds for Bin Picking Applications

    • datasets.ai
    • res1catalogd-o-tdatad-o-tgov.vcapture.xyz
    • +1more
    0, 34, 47
    Updated Aug 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2024). Data for Filtering Organized 3D Point Clouds for Bin Picking Applications [Dataset]. https://datasets.ai/datasets/data-for-filtering-organized-3d-point-clouds-for-bin-picking-applications
    Explore at:
    0, 34, 47Available download formats
    Dataset updated
    Aug 6, 2024
    Dataset authored and provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Description

    Contains scans of a bin filled with different parts ( screws, nuts, rods, spheres, sprockets). For each part type, RGB image and organized 3D point cloud obtained with structured light sensor are provided. In addition, unorganized 3D point cloud representing an empty bin and a small Matlab script to read the files is also provided. 3D data contain a lot of outliers and the data were used to demonstrate a new filtering technique.

  8. G

    AI Histology QC Outlier Detection Tool Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Aug 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). AI Histology QC Outlier Detection Tool Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/ai-histology-qc-outlier-detection-tool-market
    Explore at:
    pptx, csv, pdfAvailable download formats
    Dataset updated
    Aug 4, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    AI Histology QC Outlier Detection Tool Market Outlook



    According to our latest research, the global AI Histology QC Outlier Detection Tool market size reached USD 412 million in 2024, with a robust compound annual growth rate (CAGR) of 18.7% observed over the past year. The market’s expansion is primarily driven by the increasing adoption of artificial intelligence in digital pathology and the rising demand for high-precision quality control in histological workflows. By 2033, the market is forecasted to reach USD 1.97 billion, reflecting the accelerating integration of AI-powered QC outlier detection tools across clinical and research environments worldwide.




    The surge in demand for AI Histology QC Outlier Detection Tools is primarily attributed to the pressing need for accuracy and consistency in histopathological diagnostics. Traditional quality control processes in histology are labor-intensive and prone to human error, which can result in diagnostic discrepancies and impact patient outcomes. The deployment of advanced AI-driven QC outlier detection tools addresses these challenges by automating the identification of anomalies and artifacts in histological slides, ensuring standardized results and significantly reducing turnaround times. Moreover, the integration of machine learning algorithms enables these systems to continuously improve their detection capabilities, further enhancing diagnostic reliability and supporting the growing trend towards digitization in pathology laboratories.




    Another significant growth driver for the AI Histology QC Outlier Detection Tool market is the increasing prevalence of cancer and other chronic diseases that require histopathological examination for diagnosis and treatment planning. The rising global cancer burden, coupled with the shortage of skilled pathologists, is pushing healthcare providers to adopt AI-powered solutions that can streamline workflow efficiency and mitigate diagnostic bottlenecks. These tools not only facilitate faster and more accurate detection of outliers in tissue samples but also support pathologists in prioritizing cases that require immediate attention. As a result, healthcare institutions are investing heavily in AI-based QC solutions to optimize resource utilization, improve patient care, and comply with stringent regulatory standards for laboratory quality assurance.




    Technological advancements and strategic collaborations between AI developers, pathology labs, and healthcare providers are further accelerating market growth. The ongoing development of sophisticated image analysis algorithms, cloud-based platforms, and interoperability standards is enabling seamless integration of AI QC tools into existing laboratory information systems. Additionally, government initiatives aimed at promoting digital health transformation and funding for AI research in medical diagnostics are creating a favorable environment for market expansion. The proliferation of digital pathology infrastructure, particularly in developed regions, is expected to drive the adoption of AI QC outlier detection tools, while emerging markets are witnessing growing interest as healthcare systems modernize and invest in advanced diagnostic technologies.




    From a regional perspective, North America currently dominates the AI Histology QC Outlier Detection Tool market, accounting for a significant share of global revenues in 2024. The region’s leadership is underpinned by a well-established healthcare infrastructure, high adoption rates of digital pathology, and strong presence of leading AI technology providers. Europe follows closely, supported by robust investments in healthcare innovation and a proactive regulatory landscape. Meanwhile, the Asia Pacific region is poised for the fastest growth over the forecast period, driven by increasing healthcare expenditure, expanding cancer screening programs, and rising awareness of the benefits of AI-powered diagnostic solutions. Latin America and the Middle East & Africa are also expected to witness steady growth as digital transformation initiatives gain momentum in these regions.




  9. Outlier Responses Reflect Sensitivity to Statistical Structure in the Human...

    • plos.figshare.com
    tiff
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marta I. Garrido; Maneesh Sahani; Raymond J. Dolan (2023). Outlier Responses Reflect Sensitivity to Statistical Structure in the Human Brain [Dataset]. http://doi.org/10.1371/journal.pcbi.1002999
    Explore at:
    tiffAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Marta I. Garrido; Maneesh Sahani; Raymond J. Dolan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We constantly look for patterns in the environment that allow us to learn its key regularities. These regularities are fundamental in enabling us to make predictions about what is likely to happen next. The physiological study of regularity extraction has focused primarily on repetitive sequence-based rules within the sensory environment, or on stimulus-outcome associations in the context of reward-based decision-making. Here we ask whether we implicitly encode non-sequential stochastic regularities, and detect violations therein. We addressed this question using a novel experimental design and both behavioural and magnetoencephalographic (MEG) metrics associated with responses to pure-tone sounds with frequencies sampled from a Gaussian distribution. We observed that sounds in the tail of the distribution evoked a larger response than those that fell at the centre. This response resembled the mismatch negativity (MMN) evoked by surprising or unlikely events in traditional oddball paradigms. Crucially, responses to physically identical outliers were greater when the distribution was narrower. These results show that humans implicitly keep track of the uncertainty induced by apparently random distributions of sensory events. Source reconstruction suggested that the statistical-context-sensitive responses arose in a temporo-parietal network, areas that have been associated with attention orientation to unexpected events. Our results demonstrate a very early neurophysiological marker of the brain's ability to implicitly encode complex statistical structure in the environment. We suggest that this sensitivity provides a computational basis for our ability to make perceptual inferences in noisy environments and to make decisions in an uncertain world.

  10. Data from: Missing Data in the Uniform Crime Reports (UCR), 1977-2000...

    • catalog.data.gov
    • icpsr.umich.edu
    Updated Mar 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Justice (2025). Missing Data in the Uniform Crime Reports (UCR), 1977-2000 [United States] [Dataset]. https://catalog.data.gov/dataset/missing-data-in-the-uniform-crime-reports-ucr-1977-2000-united-states-4b340
    Explore at:
    Dataset updated
    Mar 12, 2025
    Dataset provided by
    National Institute of Justicehttp://nij.ojp.gov/
    Area covered
    United States
    Description

    This study reexamined and recoded missing data in the Uniform Crime Reports (UCR) for the years 1977 to 2000 for all police agencies in the United States. The principal investigator conducted a data cleaning of 20,067 Originating Agency Identifiers (ORIs) contained within the Offenses-Known UCR data from 1977 to 2000. Data cleaning involved performing agency name checks and creating new numerical codes for different types of missing data including missing data codes that identify whether a record was aggregated to a particular month, whether no data were reported (true missing), if more than one index crime was missing, if a particular index crime (motor vehicle theft, larceny, burglary, assault, robbery, rape, murder) was missing, researcher assigned missing value codes according to the "rule of 20", outlier values, whether an ORI was covered by another agency, and whether an agency did not exist during a particular time period.

  11. e

    Sample of 45 H{alpha}EW outliers - Dataset - B2FIND

    • b2find.eudat.eu
    Updated Oct 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Sample of 45 H{alpha}EW outliers - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/7782063a-207c-571b-bad5-80eedba236cf
    Explore at:
    Dataset updated
    Oct 23, 2023
    Description

    In this work, we calibrate the relationship between H{alpha} emission and M-dwarf ages. We compile a sample of 892 M-dwarfs with H{alpha} equivalent width (H{alpha}EW) measurements from the literature that are either comoving with a white dwarf of known age (21 stars) or in a known young association (871 stars). In this sample we identify 7 M-dwarfs that are new candidate members of known associations. By dividing the stars into active and inactive categories according to their H{alpha}EW and spectral type (SpT), we find that the fraction of active dwarfs decreases with increasing age, and the form of the decline depends on SpT. Using the compiled sample of age calibrators, we find that H{alpha} EW and fractional H{alpha} luminosity (L_H{alpha}/L_bol) decrease with increasing age. H{alpha}EW for SpT<~M7 decreases gradually up until ~1Gyr. For older ages, we found only two early M dwarfs that are both inactive and seem to continue the gradual decrease. We also found 14 mid-type M-dwarfs, out of which 11 are inactive and present a significant decrease in H{alpha}EW, suggesting that the magnetic activity decreases rapidly after ~1Gyr. We fit L_H{alpha}/L_bol versus age with a broken power law and find an index of -0.11_-0.01_^+0.02^ for ages >1Gyr) leaves this part of the relation far less constrained. Finally, from repeated independent measurements for the same stars, we find that 94% of them have a level of H{alpha}EW variability <~5{AA} at young ages (<1Gyr).

  12. d

    Data from: Is local selection so widespread in river organisms? Fractal...

    • search.dataone.org
    • datadryad.org
    Updated Jun 27, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christophe Lemaire (2025). Is local selection so widespread in river organisms? Fractal geometry of river networks leads to high bias in outlier detection [Dataset]. http://doi.org/10.5061/dryad.8m30f
    Explore at:
    Dataset updated
    Jun 27, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Christophe Lemaire
    Time period covered
    Jul 17, 2020
    Description

    Identifying local adaptation is crucial in conservation biology in order to define ecotypes and establish management guidelines. Local adaptation is often inferred from the detection of loci showing a high differentiation between populations, the so-called FST outliers. Methods of detection of loci under selection are reputed to be robust in most spatial population models. However, using simulations we showed that FST outlier tests provided a high rate of false positives (up to 60%) in fractal environments such as river networks. Surprisingly, the number of sampled demes was correlated with parameters of population genetic structure, such as the variance of FSTs, and hence strongly influenced the rate of outliers. This unappreciated property of river networks therefore needs to be accounted for in genetic studies on adaptation and conservation of river organisms.

  13. d

    mumpcepy: A Python implementation of the Method of Uncertainty Minimization...

    • datasets.ai
    • catalog.data.gov
    0
    Updated Aug 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2024). mumpcepy: A Python implementation of the Method of Uncertainty Minimization using Polynomial Chaos Expansions [Dataset]. https://datasets.ai/datasets/mumpcepy-a-python-implementation-of-the-method-of-uncertainty-minimization-using-polynomia-c2fc3
    Explore at:
    0Available download formats
    Dataset updated
    Aug 6, 2024
    Dataset authored and provided by
    National Institute of Standards and Technology
    Description

    The Method of Uncertainty Minimization using Polynomial Chaos Expansions (MUM-PCE) was developed as a software tool to constrain physical models against experimental measurements. These models contain parameters that cannot be easily determined from first principles and so must be measured, and some which cannot even be easily measured. In such cases, the models are validated and tuned against a set of global experiments which may depend on the underlying physical parameters in a complex way. The measurement uncertainty will affect the uncertainty in the parameter values.

  14. b

    Outliers - Website research page

    • data.bathspa.ac.uk
    pdf
    Updated Jun 1, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rosemary Snell (2023). Outliers - Website research page [Dataset]. http://doi.org/10.17870/bathspa.11538207.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    BathSPAdata
    Authors
    Rosemary Snell
    License

    http://rightsstatements.org/vocab/InC/1.0/http://rightsstatements.org/vocab/InC/1.0/

    Description

    Outliers is a research project articulated through a solo exhibition held at No 20 Arts in London. It contained a body of 26 works including paintings, drawings and photographs that were the culmination of a research trip to Greenland. This body of work aimed to explore how the medium of paint could be manipulated to not only represent the dramatic and transient nature of the icescapes of Greenland but to also emulate and explore the properties of snow and ice themselves. This item contains a text reproduction of a blog about the project originally appearing at link below. This content is provided as contextualising information. The work is under copyright and may not be used without permission. Use of this repository acknowledges cooperation with its policies and relevant copyright law.

  15. D

    Genomic regions underlying metabolic and neuronal signaling pathways are...

    • datasetcatalog.nlm.nih.gov
    • datadryad.org
    • +1more
    Updated Apr 2, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wagner, Dominique; Lovette, Irby; Chen, Nancy; Taylor, Scott; Curry, Robert (2020). Genomic regions underlying metabolic and neuronal signaling pathways are temporally consistent outliers in a moving avian hybrid zone [Dataset]. http://doi.org/10.5061/dryad.j3tx95x8c
    Explore at:
    Dataset updated
    Apr 2, 2020
    Authors
    Wagner, Dominique; Lovette, Irby; Chen, Nancy; Taylor, Scott; Curry, Robert
    Description

    The study of hybrid zones can provide insight into the genetic basis of species differences that are relevant for the maintenance of reproductive isolation. Hybrid zones can also provide insight into climate change, species distributions, and evolution. The hybrid zone between black-capped chickadees (Poecile atricapillus) and Carolina chickadees (P. carolinensis) is shifting northward in response to increasing winter temperatures but is not increasing in width. This pattern indicates strong selection against chickadees with admixed genomes. Using high-resolution genomic data, we identified regions of the genomes that are outliers in both time points and do not introgress between the species; these regions may be involved in the maintenance of reproductive isolation. Genes involved in metabolic regulation processes were overrepresented in this dataset. Several gene ontology categories were also temporally consistent—including glutamate signaling, synaptic transmission, and catabolic processes—but the nucleotide variants leading to this pattern were not. Our results support recent findings that hybrids between black-capped and Carolina chickadees have higher basal metabolic rates than either parental species and suffer spatial memory and problem-solving deficits. Metabolic breakdown, as well as spatial memory and problem-solving, in hybrid chickadees may act as strong postzygotic isolation mechanisms in this moving hybrid zone.

  16. f

    Statistical results for Is, ΔD, and α after outlier removal.

    • plos.figshare.com
    xls
    Updated May 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hong Zhang; Yong Cao; Qiang Luo; Wei Qi (2025). Statistical results for Is, ΔD, and α after outlier removal. [Dataset]. http://doi.org/10.1371/journal.pone.0321740.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 15, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Hong Zhang; Yong Cao; Qiang Luo; Wei Qi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Statistical results for Is, ΔD, and α after outlier removal.

  17. f

    Data from: Predictive Control Charts (PCC): A Bayesian approach in online...

    • tandf.figshare.com
    txt
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Konstantinos Bourazas; Dimitrios Kiagias; Panagiotis Tsiamyrtzis (2023). Predictive Control Charts (PCC): A Bayesian approach in online monitoring of short runs [Dataset]. http://doi.org/10.6084/m9.figshare.14588607.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Konstantinos Bourazas; Dimitrios Kiagias; Panagiotis Tsiamyrtzis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Performing online monitoring for short horizon data is a challenging, though cost effective benefit. Self-starting methods attempt to address this issue adopting a hybrid scheme that executes calibration and monitoring simultaneously. In this work, we propose a Bayesian alternative that will utilize prior information and possible historical data (via power priors), offering a head-start in online monitoring, putting emphasis on outlier detection. For cases of complete prior ignorance, the objective Bayesian version will be provided. Charting will be based on the predictive distribution and the methodological framework will be derived in a general way, to facilitate discrete and continuous data from any distribution that belongs to the regular exponential family (with Normal, Poisson and Binomial being the most representative). Being in the Bayesian arena, we will be able to not only perform process monitoring, but also draw online inference regarding the unknown process parameter(s). An extended simulation study will evaluate the proposed methodology against frequentist based competitors and it will cover topics regarding prior sensitivity and model misspecification robustness. A continuous and a discrete real data set will illustrate its use in practice. Technical details, algorithms, guidelines on prior elicitation and R-codes are provided in appendices and supplementary material. Short production runs and online phase I monitoring are among the best candidates to benefit from the developed methodology.

  18. f

    Data from: Tolerated outlier prediction method of excavation damaged zone...

    • tandf.figshare.com
    xlsx
    Updated Dec 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yaxi Shen; Shunchuan Wu; Yongbing Wang; Jiaxin Wang; Shuxian Wang; Shigui Huang (2024). Tolerated outlier prediction method of excavation damaged zone thickness of drift based on interpretable SOA-QRF ensemble learning [Dataset]. http://doi.org/10.6084/m9.figshare.25585923.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Dec 2, 2024
    Dataset provided by
    Taylor & Francis
    Authors
    Yaxi Shen; Shunchuan Wu; Yongbing Wang; Jiaxin Wang; Shuxian Wang; Shigui Huang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Drift excavation induces excavation damaged zones (EDZ) due to stress redistribution, impacting drift stability and rock deformation support. Predicting EDZ thickness is crucial, but traditional machine learning models are susceptible to potential outliers in dataset. Directly eliminating outliers, however, impacts training effectiveness. This study introduces an EDZ thickness prediction model utilising quantile loss and random forest (RF) optimised by the seagull optimisation algorithm (SOA), enabling median regression with tolerated outlier performance. 209 sets of data sets containing 34 mine borehole data were used to establish the prediction model. Evaluation using R2, explained variance score (EVS), mean absolute error (MAE), and mean square error (MSE) demonstrates the superior accuracy of the proposed SOA-QRF model compared to traditional models. Based on the discussion on the treatment of outliers, the outcomes indicate that the SOA-QRF model is more suitable for the dataset with outliers as well as being able to effectuate tolerated outlier prediction. Additionally, three interpretation methods were utilised to explain the SOA-QRF model and enhance the transparency of the model’s prediction process and facilitating the analysis of dispatcher regulation.

  19. t

    Methane in NEEM-2011-S1 ice core from North Greenland, 1800 years continuous...

    • service.tib.eu
    Updated Dec 1, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Methane in NEEM-2011-S1 ice core from North Greenland, 1800 years continuous record: outliers, v2 [Dataset]. https://service.tib.eu/ldmservice/dataset/png-doi-10-1594-pangaea-899038
    Explore at:
    Dataset updated
    Dec 1, 2024
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    North Greenland
    Description

    Description and Notes Description: Methane concentration from the Greenland NEEM-2011-S1 Ice Core from 71 to 408m depth (~270-1961 CE). Methane concentrations analysed online by laser spectrometer (SARA, Spectroscopy by Amplified Resonant Absorption, developed at Laboratoire Interdisciplinaire de Physique, Grenoble, France) on gas extracted from an ice core processed using a continuous melter system (Desert Research Institute). Methane data have a 5 second integration time (raw data acquisition rate 0.6 Hz). Analytical precision, from Allan Variance test, is 0.9 ppb (2 sigma). Long-term reproducibility is 2.6% (2 sigma). Gaps in the record are due to problems during online analysis. Online analysis conducted August-September 2011. Note: Lat-Long provided is for main NEEM borehole. The NEEM-2011-S1 core was drilled 200 m distance away in 2011 to 410 m depth. Methane concentrations are reported on NOAA2004 scale (instrument calibrated on dry synthetic air standards). A correction factor of 1.079 has been applied to all data to correct for methane dissolution in melted ice core sample prior to gas extraction. Correction factor calculated using empirical data (concentrations not aligned/tied to existing discrete methane measurements). Additional methods description provided in: Stowasser, C., Buizert, C., Gkinis, V., Chappellaz, J., Schupbach, S., Bigler, M., Fain, X., Sperlich, P., Baumgartner, M., Schilt, A., Blunier, T., 2012. Continuous measurements of methane mixing ratios from ice cores. Atmos. Meas. Tech. 5, 999-1013. Morville, J., Kassi, S., Chenevier, M., Romanini, D., 2005. Fast, low-noise, mode bymode, cavity-enhanced absorption spectroscopy by diode-laser self-locking. Appl. Phys. B Lasers Opt. 80, 1027-01038. * NEEM (North Greenland Eemian Ice Drilling) project information http://neem.dk/

  20. Regression analysis for the WTP for pain relief for piglet castration (df =...

    • plos.figshare.com
    xls
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ulrich J. Frey; Frauke Pirscher (2023). Regression analysis for the WTP for pain relief for piglet castration (df = 1203, adjusted R-squared: 0.69). [Dataset]. http://doi.org/10.1371/journal.pone.0202193.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Ulrich J. Frey; Frauke Pirscher
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Regression analysis for the WTP for pain relief for piglet castration (df = 1203, adjusted R-squared: 0.69).

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Bianchi, F. M.; Brunner, D.; Kube, R.; LaBombard, B. (2021). Outlier classification using autoencoders: application for fluctuation driven flows in fusion plasmas [Dataset]. https://www.osti.gov/dataexplorer/biblio/dataset/1882649-outlier-classification-using-autoencoders-application-fluctuation-driven-flows-fusion-plasmas
Organization logoOrganization logo

Data from: Outlier classification using autoencoders: application for fluctuation driven flows in fusion plasmas

Related Article
Explore at:
Dataset updated
Jun 2, 2021
Dataset provided by
Office of Sciencehttp://www.er.doe.gov/
United States Department of Energyhttp://energy.gov/
Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States). Plasma Science and Fusion Center
Authors
Bianchi, F. M.; Brunner, D.; Kube, R.; LaBombard, B.
Description

Understanding the statistics of fluctuation driven flows in the boundary layer of magnetically confined plasmas is desired to accurately model the lifetime of the vacuum vessel components. Mirror Langmuir probes (MLPs) are a novel diagnostic that uniquely allow us to sample the plasma parameters on a time scale shorter than the characteristic time scale of their fluctuations. Sudden large-amplitude fluctuations in the plasma degrade the precision and accuracy of the plasma parameters reported by MLPs for cases in which the probe bias range is of insufficient amplitude. While some data samples can readily be classified as valid and invalid, we find that such a classification may be ambiguous for up to 40% of data sampled for the plasma parameters and bias voltages considered in this study. In this contribution, we employ an autoencoder (AE) to learn a low-dimensional representation of valid data samples. By definition, the coordinates in this space are the features that mostly characterize valid data. Ambiguous data samples are classified in this space using standard classifiers for vectorial data. In this way, we avoid defining complicated threshold rules to identify outliers, which require strong assumptions and introduce biases in the analysis. By removing the outliers that aremore » identified in the latent low-dimensional space of the AE, we find that the average conductive and convective radial heat fluxes are between approximately 5% and 15% lower as when removing outliers identified by threshold values. For contributions to the radial heat flux due to triple correlations, the difference is up to 40%.« less

Search
Clear search
Close search
Google apps
Main menu