100+ datasets found
  1. d

    Data from: A Comparison of Filter-based Approaches for Model-based...

    • catalog.data.gov
    • s.cnmilf.com
    • +2more
    Updated Apr 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). A Comparison of Filter-based Approaches for Model-based Prognostics [Dataset]. https://catalog.data.gov/dataset/a-comparison-of-filter-based-approaches-for-model-based-prognostics
    Explore at:
    Dataset updated
    Apr 11, 2025
    Dataset provided by
    Dashlink
    Description

    Model-based prognostics approaches use domain knowledge about a system and its failure modes through the use of physics-based models. Model-based prognosis is generally divided into two sequential problems: a joint state-parameter estimation problem, in which, using the model, the health of a system or component is determined based on the observations; and a prediction problem, in which, using the model, the state-parameter distribution is simulated forward in time to compute end of life and remaining useful life. The first problem is typically solved through the use of a state observer, or filter. The choice of filter depends on the assumptions that may be made about the system, and on the desired algorithm performance. In this paper, we review three separate filters for the solution to the first problem: the Daum filter, an exact nonlinear filter; the unscented Kalman filter, which approximates nonlinearities through the use of a deterministic sampling method known as the unscented transform; and the particle filter, which approximates the state distribution using a finite set of discrete, weighted samples, called particles. Using a centrifugal pump as a case study, we conduct a number of simulation-based experiments investigating the performance of the different algorithms as applied to prognostics.

  2. Z

    A dataset for comparing filtering methods used to wave and non-wave flow at...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xiao, Qiyu (2023). A dataset for comparing filtering methods used to wave and non-wave flow at the surface of the Agulhas region [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6561067
    Explore at:
    Dataset updated
    Jan 4, 2023
    Dataset provided by
    Abernathey, Ryan P
    Xiao, Qiyu
    Jones, C Spencer
    Smith, K Shafer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset comprises sea surface height (SSH) and velocity data at the ocean surface in two small regions near the Agulhas retroflection. The unfiltered SSH and a horizontal velocity field are provided, along with the same fields after various kinds of filtering, as described in the accompanying manuscript, Using Lagrangian filtering to remove waves from the ocean surface velocity field (https://doi.org/10.31223/X5D352). The code repository for this work is https://github.com/cspencerjones/separating-balanced .

    Two time-resolutions are provided: two weeks of hourly data and 70 days of daily data.

    Seventy_daysA.nc contains daily data for region A and Seventy_daysB.nc contains daily data for region B, including unfiltered, lagrangian filtered and omega-filtered velocity and sea-surface height.

    two_weeksA.nc contains hourly data for region A and two_weeksB.nc contains hourly data for region B, including unfiltered and lagrangian filtered velocity and sea-surface height.

    Note that region A has been moved in version 2 of this dataset.

    See the manuscript and code repository for more information.

    This work was supported by NASA award 80NSSC20K1142.

  3. Data for Filtering Organized 3D Point Clouds for Bin Picking Applications

    • datasets.ai
    • catalog.data.gov
    0, 34, 47
    Updated Aug 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2024). Data for Filtering Organized 3D Point Clouds for Bin Picking Applications [Dataset]. https://datasets.ai/datasets/data-for-filtering-organized-3d-point-clouds-for-bin-picking-applications
    Explore at:
    0, 34, 47Available download formats
    Dataset updated
    Aug 6, 2024
    Dataset authored and provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Description

    Contains scans of a bin filled with different parts ( screws, nuts, rods, spheres, sprockets). For each part type, RGB image and organized 3D point cloud obtained with structured light sensor are provided. In addition, unorganized 3D point cloud representing an empty bin and a small Matlab script to read the files is also provided. 3D data contain a lot of outliers and the data were used to demonstrate a new filtering technique.

  4. f

    Data from: Bagged filters for partially observed interacting systems

    • tandf.figshare.com
    zip
    Updated Jun 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Edward L. Ionides; Kidus Asfaw; Joonha Park; Aaron A. King (2023). Bagged filters for partially observed interacting systems [Dataset]. http://doi.org/10.6084/m9.figshare.16553426.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 6, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Edward L. Ionides; Kidus Asfaw; Joonha Park; Aaron A. King
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Bagging (i.e., bootstrap aggregating) involves combining an ensemble of bootstrap estimators. We consider bagging for inference from noisy or incomplete measurements on a collection of interacting stochastic dynamic systems. Each system is called a unit, and each unit is associated with a spatial location. A motivating example arises in epidemiology, where each unit is a city: the majority of transmission occurs within a city, with smaller yet epidemiologically important interactions arising from disease transmission between cities. Monte Carlo filtering methods used for inference on nonlinear non-Gaussian systems can suffer from a curse of dimensionality as the number of units increases. We introduce bagged filter (BF) methodology which combines an ensemble of Monte Carlo filters, using spatiotemporally localized weights to select successful filters at each unit and time. We obtain conditions under which likelihood evaluation using a BF algorithm can beat a curse of dimensionality, and we demonstrate applicability even when these conditions do not hold. BF can out-perform an ensemble Kalman filter on a coupled population dynamics model describing infectious disease transmission. A block particle filter also performs well on this task, though the bagged filter respects smoothness and conservation laws that a block particle filter can violate.

  5. f

    Data from: Generative Filtering for Recursive Bayesian Inference with...

    • tandf.figshare.com
    pdf
    Updated Feb 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ian Taylor; Andee Kaplan; Brenda Betancourt (2025). Generative Filtering for Recursive Bayesian Inference with Streaming Data [Dataset]. http://doi.org/10.6084/m9.figshare.28047072.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Feb 13, 2025
    Dataset provided by
    Taylor & Francis
    Authors
    Ian Taylor; Andee Kaplan; Brenda Betancourt
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In the streaming data setting, where data arrive continuously or in frequent batches and there is no pre-determined amount of total data, Bayesian models can employ recursive updates, incorporating each new batch of data into the model parameters’ posterior distribution. Filtering methods are currently used to perform these updates efficiently, however, they suffer from eventual degradation as the number of unique values within the filtered samples decreases. We propose Generative Filtering, a method for efficiently performing recursive Bayesian updates in the streaming setting. Generative Filtering retains the speed of a filtering method while using parallel updates to avoid degenerate distributions after repeated applications. We derive rates of convergence for Generative Filtering and conditions for the use of sufficient statistics instead of fully storing all past data. We investigate the alleviation of filtering degradation through simulation and an ecological time series of counts. Supplementary materials for this article are available online.

  6. w

    Dataset of book subjects that contain An introduction to wavelets and other...

    • workwithdata.com
    Updated Nov 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2024). Dataset of book subjects that contain An introduction to wavelets and other filtering methods in finance and economics [Dataset]. https://www.workwithdata.com/datasets/book-subjects?f=1&fcol0=j0-book&fop0=%3D&fval0=An+introduction+to+wavelets+and+other+filtering+methods+in+finance+and+economics&j=1&j0=books
    Explore at:
    Dataset updated
    Nov 7, 2024
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about book subjects. It has 4 rows and is filtered where the books is An introduction to wavelets and other filtering methods in finance and economics. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.

  7. h

    nllb-filtering

    • huggingface.co
    Updated Aug 17, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yaya (2022). nllb-filtering [Dataset]. https://huggingface.co/datasets/yaya-sy/nllb-filtering
    Explore at:
    Dataset updated
    Aug 17, 2022
    Authors
    Yaya
    Description

    Dataset Card for No Language Left Behind (NLLB - 200vo)

      Dataset Summary
    

    This dataset was created based on metadata for mined bitext released by Meta AI. It contains bitext for 148 English-centric and 1465 non-English-centric language pairs using the stopes mining library and the LASER3 encoders (Heffernan et al., 2022). The complete dataset is ~450GB. CCMatrix contains previous versions of mined instructions.

      How to use the data
    

    There are two ways… See the full description on the dataset page: https://huggingface.co/datasets/yaya-sy/nllb-filtering.

  8. Data from: Current methods for automated filtering of multiple sequence...

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    application/gzip, txt +1
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ge Tan; Matthieu Muffato; Christian Ledergerber; Javier Herrero; Nick Goldman; Manuel Gil; Christophe Dessimoz; Ge Tan; Matthieu Muffato; Christian Ledergerber; Javier Herrero; Nick Goldman; Manuel Gil; Christophe Dessimoz (2024). Data from: Current methods for automated filtering of multiple sequence alignments frequently worsen single-gene phylogenetic inference [Dataset]. http://doi.org/10.5061/dryad.pc5j0
    Explore at:
    application/gzip, txt, zipAvailable download formats
    Dataset updated
    Jul 19, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Ge Tan; Matthieu Muffato; Christian Ledergerber; Javier Herrero; Nick Goldman; Manuel Gil; Christophe Dessimoz; Ge Tan; Matthieu Muffato; Christian Ledergerber; Javier Herrero; Nick Goldman; Manuel Gil; Christophe Dessimoz
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Phylogenetic inference is generally performed on the basis of multiple sequence alignments (MSA). Because errors in an alignment can lead to errors in tree estimation, there is a strong interest in identifying and removing unreliable parts of the alignment. In recent years several automated filtering approaches have been proposed, but despite their popularity, a systematic and comprehensive comparison of different alignment filtering methods on real data has been lacking. Here, we extend and apply recently introduced phylogenetic tests of alignment accuracy on a large number of gene families and contrast the performance of unfiltered versus filtered alignments in the context of single-gene phylogeny reconstruction. Based on multiple genome-wide empirical and simulated data sets, we show that the trees obtained from filtered MSAs are on average worse than those obtained from unfiltered MSAs. Furthermore, alignment filtering often leads to an increase in the proportion of well-supported branches that are actually wrong. We confirm that our findings hold for a wide range of parameters and methods. Although our results suggest that light filtering (up to 20% of alignment positions) has little impact on tree accuracy and may save some computation time, contrary to widespread practice, we do not generally recommend the use of current alignment filtering methods for phylogenetic inference. By providing a way to rigorously and systematically measure the impact of filtering on alignments, the methodology set forth here will guide the development of better filtering algorithms.

  9. i

    Data from: Distributed Cubature Information Filtering Method for State...

    • ieee-dataport.org
    Updated Oct 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhan Chen (2023). Distributed Cubature Information Filtering Method for State Estimation in Bearing-only Sensor Network [Dataset]. https://ieee-dataport.org/documents/distributed-cubature-information-filtering-method-state-estimation-bearing-only-sensor
    Explore at:
    Dataset updated
    Oct 8, 2023
    Authors
    Zhan Chen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In this brief

  10. h

    Toxicity-Bias-Filtering

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DATE-LM (Data Attribution Evaluation in Language Models), Toxicity-Bias-Filtering [Dataset]. https://huggingface.co/datasets/DataAttributionEval/Toxicity-Bias-Filtering
    Explore at:
    Dataset authored and provided by
    DATE-LM (Data Attribution Evaluation in Language Models)
    Description

    Overview

    This dataset is designed to evaluate the effectiveness of toxicity and bias filtering methods. The objective is to detect and filter a small subset of toxic or unsafe examples that have been injected into a larger, predominantly safe training set, using a reference set that exposes unsafe model behavior. All models are evaluated using the same training and reference sets. We provide two evaluation settings, denoted by the suffixes Hom (Homogeneous) and Het (Heterogeneous).… See the full description on the dataset page: https://huggingface.co/datasets/DataAttributionEval/Toxicity-Bias-Filtering.

  11. Best FE on clean and filtered data

    • kaggle.com
    Updated Mar 29, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Icaro Freire (2020). Best FE on clean and filtered data [Dataset]. https://www.kaggle.com/icarofreire/best-filter-and-featureengineering/notebooks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 29, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Icaro Freire
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Content

    The two CSV files here are the train and test data in Kaggle's Ion Switching Competition with drift removed and filter with Kalman filter to reduce noise.

    Acknowledgements

    This ideas where posted by @cdeotte and @teejmahal20, I just run the filter and the FE and save the data.

  12. d

    Data from: SAR Image Enhancement using Particle Filters

    • catalog.data.gov
    • data.nasa.gov
    • +1more
    Updated Apr 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). SAR Image Enhancement using Particle Filters [Dataset]. https://catalog.data.gov/dataset/sar-image-enhancement-using-particle-filters
    Explore at:
    Dataset updated
    Apr 10, 2025
    Dataset provided by
    Dashlink
    Description

    In this paper, we propose a novel approach to reduce the noise in Synthetic Aperture Radar (SAR) images using particle filters. Interpretation of SAR images is a difficult problem, since they are contaminated with a multiplicative noise, which is known as the “Speckle Noise”. In literature, the general approach for removing the speckle is to use the local statistics, which are computed in a square window. Here, we propose to use particle filters, which is a sequential Bayesian technique. The proposed method also uses the local statistics to denoise the images. Since this is a Bayesian approach, the computed statistics of the window can be exploited as a priori information. Moreover, particle filters are sequential methods, which are more appropriate to handle the heterogeneous structure of the image. Computer simulations show that the proposed method provides better edge-preserving results with satisfactory speckle removal, when compared to the results obtained by Gamma Maximum a posteriori (MAP) filter.

  13. Supplementary evaluation files for the paper: Grid-Based Bayesian Filtering...

    • zenodo.org
    • data.niaid.nih.gov
    Updated Aug 11, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Miroslav Opiela; Miroslav Opiela (2020). Supplementary evaluation files for the paper: Grid-Based Bayesian Filtering Methods for Pedestrian Dead Reckoning Indoor Positioning Using Smartphones [Dataset]. http://doi.org/10.5281/zenodo.3975389
    Explore at:
    Dataset updated
    Aug 11, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Miroslav Opiela; Miroslav Opiela
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This package contains evaluation supplementary files for the paper: Grid-Based Bayesian Filtering Methods for Pedestrian Dead Reckoning Indoor Positioning Using Smartphones by Miroslav Opiela and František Galčík.

    Contents:

    • ground_truth - real positions of checkpoints for given input files
    • input - sensor measurements recordings with initial positions (also after floor transitions) and checkpoint labels
    • maps - processed map models containing positions of points and connections (e.g., walls) in custom coordinate system. Reference to GNSS and map rotation is inducted in maps-meta.xml
    • output - data processed by the localization system. JSON containing the applied method, its configuration, and all estimated positions. Errors for every folder are summarized in the csv file
    • visualization - trajectories visualized for selected output files
    • readme.txt - describes data formats used for particular files in this dataset and summarizes output files

    Venues

    Data are recorded in three buildings:

    • codename: SA1, SA1_rotated - recorded by the author in the faculty building (Park Angelinum 9, 04001, Košice, Slovakia) using Lenovo tablet
    • codename: AtlantisR0, AtlantisR-1, AtlantisR+1, AtlantisR+2 - the shopping mall Atlantis Le Centre (Boulevard Salvador Allende, 44800 Saint-Herblain, France). Dataset is from IPIN 2018 competition and loc_20180922_160206 is recorded by the author using Xiaomi Mi 5.
    • codename: CNR_0, CNR_1, CNR_2 - the research institute building CNR (Via Giuseppe Moruzzi, 56127 Pisa, Italy). Dataset is from IPIN 2019 competition.

    Used datasets

    A subset of input data is derivated from available logfiles provided by organizers of IPIN 2018 and IPIN 2019 competitions:

    • Jimenez, A.R.; Mendoza-Silva, G.M.; Ortiz, M.; Perez-Navarro, A.; Perul, J.; Seco, F.; Torres-Sospedra, J. Datasets and Supporting Materials for the IPIN 2018 Competition Track 3 (Smartphone-based, off-site). http://dx.doi.org/10.5281/zenodo.2823964
    • Jiménez, A. R.; Perez-Navarro, A.; Crivello, A.; Mendoza-Silva, G.; Ortiz, M.; Perul, J.; Seco, F. and Torres-Sospedra, J. Datasets and Supporting Materials for the IPIN 2019 Competition Track 3 (Smartphone-based, off-site), Zenodo 2019. http://dx.doi.org/10.5281/zenodo.3606765

    Funding

    The work was partially supported by the Slovak Grant Agency of the Ministry of Education and Academy of Science of the Slovak Republic under grant no. 1/0056/18 and by the Slovak Research and Development Agency under the contract no. APVV-15-0091.

    Contact

    For any further questions, please contact:

    Miroslav Opiela, miroslav.opiela@upjs.sk Institute of Computer Science, Faculty of Science, P. J. Šafárik University (UPJS), Košice, Slovakia

  14. d

    Data from: Removing Spikes While Preserving Data and Noise using Wavelet...

    • catalog.data.gov
    • datasets.ai
    • +3more
    Updated Apr 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Removing Spikes While Preserving Data and Noise using Wavelet Filter Banks [Dataset]. https://catalog.data.gov/dataset/removing-spikes-while-preserving-data-and-noise-using-wavelet-filter-banks
    Explore at:
    Dataset updated
    Apr 11, 2025
    Dataset provided by
    Dashlink
    Description

    Many diagnostic datasets suffer from the adverse effects of spikes that are embedded in data and noise. For example, this is true for electrical power system data where the switches, relays, and inverters are major contributors to these effects. Spikes are mostly harmful to the analysis of data in that they throw off real-time detection of abnormal conditions, and classification of faults. Since noise and spikes are mixed together and embedded within the data, removal of the unwanted signals from the data is not always easy and may result in losing the integrity of the information carried by the data. Additionally, in some applications noise and spikes need to be filtered independently. The proposed algorithm is a multi-resolution filtering approach based on Haar wavelets that is capable of removing spikes while incurring insignificant damage to other data. In particular, noise in the data, which is a useful indicator that a sensor is healthy and not stuck, can be preserved using our approach. Presented here is the theoretical background with some examples from a realistic testbed.

  15. f

    Data from: Multi-resolution filters for massive spatio-temporal data

    • tandf.figshare.com
    zip
    Updated Jun 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marcin Jurek; Matthias Katzfuss (2023). Multi-resolution filters for massive spatio-temporal data [Dataset]. http://doi.org/10.6084/m9.figshare.13865000.v2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Marcin Jurek; Matthias Katzfuss
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Spatio-temporal datasets are rapidly growing in size. For example, environmental variables are measured with increasing resolution by increasing numbers of automated sensors mounted on satellites and aircraft. Using such data, which are typically noisy and incomplete, the goal is to obtain complete maps of the spatio-temporal process, together with uncertainty quantification. We focus here on real-time filtering inference in linear Gaussian state-space models. At each time point, the state is a spatial field evaluated on a very large spatial grid, making exact inference using the Kalman filter computationally infeasible. Instead, we propose a multi-resolution filter (MRF), a highly scalable and fully probabilistic filtering method that resolves spatial features at all scales. We prove that the MRF matrices exhibit a particular block-sparse multi-resolution structure that is preserved under filtering operations through time. We describe connections to existing methods, including hierarchical matrices from numerical mathematics. We also discuss inference on time-varying parameters using an approximate Rao-Blackwellized particle filter, in which the integrated likelihood is computed using the MRF. Using a simulation study and a real satellite-data application, we show that the MRF strongly outperforms competing approaches. Supplementary materials include Python code for reproducing the simulations, some detailed properties of the MRF and auxiliary theoretical results.

  16. A dataset for comparing filtering methods used to separate balanced and...

    • zenodo.org
    tar
    Updated Jan 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    C Spencer Jones; C Spencer Jones; Qiyu Xiao; Ryan P Abernathey; K Shafer Smith; Qiyu Xiao; Ryan P Abernathey; K Shafer Smith (2023). A dataset for comparing filtering methods used to separate balanced and unbalanced flow at the surface of the Agulhas region [Dataset]. http://doi.org/10.5281/zenodo.6561068
    Explore at:
    tarAvailable download formats
    Dataset updated
    Jan 3, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    C Spencer Jones; C Spencer Jones; Qiyu Xiao; Ryan P Abernathey; K Shafer Smith; Qiyu Xiao; Ryan P Abernathey; K Shafer Smith
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset comprises sea surface height (SSH) and velocity data at the ocean surface in two small regions near the Agulhas retroflection. The unfiltered SSH and a horizontal velocity field are provided, along with the same fields after various kinds of filtering, as described in the accompanying manuscript, Separating balanced and unbalanced flow at the surface of the Agulhas region using Lagrangian filtering. The code repository for this work is https://github.com/cspencerjones/separating-balanced .

    Two time-resolutions are provided: two weeks of hourly data and 70 days of daily data. See the manuscript for more information.

    This work was supported by NASA award 80NSSC20K1142.

  17. d

    Data from: Advances in Uncertainty Representation and Management for...

    • catalog.data.gov
    • cloud.csiss.gmu.edu
    Updated Apr 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Advances in Uncertainty Representation and Management for Particle Filtering Applied to Prognostics [Dataset]. https://catalog.data.gov/dataset/advances-in-uncertainty-representation-and-management-for-particle-filtering-applied-to-pr
    Explore at:
    Dataset updated
    Apr 11, 2025
    Dataset provided by
    Dashlink
    Description

    Particle filters (PF) have been established as the de facto state of the art in failure prognosis. They combine advantages of the rigors of Bayesian estimation to nonlinear prediction while also providing uncertainty estimates with a given solution. Within the context of particle filters, this paper introduces several novel methods for uncertainty representations and uncertainty management. The prediction uncertainty is modeled via a rescaled Epanechnikov kernel and is assisted with resampling techniques and regularization algorithms. Uncertainty management is accomplished through parametric adjustments in a feedback correction loop of the state model and its noise distributions. The correction loop provides the mechanism to incorporate information that can improve solution accuracy and reduce uncertainty bounds. In addition, this approach results in reduction in computational burden. The scheme is illustrated with real vibration feature data from a fatigue-driven fault in a critical aircraft component.

  18. I

    Internet Filtering Software Industry Report

    • marketreportanalytics.com
    doc, pdf, ppt
    Updated Apr 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Market Report Analytics (2025). Internet Filtering Software Industry Report [Dataset]. https://www.marketreportanalytics.com/reports/internet-filtering-software-industry-88261
    Explore at:
    doc, pdf, pptAvailable download formats
    Dataset updated
    Apr 22, 2025
    Dataset authored and provided by
    Market Report Analytics
    License

    https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Internet Filtering Software market is experiencing robust growth, projected to reach a substantial size by 2033. A compound annual growth rate (CAGR) of 14% from 2025 to 2033 indicates a significant upward trajectory driven by several key factors. The increasing adoption of cloud-based solutions, coupled with heightened concerns surrounding cybersecurity threats and data privacy regulations, is fueling market expansion. Businesses across various sectors, including BFSI (Banking, Financial Services, and Insurance), IT & Telecom, Government, and Education, are actively investing in robust internet filtering software to protect their sensitive data and comply with regulatory mandates. The market is segmented by component (solution and services), deployment mode (cloud and on-premises), filtering type (DNS, keyword, URL, and other filtering methods), and industry vertical. The cloud deployment model is witnessing accelerated adoption due to its scalability, cost-effectiveness, and ease of management. Furthermore, the rising prevalence of sophisticated cyber threats, including malware and phishing attacks, necessitates advanced filtering capabilities, driving demand for comprehensive solutions that go beyond basic URL filtering. The competitive landscape comprises established players like Broadcom, Cisco, Palo Alto Networks, and McAfee, alongside emerging innovative companies. However, factors such as the high initial investment cost for implementing comprehensive solutions and the complexity of managing sophisticated filtering systems might pose challenges to market growth. Future growth will depend heavily on ongoing innovation in threat detection, seamless integration with existing IT infrastructure, and the increasing awareness of the need for robust internet security among organizations of all sizes. The increasing sophistication of cyberattacks and the evolving regulatory landscape are likely to continue driving demand for advanced internet filtering solutions over the forecast period. The Asia Pacific region is expected to witness substantial growth due to increasing internet penetration and the rising adoption of internet-connected devices in developing economies. North America and Europe, while already relatively mature markets, are anticipated to continue showing moderate growth driven by continuous upgrades to existing systems and the adoption of advanced features. The continuous emergence of new and advanced threats will remain a pivotal driving force behind the sustained growth of this market. Competition is expected to remain high, with companies investing heavily in R&D to develop and deploy cutting-edge solutions. Strategic partnerships and acquisitions will likely play a crucial role in shaping the market landscape in the coming years. Key drivers for this market are: , Strict Government Regulations and the Need for Compliance; Growing BYOD Trend; Growing Online Malware and the Increasing Refinement Levels of Web Attacks. Potential restraints include: , Strict Government Regulations and the Need for Compliance; Growing BYOD Trend; Growing Online Malware and the Increasing Refinement Levels of Web Attacks. Notable trends are: BFSI to Drive the Market Growth.

  19. h

    Wildchat-RIP-Filtered-by-70b-Llama

    • huggingface.co
    Updated Feb 26, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AI at Meta (2025). Wildchat-RIP-Filtered-by-70b-Llama [Dataset]. https://huggingface.co/datasets/facebook/Wildchat-RIP-Filtered-by-70b-Llama
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 26, 2025
    Dataset authored and provided by
    AI at Meta
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    RIP is a method for perference data filtering. The core idea is that low-quality input prompts lead to high variance and low-quality responses. By measuring the quality of rejected responses and the reward gap between chosen and rejected preference pairs, RIP effectively filters prompts to enhance dataset quality. We release 4k data that filtered from 20k Wildchat prompts. For each prompt, we provide 32 responses from Llama-3.3-70B-Instruct and their corresponding rewards obtained from ArmoRM.… See the full description on the dataset page: https://huggingface.co/datasets/facebook/Wildchat-RIP-Filtered-by-70b-Llama.

  20. f

    Data_Sheet_1_Effects of Rare Microbiome Taxa Filtering on Statistical...

    • frontiersin.figshare.com
    pdf
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Quy Cao; Xinxin Sun; Karun Rajesh; Naga Chalasani; Kayla Gelow; Barry Katz; Vijay H. Shah; Arun J. Sanyal; Ekaterina Smirnova (2023). Data_Sheet_1_Effects of Rare Microbiome Taxa Filtering on Statistical Analysis.pdf [Dataset]. http://doi.org/10.3389/fmicb.2020.607325.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Frontiers
    Authors
    Quy Cao; Xinxin Sun; Karun Rajesh; Naga Chalasani; Kayla Gelow; Barry Katz; Vijay H. Shah; Arun J. Sanyal; Ekaterina Smirnova
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Background: The accuracy of microbial community detection in 16S rRNA marker-gene and metagenomic studies suffers from contamination and sequencing errors that lead to either falsely identifying microbial taxa that were not in the sample or misclassifying the taxa of DNA fragment reads. Removing contaminants and filtering rare features are two common approaches to deal with this problem. While contaminant detection methods use auxiliary sequencing process information to identify known contaminants, filtering methods remove taxa that are present in a small number of samples and have small counts in the samples where they are observed. The latter approach reduces the extreme sparsity of microbiome data and has been shown to correctly remove contaminant taxa in cultured “mock” datasets, where the true taxa compositions are known. Although filtering is frequently used, careful evaluation of its effect on the data analysis and scientific conclusions remains unreported. Here, we assess the effect of filtering on the alpha and beta diversity estimation as well as its impact on identifying taxa that discriminate between disease states.Results: The effect of filtering on microbiome data analysis is illustrated on four datasets: two mock quality control datasets where the same cultured samples with known microbial composition are processed at different labs and two disease study datasets. Results show that in microbiome quality control datasets, filtering reduces the magnitude of differences in alpha diversity and alleviates technical variability between labs while preserving the between samples similarity (beta diversity). In the disease study datasets, DESeq2 and linear discriminant analysis Effect Size (LEfSe) methods were used to identify taxa that are differentially abundant across groups of samples, and random forest models were used to rank features with the largest contribution toward disease classification. Results reveal that filtering retains significant taxa and preserves the model classification ability measured by the area under the receiver operating characteristic curve (AUC). The comparison between the filtering and the contaminant removal method shows that they have complementary effects and are advised to be used in conjunction.Conclusions: Filtering reduces the complexity of microbiome data while preserving their integrity in downstream analysis. This leads to mitigation of the classification methods' sensitivity and reduction of technical variability, allowing researchers to generate more reproducible and comparable results in microbiome data analysis.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Dashlink (2025). A Comparison of Filter-based Approaches for Model-based Prognostics [Dataset]. https://catalog.data.gov/dataset/a-comparison-of-filter-based-approaches-for-model-based-prognostics

Data from: A Comparison of Filter-based Approaches for Model-based Prognostics

Related Article
Explore at:
Dataset updated
Apr 11, 2025
Dataset provided by
Dashlink
Description

Model-based prognostics approaches use domain knowledge about a system and its failure modes through the use of physics-based models. Model-based prognosis is generally divided into two sequential problems: a joint state-parameter estimation problem, in which, using the model, the health of a system or component is determined based on the observations; and a prediction problem, in which, using the model, the state-parameter distribution is simulated forward in time to compute end of life and remaining useful life. The first problem is typically solved through the use of a state observer, or filter. The choice of filter depends on the assumptions that may be made about the system, and on the desired algorithm performance. In this paper, we review three separate filters for the solution to the first problem: the Daum filter, an exact nonlinear filter; the unscented Kalman filter, which approximates nonlinearities through the use of a deterministic sampling method known as the unscented transform; and the particle filter, which approximates the state distribution using a finite set of discrete, weighted samples, called particles. Using a centrifugal pump as a case study, we conduct a number of simulation-based experiments investigating the performance of the different algorithms as applied to prognostics.

Search
Clear search
Close search
Google apps
Main menu