100+ datasets found

d
Data from: A Comparison of Filter-based Approaches for Model-based...
catalog.data.gov
s.cnmilf.com
+2more
Updated Apr 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dashlink (2025). A Comparison of Filter-based Approaches for Model-based Prognostics [Dataset]. https://catalog.data.gov/dataset/a-comparison-of-filter-based-approaches-for-model-based-prognostics
Explore at:
Dataset updated
Apr 11, 2025
Dataset provided by
Dashlink
Description
Model-based prognostics approaches use domain knowledge about a system and its failure modes through the use of physics-based models. Model-based prognosis is generally divided into two sequential problems: a joint state-parameter estimation problem, in which, using the model, the health of a system or component is determined based on the observations; and a prediction problem, in which, using the model, the state-parameter distribution is simulated forward in time to compute end of life and remaining useful life. The first problem is typically solved through the use of a state observer, or filter. The choice of filter depends on the assumptions that may be made about the system, and on the desired algorithm performance. In this paper, we review three separate filters for the solution to the first problem: the Daum filter, an exact nonlinear filter; the unscented Kalman filter, which approximates nonlinearities through the use of a deterministic sampling method known as the unscented transform; and the particle filter, which approximates the state distribution using a finite set of discrete, weighted samples, called particles. Using a centrifugal pump as a case study, we conduct a number of simulation-based experiments investigating the performance of the different algorithms as applied to prognostics.
Z
A dataset for comparing filtering methods used to wave and non-wave flow at...
data.niaid.nih.gov
zenodo.org
Updated Jan 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xiao, Qiyu (2023). A dataset for comparing filtering methods used to wave and non-wave flow at the surface of the Agulhas region [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6561067
Explore at:
Dataset updated
Jan 4, 2023
Dataset provided by
Abernathey, Ryan P
Xiao, Qiyu
Jones, C Spencer
Smith, K Shafer
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset comprises sea surface height (SSH) and velocity data at the ocean surface in two small regions near the Agulhas retroflection. The unfiltered SSH and a horizontal velocity field are provided, along with the same fields after various kinds of filtering, as described in the accompanying manuscript, Using Lagrangian filtering to remove waves from the ocean surface velocity field (https://doi.org/10.31223/X5D352). The code repository for this work is https://github.com/cspencerjones/separating-balanced .

Two time-resolutions are provided: two weeks of hourly data and 70 days of daily data.

Seventy_daysA.nc contains daily data for region A and Seventy_daysB.nc contains daily data for region B, including unfiltered, lagrangian filtered and omega-filtered velocity and sea-surface height.

two_weeksA.nc contains hourly data for region A and two_weeksB.nc contains hourly data for region B, including unfiltered and lagrangian filtered velocity and sea-surface height.

Note that region A has been moved in version 2 of this dataset.

See the manuscript and code repository for more information.

This work was supported by NASA award 80NSSC20K1142.
Data for Filtering Organized 3D Point Clouds for Bin Picking Applications
datasets.ai
catalog.data.gov
0, 34, 47
Updated Aug 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institute of Standards and Technology (2024). Data for Filtering Organized 3D Point Clouds for Bin Picking Applications [Dataset]. https://datasets.ai/datasets/data-for-filtering-organized-3d-point-clouds-for-bin-picking-applications
Explore at:
0, 34, 47Available download formats
Dataset updated
Aug 6, 2024
Dataset authored and provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
Description
Contains scans of a bin filled with different parts ( screws, nuts, rods, spheres, sprockets). For each part type, RGB image and organized 3D point cloud obtained with structured light sensor are provided. In addition, unorganized 3D point cloud representing an empty bin and a small Matlab script to read the files is also provided. 3D data contain a lot of outliers and the data were used to demonstrate a new filtering technique.
f
Data from: Bagged filters for partially observed interacting systems
tandf.figshare.com
zip
Updated Jun 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Edward L. Ionides; Kidus Asfaw; Joonha Park; Aaron A. King (2023). Bagged filters for partially observed interacting systems [Dataset]. http://doi.org/10.6084/m9.figshare.16553426.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.16553426.v1
Dataset updated
Jun 6, 2023
Dataset provided by
Taylor & Francis
Authors
Edward L. Ionides; Kidus Asfaw; Joonha Park; Aaron A. King
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Bagging (i.e., bootstrap aggregating) involves combining an ensemble of bootstrap estimators. We consider bagging for inference from noisy or incomplete measurements on a collection of interacting stochastic dynamic systems. Each system is called a unit, and each unit is associated with a spatial location. A motivating example arises in epidemiology, where each unit is a city: the majority of transmission occurs within a city, with smaller yet epidemiologically important interactions arising from disease transmission between cities. Monte Carlo filtering methods used for inference on nonlinear non-Gaussian systems can suffer from a curse of dimensionality as the number of units increases. We introduce bagged filter (BF) methodology which combines an ensemble of Monte Carlo filters, using spatiotemporally localized weights to select successful filters at each unit and time. We obtain conditions under which likelihood evaluation using a BF algorithm can beat a curse of dimensionality, and we demonstrate applicability even when these conditions do not hold. BF can out-perform an ensemble Kalman filter on a coupled population dynamics model describing infectious disease transmission. A block particle filter also performs well on this task, though the bagged filter respects smoothness and conservation laws that a block particle filter can violate.
f
Data from: Generative Filtering for Recursive Bayesian Inference with...
tandf.figshare.com
pdf
Updated Feb 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ian Taylor; Andee Kaplan; Brenda Betancourt (2025). Generative Filtering for Recursive Bayesian Inference with Streaming Data [Dataset]. http://doi.org/10.6084/m9.figshare.28047072.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.28047072.v1
Dataset updated
Feb 13, 2025
Dataset provided by
Taylor & Francis
Authors
Ian Taylor; Andee Kaplan; Brenda Betancourt
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In the streaming data setting, where data arrive continuously or in frequent batches and there is no pre-determined amount of total data, Bayesian models can employ recursive updates, incorporating each new batch of data into the model parameters’ posterior distribution. Filtering methods are currently used to perform these updates efficiently, however, they suffer from eventual degradation as the number of unique values within the filtered samples decreases. We propose Generative Filtering, a method for efficiently performing recursive Bayesian updates in the streaming setting. Generative Filtering retains the speed of a filtering method while using parallel updates to avoid degenerate distributions after repeated applications. We derive rates of convergence for Generative Filtering and conditions for the use of sufficient statistics instead of fully storing all past data. We investigate the alleviation of filtering degradation through simulation and an ecological time series of counts. Supplementary materials for this article are available online.
w
Dataset of book subjects that contain An introduction to wavelets and other...
workwithdata.com
Updated Nov 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2024). Dataset of book subjects that contain An introduction to wavelets and other filtering methods in finance and economics [Dataset]. https://www.workwithdata.com/datasets/book-subjects?f=1&fcol0=j0-book&fop0=%3D&fval0=An+introduction+to+wavelets+and+other+filtering+methods+in+finance+and+economics&j=1&j0=books
Explore at:
Dataset updated
Nov 7, 2024
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about book subjects. It has 4 rows and is filtered where the books is An introduction to wavelets and other filtering methods in finance and economics. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.
h
nllb-filtering
huggingface.co
Updated Aug 17, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yaya (2022). nllb-filtering [Dataset]. https://huggingface.co/datasets/yaya-sy/nllb-filtering
Explore at:
Dataset updated
Aug 17, 2022
Authors
Yaya
Description
Dataset Card for No Language Left Behind (NLLB - 200vo)

Dataset Summary

This dataset was created based on metadata for mined bitext released by Meta AI. It contains bitext for 148 English-centric and 1465 non-English-centric language pairs using the stopes mining library and the LASER3 encoders (Heffernan et al., 2022). The complete dataset is ~450GB. CCMatrix contains previous versions of mined instructions.

How to use the data

There are two ways… See the full description on the dataset page: https://huggingface.co/datasets/yaya-sy/nllb-filtering.
Data from: Current methods for automated filtering of multiple sequence...
zenodo.org
data.niaid.nih.gov
+1more
application/gzip, txt +1
Updated Jul 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ge Tan; Matthieu Muffato; Christian Ledergerber; Javier Herrero; Nick Goldman; Manuel Gil; Christophe Dessimoz; Ge Tan; Matthieu Muffato; Christian Ledergerber; Javier Herrero; Nick Goldman; Manuel Gil; Christophe Dessimoz (2024). Data from: Current methods for automated filtering of multiple sequence alignments frequently worsen single-gene phylogenetic inference [Dataset]. http://doi.org/10.5061/dryad.pc5j0
Explore at:
application/gzip, txt, zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.pc5j0
Dataset updated
Jul 19, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Ge Tan; Matthieu Muffato; Christian Ledergerber; Javier Herrero; Nick Goldman; Manuel Gil; Christophe Dessimoz; Ge Tan; Matthieu Muffato; Christian Ledergerber; Javier Herrero; Nick Goldman; Manuel Gil; Christophe Dessimoz
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Phylogenetic inference is generally performed on the basis of multiple sequence alignments (MSA). Because errors in an alignment can lead to errors in tree estimation, there is a strong interest in identifying and removing unreliable parts of the alignment. In recent years several automated filtering approaches have been proposed, but despite their popularity, a systematic and comprehensive comparison of different alignment filtering methods on real data has been lacking. Here, we extend and apply recently introduced phylogenetic tests of alignment accuracy on a large number of gene families and contrast the performance of unfiltered versus filtered alignments in the context of single-gene phylogeny reconstruction. Based on multiple genome-wide empirical and simulated data sets, we show that the trees obtained from filtered MSAs are on average worse than those obtained from unfiltered MSAs. Furthermore, alignment filtering often leads to an increase in the proportion of well-supported branches that are actually wrong. We confirm that our findings hold for a wide range of parameters and methods. Although our results suggest that light filtering (up to 20% of alignment positions) has little impact on tree accuracy and may save some computation time, contrary to widespread practice, we do not generally recommend the use of current alignment filtering methods for phylogenetic inference. By providing a way to rigorously and systematically measure the impact of filtering on alignments, the methodology set forth here will guide the development of better filtering algorithms.
i
Data from: Distributed Cubature Information Filtering Method for State...
ieee-dataport.org
Updated Oct 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zhan Chen (2023). Distributed Cubature Information Filtering Method for State Estimation in Bearing-only Sensor Network [Dataset]. https://ieee-dataport.org/documents/distributed-cubature-information-filtering-method-state-estimation-bearing-only-sensor
Explore at:
Dataset updated
Oct 8, 2023
Authors
Zhan Chen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In this brief
h
Toxicity-Bias-Filtering
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DATE-LM (Data Attribution Evaluation in Language Models), Toxicity-Bias-Filtering [Dataset]. https://huggingface.co/datasets/DataAttributionEval/Toxicity-Bias-Filtering
Explore at:
Dataset authored and provided by
DATE-LM (Data Attribution Evaluation in Language Models)
Description
Overview

This dataset is designed to evaluate the effectiveness of toxicity and bias filtering methods. The objective is to detect and filter a small subset of toxic or unsafe examples that have been injected into a larger, predominantly safe training set, using a reference set that exposes unsafe model behavior. All models are evaluated using the same training and reference sets. We provide two evaluation settings, denoted by the suffixes Hom (Homogeneous) and Het (Heterogeneous).… See the full description on the dataset page: https://huggingface.co/datasets/DataAttributionEval/Toxicity-Bias-Filtering.
Best FE on clean and filtered data
kaggle.com
Updated Mar 29, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Icaro Freire (2020). Best FE on clean and filtered data [Dataset]. https://www.kaggle.com/icarofreire/best-filter-and-featureengineering/notebooks
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 29, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Icaro Freire
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Content

The two CSV files here are the train and test data in Kaggle's Ion Switching Competition with drift removed and filter with Kalman filter to reduce noise.

Acknowledgements

This ideas where posted by @cdeotte and @teejmahal20, I just run the filter and the FE and save the data.
d
Data from: SAR Image Enhancement using Particle Filters
catalog.data.gov
data.nasa.gov
+1more
Updated Apr 10, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dashlink (2025). SAR Image Enhancement using Particle Filters [Dataset]. https://catalog.data.gov/dataset/sar-image-enhancement-using-particle-filters
Explore at:
Dataset updated
Apr 10, 2025
Dataset provided by
Dashlink
Description
In this paper, we propose a novel approach to reduce the noise in Synthetic Aperture Radar (SAR) images using particle filters. Interpretation of SAR images is a difficult problem, since they are contaminated with a multiplicative noise, which is known as the “Speckle Noise”. In literature, the general approach for removing the speckle is to use the local statistics, which are computed in a square window. Here, we propose to use particle filters, which is a sequential Bayesian technique. The proposed method also uses the local statistics to denoise the images. Since this is a Bayesian approach, the computed statistics of the window can be exploited as a priori information. Moreover, particle filters are sequential methods, which are more appropriate to handle the heterogeneous structure of the image. Computer simulations show that the proposed method provides better edge-preserving results with satisfactory speckle removal, when compared to the results obtained by Gamma Maximum a posteriori (MAP) filter.
Supplementary evaluation files for the paper: Grid-Based Bayesian Filtering...
zenodo.org
data.niaid.nih.gov
Updated Aug 11, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Miroslav Opiela; Miroslav Opiela (2020). Supplementary evaluation files for the paper: Grid-Based Bayesian Filtering Methods for Pedestrian Dead Reckoning Indoor Positioning Using Smartphones [Dataset]. http://doi.org/10.5281/zenodo.3975389
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.3975389
Dataset updated
Aug 11, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Miroslav Opiela; Miroslav Opiela
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This package contains evaluation supplementary files for the paper: Grid-Based Bayesian Filtering Methods for Pedestrian Dead Reckoning Indoor Positioning Using Smartphones by Miroslav Opiela and František Galčík.

Contents:

ground_truth - real positions of checkpoints for given input files

input - sensor measurements recordings with initial positions (also after floor transitions) and checkpoint labels

maps - processed map models containing positions of points and connections (e.g., walls) in custom coordinate system. Reference to GNSS and map rotation is inducted in maps-meta.xml

output - data processed by the localization system. JSON containing the applied method, its configuration, and all estimated positions. Errors for every folder are summarized in the csv file

visualization - trajectories visualized for selected output files

readme.txt - describes data formats used for particular files in this dataset and summarizes output files

Venues

Data are recorded in three buildings:

codename: SA1, SA1_rotated - recorded by the author in the faculty building (Park Angelinum 9, 04001, Košice, Slovakia) using Lenovo tablet

codename: AtlantisR0, AtlantisR-1, AtlantisR+1, AtlantisR+2 - the shopping mall Atlantis Le Centre (Boulevard Salvador Allende, 44800 Saint-Herblain, France). Dataset is from IPIN 2018 competition and loc_20180922_160206 is recorded by the author using Xiaomi Mi 5.

codename: CNR_0, CNR_1, CNR_2 - the research institute building CNR (Via Giuseppe Moruzzi, 56127 Pisa, Italy). Dataset is from IPIN 2019 competition.

Used datasets

A subset of input data is derivated from available logfiles provided by organizers of IPIN 2018 and IPIN 2019 competitions:

Jimenez, A.R.; Mendoza-Silva, G.M.; Ortiz, M.; Perez-Navarro, A.; Perul, J.; Seco, F.; Torres-Sospedra, J. Datasets and Supporting Materials for the IPIN 2018 Competition Track 3 (Smartphone-based, off-site). http://dx.doi.org/10.5281/zenodo.2823964

Jiménez, A. R.; Perez-Navarro, A.; Crivello, A.; Mendoza-Silva, G.; Ortiz, M.; Perul, J.; Seco, F. and Torres-Sospedra, J. Datasets and Supporting Materials for the IPIN 2019 Competition Track 3 (Smartphone-based, off-site), Zenodo 2019. http://dx.doi.org/10.5281/zenodo.3606765

Funding

The work was partially supported by the Slovak Grant Agency of the Ministry of Education and Academy of Science of the Slovak Republic under grant no. 1/0056/18 and by the Slovak Research and Development Agency under the contract no. APVV-15-0091.

Contact

For any further questions, please contact:

Miroslav Opiela, miroslav.opiela@upjs.sk Institute of Computer Science, Faculty of Science, P. J. Šafárik University (UPJS), Košice, Slovakia
d
Data from: Removing Spikes While Preserving Data and Noise using Wavelet...
catalog.data.gov
datasets.ai
+3more
Updated Apr 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dashlink (2025). Removing Spikes While Preserving Data and Noise using Wavelet Filter Banks [Dataset]. https://catalog.data.gov/dataset/removing-spikes-while-preserving-data-and-noise-using-wavelet-filter-banks
Explore at:
Dataset updated
Apr 11, 2025
Dataset provided by
Dashlink
Description
Many diagnostic datasets suffer from the adverse effects of spikes that are embedded in data and noise. For example, this is true for electrical power system data where the switches, relays, and inverters are major contributors to these effects. Spikes are mostly harmful to the analysis of data in that they throw off real-time detection of abnormal conditions, and classification of faults. Since noise and spikes are mixed together and embedded within the data, removal of the unwanted signals from the data is not always easy and may result in losing the integrity of the information carried by the data. Additionally, in some applications noise and spikes need to be filtered independently. The proposed algorithm is a multi-resolution filtering approach based on Haar wavelets that is capable of removing spikes while incurring insignificant damage to other data. In particular, noise in the data, which is a useful indicator that a sensor is healthy and not stuck, can be preserved using our approach. Presented here is the theoretical background with some examples from a realistic testbed.
f
Data from: Multi-resolution filters for massive spatio-temporal data
tandf.figshare.com
zip
Updated Jun 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marcin Jurek; Matthias Katzfuss (2023). Multi-resolution filters for massive spatio-temporal data [Dataset]. http://doi.org/10.6084/m9.figshare.13865000.v2
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.13865000.v2
Dataset updated
Jun 4, 2023
Dataset provided by
Taylor & Francis
Authors
Marcin Jurek; Matthias Katzfuss
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Spatio-temporal datasets are rapidly growing in size. For example, environmental variables are measured with increasing resolution by increasing numbers of automated sensors mounted on satellites and aircraft. Using such data, which are typically noisy and incomplete, the goal is to obtain complete maps of the spatio-temporal process, together with uncertainty quantification. We focus here on real-time filtering inference in linear Gaussian state-space models. At each time point, the state is a spatial field evaluated on a very large spatial grid, making exact inference using the Kalman filter computationally infeasible. Instead, we propose a multi-resolution filter (MRF), a highly scalable and fully probabilistic filtering method that resolves spatial features at all scales. We prove that the MRF matrices exhibit a particular block-sparse multi-resolution structure that is preserved under filtering operations through time. We describe connections to existing methods, including hierarchical matrices from numerical mathematics. We also discuss inference on time-varying parameters using an approximate Rao-Blackwellized particle filter, in which the integrated likelihood is computed using the MRF. Using a simulation study and a real satellite-data application, we show that the MRF strongly outperforms competing approaches. Supplementary materials include Python code for reproducing the simulations, some detailed properties of the MRF and auxiliary theoretical results.
A dataset for comparing filtering methods used to separate balanced and...
zenodo.org
tar
Updated Jan 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
C Spencer Jones; C Spencer Jones; Qiyu Xiao; Ryan P Abernathey; K Shafer Smith; Qiyu Xiao; Ryan P Abernathey; K Shafer Smith (2023). A dataset for comparing filtering methods used to separate balanced and unbalanced flow at the surface of the Agulhas region [Dataset]. http://doi.org/10.5281/zenodo.6561068
Explore at:
tarAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6561068
Dataset updated
Jan 3, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
C Spencer Jones; C Spencer Jones; Qiyu Xiao; Ryan P Abernathey; K Shafer Smith; Qiyu Xiao; Ryan P Abernathey; K Shafer Smith
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset comprises sea surface height (SSH) and velocity data at the ocean surface in two small regions near the Agulhas retroflection. The unfiltered SSH and a horizontal velocity field are provided, along with the same fields after various kinds of filtering, as described in the accompanying manuscript, Separating balanced and unbalanced flow at the surface of the Agulhas region using Lagrangian filtering. The code repository for this work is https://github.com/cspencerjones/separating-balanced .

Two time-resolutions are provided: two weeks of hourly data and 70 days of daily data. See the manuscript for more information.

This work was supported by NASA award 80NSSC20K1142.
d
Data from: Advances in Uncertainty Representation and Management for...
catalog.data.gov
cloud.csiss.gmu.edu
Updated Apr 11, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dashlink (2025). Advances in Uncertainty Representation and Management for Particle Filtering Applied to Prognostics [Dataset]. https://catalog.data.gov/dataset/advances-in-uncertainty-representation-and-management-for-particle-filtering-applied-to-pr
Explore at:
Dataset updated
Apr 11, 2025
Dataset provided by
Dashlink
Description
Particle filters (PF) have been established as the de facto state of the art in failure prognosis. They combine advantages of the rigors of Bayesian estimation to nonlinear prediction while also providing uncertainty estimates with a given solution. Within the context of particle filters, this paper introduces several novel methods for uncertainty representations and uncertainty management. The prediction uncertainty is modeled via a rescaled Epanechnikov kernel and is assisted with resampling techniques and regularization algorithms. Uncertainty management is accomplished through parametric adjustments in a feedback correction loop of the state model and its noise distributions. The correction loop provides the mechanism to incorporate information that can improve solution accuracy and reduce uncertainty bounds. In addition, this approach results in reduction in computational burden. The scheme is illustrated with real vibration feature data from a fatigue-driven fault in a critical aircraft component.
I
Internet Filtering Software Industry Report
marketreportanalytics.com
doc, pdf, ppt
Updated Apr 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Market Report Analytics (2025). Internet Filtering Software Industry Report [Dataset]. https://www.marketreportanalytics.com/reports/internet-filtering-software-industry-88261
Explore at:
doc, pdf, pptAvailable download formats
Dataset updated
Apr 22, 2025
Dataset authored and provided by
Market Report Analytics
License
https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The Internet Filtering Software market is experiencing robust growth, projected to reach a substantial size by 2033. A compound annual growth rate (CAGR) of 14% from 2025 to 2033 indicates a significant upward trajectory driven by several key factors. The increasing adoption of cloud-based solutions, coupled with heightened concerns surrounding cybersecurity threats and data privacy regulations, is fueling market expansion. Businesses across various sectors, including BFSI (Banking, Financial Services, and Insurance), IT & Telecom, Government, and Education, are actively investing in robust internet filtering software to protect their sensitive data and comply with regulatory mandates. The market is segmented by component (solution and services), deployment mode (cloud and on-premises), filtering type (DNS, keyword, URL, and other filtering methods), and industry vertical. The cloud deployment model is witnessing accelerated adoption due to its scalability, cost-effectiveness, and ease of management. Furthermore, the rising prevalence of sophisticated cyber threats, including malware and phishing attacks, necessitates advanced filtering capabilities, driving demand for comprehensive solutions that go beyond basic URL filtering. The competitive landscape comprises established players like Broadcom, Cisco, Palo Alto Networks, and McAfee, alongside emerging innovative companies. However, factors such as the high initial investment cost for implementing comprehensive solutions and the complexity of managing sophisticated filtering systems might pose challenges to market growth. Future growth will depend heavily on ongoing innovation in threat detection, seamless integration with existing IT infrastructure, and the increasing awareness of the need for robust internet security among organizations of all sizes. The increasing sophistication of cyberattacks and the evolving regulatory landscape are likely to continue driving demand for advanced internet filtering solutions over the forecast period. The Asia Pacific region is expected to witness substantial growth due to increasing internet penetration and the rising adoption of internet-connected devices in developing economies. North America and Europe, while already relatively mature markets, are anticipated to continue showing moderate growth driven by continuous upgrades to existing systems and the adoption of advanced features. The continuous emergence of new and advanced threats will remain a pivotal driving force behind the sustained growth of this market. Competition is expected to remain high, with companies investing heavily in R&D to develop and deploy cutting-edge solutions. Strategic partnerships and acquisitions will likely play a crucial role in shaping the market landscape in the coming years. Key drivers for this market are: , Strict Government Regulations and the Need for Compliance; Growing BYOD Trend; Growing Online Malware and the Increasing Refinement Levels of Web Attacks. Potential restraints include: , Strict Government Regulations and the Need for Compliance; Growing BYOD Trend; Growing Online Malware and the Increasing Refinement Levels of Web Attacks. Notable trends are: BFSI to Drive the Market Growth.
h
Wildchat-RIP-Filtered-by-70b-Llama
huggingface.co
Updated Feb 26, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AI at Meta (2025). Wildchat-RIP-Filtered-by-70b-Llama [Dataset]. https://huggingface.co/datasets/facebook/Wildchat-RIP-Filtered-by-70b-Llama
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 26, 2025
Dataset authored and provided by
AI at Meta
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
RIP is a method for perference data filtering. The core idea is that low-quality input prompts lead to high variance and low-quality responses. By measuring the quality of rejected responses and the reward gap between chosen and rejected preference pairs, RIP effectively filters prompts to enhance dataset quality. We release 4k data that filtered from 20k Wildchat prompts. For each prompt, we provide 32 responses from Llama-3.3-70B-Instruct and their corresponding rewards obtained from ArmoRM.… See the full description on the dataset page: https://huggingface.co/datasets/facebook/Wildchat-RIP-Filtered-by-70b-Llama.
f
Data_Sheet_1_Effects of Rare Microbiome Taxa Filtering on Statistical...
frontiersin.figshare.com
pdf
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Quy Cao; Xinxin Sun; Karun Rajesh; Naga Chalasani; Kayla Gelow; Barry Katz; Vijay H. Shah; Arun J. Sanyal; Ekaterina Smirnova (2023). Data_Sheet_1_Effects of Rare Microbiome Taxa Filtering on Statistical Analysis.pdf [Dataset]. http://doi.org/10.3389/fmicb.2020.607325.s001
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.3389/fmicb.2020.607325.s001
Dataset updated
May 30, 2023
Dataset provided by
Frontiers
Authors
Quy Cao; Xinxin Sun; Karun Rajesh; Naga Chalasani; Kayla Gelow; Barry Katz; Vijay H. Shah; Arun J. Sanyal; Ekaterina Smirnova
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Background: The accuracy of microbial community detection in 16S rRNA marker-gene and metagenomic studies suffers from contamination and sequencing errors that lead to either falsely identifying microbial taxa that were not in the sample or misclassifying the taxa of DNA fragment reads. Removing contaminants and filtering rare features are two common approaches to deal with this problem. While contaminant detection methods use auxiliary sequencing process information to identify known contaminants, filtering methods remove taxa that are present in a small number of samples and have small counts in the samples where they are observed. The latter approach reduces the extreme sparsity of microbiome data and has been shown to correctly remove contaminant taxa in cultured “mock” datasets, where the true taxa compositions are known. Although filtering is frequently used, careful evaluation of its effect on the data analysis and scientific conclusions remains unreported. Here, we assess the effect of filtering on the alpha and beta diversity estimation as well as its impact on identifying taxa that discriminate between disease states.Results: The effect of filtering on microbiome data analysis is illustrated on four datasets: two mock quality control datasets where the same cultured samples with known microbial composition are processed at different labs and two disease study datasets. Results show that in microbiome quality control datasets, filtering reduces the magnitude of differences in alpha diversity and alleviates technical variability between labs while preserving the between samples similarity (beta diversity). In the disease study datasets, DESeq2 and linear discriminant analysis Effect Size (LEfSe) methods were used to identify taxa that are differentially abundant across groups of samples, and random forest models were used to rank features with the largest contribution toward disease classification. Results reveal that filtering retains significant taxa and preserves the model classification ability measured by the area under the receiver operating characteristic curve (AUC). The comparison between the filtering and the contaminant removal method shows that they have complementary effects and are advised to be used in conjunction.Conclusions: Filtering reduces the complexity of microbiome data while preserving their integrity in downstream analysis. This leads to mitigation of the classification methods' sensitivity and reduction of technical variability, allowing researchers to generate more reproducible and comparable results in microbiome data analysis.

Facebook

Twitter

Click to copy link

Link copied

Cite

Dashlink (2025). A Comparison of Filter-based Approaches for Model-based Prognostics [Dataset]. https://catalog.data.gov/dataset/a-comparison-of-filter-based-approaches-for-model-based-prognostics

Data from: A Comparison of Filter-based Approaches for Model-based Prognostics

Explore at:

Dataset updated

Apr 11, 2025

Dataset provided by

Dashlink

Description

Model-based prognostics approaches use domain knowledge about a system and its failure modes through the use of physics-based models. Model-based prognosis is generally divided into two sequential problems: a joint state-parameter estimation problem, in which, using the model, the health of a system or component is determined based on the observations; and a prediction problem, in which, using the model, the state-parameter distribution is simulated forward in time to compute end of life and remaining useful life. The first problem is typically solved through the use of a state observer, or filter. The choice of filter depends on the assumptions that may be made about the system, and on the desired algorithm performance. In this paper, we review three separate filters for the solution to the first problem: the Daum filter, an exact nonlinear filter; the unscented Kalman filter, which approximates nonlinearities through the use of a deterministic sampling method known as the unscented transform; and the particle filter, which approximates the state distribution using a finite set of discrete, weighted samples, called particles. Using a centrifugal pump as a case study, we conduct a number of simulation-based experiments investigating the performance of the different algorithms as applied to prognostics.

Clear search

Close search

Google apps

Main menu

Data from: A Comparison of Filter-based Approaches for Model-based...

A dataset for comparing filtering methods used to wave and non-wave flow at...

Data for Filtering Organized 3D Point Clouds for Bin Picking Applications

Data from: Bagged filters for partially observed interacting systems

Data from: Generative Filtering for Recursive Bayesian Inference with...

Dataset of book subjects that contain An introduction to wavelets and other...

nllb-filtering

Data from: Current methods for automated filtering of multiple sequence...

Data from: Distributed Cubature Information Filtering Method for State...

Toxicity-Bias-Filtering

Best FE on clean and filtered data

Content

Acknowledgements

Data from: SAR Image Enhancement using Particle Filters

Supplementary evaluation files for the paper: Grid-Based Bayesian Filtering...

Data from: Removing Spikes While Preserving Data and Noise using Wavelet...

Data from: Multi-resolution filters for massive spatio-temporal data

A dataset for comparing filtering methods used to separate balanced and...

Data from: Advances in Uncertainty Representation and Management for...

Internet Filtering Software Industry Report

Wildchat-RIP-Filtered-by-70b-Llama

Data_Sheet_1_Effects of Rare Microbiome Taxa Filtering on Statistical...

Data from: A Comparison of Filter-based Approaches for Model-based Prognostics