100+ datasets found
  1. w

    Dataset of book subjects that contain Assessing and improving prediction and...

    • workwithdata.com
    Updated Nov 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The citation is currently not available for this dataset.
    Explore at:
    Dataset updated
    Nov 7, 2024
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about book subjects. It has 7 rows and is filtered where the books is Assessing and improving prediction and classification : theory and algorithms in C++. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.

  2. l

    Algorithms for the paper "Finite orbits of the pure braid group on the...

    • repository.lboro.ac.uk
    pdf
    Updated Sep 8, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pierpaolo Calligaris; Marta Mazzocco (2017). Algorithms for the paper "Finite orbits of the pure braid group on the monodromy of the 2-variable Garnier system" [Dataset]. http://doi.org/10.17028/rd.lboro.4924181.v2
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Sep 8, 2017
    Dataset provided by
    Loughborough University
    Authors
    Pierpaolo Calligaris; Marta Mazzocco
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    These are the algorithms needed in the paper "Finite orbits of the pure braid group on the monodromy of the 2-variable Garnier system" by P. Calligaris and M. Mazzocco.More information on using these algorithms can be found in the file "README.pdf"An updated version of the algorithms was uploaded on 8th September 2017.

  3. m

    Data for: A fuzzy c-means algorithm based on the relationship among...

    • data.mendeley.com
    Updated Mar 31, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xueguan Song (2020). Data for: A fuzzy c-means algorithm based on the relationship among attributes of data and its application in tunnel boring machine [Dataset]. http://doi.org/10.17632/f6h8gfhsmf.1
    Explore at:
    Dataset updated
    Mar 31, 2020
    Authors
    Xueguan Song
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The codes and datasets of the SVR-FCM algorithm.

  4. f

    Data from: Count-Based Morgan Fingerprint: A More Efficient and...

    • acs.figshare.com
    xlsx
    Updated Jul 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shifa Zhong; Xiaohong Guan (2023). Count-Based Morgan Fingerprint: A More Efficient and Interpretable Molecular Representation in Developing Machine Learning-Based Predictive Regression Models for Water Contaminants’ Activities and Properties [Dataset]. http://doi.org/10.1021/acs.est.3c02198.s002
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jul 5, 2023
    Dataset provided by
    ACS Publications
    Authors
    Shifa Zhong; Xiaohong Guan
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    In this study, we introduce the count-based Morgan fingerprint (C-MF) to represent chemical structures of contaminants and develop machine learning (ML)-based predictive models for their activities and properties. Compared with the binary Morgan fingerprint (B-MF), C-MF not only qualifies the presence or absence of an atom group but also quantifies its counts in a molecule. We employ six different ML algorithms (ridge regression, SVM, KNN, RF, XGBoost, and CatBoost) to develop models on 10 contaminant-related data sets based on C-MF and B-MF to compare them in terms of the model’s predictive performance, interpretation, and applicability domain (AD). Our results show that C-MF outperforms B-MF in nine of 10 data sets in terms of model predictive performance. The advantage of C-MF over B-MF is dependent on the ML algorithm, and the performance enhancements are proportional to the difference in the chemical diversity of data sets calculated by B-MF and C-MF. Model interpretation results show that the C-MF-based model can elucidate the effect of atom group counts on the target and have a wider range of SHAP values. AD analysis shows that C-MF-based models have an AD similar to that of B-MF-based ones. Finally, we developed a “ContaminaNET” platform to deploy these C-MF-based models for free use.

  5. d

    Simulating Degradation Data for Prognostic Algorithm Development

    • catalog.data.gov
    • data.nasa.gov
    Updated Apr 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Simulating Degradation Data for Prognostic Algorithm Development [Dataset]. https://catalog.data.gov/dataset/simulating-degradation-data-for-prognostic-algorithm-development
    Explore at:
    Dataset updated
    Apr 10, 2025
    Dataset provided by
    Dashlink
    Description

    PHM08 Challenge Dataset is now publicly available at the NASA Prognostics Respository + Download INTRODUCTION - WHY SIMULATE DEGRADATION DATA? Of various challenges encountered in prognostics algorithm development, the non-availability of suitable validation data is most often the bottleneck in the technology certification process. Prognostics imposes several requirements on the training data in addition to what is commonly available from various applications. It not only requires data containing fault signatures but also that contains fault evolution trends with corresponding time indexes (in number of hours or number of operational cycles). In general there are three sources from which data is usually available, namely: Fielded applications, experimental test-beds, and computer simulations (see Figure 1). From prognostics point of view, data collection paradoxically suffers from the situation that the systems that do run to failure often did not have warning instrumentation installed, hence no or little record of what went wrong. In the other situation, those that are continuously monitored are prevented from running to failure or are subject to maintenance that eliminates the signatures of fault evolution. Conducting experiments that replicate real world situations is extremely expensive in terms of time required for a healthy system to run to failure and is often dangerous. Accelerated ageing may be useful to some extent but may not emulate normal wear patterns. Furthermore, to manage uncertainty multiple datasets must be collected to quantify variations resulting from multiple sources, which makes it all the way more unattainable. Simulations can be fast, inexpensive, and provide a number of options to design experiments, but their usefulness is contingent on the availability of high fidelity models that represent the real systems fairly well. However, once such a model is available, simulations offer the flexibility to rerun various experiments with added knowledge from the system as it becomes available. Where, availability of real fault evolution data from the fielded systems would be more desirable, generating data using a high fidelity model and integrating it with the knowledge gathered from the partial data obtained from the real systems is by far the most practical approach for prognostics algorithm development, validation, and verification. In this presentation we discuss some key elements that must be kept in mind while generating datasets suitable for prognostics. Furthermore, with the help of an example it has been shown how a dynamical system model can be supported with suitable degradation models available from respective domain knowledge to create suitable data. The example is discussed next. APPLICATION DOMAIN Tracking and Predicting the progressionof damage in aircraft turbo machinery has been an active area of study within the Condition Based Maintenance (CBM) community. A general approach has been to correlate flow and effciency losses to degradation signtures in various components of the engine. Once such mapping is available, the next task is to estimate this loss of flow and eficiency inferring information from measurable sensor outputs, which ultimtely is used to assess the level of degradation in the system. SYSTEM MODEL: C-MAPSS The C-MAPSS (Commercial Modular Aero Propulsion System Simulation) is a tool, recently released, for simulating a realistic large commercial turbofan engine. C-MAPSS (Commercial Modular Aero-Propulsion System Simulation) that simulates a realistic large (~90,000lb) commercial turbofan engine. It allows the user to choose and design operational profiles, controllers, environmental conditions, thrust levels, etc. to simualte a scenario of interest. An extensive list of output va

  6. f

    C++ GPU implementation of the Boolean matrix factorization algorithm C-Salt

    • springernature.figshare.com
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sibylle Hess; Katharina Morik (2023). C++ GPU implementation of the Boolean matrix factorization algorithm C-Salt [Dataset]. http://doi.org/10.6084/m9.figshare.5441365.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    figshare
    Authors
    Sibylle Hess; Katharina Morik
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The dataset comprises of the C++ GPU implementation of the Boolean matrix factorization algorithm C-Salt. The ReadMe file which is included describes how the implementation can be set up and used. Two of the files, the IJulia notebooks with filenames CSaltEvalRealWorld.ipynb and CSaltGenerateSynthData.ipynb can be used to generate synthetic data as proposed in the paper, and to evaluate the quality measurements for results on the submitted text data.The data includes two IPython (IPYNB) notebook files, 6 tab separated value (.tsv) files, two .hpp files and two .cu files. The IPYNB files can be exported to .HTML, .PDF, reStructuredText, and LaTeX formats. .TSV files can be opened using open source text editors. .HPP is a header format used by C++, and .cu are associated with the NVIDIA CUDA Toolkit.Abstract:Given labelled data represented by a binary matrix, we consider the task to derive a Boolean matrix factorization which identifies commonalities and specifications among the classes. While existing works focus on rank-one factorizations which are either specific or common to the classes, we derive class-specific alterations from common factorizations as well. Therewith, we broaden the applicability of our new method to datasets whose class-dependencies have a more complex structure. On the basis of synthetic and real-world datasets, we show on the one hand that our method is able to alter structure which corresponds to our model assumption, and on the other hand that our model assumption is justified in real-world application. Our method is parameter-free.

  7. Data for the article "A New Ecient Algorithm Addressing the Problem of...

    • ieee-dataport.org
    Updated Nov 9, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Paul Kozyra (2021). Data for the article "A New E cient Algorithm Addressing the Problem of Reliability Evaluation of Multistate Flow Networks with Budget Constraint in Terms of Minimal Cuts" [Dataset]. http://doi.org/10.21227/8fmh-3d09
    Explore at:
    Dataset updated
    Nov 9, 2021
    Dataset provided by
    Institute of Electrical and Electronics Engineershttp://www.ieee.ro/
    Authors
    Paul Kozyra
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Many real-world systems can be modeled by multistate flow networks (MFNs) and their reliability evaluation features in designing and control of these systems. Considering the cost constraint makes the problem of reliability evaluation of an MFN more realistic. For a given demand value d and a given cost limit c, the reliability of an MFN at level (d, c) is the probability of transmitting at least d units from the source node to the sink node through the network within the cost of c. This article addresses the so-called (d, c)-MC problem, i.e., the problem of reliability evaluation of an MFN with cost constraint in terms of minimal cuts. It presents new results on which a new algorithm is based. This algorithm finds all (d, c)-MC candidates without duplicates and verifies them more efficiently than existing ones. The complexity results for this algorithm and an example of its use are provided. Finally, numerical experiments with R language implementations of the presented algorithm and other competitive algorithms are considered. Both, the time complexity analysis and numerical experiments demonstrate the presented algorithm to be more efficient than other existing ones in the majority of cases.

  8. m

    Data from: The Least Cost Directed Perfect Awareness Problem - Benchmark...

    • data.mendeley.com
    Updated Nov 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Felipe Pereira (2024). The Least Cost Directed Perfect Awareness Problem - Benchmark Instances and Solutions [Dataset]. http://doi.org/10.17632/xgtjgzf28r.3
    Explore at:
    Dataset updated
    Nov 11, 2024
    Authors
    Felipe Pereira
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    This dataset contains complementary data to the paper "The Least Cost Directed Perfect Awareness Problem: Complexity, Algorithms and Computations" [1]. Here, we make available two sets of instances of the combinatorial optimization problem studied in that paper, which deals with the spread of information on social networks. We also provide the best known solutions and bounds obtained through computational experiments for each instance.

    The first input set includes 300 synthetic instances composed of graphs that resemble real-world social networks. These graphs were produced with a generator proposed in [2]. The second set consists of 14 instances built from graphs obtained by crawling Twitter [3].

    The directories "synthetic_instances" and "twitter_instances" contain files that describe both sets of instances, all of which follow the format: the first two lines correspond to:

    where

    where

    where and

    The directories "solutions_for_synthetic_instances" and "solutions_for_twitter_instances" contain files that describe the best known solutions for both sets of instances, all of which follow the format: the first line corresponds to:

    where is the number of vertices in the solution. Each of the next lines contains:

    where

    where

    Lastly, two files, namely, "bounds_for_synthetic_instances.csv" and "bounds_for_twitter_instances.csv", enumerate the values of the best known lower and upper bounds for both sets of instances.

    This work was supported by grants from Santander Bank, Brazil, Brazilian National Council for Scientific and Technological Development (CNPq), Brazil, São Paulo Research Foundation (FAPESP), Brazil.

    Caveat: the opinions, hypotheses and conclusions or recommendations expressed in this material are the responsibility of the authors and do not necessarily reflect the views of Santander, CNPq, or FAPESP.

    References

    [1] F. C. Pereira, P. J. de Rezende. The Least Cost Directed Perfect Awareness Problem: Complexity, Algorithms and Computations. Submitted. 2023.

    [2] B. Bollobás, C. Borgs, J. Chayes, and O. Riordan. Directed scale-free graphs. In Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’03, pages 132–139, 2003.

    [3] C. Schweimer, C. Gfrerer, F. Lugstein, D. Pape, J. A. Velimsky, R. Elsässer, and B. C. Geiger. Generating simple directed social network graphs for information spreading. In Proceedings of the ACM Web Conference 2022, WWW ’22, pages 1475–1485, 2022.

  9. d

    C++QEDv2: The multi-array concept and compile-time algorithms in the...

    • elsevier.digitalcommonsdata.com
    Updated Jun 1, 2012
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    András Vukics (2012). C++QEDv2: The multi-array concept and compile-time algorithms in the definition of composite quantum systems [Dataset]. http://doi.org/10.17632/p4c8n83bb6.1
    Explore at:
    Dataset updated
    Jun 1, 2012
    Authors
    András Vukics
    Description

    Abstract C++QED is a versatile framework for simulating open quantum dynamics. It allows to build arbitrarily complex quantum systems from elementary free subsystems and interactions, and simulate their time evolution with the available time-evolution drivers. Through this framework, we introduce a design which should be generic for high-level representations of composite quantum systems. It relies heavily on the object-oriented and generic programming paradigms on one hand, and on the other hand, com...

    Title of program: C++QED Catalogue Id: AELU_v1_0

    Nature of problem Definition of (open) composite quantum systems out of elementary building blocks [1]. Manipulation of such systems, with emphasis on dynamical simulations such as Master-equation evolution [2] and Monte Carlo wave-function simulation [3].

    Versions of this program held in the CPC repository in Mendeley Data AELU_v1_0; C++QED; 10.1016/j.cpc.2012.02.004 AELU_v2_0; C++QED; 10.1016/j.cpc.2014.04.011

    This program has been imported from the CPC Program Library held at Queen's University Belfast (1969-2019)

  10. d

    Multiresponse algorithms for community-level modeling: review of theory,...

    • datadryad.org
    • dataone.org
    zip
    Updated Nov 2, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Diego Nieto-Lugilde; Katlin C. Maguire; Jessica L. Blois; John W. Williams; Matthew C. Fitzpatrick; Kaitlin C. Maguire (2018). Multiresponse algorithms for community-level modeling: review of theory, applications, and comparison to species distribution models [Dataset]. http://doi.org/10.5061/dryad.99dc0
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 2, 2018
    Dataset provided by
    Dryad
    Authors
    Diego Nieto-Lugilde; Katlin C. Maguire; Jessica L. Blois; John W. Williams; Matthew C. Fitzpatrick; Kaitlin C. Maguire
    Time period covered
    2018
    Description

    Studies applying Community Level ModelsSummary table of empirical studies applying Community Level Models (CLMs). The studies have been grouped by CLM algorithm.CLMs_review-Dryad-Data.xlsxCommunity Level Modeling vignetteRmarkdown file with code for the Community Level Modeling tutorial: Fitting multispecies models in RCLMs_review-Dryad-Code.Rmd

  11. m

    Data Structures in Java: Active Learning Techniques to Enhance Learning in...

    • data.mendeley.com
    Updated Jun 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rubén Baena-Navarro (2024). Data Structures in Java: Active Learning Techniques to Enhance Learning in STEM [Dataset]. http://doi.org/10.17632/2fh69dt922.1
    Explore at:
    Dataset updated
    Jun 24, 2024
    Authors
    Rubén Baena-Navarro
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains a collection of scholarly articles focused on the implementation of active learning techniques in data structures courses, with a particular emphasis on Java programming and its application in enhancing student learning in STEM (Science, Technology, Engineering, and Mathematics) disciplines. This collection provides a comprehensive view of various teaching strategies that promote deeper and more meaningful learning through active methods. Each included article has been selected for its relevance, accessibility (Open Access), and contribution to educational practice in programming and data structures.

    Keywords: Active learning, data structures, Java programming, STEM, education, teaching strategies, student engagement.

    This dataset provides a solid foundation for research and implementation of active learning techniques in data structures and programming courses, benefiting educators and students in the STEM field.

    Dataset Contents:

    Learning more about active learning Author: Graeme Stemp-Morlock DOI: 10.1145/1498765.1498771 Publication Date: April 1, 2009 Abstract: Discusses how active learning algorithms can reduce label complexity compared to passive methods.

    A Compendium of Rationales and Techniques for Active Learning Author: C. Reiness DOI: 10.1187/CBE.20-08-0177 Publication Date: October 1, 2020 Abstract: Provides a collection of strategies for promoting active learning.

    Defining Active Learning: A Restricted Systemic Review Authors: Peter Doolittle, Krista Wojdak, Amanda Walters DOI: 10.20343/teachlearninqu.11.25 Publication Date: September 22, 2023 Abstract: Defines active learning as a student-centered approach to knowledge construction focusing on higher-order thinking.

    The Curious Construct of Active Learning Authors: D. Lombardi, T. Shipley DOI: 10.1177/1529100620973974 Publication Date: April 1, 2021 Abstract: Discusses the different interpretations of active learning in STEM domains.

    Active Learning to Classify Macromolecular Structures in situ for Less Supervision in Cryo-Electron Tomography Authors: Xuefeng Du, Haohan Wang, Zhenxi Zhu, Xiangrui Zeng, Yi-Wei Chang, Jing Zhang, E. Xing, Min Xu DOI: 10.1093/bioinformatics/btab123 Publication Date: February 23, 2021 Abstract: Proposes a hybrid active learning framework to reduce labeling burden in cryo-ET tasks.

  12. LOADING SIMULATION PROGRAM C

    • catalog.data.gov
    Updated Jul 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Environmental Protection Agency, Region 4 (2021). LOADING SIMULATION PROGRAM C [Dataset]. https://catalog.data.gov/dataset/loading-simulation-program-c
    Explore at:
    Dataset updated
    Jul 12, 2021
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    LSPC is the Loading Simulation Program in C++, a watershed modeling system that includes streamlined Hydrologic Simulation Program Fortran (HSPF) algorithms for simulating hydrology, sediment, and general water quality

  13. H

    Replication Data for Novel, provable algorithms for efficient ensemble-based...

    • dataverse.harvard.edu
    Updated Sep 25, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anna Lowegard (2019). Replication Data for Novel, provable algorithms for efficient ensemble-based computational protein design and their application to the redesign of the c-Raf-RBD:KRas protein-protein interface [Dataset]. http://doi.org/10.7910/DVN/VHIRNM
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 25, 2019
    Dataset provided by
    Harvard Dataverse
    Authors
    Anna Lowegard
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    These files contain all of the necessary code to re-create the results in the manuscript entitled "Novel, provable algorithms for efficient ensemble-based computational protein design and their application to the redesign of the c-Raf-RBD:KRas protein-protein interface" along with all of the results described therein.

  14. f

    Data from: Can geodemographic clustering be fair? Incorporating social...

    • figshare.com
    zip
    Updated Dec 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yue Lin; George Grekousis (2024). Can geodemographic clustering be fair? Incorporating social fairness in crisp and fuzzy approaches through a unified framework [Dataset]. http://doi.org/10.6084/m9.figshare.25719732.v2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 14, 2024
    Dataset provided by
    figshare
    Authors
    Yue Lin; George Grekousis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Geodemographic analysis involves clustering geographic areas into socio-demographically homogeneous groups. However, most existing methods prioritize overall effectiveness, measured by minimizing total costs, potentially misrepresenting specific subgroups within the data. Despite a growing literature on fair clustering, it focuses almost exclusively on crisp clustering, failing to address the inherent fuzziness of the real world. This study addresses these gaps by introducing a socially-fair geodemographic clustering (SFGC) framework, which modifies the classical fuzzy-c means (FCM) by incorporating a new cost function that, instead of minimizing total costs, minimizes the maximum average cost across all subgroups. SFGC also introduces a gradient descent-based algorithm to optimize this new cost function. In addition, SFGC can be directly adapted to crisp clustering, facilitating practical implementation and comparison of clustering algorithms.

  15. c

    Data from: Comparison of Prognostic Algorithms for Estimating Remaining...

    • s.cnmilf.com
    • data.nasa.gov
    • +2more
    Updated Apr 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Comparison of Prognostic Algorithms for Estimating Remaining Useful Life of Batteries [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/comparison-of-prognostic-algorithms-for-estimating-remaining-useful-life-of-batteries
    Explore at:
    Dataset updated
    Apr 10, 2025
    Dataset provided by
    Dashlink
    Description

    We were interested here in particular in conditions where un-modeled effects are present as manifested by the different degradation curve at 45°C. Although all algorithms were given the same amount of information to the degree practical, there were considerable differences in performance. Specifically, the combined Bayesian regression-estimation approach implemented as a RVM-PF framework has significant advantages over conventional methods of RUL estimation like ARIMA and EKF. ARIMA, being a purely data-driven method, does not incorporate any physics of the process into the computation, and hence ends up with wide uncertainty margins that make it unsuitable for long-term predictions. Additionally, it may not be possible to eliminate all non-stationarity from a dataset even after repeated differencing, thus adding to prediction inaccuracy. EKF, though robust against non-stationarity, suffers from the inability to accommodate un-modeled effects and can diverge quickly as shown. We did not explore other variations of the Kalman Filter that might provide better performance such as the unscented Kalman Filter. The Bayesian statistical approach, on the other hand, appears to be well suited to handle various sources of uncertainties since it defines probability distributions over both parameters and variables and integrates out the nuisance terms. Also, it does not simply provide a mean estimate of the time-to-failure; rather it generates a probability distribution over time that best encapsulates the uncertainties inherent in the system model and measurements and in the core concept of failure prediction.

  16. N

    Cohen's d Confidence Sets Results: Algorithm 3 MiddleConfidenceInterval c12

    • neurovault.org
    nifti
    Updated Oct 19, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). Cohen's d Confidence Sets Results: Algorithm 3 MiddleConfidenceInterval c12 [Dataset]. http://identifiers.org/neurovault.image:426896
    Explore at:
    niftiAvailable download formats
    Dataset updated
    Oct 19, 2020
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Middle confidence interval!

    glassbrain

    Collection description

    In this repository, we share all of the Human Connectome Project results maps used in the manuscript Spatial
    Confidence Sets for Standardized Effect Size Images
    (Bowring, Telschow, Schwartzman, Nichols; 2020).

    Images are named using the following format: The 'Algorithm 1 LowerConfidenceInterval c05' image is the (blue) lower CS map obtained for the targetted Cohen's d effect size c = 0.5 using Algorithm. 1 as described in the manuscript; the 'Algorithm 3 MiddleConfidenceInterval c12' image is the (yellow) point estimate map obtained for the targetted Cohen's d effect size c = 1.2 using Algorithm. 3. etc.

    Finally, the 'SnPM filtered' image is the thresholded (p < 0.05 FWE; obtained via permutation) statistical results map from applying a group-level one sample t-test to the 80 subjects' data.

    Subject species

    homo sapiens

    Map type

    R

  17. Data from: Spatial and temporal variability of the freezing level in...

    • zenodo.org
    bin, text/x-python +1
    Updated Jan 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicolás D. García-Lee; Nicolás D. García-Lee; Claudio Bravo; Claudio Bravo; Álvaro González-Reyes; Álvaro González-Reyes; Piero Mardones; Piero Mardones (2025). Spatial and temporal variability of the freezing level in Patagonia's atmosphere [Dataset]. http://doi.org/10.5281/zenodo.14673822
    Explore at:
    text/x-python, bin, txtAvailable download formats
    Dataset updated
    Jan 16, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Nicolás D. García-Lee; Nicolás D. García-Lee; Claudio Bravo; Claudio Bravo; Álvaro González-Reyes; Álvaro González-Reyes; Piero Mardones; Piero Mardones
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    May 27, 2024
    Description

    Short Summary:

    This repository houses the Python preprocessing scripts utilized in generating the metadata for García-Lee et al., (2024) dataset. With these files and scripts, you gain access to the algorithm and examples for generating gridded products in netCDF format, specifically featuring the 0°C isotherm field.

    Dependencies:

    • numpy (tested with 1.24.4 in py3)
    • pandas (tested with 2.0.3 in py3)
    • netCDF4 (tested with 1.6.0 in py3)
    • re (tested with 2.2.1 in py3)
    • glob
    • OS: Tested in Windows.

    Technical Info:

    File

    Type

    Description

    1_H0_Detect.py

    Python script

    0°C Isotherm Detection Algorithm.

    2_H0_Daily_Mean.py

    Python script

    Calculation of Daily Mean.

    ISO0_1959_2021_GRID.rar

    netCDF

    0°C Isotherm Data at 6-Hour Intervals (1959-2021) in meters above sea level (m a.s.l.).

    ERA5_PATAGONIA_6H_T_GPH_1959.rar

    netCDF

    Raw ERA5 data example for 1959: Temperature (°K) and Geopotential (m**2 s**-2).

    Extra:

    The file 'Observations and Charts.pdf' shows averages, standard deviations, bias, and trends of the 0°C isotherm for Puerto Montt, Río Gallegos, Comodoro Rivadavia, and Punta Arenas. These values were estimated using both observations and reanalysis ERA5 data.

    Reference:

    García-Lee, N., Bravo, C., Gónzalez-Reyes, Á., and Mardones, P.: Spatial and temporal variability of the freezing level in Patagonia's atmosphere, Weather Clim. Dynam., 5, 1137–1151, https://doi.org/10.5194/wcd-5-1137-2024, 2024.

  18. d

    Efficient Matlab Programs

    • catalog.data.gov
    • datasets.ai
    • +1more
    Updated Apr 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Efficient Matlab Programs [Dataset]. https://catalog.data.gov/dataset/efficient-matlab-programs
    Explore at:
    Dataset updated
    Apr 10, 2025
    Dataset provided by
    Dashlink
    Description

    Matlab has a reputation for running slowly. Here are some pointers on how to speed computations, to an often unexpected degree. Subjects currently covered: Matrix Coding Implicit Multithreading on a Multicore Machine Sparse Matrices Sub-Block Computation to Avoid Memory Overflow Matrix Coding - 1 Matlab documentation notes that efficient computation depends on using the matrix facilities, and that mathematically identical algorithms can have very different runtimes, but they are a bit coy about just what these differences are. A simple but telling example: The following is the core of the GD-CLS algorithm of Berry et.al., copied from fig. 1 of Shahnaz et.al, 2006, "Document clustering using nonnegative matrix factorization': for jj = 1:maxiter A = W'*W + lambda*eye(k); for ii = 1:n b = W'*V(:,ii); H(:,ii) = A \ b; end H = H .* (H>0); W = W .* (V*H') ./ (W*(H*H') + 1e-9); end Replacing the columwise update of H with a matrix update gives: for jj = 1:maxiter A = W'*W + lambda*eye(k); B = W'*V; H = A \ B; H = H .* (H>0); W = W .* (V*H') ./ (W*(H*H') + 1e-9); end These were tested on an 8049 x 8660 sparse matrix bag of words V (.0083 non-zeros), with W of size 8049 x 50, H 50 x 8660, maxiter = 50, lambda = 0.1, and identical initial W. They were run consecutivly, multithreaded on an 8-processor Sun server, starting at ~7:30PM. Tic-toc timing was recorded. Runtimes were respectivly 6586.2 and 70.5 seconds, a 93:1 difference. The maximum absolute pairwise difference between W matrix values was 6.6e-14. Similar speedups have been consistantly observed in other cases. In one algorithm, combining matrix operations with efficient use of the sparse matrix facilities gave a 3600:1 speedup. For speed alone, C-style iterative programming should be avoided wherever possible. In addition, when a couple lines of matrix code can substitute for an entire C-style function, program clarity is much improved. Matrix Coding - 2 Applied to integration, the speed gains are not so great, largely due to the time taken to set up the and deal with the boundaries. The anyomous function setup time is neglegable. I demonstrate on a simple uniform step linearly interpolated 1-D integration of cos() from 0 to pi, which should yield zero: tic; step = .00001; fun = @cos; start = 0; endit = pi; enda = floor((endit - start)/step)step + start; delta = (endit - enda)/step; intF = fun(start)/2; intF = intF + fun(endit)delta/2; intF = intF + fun(enda)(delta+1)/2; for ii = start+step:step:enda-step intF = intF + fun(ii); end intF = intFstep toc; intF = -2.910164109692914e-14 Elapsed time is 4.091038 seconds. Replacing the inner summation loop with the matrix equivalent speeds things up a bit: tic; step = .00001; fun = @cos; start = 0; endit = pi; enda = floor((endit - start)/step)*step + start; delta = (endit - enda)/step; intF = fun(start)/2; intF = intF + fun(endit)*delta/2; intF = intF + fun(enda)*(delta+1)/2; intF = intF + sum(fun(start+step:step:enda-step)); intF = intF*step toc; intF = -2.868419946011613e-14 Elapsed time is 0.141564 seconds. The core computation take

  19. H

    Data from: Benchmark data for sulcal pits extraction algorithms

    • dataverse.harvard.edu
    • search.dataone.org
    Updated Sep 30, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    guillaume Auzias; Lucile Brun; Christine Deruelle; Olivier Coulon (2015). Benchmark data for sulcal pits extraction algorithms [Dataset]. http://doi.org/10.7910/DVN/PRK2U1
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 30, 2015
    Dataset provided by
    Harvard Dataverse
    Authors
    guillaume Auzias; Lucile Brun; Christine Deruelle; Olivier Coulon
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    France, Marseilles
    Dataset funded by
    Fondation de France (OTP 38872)
    OASIS database grants (P50 AG05681, P01 AG03991, R01 AG021910, P20 MH071616, U24 RR021382)
    Fondation Orange (S1 2013-050)
    Description

    This article contains data related to the research article “G. Auzias, L. Brun, C. Deruelle, O. Coulon, Deep sulcal landmarks: Algorithmic and conceptual improvements in the definition and extraction of sulcal pits, Neuroimage. 111 (2015) 12–25. doi:10.1016/j.neuroimage.2015.02.008”. This data can be used as a benchmark for quantitative evaluation of sulcal pits extraction algorithm. In particular, it allows a quantitative comparison with our method, and the assessment of the consistency of the sulcal pits extraction across two well-matched populations.

  20. d

    Software for control of autonomous robots using fuzzy logic controllers...

    • datadryad.org
    • data.mendeley.com
    • +1more
    zip
    Updated Oct 28, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Corneliu Arsene (2019). Software for control of autonomous robots using fuzzy logic controllers tuned by genetic algorithms [Dataset]. http://doi.org/10.5061/dryad.9ghx3ffcw
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 28, 2019
    Dataset provided by
    Dryad
    Authors
    Corneliu Arsene
    Time period covered
    2019
    Description

    This software implements the autonomous control of a robot by using a fuzzy logic controller tuned by a genetic algorithm. The software was written in C programming language for Windows (SDK). A description of the software can be found in the research publication "Arsene, C.T.C., & Zalzala, A.M.S., "Control of autonomous robots using fuzzy logic controllers tuned by genetic algorithms", In Proc Congress on Evolutionary Computation, Vol. 1, pp. 428-35, Washington DC, 1999, IEEE Computer Science Press, ISBN 0-7803-5536-9". Possibly the software to be used also for simulation of Nano-robots.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
The citation is currently not available for this dataset.

Dataset of book subjects that contain Assessing and improving prediction and classification : theory and algorithms in C++

Explore at:
Dataset updated
Nov 7, 2024
Dataset authored and provided by
Work With Data
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This dataset is about book subjects. It has 7 rows and is filtered where the books is Assessing and improving prediction and classification : theory and algorithms in C++. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.

Search
Clear search
Close search
Google apps
Main menu