3 datasets found
  1. Software Defect Dataset by NASA

    • kaggle.com
    zip
    Updated Jul 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Radowanul Haque (2023). Software Defect Dataset by NASA [Dataset]. https://www.kaggle.com/datasets/radowanulhaque/software-defect
    Explore at:
    zip(315165 bytes)Available download formats
    Dataset updated
    Jul 10, 2023
    Authors
    Radowanul Haque
    Description

    The Software Defect Dataset is a rich and meticulously curated collection of data designed to facilitate research and analysis in the field of software quality and defect prediction. This dataset serves as a valuable resource for researchers, practitioners, and enthusiasts seeking to enhance the reliability and stability of software systems.

  2. Software Defects

    • kaggle.com
    Updated May 3, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yasir Hussein Shakir (2025). Software Defects [Dataset]. https://www.kaggle.com/datasets/yasserhessein/software-defects
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 3, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Yasir Hussein Shakir
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    A total of twelve NASA software defect data sets have been utilized in this research. The PROMISE software engineering repository (http://promise.site.uottawa.ca/SERepository/) provided five of the data sets (part I), which included CM1, JM1, KC1, KC2, and PC1. The remaining seven data sets (part II) came from the tera-PROMISE Repository (http://openscience.us/repo/defect/mccabehalsted/).

    Tong, Haonan; Liu, Bin; Wang, Shihai (2017), “Benchmark data sets”, Mendeley Data, V1, doi: 10.17632/923xvkk5mm.1

  3. f

    DataSheet2_Batch effect correction methods for NASA GeneLab transcriptomic...

    • datasetcatalog.nlm.nih.gov
    • figshare.com
    Updated Jun 1, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Boyko, Valery; Samson, Finsam; Sanders, Lauren M.; Chen, Yi-Chun; Gebre, Samrawit; Galazka, Jonathan M.; Chok, Hamed; Acuna, Ana Uriarte; Costes, Sylvain V.; Dinh, Marie; Polo, San-Huei Lai; Saravia-Butler, Amanda M. (2023). DataSheet2_Batch effect correction methods for NASA GeneLab transcriptomic datasets.PDF [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001023928
    Explore at:
    Dataset updated
    Jun 1, 2023
    Authors
    Boyko, Valery; Samson, Finsam; Sanders, Lauren M.; Chen, Yi-Chun; Gebre, Samrawit; Galazka, Jonathan M.; Chok, Hamed; Acuna, Ana Uriarte; Costes, Sylvain V.; Dinh, Marie; Polo, San-Huei Lai; Saravia-Butler, Amanda M.
    Description

    Introduction: RNA sequencing (RNA-seq) data from space biology experiments promise to yield invaluable insights into the effects of spaceflight on terrestrial biology. However, sample numbers from each study are low due to limited crew availability, hardware, and space. To increase statistical power, spaceflight RNA-seq datasets from different missions are often aggregated together. However, this can introduce technical variation or “batch effects”, often due to differences in sample handling, sample processing, and sequencing platforms. Several computational methods have been developed to correct for technical batch effects, thereby reducing their impact on true biological signals.Methods: In this study, we combined 7 mouse liver RNA-seq datasets from NASA GeneLab (part of the NASA Open Science Data Repository) to evaluate several common batch effect correction methods (ComBat and ComBat-seq from the sva R package, and Median Polish, Empirical Bayes, and ANOVA from the MBatch R package). Principal component analysis (PCA) was used to identify library preparation method and mission as the primary sources of batch effect among the technical variables in the combined dataset. We next quantitatively evaluated the ability of each of the indicated methods to correct for each identified technical batch variable using the following criteria: BatchQC, PCA, dispersion separability criterion, log fold change correlation, and differential gene expression analysis. Each batch variable/correction method combination was then assessed using a custom scoring approach to identify the optimal correction method for the combined dataset, by geometrically probing the space of all allowable scoring functions to yield an aggregate volume-based scoring measure.Results and Discussion: Using the method described for the combined dataset in this study, the library preparation variable/ComBat correction method pair out ranked the other candidate pairs, suggesting that this combined dataset should be corrected for library preparation using the ComBat correction method prior to downstream analysis. We describe the GeneLab multi-study analysis and visualization portal which will allow users to access the publicly available space biology ‘omics data, select multiple studies to combine for analysis, and examine the presence or absence of batch effects using multiple metrics. If the user chooses to perform batch effect correction, the scoring approach described here can be implemented to identify the optimal correction method to use for their specific combined dataset prior to analysis.

  4. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Radowanul Haque (2023). Software Defect Dataset by NASA [Dataset]. https://www.kaggle.com/datasets/radowanulhaque/software-defect
Organization logo

Software Defect Dataset by NASA

Dataset from PROMISE Software Engineering Repository

Explore at:
zip(315165 bytes)Available download formats
Dataset updated
Jul 10, 2023
Authors
Radowanul Haque
Description

The Software Defect Dataset is a rich and meticulously curated collection of data designed to facilitate research and analysis in the field of software quality and defect prediction. This dataset serves as a valuable resource for researchers, practitioners, and enthusiasts seeking to enhance the reliability and stability of software systems.

Search
Clear search
Close search
Google apps
Main menu