2 datasets found
  1. f

    Top 10 performing oversamplers for DTS2 versus baseline (no oversampling and...

    • plos.figshare.com
    xls
    Updated Jun 3, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kevin Teh; Paul Armitage; Solomon Tesfaye; Dinesh Selvarajah; Iain D. Wilkinson (2023). Top 10 performing oversamplers for DTS2 versus baseline (no oversampling and SMOTE) averaged across four classifiers. [Dataset]. http://doi.org/10.1371/journal.pone.0243907.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Kevin Teh; Paul Armitage; Solomon Tesfaye; Dinesh Selvarajah; Iain D. Wilkinson
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Top 10 performing oversamplers for DTS2 versus baseline (no oversampling and SMOTE) averaged across four classifiers.

  2. Data from: iGlu_AdaBoost: Identification of Lysine Glutarylation Using the...

    • acs.figshare.com
    zip
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lijun Dou; Xiaoling Li; Lichao Zhang; Huaikun Xiang; Lei Xu (2023). iGlu_AdaBoost: Identification of Lysine Glutarylation Using the AdaBoost Classifier [Dataset]. http://doi.org/10.1021/acs.jproteome.0c00314.s001
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    ACS Publications
    Authors
    Lijun Dou; Xiaoling Li; Lichao Zhang; Huaikun Xiang; Lei Xu
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Lysine glutarylation is a newly reported post-translational modification (PTM) that plays significant roles in regulating metabolic and mitochondrial processes. Accurate identification of protein glutarylation is the primary task to better investigate molecular functions and various applications. Due to the common disadvantages of the time-consuming and expensive nature of traditional biological sequencing techniques as well as the explosive growth of protein data, building precise computational models to rapidly diagnose glutarylation is a popular and feasible solution. In this work, we proposed a novel AdaBoost-based predictor called iGlu_AdaBoost to distinguish glutarylation and non-glutarylation sequences. Here, the top 37 features were chosen from a total of 1768 combined features using Chi2 following incremental feature selection (IFS) to build the model, including 188D, the composition of k-spaced amino acid pairs (CKSAAP), and enhanced amino acid composition (EAAC). With the help of the hybrid-sampling method SMOTE-Tomek, the AdaBoost algorithm was performed with satisfactory recall, specificity, and AUC values of 87.48%, 72.49%, and 0.89 over 10-fold cross validation as well as 72.73%, 71.92%, and 0.63 over independent test, respectively. Further feature analysis inferred that positively charged amino acids RK play critical roles in glutarylation recognition. Our model presented the well generalization ability and consistency of the prediction results of positive and negative samples, which is comparable to four published tools. The proposed predictor is an efficient tool to find potential glutarylation sites and provides helpful suggestions for further research on glutarylation mechanisms and concerned disease treatments.

  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Kevin Teh; Paul Armitage; Solomon Tesfaye; Dinesh Selvarajah; Iain D. Wilkinson (2023). Top 10 performing oversamplers for DTS2 versus baseline (no oversampling and SMOTE) averaged across four classifiers. [Dataset]. http://doi.org/10.1371/journal.pone.0243907.t003

Top 10 performing oversamplers for DTS2 versus baseline (no oversampling and SMOTE) averaged across four classifiers.

Related Article
Explore at:
xlsAvailable download formats
Dataset updated
Jun 3, 2023
Dataset provided by
PLOS ONE
Authors
Kevin Teh; Paul Armitage; Solomon Tesfaye; Dinesh Selvarajah; Iain D. Wilkinson
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Top 10 performing oversamplers for DTS2 versus baseline (no oversampling and SMOTE) averaged across four classifiers.

Search
Clear search
Close search
Google apps
Main menu