Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Leave-one-out cross-validation results obtained with BioHEL, SVM, RF and PAM on the three microarray datasets using three feature selection methods (CFS, PLSS, RFS); AVG = average accuracy, STDDEV = standard deviation; the highest accuracies achieved with BioHEL and the best alternative are both shown in bold type for each dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Logistic regression leave-one-out cross-validation classification rate.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
*Once again, kicking one sample out as the testing sample, the rest 28 samples are the training dataset.The four features (columns “1”, “2”, “3”, and “4”) of each miRNA are calculated based on the genomic coordinates of the miRNA, the miRNA hosting intron, and the host gene.ER represents the experimental results and PR represents the prediction results. The symbol “+” means high co-expression and the symbol “−” means low co-expression.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This appendix contains all supplementary materials for the accepted manuscript JSSC-2023-0150, including detailed proofs of all theorems presented in the paper as well as additional simulation results.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A new design and implementation of a control system for an anthropomorphic robotic hand has been developed for the Bioinformatics and Autonomous Learning Laboratory (BALL) at ESPOL. Myoelectric signals were acquired using a bioelectric data acquisition board (CYTON BOARD) with six out of the available eight channels. These signals had an amplitude of 200 [uV] and were sampled at a frequency of 250 [Hz].
Molecular tip-dating of phylogenetic trees is a growing discipline that uses DNA sequences sampled at different points in time to co-estimate the timing of evolutionary events with rates of molecular evolution. In this context, BEAST, a program for Bayesian analysis of molecular sequences, is the most widely used phylogenetic tool. Here, we introduce TipDatingBeast, an R package built to assist the implementation of various phylogenetic tip-dating tests using BEAST. TipDatingBeast currently contains two main functions. The first one allows preparing date-randomization analyses, which assess the temporal signal of a dataset. The second function allows performing leave-one-out analyses, which test for the consistency between independent calibration sequences and allow pinpointing those leading to potential bias. We apply those functions to an empirical dataset and supply practical guidance for results interpretation.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The leave-one-out cross-validation (Jackknife test) success rates by a random guess and the network-based method.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Results of each tested prioritization method on the STRINGv8.2 network. Mean and standard deviation of four evaluation measures (AUC, MAP, and percentage of left-out genes ranked in tops 10 and 20), obtained for 10 complete leave-one-out cross-validations on the 29 disease sets using 10 distinct previously generated candidate sets. ‘SRec’: percentage of left-out genes (from the total number of seeds in the original seed sets: 620) effectively ranked, that is, yielding a ranking score larger than zero. ‘DRec’: percentage of recovered diseases among the 29 diseases with seeds (a disease is recovered if at least one of its left-out genes obtained a ranking score larger than zero). ‘SEval’: percentage of left-out genes (from the total number of seeds originally in the seed sets: 620) in the network. All evaluation measures, AUC, MAP, TOP 10 and TOP 20, were computed taking into account only the left-out genes present in each network (SEval), rather than all the genes originally in the seed sets. Parameters: HDiffusion (, ), PRank (, ).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
DP represents our dynamic programming algorithm, RC and RP represent the recursive combination and recursive partition algorithms, respectively.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
There is still no consensus as to how to select models in Bayesian phylogenetics, and more generally in applied Bayesian statistics. Bayes factors are often presented as the method of choice, yet other approaches have been proposed, such as cross-validation or information criteria. Each of these paradigms raises specific computational challenges, but they also differ in their statistical meaning, being motivated by different objectives: either testing hypotheses or finding the best-approximating model. These alternative goals entail different compromises, and as a result, Bayes factors, cross-validation and information criteria may be valid for addressing different questions. Here, the question of Bayesian model selection is revisited, with a focus on the problem of finding the best-approximating model. Several model selection approaches were re-implemented, numerically assessed and compared: Bayes factors, cross-validation (CV), in its different forms (k-fold or leave-one-out), and the widely applicable information criterion (wAIC), which is asymptotically equivalent to leave-one-out cross validation (LOO-CV). Using a combination of analytical results and empirical and simulation analyses, it is shown that Bayes factors are unduly conservative. In contrast, cross-validation represents a more adequate formalism for selecting the model returning the best approximation of the data-generating process and the most accurate estimates of the parameters of interest. Among alternative CV schemes, LOO-CV and its asymptotic equivalent represented by the wAIC, stand out as the best choices, conceptually and computationally, given that both can be simultaneously computed based on standard MCMC runs under the posterior distribution.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
[Acc2 average acceleration (g) where non-wear time was imputed by all wear-time data at similar time of the day for that participant; RMSE: Root mean square of the error; body side, monitor attachment to dominant wrist vs. non-dominant wrist;***: p
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
“Ratio” standards for the percentage of correctly predicted domains in the cross-validation. The feature used are only GO term frequencies gained from radius one neighboring domains.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In the second column the number in the brackets is the pre-defined number of components. The smaller prediction error is related a better prediction model.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
AUC in the framework of leave-one-out cross validation schema under different weight parameters is calculated to confirm that miREFScan is robust to the selection of parameter values.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This overview shows the individual importance of each of the features in the overall classification model. The large majority of features have very little impact over the model, i.e., a decrease in performance of 1–2%. The only two features that make a difference are the Prefix and the token context (Token_Bi3) that affect the overall performance with almost 15%.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Results of each tested prioritization method on the NCBI PPI network. Mean and standard deviation of four evaluation measures (AUC, MAP, and percentage of left-out genes ranked in tops 10 and 20), obtained for 10 complete leave-one-out cross-validations on the 29 disease sets using 10 distinct previously generated candidate sets. ‘SRec’: percentage of left-out genes (from the total number of seeds in the original seed sets: 620) effectively ranked, that is, yielding a ranking score larger than zero. ‘DRec’: percentage of recovered diseases among the 29 diseases with seeds (a disease is recovered if at least one of its left-out genes obtained a ranking score larger than zero). ‘SEval’: percentage of left-out genes (from the total number of seeds originally in the seed sets: 620) in the network. All evaluation measures, AUC, MAP, TOP 10 and TOP 20, were computed taking into account only the left-out genes present in each network (SEval), rather than all the genes originally in the seed sets. Parameters: HDiffusion (, ), PRank (, ).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Results are reported for three panel sizes (P1:500 SNPs, P2:800 SNPs, P3:1000 SNPs). For each panel size and for each population we report the average error and the standard deviation.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ModelND is trained on PositiveH, PositiveC, and NegativeND. ModelNDNN is trained using the same positive set, and a negative set that includes NegativeND together with NegativeNN. Accuracy (Acc), precision (Pre), and recall (Rec) were calculated for both kernerls, RBF and Polynomial.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Absolute () and relative () rankings of the known FEB and GEFS+ genes (see Table 1) for LOOCVs using different artificial locus sizes (up to 2+1 genes). Results are shown for both gene expression datasets, HBA and GEO. Ranks among the best 10% () are evidenced by bold face font. Ranks among the top 10 () are additionally marked by a single star () and ranks among the top 3 () by three stars ().
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Predictive power (PP) in leave-one-out validation of the respectively optimal selections of predictors for the relative mid-parent heterosis regarding the two different testcross set-ups.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Leave-one-out cross-validation results obtained with BioHEL, SVM, RF and PAM on the three microarray datasets using three feature selection methods (CFS, PLSS, RFS); AVG = average accuracy, STDDEV = standard deviation; the highest accuracies achieved with BioHEL and the best alternative are both shown in bold type for each dataset.