33 datasets found

TPS Jun 2022 Params
kaggle.com
zip
Updated Jun 22, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Akio Onodera (2022). TPS Jun 2022 Params [Dataset]. https://www.kaggle.com/datasets/akioonodera/tps-jun-2022
Explore at:
zip(7503 bytes)Available download formats
Dataset updated
Jun 22, 2022
Authors
Akio Onodera
Description
Adjust the parameters by extracting only the rows that do not contain any missing values. Best result when using ver4.
RFE results for the ML models.
plos.figshare.com
xls
Updated Apr 2, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohammad Pourmahmood Aghababa; Jan Andrysek (2024). RFE results for the ML models. [Dataset]. http://doi.org/10.1371/journal.pone.0300447.t004
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0300447.t004
Dataset updated
Apr 2, 2024
Dataset provided by
PLOShttp://plos.org/
Authors
Mohammad Pourmahmood Aghababa; Jan Andrysek
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Quantitative gait analysis is important for understanding the non-typical walking patterns associated with mobility impairments. Conventional linear statistical methods and machine learning (ML) models are commonly used to assess gait performance and related changes in the gait parameters. Nonetheless, explainable machine learning provides an alternative technique for distinguishing the significant and influential gait changes stemming from a given intervention. The goal of this work was to demonstrate the use of explainable ML models in gait analysis for prosthetic rehabilitation in both population- and sample-based interpretability analyses. Models were developed to classify amputee gait with two types of prosthetic knee joints. Sagittal plane gait patterns of 21 individuals with unilateral transfemoral amputations were video-recorded and 19 spatiotemporal and kinematic gait parameters were extracted and included in the models. Four ML models—logistic regression, support vector machine, random forest, and LightGBM—were assessed and tested for accuracy and precision. The Shapley Additive exPlanations (SHAP) framework was applied to examine global and local interpretability. Random Forest yielded the highest classification accuracy (98.3%). The SHAP framework quantified the level of influence of each gait parameter in the models where knee flexion-related parameters were found the most influential factors in yielding the outcomes of the models. The sample-based explainable ML provided additional insights over the population-based analyses, including an understanding of the effect of the knee type on the walking style of a specific sample, and whether or not it agreed with global interpretations. It was concluded that explainable ML models can be powerful tools for the assessment of gait-related clinical interventions, revealing important parameters that may be overlooked using conventional statistical methods.
Pickle files
kaggle.com
zip
Updated Aug 18, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ishan (2023). Pickle files [Dataset]. https://www.kaggle.com/datasets/ishanpurohit/pickle-files/discussion
Explore at:
zip(3930788 bytes)Available download formats
Dataset updated
Aug 18, 2023
Authors
Ishan
Description
In this dataset, I've compiled and shared the best-fitted model with the parameters optimized with GridSearchCV. These model parameters have been carefully selected and optimized to provide superior predictive performance for the given task.

The dataset includes pickle files containing the best parameter settings for different machine learning algorithms. Here's what you'll find:

CatBoost Classifier Parameters (catboost.pkl): Unleash the power of gradient boosting with categorical features. The pickle file contains a model with tuned hyperparameters for the CatBoost model.

LightGBM Classifier Parameters (lgbm.pkl): Experience the efficiency and accuracy of LightGBM. The pickle file holds the model with optimized hyperparameters for the LightGBM model.

Random Forest Classifier Parameters (rf.pkl): Embrace the classic Random Forest algorithm. The pickle file presents the model with the best hyperparameters for the Random Forest model.

TabNet Classifier Parameters (tab_net.pkl): Dive into the world of TabNet's attention mechanisms. The pickle file showcases the ideal hyperparameters for the TabNet model. You need to directly use this model to predict. But make sure you use the columns I used in my Notebook and used the exact same feature Engineering.

XGBoost Classifier Parameters (xgb.pkl): Harness the power of XGBoost's gradient boosting techniques. The pickle file includes the model with the finest hyperparameter settings for the XGBoost model.

These pickle files provide a snapshot of the hyperparameters that have yielded exceptional results in terms of accuracy and generalization. They are a valuable resource for anyone aiming to enhance their predictive modeling skills or participate in the Spaceship Titanic competition.

Feel free to explore and utilize these best-fitted model parameters in your analysis and modeling endeavors. Let's continue to learn, collaborate, and push the boundaries of data science together.
Optimization range of LightGBM parameters.
plos.figshare.com
xls
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yi Guo; Xuejun Xiong; Yangcheng Liu; Liang Xu; Qiong Li (2023). Optimization range of LightGBM parameters. [Dataset]. http://doi.org/10.1371/journal.pone.0267132.t012
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0267132.t012
Dataset updated
May 31, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Yi Guo; Xuejun Xiong; Yangcheng Liu; Liang Xu; Qiong Li
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Optimization range of LightGBM parameters.
All boosters parameters for TPS oct 2021
kaggle.com
zip
Updated Oct 21, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Meghdad (2021). All boosters parameters for TPS oct 2021 [Dataset]. https://www.kaggle.com/akmeghdad/all-booster-parameters-for-tps-oct-2021
Explore at:
zip(6269713890 bytes)Available download formats
Dataset updated
Oct 21, 2021
Authors
Meghdad
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Various methods are used to solve TSP Oct 2021, including the use of LightGBM , CatBoost and XGBoost, in which hyperparametersplay an important role. We use hyperparameter optimization framework like Optuna to find hyperparameter. Another way is to use parameters that have already been created and had good results.

In this database, I collected all the parameters of LightGBM , CatBoost and XGBoost introduced in the TPS Oct 2021.

All parameters are checked under one condition. I used the following specifications to find the accuracy of each parameter. This is not the final accuracy because it is measured with only 20% of the data, but it is a criterion for comparing the parameters.

train_20 = dt.fread(sample_train_20.csv', columns=lambda cols:[col.name not in ('id') for col in cols]).to_pandas() y = train_20['target'] X = train_20.drop(columns=['target']) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=(1 / 5), random_state=59) model = model_from_csv(**params_from_csv) model.fit( X_train, y_train, eval_set=[(X_test, y_test)], eval_metric=["auc"], verbose=False, early_stopping_rounds = 600 ) y_predicted = model.predict_proba(X_test) accuracy = roc_auc_score(y_test, y_predicted[:, 1])

UPVOTE

I will try to create for future competitions as well if it is helpful for you, If you think this dataset are helpful for you, Please do not forget upvote this dataset Thank you in advance
Key parameters of LightGBM.
plos.figshare.com
xls
Updated Feb 19, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jizhong Wang; Jianfei Chi; Yeqiang Ding; Haiyan Yao; Qiang Guo (2025). Key parameters of LightGBM. [Dataset]. http://doi.org/10.1371/journal.pone.0314481.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0314481.t002
Dataset updated
Feb 19, 2025
Dataset provided by
PLOShttp://plos.org/
Authors
Jizhong Wang; Jianfei Chi; Yeqiang Ding; Haiyan Yao; Qiang Guo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A fault diagnosis method for oil immersed transformers based on principal component analysis and SSA LightGBM is proposed to address the problem of low diagnostic accuracy caused by the complexity of current oil immersed transformer faults. Firstly, data on dissolved gases in oil is collected, and a 17 dimensional fault feature matrix is constructed using the uncoded ratio method. The feature matrix is then standardized to obtain joint features. Secondly, principal component analysis is used for feature fusion to eliminate information redundancy between variables and construct fused features. Finally, a transformer diagnostic model based on SSA-LightGBM was constructed, and the ten fold cross validation method was used to verify the classification ability of the model. The experimental results show that the SSA-LightGBM model proposed in this paper has an average fault diagnosis accuracy of 93.6% after SSA algorithm optimization, which is 3.6% higher than before optimization. At the same time, compared with the GA-LightGBM and GWO-LightGBM fault diagnosis models, SSA-LightGBM has improved the diagnostic accuracy by 8.1% and 5.7% respectively, verifying that this method can effectively improve the fault diagnosis performance of oil immersed transformers and is superior to other similar methods.
Super parameters of LightGBM.
plos.figshare.com
xls
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yi Guo; Xuejun Xiong; Yangcheng Liu; Liang Xu; Qiong Li (2023). Super parameters of LightGBM. [Dataset]. http://doi.org/10.1371/journal.pone.0267132.t016
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0267132.t016
Dataset updated
Jun 3, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Yi Guo; Xuejun Xiong; Yangcheng Liu; Liang Xu; Qiong Li
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Super parameters of LightGBM.
Comparison of the results of P-feature and TF-feature in LightGBM.
plos.figshare.com
xls
Updated Jun 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shaokun Liang; Tao Deng; Anna Huang; Ningxian Liu; Xuchu Jiang (2023). Comparison of the results of P-feature and TF-feature in LightGBM. [Dataset]. http://doi.org/10.1371/journal.pone.0277085.t003
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0277085.t003
Dataset updated
Jun 4, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Shaokun Liang; Tao Deng; Anna Huang; Ningxian Liu; Xuchu Jiang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Comparison of the results of P-feature and TF-feature in LightGBM.
f
Table 1_Machine learning prediction of anxiety symptoms in social anxiety...
datasetcatalog.nlm.nih.gov
frontiersin.figshare.com
Updated Jan 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pack, Seung Pil; Hur, Ji-Won; Jung, Dooyoung; Cho, Chul-Hyun; Park, Jin-Hyun; Lee, Hwamin; Lee, Heon-Jeong; Shin, Yu-Bin (2025). Table 1_Machine learning prediction of anxiety symptoms in social anxiety disorder: utilizing multimodal data from virtual reality sessions.docx [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001283930
Explore at:
Dataset updated
Jan 7, 2025
Authors
Pack, Seung Pil; Hur, Ji-Won; Jung, Dooyoung; Cho, Chul-Hyun; Park, Jin-Hyun; Lee, Hwamin; Lee, Heon-Jeong; Shin, Yu-Bin
Description
IntroductionMachine learning (ML) is an effective tool for predicting mental states and is a key technology in digital psychiatry. This study aimed to develop ML algorithms to predict the upper tertile group of various anxiety symptoms based on multimodal data from virtual reality (VR) therapy sessions for social anxiety disorder (SAD) patients and to evaluate their predictive performance across each data type.MethodsThis study included 32 SAD-diagnosed individuals, and finalized a dataset of 132 samples from 25 participants. It utilized multimodal (physiological and acoustic) data from VR sessions to simulate social anxiety scenarios. This study employed extended Geneva minimalistic acoustic parameter set for acoustic feature extraction and extracted statistical attributes from time series-based physiological responses. We developed ML models that predict the upper tertile group for various anxiety symptoms in SAD using Random Forest, extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and categorical boosting (CatBoost) models. The best parameters were explored through grid search or random search, and the models were validated using stratified cross-validation and leave-one-out cross-validation.ResultsThe CatBoost, using multimodal features, exhibited high performance, particularly for the Social Phobia Scale with an area under the receiver operating characteristics curve (AUROC) of 0.852. It also showed strong performance in predicting cognitive symptoms, with the highest AUROC of 0.866 for the Post-Event Rumination Scale. For generalized anxiety, the LightGBM’s prediction for the State-Trait Anxiety Inventory-trait led to an AUROC of 0.819. In the same analysis, models using only physiological features had AUROCs of 0.626, 0.744, and 0.671, whereas models using only acoustic features had AUROCs of 0.788, 0.823, and 0.754.ConclusionsThis study showed that a ML algorithm using integrated multimodal data can predict upper tertile anxiety symptoms in patients with SAD with higher performance than acoustic or physiological data obtained during a VR session. The results of this study can be used as evidence for personalized VR sessions and to demonstrate the strength of the clinical use of multimodal data.
f
Parameters of base classifiers in ensemble model.
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Apr 23, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zhang, Peng; Zhang, Jialiang; Li, Yi (2025). Parameters of base classifiers in ensemble model. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0002104649
Explore at:
Dataset updated
Apr 23, 2025
Authors
Zhang, Peng; Zhang, Jialiang; Li, Yi
Description
Timely prediction of memory failures is crucial for the stable operation of data centers. However, existing methods often rely on a single classifier, which can lead to inaccurate or unstable predictions. To address this, we propose a new ensemble model for predicting CE-driven memory failures, where failures occur due to a surge of correctable errors (CEs) in memory, causing server downtime. Our model combines several strong-performing classifiers, such as Random Forest, LightGBM, and XGBoost, and assigns different weights to each based on its performance. By optimizing the decision-making process, the model improves prediction accuracy. We validate the model using in-memory data from Alibaba’s data center, and the results show an accuracy of over 84%, outperforming existing single and dual-classifier models, further confirming its excellent predictive performance.
MAPE of all methods without noise.
plos.figshare.com
xls
Updated Jun 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shaokun Liang; Tao Deng; Anna Huang; Ningxian Liu; Xuchu Jiang (2023). MAPE of all methods without noise. [Dataset]. http://doi.org/10.1371/journal.pone.0277085.t005
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0277085.t005
Dataset updated
Jun 4, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Shaokun Liang; Tao Deng; Anna Huang; Ningxian Liu; Xuchu Jiang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
MAPE of all methods without noise.
f
Statistical analysis of characteristics of the Framingham data set.
figshare.com
xls
Updated Sep 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lang Deng; Kongjie Lu; Huanhuan Hu (2025). Statistical analysis of characteristics of the Framingham data set. [Dataset]. http://doi.org/10.1371/journal.pone.0330377.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0330377.t002
Dataset updated
Sep 12, 2025
Dataset provided by
PLOS ONE
Authors
Lang Deng; Kongjie Lu; Huanhuan Hu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Framingham
Description
Statistical analysis of characteristics of the Framingham data set.
The MAPE of the noise experiment on all methods.
figshare.com
xls
Updated Jun 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shaokun Liang; Tao Deng; Anna Huang; Ningxian Liu; Xuchu Jiang (2023). The MAPE of the noise experiment on all methods. [Dataset]. http://doi.org/10.1371/journal.pone.0277085.t006
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0277085.t006
Dataset updated
Jun 5, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Shaokun Liang; Tao Deng; Anna Huang; Ningxian Liu; Xuchu Jiang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The MAPE of the noise experiment on all methods.
Neural network hyperparameters.
figshare.com
xls
Updated Jun 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shaokun Liang; Tao Deng; Anna Huang; Ningxian Liu; Xuchu Jiang (2023). Neural network hyperparameters. [Dataset]. http://doi.org/10.1371/journal.pone.0277085.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0277085.t002
Dataset updated
Jun 7, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Shaokun Liang; Tao Deng; Anna Huang; Ningxian Liu; Xuchu Jiang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Neural network hyperparameters.
f
Comparison of all optimal model results.
figshare.com
plos.figshare.com
xls
Updated Jun 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shaokun Liang; Tao Deng; Anna Huang; Ningxian Liu; Xuchu Jiang (2023). Comparison of all optimal model results. [Dataset]. http://doi.org/10.1371/journal.pone.0277085.t004
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0277085.t004
Dataset updated
Jun 10, 2023
Dataset provided by
PLOS ONE
Authors
Shaokun Liang; Tao Deng; Anna Huang; Ningxian Liu; Xuchu Jiang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Comparison of all optimal model results.
Statistical analysis of characteristics of the Z-AlizadehSani data set.
plos.figshare.com
xls
Updated Sep 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lang Deng; Kongjie Lu; Huanhuan Hu (2025). Statistical analysis of characteristics of the Z-AlizadehSani data set. [Dataset]. http://doi.org/10.1371/journal.pone.0330377.t003
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0330377.t003
Dataset updated
Sep 12, 2025
Dataset provided by
PLOShttp://plos.org/
Authors
Lang Deng; Kongjie Lu; Huanhuan Hu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Statistical analysis of characteristics of the Z-AlizadehSani data set.
Comparison of three scoring systems (BRFSS_2015 dataset).
plos.figshare.com
xls
Updated Sep 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lang Deng; Kongjie Lu; Huanhuan Hu (2025). Comparison of three scoring systems (BRFSS_2015 dataset). [Dataset]. http://doi.org/10.1371/journal.pone.0330377.t006
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0330377.t006
Dataset updated
Sep 12, 2025
Dataset provided by
PLOShttp://plos.org/
Authors
Lang Deng; Kongjie Lu; Huanhuan Hu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Comparison of three scoring systems (BRFSS_2015 dataset).
The results of the Shapiro–Wilk test.
plos.figshare.com
xls
Updated Jun 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shaokun Liang; Tao Deng; Anna Huang; Ningxian Liu; Xuchu Jiang (2023). The results of the Shapiro–Wilk test. [Dataset]. http://doi.org/10.1371/journal.pone.0277085.t007
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0277085.t007
Dataset updated
Jun 5, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Shaokun Liang; Tao Deng; Anna Huang; Ningxian Liu; Xuchu Jiang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The results of the Shapiro–Wilk test.
f
Performance comparison of different models of the BRFSS_2015 data set.
figshare.com
xls
Updated Sep 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lang Deng; Kongjie Lu; Huanhuan Hu (2025). Performance comparison of different models of the BRFSS_2015 data set. [Dataset]. http://doi.org/10.1371/journal.pone.0330377.t004
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0330377.t004
Dataset updated
Sep 12, 2025
Dataset provided by
PLOS ONE
Authors
Lang Deng; Kongjie Lu; Huanhuan Hu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Performance comparison of different models of the BRFSS_2015 data set.
Statistical analysis of characteristics of the BRFSS_2015 data set.
plos.figshare.com
xls
Updated Sep 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lang Deng; Kongjie Lu; Huanhuan Hu (2025). Statistical analysis of characteristics of the BRFSS_2015 data set. [Dataset]. http://doi.org/10.1371/journal.pone.0330377.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0330377.t001
Dataset updated
Sep 12, 2025
Dataset provided by
PLOShttp://plos.org/
Authors
Lang Deng; Kongjie Lu; Huanhuan Hu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Statistical analysis of characteristics of the BRFSS_2015 data set.