12 datasets found

s
Online Feature Selection and Its Applications
researchdata.smu.edu.sg
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
HOI Steven; Jialei WANG; Peilin ZHAO; Rong JIN (2023). Online Feature Selection and Its Applications [Dataset]. http://doi.org/10.25440/smu.12062733.v1
Explore at:
Unique identifier
https://doi.org/10.25440/smu.12062733.v1
Dataset updated
May 31, 2023
Dataset provided by
SMU Research Data Repository (RDR)
Authors
HOI Steven; Jialei WANG; Peilin ZHAO; Rong JIN
License
https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html
Description
Feature selection is an important technique for data mining before a machine learning algorithm is applied. Despite its importance, most studies of feature selection are restricted to batch learning. Unlike traditional batch learning methods, online learning represents a promising family of efficient and scalable machine learning algorithms for large-scale applications. Most existing studies of online learning require accessing all the attributes/features of training instances. Such a classical setting is not always appropriate for real-world applications when data instances are of high dimensionality or it is expensive to acquire the full set of attributes/features. To address this limitation, we investigate the problem of Online Feature Selection (OFS) in which an online learner is only allowed to maintain a classifier involved only a small and fixed number of features. The key challenge of Online Feature Selection is how to make accurate prediction using a small and fixed number of active features. This is in contrast to the classical setup of online learning where all the features can be used for prediction. We attempt to tackle this challenge by studying sparsity regularization and truncation techniques. Specifically, this article addresses two different tasks of online feature selection: (1) learning with full input where an learner is allowed to access all the features to decide the subset of active features, and (2) learning with partial input where only a limited number of features is allowed to be accessed for each instance by the learner. We present novel algorithms to solve each of the two problems and give their performance analysis. We evaluate the performance of the proposed algorithms for online feature selection on several public datasets, and demonstrate their applications to real-world problems including image classification in computer vision and microarray gene expression analysis in bioinformatics. The encouraging results of our experiments validate the efficacy and efficiency of the proposed techniques.Related Publication: Hoi, S. C., Wang, J., Zhao, P., & Jin, R. (2012). Online feature selection for mining big data. In Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications (pp. 93-100). ACM. http://dx.doi.org/10.1145/2351316.2351329 Full text available in InK: http://ink.library.smu.edu.sg/sis_research/2402/ Wang, J., Zhao, P., Hoi, S. C., & Jin, R. (2014). Online feature selection and its applications. IEEE Transactions on Knowledge and Data Engineering, 26(3), 698-710. http://dx.doi.org/10.1109/TKDE.2013.32 Full text available in InK: http://ink.library.smu.edu.sg/sis_research/2277/
f
Long Covid Risk
figshare.com
txt
Updated Apr 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ahmed Shaheen (2024). Long Covid Risk [Dataset]. http://doi.org/10.6084/m9.figshare.25599591.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.25599591.v1
Dataset updated
Apr 13, 2024
Dataset provided by
figshare
Authors
Ahmed Shaheen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Feature preparation Preprocessing was applied to the data, such as creating dummy variables and performing transformations (centering, scaling, YeoJohnson) using the preProcess() function from the “caret” package in R. The correlation among the variables was examined and no serious multicollinearity problems were found. A stepwise variable selection was performed using a logistic regression model. The final set of variables included: Demographic: age, body mass index, sex, ethnicity, smoking History of disease: heart disease, migraine, insomnia, gastrointestinal disease, COVID-19 history: covid vaccination, rashes, conjunctivitis, shortness of breath, chest pain, cough, runny nose, dysgeusia, muscle and joint pain, fatigue, fever ,COVID-19 reinfection, and ICU admission. These variables were used to train and test various machine-learning models Model selection and training The data was randomly split into 80% training and 20% testing subsets. The “h2o” package in R version 4.3.1 was employed to implement different algorithms. AutoML was first used, which automatically explored a range of models with different configurations. Gradient Boosting Machines (GBM), Random Forest (RF), and Regularized Generalized Linear Model (GLM) were identified as the best-performing models on our data and their parameters were fine-tuned. An ensemble method that stacked different models together was also used, as it could sometimes improve the accuracy. The models were evaluated using the area under the curve (AUC) and C-statistics as diagnostic measures. The model with the highest AUC was selected for further analysis using the confusion matrix, accuracy, sensitivity, specificity, and F1 and F2 scores. The optimal prediction threshold was determined by plotting the sensitivity, specificity, and accuracy and choosing the point of intersection as it balanced the trade-off between the three metrics. The model’s predictions were also plotted, and the quantile ranges were used to classify the model’s prediction as follows: > 1st quantile, > 2nd quantile, > 3rd quartile and < 3rd quartile (very low, low, moderate, high) respectively. Metric Formula C-statistics (TPR + TNR - 1) / 2 Sensitivity/Recall TP / (TP + FN) Specificity TN / (TN + FP) Accuracy (TP + TN) / (TP + TN + FP + FN) F1 score 2 * (precision * recall) / (precision + recall) Model interpretation We used the variable importance plot, which is a measure of how much each variable contributes to the prediction power of a machine learning model. In H2O package, variable importance for GBM and RF is calculated by measuring the decrease in the model's error when a variable is split on. The more a variable's split decreases the error, the more important that variable is considered to be. The error is calculated using the following formula: 𝑆𝐸=𝑀𝑆𝐸∗𝑁=𝑉𝐴𝑅∗𝑁 and then it is scaled between 0 and 1 and plotted. Also, we used The SHAP summary plot which is a graphical tool to visualize the impact of input features on the prediction of a machine learning model. SHAP stands for SHapley Additive exPlanations, a method to calculate the contribution of each feature to the prediction by averaging over all possible subsets of features [28]. SHAP summary plot shows the distribution of the SHAP values for each feature across the data instances. We use the h2o.shap_summary_plot() function in R to generate the SHAP summary plot for our GBM model. We pass the model object and the test data as arguments, and optionally specify the columns (features) we want to include in the plot. The plot shows the SHAP values for each feature on the x-axis, and the features on the y-axis. The color indicates whether the feature value is low (blue) or high (red). The plot also shows the distribution of the feature values as a density plot on the right.
Reproducibility Package for "On the Anatomy of Real-World R Code for Static...
zenodo.org
data.niaid.nih.gov
zip
Updated Jan 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Florian Sihler; Florian Sihler (2024). Reproducibility Package for "On the Anatomy of Real-World R Code for Static Analysis" [Dataset]. http://doi.org/10.5281/zenodo.10569379
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10569379
Dataset updated
Jan 26, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Florian Sihler; Florian Sihler
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 24, 2024
Description
This is the reproducibility package for the paper "On the Anatomy of Real-World R Code for Static Analysis", accepted at MSR 2024.
f
Data from: Accelerated Design of Flame Retardant Polymeric Nanocomposites...
acs.figshare.com
xls
Updated Jun 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zhuoran Zhang; Zeren Jiao; Ruiqing Shen; Pingan Song; Qingsheng Wang (2023). Accelerated Design of Flame Retardant Polymeric Nanocomposites via Machine Learning Prediction [Dataset]. http://doi.org/10.1021/acsaenm.2c00145.s002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1021/acsaenm.2c00145.s002
Dataset updated
Jun 21, 2023
Dataset provided by
ACS Publications
Authors
Zhuoran Zhang; Zeren Jiao; Ruiqing Shen; Pingan Song; Qingsheng Wang
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Improving the flame retardancy of polymeric materials used in engineering applications is an increasingly important strategy for limiting fire hazards. However, the wide variety of flame retardant polymeric nanocomposite compositions prevents quick identification of the optimal design for a specific application. In this study, we built a flame retardancy database of more than 800 polymeric nanocomposites, including information from polymer flammability, thermal stability, and nanofiller properties. Then, we applied five machine learning algorithms to predict the flame retardancy index for different types of flame retardant polymeric nanocomposites. Among them, extreme gradient boosting regression gives the best prediction with a coefficient of determination (R2) of 0.94 and a root-mean-square error of 0.17. In addition, we studied how the physical features of polymeric nanocomposites affected flame retardancy using the correlation matrix and feature importance plot, which in turn was used to guide the design of polymeric nanocomposites for flame retardant applications. Following the guidelines, a high-performance flame retardant polymeric nanocomposite was designed and synthesized, and the experimental FRI result was compared with the machine learning prediction (6% prediction error). This result demonstrated a fast identification of flame retardancy of polymeric nanocomposite without large-scale fire tests, which could accelerate the design of functional polymeric nanocomposites in the flame retardant field.
f
Data from: In Silico Study of Metal–Organic Frameworks for CO2/CO...
figshare.com
xlsx
Updated Jul 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
I-Ting Sung; Li-Chiang Lin (2023). In Silico Study of Metal–Organic Frameworks for CO2/CO Separation: Molecular Simulations and Machine Learning [Dataset]. http://doi.org/10.1021/acs.jpcc.3c02452.s003
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1021/acs.jpcc.3c02452.s003
Dataset updated
Jul 10, 2023
Dataset provided by
ACS Publications
Authors
I-Ting Sung; Li-Chiang Lin
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Metal–organic frameworks (MOFs), an emerging class of nanoporous materials, have drawn considerable attention as promising adsorbents for gas separations. Among various separation applications, CO2/CO separation is of particular interest owing to its industrial relevance. While searching for promising MOFs from tens of thousands of candidates represents a great challenge, this study conducts large-scale molecular simulations to identify top-performing CO2 adsorbents, followed by investigating structure–property relationships for their design. Optimal MOFs are found to possess features such as metal nodes of greater metallic charges and dipole moments with a relatively confined pore structure. With the large-scale data at our disposal, machine learning models capable of predicting the CO2-to-CO selectivity and adsorption uptakes are also established. Specifically, three algorithms including support vector regression (SVR), extreme gradient boosting (XGBoost), and random forest (RF) models are employed. The results show that the RF algorithm demonstrates the best accuracy, and the r value for the predicted CO2-to-CO selectivity (S) can be as large as ∼0.88. The relative importance of the adopted features is also investigated with results suggesting that the adsorption of CO2 initiates more preferentially than that of CO due to the stronger van der Waals interaction and electrostatic contribution between CO2 and the metal sites. Finally, a design rule is proposed for the optimal design of CO2-selective materials. Overall, this work demonstrates a successful hybrid approach combining molecular simulations and machine learning for screening highly CO2/CO selective MOFs and offering insights into the design of optimal adsorbents.
f
Parameter setting table for training feature extraction network.
plos.figshare.com
xls
Updated Jun 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Junpeng Wu; Shaobo Tang; Xianglei Li; Yibo Zhou (2023). Parameter setting table for training feature extraction network. [Dataset]. http://doi.org/10.1371/journal.pone.0266444.t003
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0266444.t003
Dataset updated
Jun 15, 2023
Dataset provided by
PLOS ONE
Authors
Junpeng Wu; Shaobo Tang; Xianglei Li; Yibo Zhou
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Parameter setting table for training feature extraction network.
f
Data from: Enhancing the Predictive Performance of Molecularly Imprinted...
acs.figshare.com
zip
Updated Apr 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Reza Mohammadi Dashtaki; Saeed Mohammadi Dashtaki; Esmaeil Heydari-Bafrooei; Md Jalil Piran (2025). Enhancing the Predictive Performance of Molecularly Imprinted Polymer-Based Electrochemical Sensors Using a Stacking Regressor Ensemble of Machine Learning Models [Dataset]. http://doi.org/10.1021/acssensors.5c00364.s001
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.1021/acssensors.5c00364.s001
Dataset updated
Apr 17, 2025
Dataset provided by
ACS Publications
Authors
Reza Mohammadi Dashtaki; Saeed Mohammadi Dashtaki; Esmaeil Heydari-Bafrooei; Md Jalil Piran
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
The performance of electrochemical sensors is influenced by various factors. To enhance the effectiveness of these sensors, it is crucial to find the right balance among these factors. Researchers and engineers continually explore innovative approaches to enhance sensitivity, selectivity, and reliability. Machine learning (ML) techniques facilitate the analysis and predictive modeling of sensor performance by establishing quantitative relationships between parameters and their effects. This work presents a case study on developing a molecularly imprinted polymer (MIP)-based sensor for detecting doxorubicin (Dox), emphasizing the use of ML-based ensemble models to improve performance and reliability. Four ML models, including Decision Tree (DT), eXtreme Gradient Boosting (XGBoost), Random Forest (RF), and K-Nearest Neighbors (KNN), are used to evaluate the effect of each parameter on prediction performance, using the SHapley Additive exPlanations (SHAP) method to determine feature importance. Based on the analysis, removing a less influential feature and introducing a new feature significantly improved the model’s predictive capabilities. By applying the min–max scaling technique, it is ensured that all features contribute proportionally to the model learning process. Additionally, multiple ML modelsLinear Regression (LR), KNN, DT, RF, Adaptive Boosting (AdaBoost), Gradient Boosting (GB), Support Vector Regression (SVR), XGBoost, Bagging, Partial Least Squares (PLS), and Ridge Regressionare applied to the data set and their performance in predicting the sensor output current is compared. To further enhance prediction performance, a novel ensemble model is proposed that integrates DT, RF, GB, XGBoost, and Bagging regressors, leveraging their combined strengths to offset individual weaknesses. The main benefit of this work lies in its ability to enhance MIP-based sensor performance by developing a novel stacking regressor ensemble model, which improves prediction performance and reliability. This methodology is broadly applicable to the development of other sensors with different transducers and sensing elements. Through extensive simulation results, the proposed stacking regressor ensemble model demonstrated superior predictive performance compared to individual ML models. The model achieved an R-squared (R2) of 0.993, significantly reducing the root-mean-square error (RMSE) to 0.436 and the mean absolute error (MAE) to 0.244. These improvements enhanced sensitivity and reliability of the MIP-based electrochemical sensor, demonstrating a substantial performance gain over individual ML models.
f
Data from: Prediction of Chemical Looping Hydrogen Production Using...
acs.figshare.com
xlsx
Updated Oct 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jialei Cao; Liyan Sun; Fan Yin; Ran Zhang; Zixiang Gao; Rui Xiao (2024). Prediction of Chemical Looping Hydrogen Production Using Physics-Informed Machine Learning [Dataset]. http://doi.org/10.1021/acs.energyfuels.4c02988.s001
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1021/acs.energyfuels.4c02988.s001
Dataset updated
Oct 2, 2024
Dataset provided by
ACS Publications
Authors
Jialei Cao; Liyan Sun; Fan Yin; Ran Zhang; Zixiang Gao; Rui Xiao
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Hydrogen energy holds promise for controlling emissions but is limited by the production cost and method. Chemical looping hydrogen production (CLHP) provides a more efficient and environmentally sustainable route to produce high-purity hydrogen compared with conventional methods. Yet, CLHP involves a series of operational variables, and the optimization of operating conditions is the critical issue for large-scale hydrogen production. In this study, support vector machine (SVM), decision tree (DT), random forest (RF), artificial neural network (ANN), and physics-informed neural network (PINN) models are developed to predict hydrogen production rates by analyzing multiple process variables. Through the analysis of the database and experiments, we integrated physical consistency as prior physical knowledge into the PINN for eliminating the data dependence. All models are optimized for optimal performance through hyperparameters. The comparison of five machine learning models reveals that DT and RF models exhibit a characteristic step-like pattern in their predictions, while SVM and ANN models produce outputs that often diverge from the expected trend. The prediction of the PINN model exhibits good performance with R2, mean squared error, and mean absolute percentage error scores of 0.882, 1.228, and 18.1%, respectively. The results are with high interpretability due to the physical-informed inherent feature. Then, the CLHP process is studied, and the relationships between hydrogen yield and operating temperature, gas flow rate, and mass fraction of iron oxide are established. This work shows the difference in the prediction curves between the different models. By training various general models and comparing their predictive performance on chemical looping data, we can gain valuable insights to guide subsequent predictions for CLHP. It will be beneficial for the design of oxygen carriers and the optimization of the CLHP process.
f
Model Performance Metrics After Feature Fusion and Normalization.
plos.figshare.com
xls
Updated May 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sana Yaqoob; Ayman Noor; Talal H. Noor; Mohammad Zubair Khan; Anmol Ejaz; Md Imran Alam; Nadim Rana; Khurram Ejaz (2025). Model Performance Metrics After Feature Fusion and Normalization. [Dataset]. http://doi.org/10.1371/journal.pone.0321108.t005
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0321108.t005
Dataset updated
May 9, 2025
Dataset provided by
PLOS ONE
Authors
Sana Yaqoob; Ayman Noor; Talal H. Noor; Mohammad Zubair Khan; Anmol Ejaz; Md Imran Alam; Nadim Rana; Khurram Ejaz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Model Performance Metrics After Feature Fusion and Normalization.
f
mAP comparison of three kinds of industrial equipment detection by ROMS...
plos.figshare.com
xls
Updated Jun 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Junpeng Wu; Shaobo Tang; Xianglei Li; Yibo Zhou (2023). mAP comparison of three kinds of industrial equipment detection by ROMS R-CNN algorithm with different structures and functions. [Dataset]. http://doi.org/10.1371/journal.pone.0266444.t004
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0266444.t004
Dataset updated
Jun 15, 2023
Dataset provided by
PLOS ONE
Authors
Junpeng Wu; Shaobo Tang; Xianglei Li; Yibo Zhou
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
mAP comparison of three kinds of industrial equipment detection by ROMS R-CNN algorithm with different structures and functions.
f
Comparison of parameters and complexity of different network structures in...
plos.figshare.com
xls
Updated Jun 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Junpeng Wu; Shaobo Tang; Xianglei Li; Yibo Zhou (2023). Comparison of parameters and complexity of different network structures in feature extraction. [Dataset]. http://doi.org/10.1371/journal.pone.0266444.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0266444.t001
Dataset updated
Jun 4, 2023
Dataset provided by
PLOS ONE
Authors
Junpeng Wu; Shaobo Tang; Xianglei Li; Yibo Zhou
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Comparison of parameters and complexity of different network structures in feature extraction.
f
Performance Evaluation of Feature Fusion and normalization, AFR, and Hybrid...
plos.figshare.com
xls
Updated May 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sana Yaqoob; Ayman Noor; Talal H. Noor; Mohammad Zubair Khan; Anmol Ejaz; Md Imran Alam; Nadim Rana; Khurram Ejaz (2025). Performance Evaluation of Feature Fusion and normalization, AFR, and Hybrid Methods. [Dataset]. http://doi.org/10.1371/journal.pone.0321108.t009
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0321108.t009
Dataset updated
May 9, 2025
Dataset provided by
PLOS ONE
Authors
Sana Yaqoob; Ayman Noor; Talal H. Noor; Mohammad Zubair Khan; Anmol Ejaz; Md Imran Alam; Nadim Rana; Khurram Ejaz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Performance Evaluation of Feature Fusion and normalization, AFR, and Hybrid Methods.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

HOI Steven; Jialei WANG; Peilin ZHAO; Rong JIN (2023). Online Feature Selection and Its Applications [Dataset]. http://doi.org/10.25440/smu.12062733.v1

Online Feature Selection and Its Applications

Explore at:

Unique identifier

https://doi.org/10.25440/smu.12062733.v1

Dataset updated

May 31, 2023

Dataset provided by

SMU Research Data Repository (RDR)

Authors

HOI Steven; Jialei WANG; Peilin ZHAO; Rong JIN

License

https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html

Description

Feature selection is an important technique for data mining before a machine learning algorithm is applied. Despite its importance, most studies of feature selection are restricted to batch learning. Unlike traditional batch learning methods, online learning represents a promising family of efficient and scalable machine learning algorithms for large-scale applications. Most existing studies of online learning require accessing all the attributes/features of training instances. Such a classical setting is not always appropriate for real-world applications when data instances are of high dimensionality or it is expensive to acquire the full set of attributes/features. To address this limitation, we investigate the problem of Online Feature Selection (OFS) in which an online learner is only allowed to maintain a classifier involved only a small and fixed number of features. The key challenge of Online Feature Selection is how to make accurate prediction using a small and fixed number of active features. This is in contrast to the classical setup of online learning where all the features can be used for prediction. We attempt to tackle this challenge by studying sparsity regularization and truncation techniques. Specifically, this article addresses two different tasks of online feature selection: (1) learning with full input where an learner is allowed to access all the features to decide the subset of active features, and (2) learning with partial input where only a limited number of features is allowed to be accessed for each instance by the learner. We present novel algorithms to solve each of the two problems and give their performance analysis. We evaluate the performance of the proposed algorithms for online feature selection on several public datasets, and demonstrate their applications to real-world problems including image classification in computer vision and microarray gene expression analysis in bioinformatics. The encouraging results of our experiments validate the efficacy and efficiency of the proposed techniques.Related Publication: Hoi, S. C., Wang, J., Zhao, P., & Jin, R. (2012). Online feature selection for mining big data. In Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications (pp. 93-100). ACM. http://dx.doi.org/10.1145/2351316.2351329 Full text available in InK: http://ink.library.smu.edu.sg/sis_research/2402/ Wang, J., Zhao, P., Hoi, S. C., & Jin, R. (2014). Online feature selection and its applications. IEEE Transactions on Knowledge and Data Engineering, 26(3), 698-710. http://dx.doi.org/10.1109/TKDE.2013.32 Full text available in InK: http://ink.library.smu.edu.sg/sis_research/2277/

Clear search

Close search

Google apps

Main menu

Online Feature Selection and Its Applications

Long Covid Risk

Reproducibility Package for "On the Anatomy of Real-World R Code for Static...

Data from: Accelerated Design of Flame Retardant Polymeric Nanocomposites...

Data from: In Silico Study of Metal–Organic Frameworks for CO2/CO...

Parameter setting table for training feature extraction network.

Data from: Enhancing the Predictive Performance of Molecularly Imprinted...

Data from: Prediction of Chemical Looping Hydrogen Production Using...

Model Performance Metrics After Feature Fusion and Normalization.

mAP comparison of three kinds of industrial equipment detection by ROMS...

Comparison of parameters and complexity of different network structures in...

Performance Evaluation of Feature Fusion and normalization, AFR, and Hybrid...

Online Feature Selection and Its Applications