Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Stacking Ensemble Technique for Multiple Medical Datasets Classification: A Generalized Prediction Model
19
Zitationen
7
Autoren
2023
Jahr
Abstract
Precise early detection of diseases can reduce the worsening and lethality, but it is not a spontaneous act to deal with complex medical data. Machine Learning (ML) can help the research community extensively in this aspect by playing a vast role in predicting the status of diseases at early stages. The study intended to develop a generalized model based on ML techniques that can classify frequently occurring diseases with better performance and reliability. In this research, four datasets collected from different repositories, such as the MRI and Alzheimer's Dataset (MAD), the SPECTF Heart Dataset (SHD), the Early Stage Diabetes Dataset (ESDD), and Lower Back Pain Dataset (LBPD), followed by analyzing and evaluating according to their performances to propose the prediction model. Numerous studies on this aspect conducted by others are available, but there is still scope for prosperity. To overcome the shortcomings of previous research, we have driven the first step with data preprocessing followed by six classification techniques such as Logistic regression (LR), Support Vector Machine (SVM), Naive Bayes (NB), Decision Tree (DT), Random Forest (RF), and Extra tree (ET) are performed with 10-fold cross-validation as evaluation measure after assigning the best parameters manually by randomized search. In addition, the three best-performing classifiers (LR, RF, and SVM) are selected with their hyper-parameters to create an ensemble model through the stacking ensemble technique. After all, our generalized stacking ensemble model outperformed all other classifiers used in this study as well as other researchers in terms of accuracy that 96.97% in MAD, 95.08% in SHD, 98.90% in ESDD and 91.34% in LBPD are obtained.
Ähnliche Arbeiten
Biostatistical Analysis
1996 · 35.448 Zit.
UCI Machine Learning Repository
2007 · 24.318 Zit.
An introduction to ROC analysis
2005 · 20.758 Zit.
The use of the area under the ROC curve in the evaluation of machine learning algorithms
1997 · 7.139 Zit.
A method of comparing the areas under receiver operating characteristic curves derived from the same cases.
1983 · 7.071 Zit.