Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
EVALUATING LOGISTIC REGRESSION, SVM, KNN, AND ENSEMBLE MODELS FOR ACCURATE HEART DISEASE RISK PREDICTION
0
Zitationen
2
Autoren
2026
Jahr
Abstract
Cardiovascular disease remains the most significant contributor to global mortality, highlighting the importance of early and precise risk assessment within preventive healthcare frameworks. Alongside the rapid growth of clinical data availability, machine learning approaches have increasingly been adopted to assist medical decision-making, particularly for interpreting complex and high-dimensional health information. This research investigates the predictive capability of six supervised machine learning models in determining the likelihood of cardiovascular disease incidence: Logistic Regression, Support Vector Machine, k-Nearest Neighbors, Decision Tree, Random Forest, and Gradient Boosting. The Cleveland Heart Disease dataset from the UCI Machine Learning Repository served as the study's foundation. It includes 303 patient samples with a total of 76 recorded attributes. From this dataset, 14 clinically significant variables frequently reported in previous studies were selected for analysis. Considering the relatively small dataset size and the possibility of redundant or low-impact features, a feature selection approach was implemented to improve model robustness, minimize overfitting, and enhance interpretability. The data preparation process involved cleaning, normalization, feature selection, and division into datasets for testing and training. Metrics like accuracy, precision, recall, and F1-score were used to evaluate the model. The results of the experiment show that Random Forest and Logistic Regression models produced the highest predictive performance, followed by k-Nearest Neighbours and Support Vector Machine. These results indicate that supervised machine learning techniques, when supported by appropriate feature selection methods, are effective as decision-support tools for the early detection of cardiovascular disease.
Ähnliche Arbeiten
Biostatistical Analysis
1996 · 35.449 Zit.
UCI Machine Learning Repository
2007 · 24.319 Zit.
An introduction to ROC analysis
2005 · 20.877 Zit.
Prediction of Coronary Heart Disease Using Risk Factor Categories
1998 · 9.594 Zit.
The use of the area under the ROC curve in the evaluation of machine learning algorithms
1997 · 7.166 Zit.