Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Cardiovascular Disease Risk Prediction Models
0
Zitationen
1
Autoren
2025
Jahr
Abstract
Introduction. Non-communicable diseases, especially cardiovascular pathologies, remain the leading cause of mortality worldwide, creating a significant burden on society, the economy, and healthcare systems. Heart attacks and strokes are particularly dangerous because they often develop suddenly and without symptoms, which complicates timely diagnosis and prevention. Identification of patients at increased risk can improve disease prevention and clinical outcomes, enhance the quality of medical care. In recent years, growing attention has been directed toward the use of artificial intelligence, machine learning, and big data processing techniques – particularly the analysis of unstructured medical texts – to improve the accuracy of medical predictions. The analysis of medical reports, patient histories, and other textual information can reveal hidden patterns that are inaccessible to traditional manual review and can greatly contribute to personalized treatment strategies. The aim of the study is to improve the model for predicting the risk of myocardial infarction by introducing new methods of preprocessing medical reports and feature selection. In addition, the study aims to develop a new model for determining the risk level of cerebral vascular damage. The work focuses on integrating these models into modern information systems used in medical institutions and testing them on real clinical datasets. Results. The study proposed and evaluated several approaches for improving myocardial infarction risk prediction, including text translation, lemmatization, and automated extraction of medical terms. Building on an extended version of the existing methodology, a new model was developed to predict cerebral vascular lesions. The analysis was conducted using the depersonalized “Eskulap” database, which contains records of more than 22,000 patients. The improved models demonstrated strong performance, achieving 80% accuracy (AUC = 0.898) for myocardial infarction and 86% accuracy (AUC = 0.92) for cerebral vascular lesions. The new model has already been successfully implemented in a medical center. Conclusions. The proposed methods for improving the analysis of medical texts, including preprocessing, automated selection of relevant features, lemmatization, and adaptation to language-specific characteristics – enhanced the quality of risk prediction for cardiovascular and cerebrovascular diseases. The development of the new model for predicting cerebral vascular lesions further confirmed the effectiveness of this approach, and its implementation demonstrates the feasibility of integrating such solutions into clinical, insurance, and scientific practice. The model supports personalized prevention and treatment, facilitates the identification of high-risk groups, optimizes resource allocation, and improves clinical decision-making. It may also be used for calculating insurance rates or guiding targeted funding by governmental and municipal institutions. The model also has strong potential for further development through the integration of additional data sources (such as laboratory indicators, instrumental examination results, and medical images), the adoption of more advanced ensemble algorithms, and deeper incorporation of expert assessments. Taken together, these results reinforce the conclusion that machine learning is a promising tool for analyzing unstructured medical texts, supporting clinical decision-making, and improving overall healthcare efficiency. Keywords: non-communicable diseases, myocardial infarction, stroke, machine learning, risk prediction, Multinomial Naive Bayes, medical texts, data analysis.
Ähnliche Arbeiten
Biostatistical Analysis
1996 · 35.446 Zit.
UCI Machine Learning Repository
2007 · 24.290 Zit.
An introduction to ROC analysis
2005 · 20.692 Zit.
The use of the area under the ROC curve in the evaluation of machine learning algorithms
1997 · 7.122 Zit.
A method of comparing the areas under receiver operating characteristic curves derived from the same cases.
1983 · 7.066 Zit.