OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 20.05.2026, 04:55

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Machine learning for predicting the prognosis of patients with thymoma and thymic carcinoma

2025·1 Zitationen·Journal of Thoracic DiseaseOpen Access
Volltext beim Verlag öffnen

1

Zitationen

12

Autoren

2025

Jahr

Abstract

Background: Thymoma and thymic carcinoma are the most common tumors of the anterior mediastinum. However, there are little research on applying machine learning (ML) approaches to the prognostic prediction of thymoma and thymic carcinoma. The study aims to develop predictive models utilizing ML techniques to accurately forecast the 5-year survival of patients with thymoma and thymic carcinoma. Methods: Patients with malignant thymic neoplasms were identified in the Surveillance, Epidemiology, and End Results (SEER) 17 database, and their demographic and clinicopathological characteristics were collected. ML classifiers, including elastic net regularized logistic regression, random forest (RF), non-linear support vector machine (SVM), extreme gradient boosting (XGBoost) machine, and categorical boosting (CatBoost) were trained. The hyper-parameter of the algorithms was optimized by a grid search with five repeats of 10-fold cross-validation. Ensemble models were built based on the three algorithms with the highest area under the receiver operator characteristic (ROC) curve (AUC) in the validation set. The best model among the single models and ensemble model was selected as the final model. Calibration curve and decision curve were adopted to evaluate the calibration performance and clinical utility. For comparison, we constructed a baseline model consisting of age and Masaoka stages using logistic regression. Results: After data cleaning, 1,363 patients and 841 patients were included in the overall survival (OS) dataset and disease-specific survival (DSS) dataset, respectively. CatBoost [AUC: 0.755; 95% confidence interval (CI): 0.698–0.811] had the best performance in the OS prediction for the original dataset. The ensemble model achieved the highest prognostic efficiency for the original dataset, with an AUC of 0.833 (95% CI: 0.765–0.901). Calibration showed favorable goodness of fit and was further verified with the Hosmer-Lemeshow test (CatBoost: χ2=12.63, P=0.13; ensemble model: χ2=7.61, P=0.47). The decision curve showed that the final model provided a high net benefit. The model could significantly distinguish the prognosis of patients (all P values <0.001). Finally, World Health Organization (WHO) histological classification, Masaoka stage, and age were the variables that significantly contributed to the models’ prediction of OS and DSS. Conclusions: We trained ML-based predictive models that could accurately predict the 5-year OS and DSS of patients with thymoma and thymic carcinoma.

Ähnliche Arbeiten