Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Identification of relevant features using SEQENS to improve supervised machine learning models predicting AML treatment outcome
2
Zitationen
13
Autoren
2025
Jahr
Abstract
BACKGROUND AND OBJECTIVE: This study has two main objectives. First, to evaluate a feature selection methodology based on SEQENS, an algorithm for identifying relevant variables. Second, to validate machine learning models that predict the risk of complications in patients with acute myeloid leukemia (AML) using data available at diagnosis. Predictions are made at three time points: 90 days, six months, and one year post-diagnosis. These objectives represent fundamental steps toward the development of a tool to assist clinicians in therapeutic decision-making and provide insights into the risk factors associated with AML complications. METHODS: A dataset of 568 patients, including demographic, clinical, genetic (VAF), and cytogenetic information, was created by combining data from Hospital 12 de Octubre (Madrid, Spain) and Instituto de Investigación Sanitaria La Fe (Valencia, Spain). Feature selection based on an enhanced version of SEQENS was conducted for each time point, followed by the comparison of four classifiers (XGBoost, Multi-Layer Perceptron, Logistic Regression and Decision Tree) to assess the impact of feature selection on model performance. RESULTS: SEQENS identified different relevant features for each prediction horizon, with Age, TP53, - 7/7Q, and EZH2 consistently relevant across all time points. The models were evaluated using 5-fold cross-validation, XGBoost achieve the highest average ROC-AUC scores of 0.81, 0.84, and 0.82 for 90-day, 6-month, and 1-year predictions, respectively. Generally, performance remained stable or improved after applying SEQENS-based feature selection. Evaluation on an external test set of 54 patients yielded ROC-AUC scores of 0.72 (90-day), 0.75 (6-month), and 0.68 (1-year). CONCLUSIONS: The models achieved performance levels that suggest they could serve as therapeutic decision support tools at different times after diagnosis. The selected variables align with the European LeukemiaNet (ELN) 2022 risk classification, and the SEQENS-based feature selection effectively reduced the feature set while maintaining prediction accuracy.
Ähnliche Arbeiten
The 2016 revision to the World Health Organization classification of myeloid neoplasms and acute leukemia
2016 · 10.150 Zit.
Human acute myeloid leukemia is organized as a hierarchy that originates from a primitive hematopoietic cell
1997 · 6.923 Zit.
Diagnosis and management of AML in adults: 2017 ELN recommendations from an international expert panel
2016 · 5.838 Zit.
Proposals for the Classification of the Acute Leukaemias F<scp>rench</scp>‐A<scp>merican</scp>‐B<scp>ritish</scp> (FAB) C<scp>o‐operative</scp> G<scp>roup</scp>
1976 · 5.594 Zit.
Genomic and Epigenomic Landscapes of Adult De Novo Acute Myeloid Leukemia
2013 · 5.119 Zit.