Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Clinical reasoning from real-world oncology reports using large language models
0
Zitationen
3
Autoren
2025
Jahr
Abstract
Objective: To evaluate the ability of large language models (LLMs) to perform structured information extraction and guideline-based clinical inferences from radiology and pathology reports in real-world oncology. Methods: We constructed a Question Answering (Q&A) benchmark dataset using 3650 radiological and 588 pathological reports from 1632 patients. The tasks included direct extraction of genomic and histological findings, as well as clinical reasoning tasks, such as Response Evaluation Criteria in Solid Tumors (RECIST)-based tumor response classification and American Joint Committee on Cancer (AJCC)-based tumor-node-metastasis (TNM) staging. We compared the performance of the Gemma family of open-source LLMs (Gemma 4B, a lightweight 4-billion parameter model, and Gemma 12B, a larger 12-billion parameter model) with and without structured reasoning prompts designed according to clinical guidelines. Results: The 12B model achieved high performance in direct extraction tasks from pathology reports, with F1-score ranging from 92.6 to 93.3 across genomic and histological variables. Furthermore, when guided by structured reasoning prompts, it also showed substantial improvements in reasoning tasks, achieving an F1-score of 81.5 (95% CI: 79.8-83.3) for tumor response, 74.3 (95% CI: 70.8-77.8) for T-stage, 87.1 (95% CI: 85.1-89.0) for N-stage, and 90.8 (95% CI: 89.1-92.2) for M-stage. In contrast, the 4B model showed inconsistent performance and was sometimes degraded under reasoning prompts. Conclusion: This study shows that LLMs can perform complex guideline-based clinical reasoning using real-world radiology reports. By combining the RECIST/AJCC criteria with structured prompts, we demonstrated how LLMs can move beyond surface-level extraction to support nuanced inference in oncology, with implications for future clinical applications.