Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Generative Large Language Models for Detection of Speech Recognition Errors in Radiology Reports

2024·58 Zitationen·Radiology Artificial IntelligenceOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2024

Jahr

Abstract

This study evaluated the ability of generative large language models (LLMs) to detect speech recognition errors in radiology reports. A dataset of 3233 CT and MRI reports was assessed by radiologists for speech recognition errors. Errors were categorized as clinically significant or not clinically significant. Performances of five generative LLMs—GPT-3.5-turbo, GPT-4, text-davinci-003, Llama-v2–70B-chat, and Bard—were compared in detecting these errors, using manual error detection as the reference standard. Prompt engineering was used to optimize model performance. GPT-4 demonstrated high accuracy in detecting clinically significant errors (precision, 76.9%; recall, 100%; F1 score, 86.9%) and not clinically significant errors (precision, 93.9%; recall, 94.7%; F1 score, 94.3%). Text-davinci-003 achieved F1 scores of 72% and 46.6% for clinically significant and not clinically significant errors, respectively. GPT-3.5-turbo obtained 59.1% and 32.2% F1 scores, while Llama-v2–70B-chat scored 72.8% and 47.7%. Bard showed the lowest accuracy, with F1 scores of 47.5% and 20.9%. GPT-4 effectively identified challenging errors of nonsense phrases and internally inconsistent statements. Longer reports, resident dictation, and overnight shifts were associated with higher error rates. In conclusion, advanced generative LLMs show potential for automatic detection of speech recognition errors in radiology reports. Keywords: CT, Large Language Model, Machine Learning, MRI, Natural Language Processing, Radiology Reports, Speech, Unsupervised Learning Supplemental material is available for this article. © RSNA, 2024

Autoren

Institutionen

Western Health(AU)

Themen

Radiology practices and educationArtificial Intelligence in Healthcare and EducationRadiomics and Machine Learning in Medical Imaging

Volltext beim Verlag öffnen

Generative Large Language Models for Detection of Speech Recognition Errors in Radiology Reports

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen