Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Systematic analysis of ChatGPT, Google search and Llama 2 for clinical decision support tasks

2024·184 Zitationen·Nature CommunicationsOpen Access

Volltext beim Verlag öffnen

184

Zitationen

Autoren

2024

Jahr

Abstract

It is likely that individuals are turning to Large Language Models (LLMs) to seek health advice, much like searching for diagnoses on Google. We evaluate clinical accuracy of GPT-3·5 and GPT-4 for suggesting initial diagnosis, examination steps and treatment of 110 medical cases across diverse clinical disciplines. Moreover, two model configurations of the Llama 2 open source LLMs are assessed in a sub-study. For benchmarking the diagnostic task, we conduct a naïve Google search for comparison. Overall, GPT-4 performed best with superior performances over GPT-3·5 considering diagnosis and examination and superior performance over Google for diagnosis. Except for treatment, better performance on frequent vs rare diseases is evident for all three approaches. The sub-study indicates slightly lower performances for Llama models. In conclusion, the commercial LLMs show growing potential for medical question answering in two successive major releases. However, some weaknesses underscore the need for robust and regulated AI models in health care. Open source LLMs can be a viable option to address specific needs regarding data privacy and transparency of training.

Autoren

Institutionen

University of Münster(DE)

Themen

Artificial Intelligence in Healthcare and EducationMachine Learning in HealthcareTopic Modeling

Volltext beim Verlag öffnen

Systematic analysis of ChatGPT, Google search and Llama 2 for clinical decision support tasks

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen