Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Performance of Large Language Models in Supporting Medical Diagnosis and Treatment: An Evaluation on the 2024 PNA Exam

2025·0 Zitationen

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

The integration of Large Language Models (LLMs) into healthcare holds significant potential to enhance diagnostic accuracy and support medical treatment planning. This study evaluates the performance of a range of contemporary LLMs on the 2024 Portuguese National Exam for medical specialty access (PNA), a standardized medical knowledge assessment. Our results highlight considerable variation in accuracy and cost-effectiveness, with several models demonstrating performance comparable to or exceeding human benchmarks for medical students on this specific task. We analyze leading models based on a combined score of accuracy, cost, and potential data contam-ination risk. We extensively discuss insights from comprehensive benchmarks like HealthBench, detailing its methodology and findings on model behavior across diverse health contexts. We fur-ther examine reasoning methodologies like Chain-of- Thought and Chain-of-Draft, emerging model architectures, and underscore the potential for LLMs to function as valuable complementary tools aiding medical professionals, within a robust ethical and regulatory framework.

Autoren

Institutionen

Institute of Mechanical Engineering and Industrial Mangement(PT)

Themen

Artificial Intelligence in Healthcare and EducationMachine Learning in HealthcareExplainable Artificial Intelligence (XAI)

Volltext beim Verlag öffnen

Performance of Large Language Models in Supporting Medical Diagnosis and Treatment: An Evaluation on the 2024 PNA Exam

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen