Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

ChatGPT-3.5 and the Polish thoracic surgery specialty examination: a performance evaluation

2025·0 Zitationen·Polish Journal of Cardio-Thoracic SurgeryOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

Introduction:The incredibly rapid development of artificial intelligence (AI) in recent years has created new opportunities for its application in medical advancements.This raises questions about the reliability and limitations of AI.Aim: The aim of the present study was to evaluate the effectiveness of the ChatGPT-3.5language model in solving the test component of the National Specialist Examination (PES) in the field of thoracic surgery.Material and methods: A total of 120 test questions from 2015 PES examination were analyzed.They were grouped according to subject matter, clinical character, and cognitive requirements.In independent sessions, each question was submitted five times.The following statistical tests were applied: c 2 , Kruskal-Wallis, Mann-Whitney and Spearman's rank correlation.The consistency of the answers was assessed using Fleiss' k coefficient.Results: The AI tool achieved a score of 42.2% correct answers, with the passing threshold set at 60%.A statistically significant difference was found between clinical and non-clinical questions (p = 0.041).Correct answers were characterized by a higher confidence coefficient (p < 0.001).No correlation was observed between confidence and psychometric indicators.The response consistency was assessed as moderate (k = 0.341).Conclusions: The result obtained by ChatGPT-3.5 is equivalent to a failing score on the examination.The confidence of responses correlated with their correctness, whereas limitations in clinical knowledge and consistency indicate the need for caution when using this model to assess specialized knowledge.

Autoren

Institutionen

Medical University of Silesia(PL)

Themen

Artificial Intelligence in Healthcare and EducationClinical Reasoning and Diagnostic SkillsRadiomics and Machine Learning in Medical Imaging

Volltext beim Verlag öffnen

ChatGPT-3.5 and the Polish thoracic surgery specialty examination: a performance evaluation

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen