Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Clarity Without Credibility? Human Versus AI Abstracts in Otolaryngology
0
Zitationen
13
Autoren
2026
Jahr
Abstract
ABSTRACT Objective This study evaluated whether otolaryngologists can distinguish between human‐ and machine‐written abstracts. The primary question was whether large language models (LLMs) produce abstracts comparable in clarity and usefulness to human‐authored work, and whether reviewers can identify authorship with accuracy. Methods A blinded cross‐sectional design was used. Forty‐eight abstracts were evaluated, consisting of twenty‐four human‐authored abstracts and 24 generated by four LLMs. Human abstracts were drawn from articles published after July 2025 to minimize overlap with LLM training data. Twenty otolaryngologists independently reviewed all abstracts. Using a structured rubric, raters classified authorship, rated clarity, usefulness, and confidence on 5‐point scales, and provided optional free‐text explanations. Group comparisons were performed using chi‐square and Mann–Whitney tests, with Kruskal–Wallis tests for model‐level analyses. Results Overall recognition accuracy was 44.7%. Human‐written abstracts were more often misclassified as AI than AI‐generated abstracts were mistaken for human. Human abstracts received significantly higher clarity and usefulness scores than LLM abstracts, though effect sizes were small. Confidence did not correlate with correctness, indicating miscalibration of rater judgments. Model‐level performance varied. Grok‐generated abstracts were most easily identified as AI, whereas GPT‐5 and Claude 3.5 more frequently resembled human writing. Free‐text rationales commonly referenced style, vagueness, or lack of detail when AI authorship was suspected. Conclusion LLMs generate abstracts that increasingly resemble human scientific writing, yet still lag in perceived usefulness and credibility. Clinicians were only moderately successful at detecting authorship and were frequently confident in incorrect classifications. These findings highlight both the promise and risks of AI‐assisted scientific communication.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.316 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.177 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.575 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.468 Zit.
Autoren
Institutionen
- Sheba Medical Center(IL)
- Stanford University(US)
- Medical University of Vienna(AT)
- Ospedale Santa Maria della Misericordia di Udine(IT)
- Université Claude Bernard Lyon 1(FR)
- Centre National de la Recherche Scientifique(FR)
- Hospices Civils de Lyon(FR)
- Hôpital Lyon Sud(FR)
- Biologie Tissulaire et Ingénierie Thérapeutique(FR)
- The University of Texas Health Science Center(US)
- University of Missouri–Kansas City(US)
- Tel Aviv University(IL)
- Meir Medical Center(IL)