Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Assessing the Capability of Large Language Model Chatbots in Generating Plain Language Summaries

2025·10 Zitationen·CureusOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

Background Plain language summaries (PLSs) make scientific research accessible to a broad non-expert audience. However, crafting effective PLS can be challenging, particularly for non-native English-speaking researchers. Large language model (LLM) chatbots have the potential to assist in generating summaries, but their effectiveness compared to human-generated PLS remains underexplored. Methods This cross-sectional study compared 30 human-written PLS with LLM chatbot (viz., ChatGPT (OpenAI, San Francisco, CA), Claude (Anthropic, San Francisco, CA), Copilot (Microsoft Corp., Washington, DC), Gemini (Google, Mountain View, CA), Meta AI (Meta, Menlo Park, CA), and Perplexity (Perplexity AI, Inc., San Francisco, CA)) generated PLS. The readability of the PLS was checked by the Flesch reading (FR) ease score, and understandability was checked by the Flesch-Kincaid (FK) grade level. Three authors rated the text on seven-item predefined criteria, and their average score was used to compare the quality of the PLS. Results In comparison to human-written PLS, chatbots could generate PLS with lower FK grade levels (p-value < 0.0001) and except Copilot, all others had higher FR ease scores. The overall score of human-written PLS was 8.89±0.26. Although there was statistically significant variance among the scores (F = 7.16, p-value = 0.0012), in the post-hoc test, there was no difference between human-generated and individual chatbots-generated PLS (ChatGPT 8.8±0.34, Claude 8.89±0.33, Copilot 8.69±0.4, Gemini 8.56±0.56, Meta AI 8.98±0.23, and Perplexity 8.8±0.3). Conclusion LLM chatbots can generate PLS with better readability and a person with a lower grade of education can understand it. The PLS are of similar quality to those written by human authors. Hence, authors can generate PLS from LLM chatbots and it is particularly beneficial for researchers in developing countries. While LLM chatbots improve readability, they may introduce minor inaccuracies also. Hence, PLS generated by LLM should always checked for accuracy and relevancy.

Autoren

Institutionen

All India Institute of Medical Sciences, Deoghar

Themen

Artificial Intelligence in Healthcare and EducationTopic ModelingTracheal and airway disorders

Volltext beim Verlag öffnen

Assessing the Capability of Large Language Model Chatbots in Generating Plain Language Summaries

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen