OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 18.05.2026, 08:54

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Assessing readability and accuracy of content produced by the American College of Prosthodontists and large language models for patient education in prosthodontics

2025·2 Zitationen·Journal of ProsthodonticsOpen Access
Volltext beim Verlag öffnen

2

Zitationen

7

Autoren

2025

Jahr

Abstract

PURPOSE: This study aims to evaluate the readability and accuracy of content produced by ChatGPT, Copilot, Gemini, and the American College of Prosthodontists (ACP) for patient education in prosthodontics. MATERIALS AND METHODS: A series of 26 questions were selected from the ACP's list of questions (GoToAPro.org FAQs) and their published answers. Answers to the same questions were generated from ChatGPT-3.5, Copilot, and Gemini. The word counts of responses from chatbots and the ACP were recorded. The readability was calculated using the Flesch Reading Ease Scale and Flesch-Kincaid Grade Level. The responses were also evaluated for accuracy, completeness, and overall quality. Descriptive statistics were used to calculate mean and standard deviations (SD). One-way analysis of variance was performed, followed by the Tukey multiple comparisons to test differences across chatbots, ACP, and various selected topics. The Pearson correlation coefficient was used to examine the relationship between each variable. Significance was set at α < 0.05. RESULTS: ChatGPT had a higher word count, while ACP had a lower word count (p < 0.001). The cumulative scores of the prosthodontist topic had the lowest Flesch Reading Ease Scale score, while brushing and flossing topics displayed the highest score (p < 0.001). Brushing and flossing topics also had the lowest Flesch-Kincaid Grade Level score, whereas the prosthodontist topic had the highest score (p < 0.001). Accuracy for denture topics was the lowest across the chatbots and ACP, and it was the highest for brushing and flossing topics (p = 0.006). CONCLUSIONS: This study highlights the potential for large language models to enhance patient's prosthodontic education. However, the variability in readability and accuracy across platforms underscores the need for dental professionals to critically evaluate the content generated by these tools before recommending them to patients.

Ähnliche Arbeiten