Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Assessment of Artificial Intelligence-based Translation Tools for Emergency Department Discharge Instructions
0
Zitationen
4
Autoren
2026
Jahr
Abstract
Introduction: Emergency departments (ED) in the United States serve as a safety net for millions, including those with limited English proficiency (LEP). Eight percent of individuals living in the United States have LEP, placing them at risk for language barriers that can adversely affect the quality and safety of their care. Many hospitals lack language-concordant care, especially at the time of discharge. Miscommunication at discharge can lead to adverse health outcomes, including medication errors, poor compliance, and unnecessary return visits to the ED. Our objectives in this study were to evaluate the quality and safety of artificial intelligence (AI)-generated translations of physician-written, patient-specific ED discharge instructions and to assess performance across varying levels of instruction complexity. Methods: Emergency physicians wrote free-form discharge instructions, representing patient-specific guidance, which are typically provided at the time of ED discharge. Four topics were selected: abdominal pain; chest pain; wrist fracture; and vaginal bleeding in pregnancy. These instructions were intentionally developed to vary in linguistic complexity and were assessed using the Flesch Reading Ease and Flesch-Kincaid Grade Level scales. Instructions were translated into Albanian, Brazilian Portuguese, and Vietnamese using the AI-based translation tools ChatGPT-4, Microsoft Copilot, and Google Translate. Translations were evaluated for semantic and syntactic accuracy. Criteria included adequacy, fluency, meaning, and severity on a 5-point scale (1 = lowest accuracy, 5 = highest accuracy). Preference and formality were rated on a 3-point scale (1 = lowest, 3 = highest). The primary outcome was the quality and safety of AI-generated translations of patient-specific discharge instructions. Secondary outcomes included the ability to handle varying instruction complexity. Professional medical translators primarily responsible for the written translation of medical text evaluated and scored the translations for accuracy and quality metrics. Results: Overall adequacy, fluency, meaning, and severity scores were similar across models. ChatGPT-4 (3.79), Microsoft Copilot (3.60), and Google Translate (3.50), showed no statistically significant differences. Albanian translation was an exception, with ChatGPT-4 scoring significantly higher (3.75) than Google Translate (3.19) (P < .001). There were no other significant differences observed for Brazilian Portuguese or Vietnamese. ChatGPT-4 was also found to be the highest rated for Albanian and Brazilian Portuguese. Both Microsoft Copilot and Google Translate produced a total of five potentially harmful translation errors, whereas none were identified for ChatGPT-4. Conclusion: Miscommunication during discharge can lead to negative patient outcomes. This study evaluated ChatGPT-4, Microsoft Copilot, and Google Translate in translating ED instructions into Albanian, Brazilian Portuguese, and Vietnamese. ChatGPT-4 performed best overall and produced no harmful translations, and significantly outperformed Google Translate in Albanian. While AI-based translation tools show promise, human oversight remains necessary to mitigate risks from translation inaccuracies.
Ähnliche Arbeiten
Fundamental Considerations in Language Testing
1991 · 4.401 Zit.
Interpretative Phenomenological Analysis
2020 · 4.057 Zit.
Implicit memory: History and current status.
1987 · 2.908 Zit.
Recognizing: The judgment of previous occurrence.
1980 · 2.683 Zit.
Category Interference in Translation and Picture Naming: Evidence for Asymmetric Connections Between Bilingual Memory Representations
1994 · 2.575 Zit.