Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
P1149 AI-assisted Inflammatory Bowel Disease management for non-specialist physicians: A comparative evaluation of GPT-4 and DeepSeek
0
Zitationen
5
Autoren
2026
Jahr
Abstract
Abstract Background Inflammatory bowel disease (IBD), including ulcerative colitis and Crohn’s disease, poses significant diagnostic and therapeutic challenges. Globally, a shortage of IBD specialists often necessitates that non-specialist clinicians manage these cases, especially in China. While artificial intelligence (AI) tools such as GPT-4 and DeepSeek may assist in clinical decision-making, their utility in supporting non-specialists remains underexplored. This study aims to assess the performance of GPT-4 and DeepSeek in providing clinical guidance to manage IBD for non-specialist physicians. Methods A quantitative evaluation was conducted through expert assessments and non-specialist physician feedback. The study comprised 20 clinical questions submitted by non-specialist physicians, along with a designed case analysis, all of which were entered into two AI models. Seven IBD experts rated the accuracy and completeness of responses using a Likert 5-point scale, while 12 non-specialist physicians assessed their clinical usefulness. Results Both AI models scored above 4.0 in terms of accuracy, completeness, and usefulness. DeepSeek outperformed GPT-4 in diagnosis (P < 0.001), treatment (P < 0.001), and follow-up monitoring (accuracy P = 0.023, completeness P < 0.001). Non-specialist physicians rated DeepSeek more useful for diagnosis (P = 0.004) and follow-up monitoring (P = 0.005), with no significant difference in treatment options. GPT-4 played a critical role in managing special populations. In specific case analyses, GPT-4 offered superior recommendations for subsequent treatment compared to DeepSeek, while no differences were noted in other aspects. While both models demonstrated distinct strengths and weaknesses, they shared common limitations in providing personalized treatment recommendations and lacking reference sources. Conclusion GPT-4 and DeepSeek both provide valuable guidance for non-specialist physicians, but each has distinct strengths and limitations;importantly, high-risk scenarios still warrant red-flag escalation. Integrating their complementary advantages is essential for developing IBD-specialized AI-assisted protocols to enhance IBD management. References: 1. Torres J, Bonovas S and Doherty G, et al. ECCO Guidelines on Therapeutics in Crohn’s Disease: Medical Treatment. Journal of Crohn’s and Colitis 2020; 14: 4-22. 2. Haupt CE and Marks M. AI-Generated Medical Advice-GPT and Beyond. Jama 2023; 329: 1349-1350. 3. Zhang Y, Wan X and Kong Q, et al. Evaluating large language models as patient education tools for inflammatory bowel disease: A comparative study. World J Gastroenterol 2025; 31. 4. Peng Y, Malin BA and Rousseau JF, et al. From GPT to DeepSeek: Significant gaps remain in realizing AI in healthcare. J Biomed Inform 2025; 163: 104791. Conflict of interest: Dr. Liu, Yan: No conflict of interest Guo, Hong: No conflict of interest Song, Xiaomei: No conflict of interest Xiang, Lingya: No conflict of interest Wei, Tan: No conflict of interest
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.380 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.243 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.671 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.496 Zit.