Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Token-splitting improves GPT-4.1 performance on plastic surgery exams: implications for AI-Assisted medical education
0
Zitationen
3
Autoren
2025
Jahr
Abstract
Large language models (LLMs), such as ChatGPT, have demonstrated impressive performance on general medical examinations; however, their effectiveness significantly declines in specialized board examinations due to limited domain-specific training data and computational constraints inherent to their self-attention mechanisms. This study investigates a novel token-splitting strategy informed by Cognitive Load Theory (CLT), aimed at overcoming these limitations by optimizing cognitive processing and enhancing knowledge retention in specialized educational contexts. We implemented a token-splitting approach by segmenting Taiwan plastic surgery board examination materials and associated textbook content into cognitively manageable segments ranging from 4,000 to 20,000 tokens. These segmented inputs were provided to GPT-4.1 via its standard ChatGPT web interface. Model performance was rigorously evaluated, comparing accuracy and efficiency across various token lengths and question complexities classified according to Bloom's taxonomy.The GPT-4.1 model utilizing the token-splitting strategy significantly outperformed the baseline (unmodified) model, achieving notably higher accuracy. The optimal segmentation length was determined to be 6,000 tokens, effectively balancing cognitive coherence with information retention and model attention. Errors observed at this optimal length primarily resulted from content absent from textual materials or requiring multimodal interpretation (e.g., image-based reasoning). Provided relevant textual content was adequately segmented, GPT-4.1 consistently demonstrated high accuracy (From 75.88% to 92.93%). The findings highlight that a token-splitting approach, grounded in Cognitive Load Theory, significantly enhances LLM performance on specialized medical board examinations. This accessible, user-friendly strategy provides educators and clinicians with a practical means to improve AI-assisted education outcomes without requiring complex technical skills or infrastructure. Future research and development integrating multimodal capabilities and adaptive segmentation strategies promise to further optimize educational applications and clinical decision-making support.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.539 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.426 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.921 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.586 Zit.