OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 28.03.2026, 20:37

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Sentiment analysis to optimise CBME narrative feedback review

2025·1 Zitationen·Medical EducationOpen Access
Volltext beim Verlag öffnen

1

Zitationen

6

Autoren

2025

Jahr

Abstract

In 2014, the Royal College of Physicians and Surgeons of Canada initiated the transition to Competency-Based Medical Education (CBME) with the Competence by Design (CBD) framework, which was adopted by the Queen's Radiology Department in 2017. CBD requires regular, criterion-referenced assessments of Entrustable Professional Activities (EPAs; essential tasks performed by radiologists) through direct observations. A Competence Committee bi-annually reviews residents' performance to identify improvement and make promotion decisions. However, manually searching through copious amounts of narrative feedback is challenging. Unlike quantitative data, it also lacks systematic organisation, complicating objective decision-making and feedback construction.1 To address this gap, we sorted narrative feedback according to sentiment by developing natural language processing techniques in the robustly optimised Bidirectional Encoder Representations from Transformers approach (RoBERTa) model. By generating reports that label comments as negative, positive or neutral sentiments, faculty can find pertinent comments more easily within large datasets when reviewing resident progress. Our fine-tuned RoBERTa model demonstrated strong predictive ability, given its F1 score of 0.90 in classifying sentences. Using a three-label classification tool with a neutral category helps to sort many ambiguous statements that are neither positive nor negative. This is standard for sentiment analysis and reduces the misclassification risk. Since we prioritised accuracy, no more than three classes were introduced, which would not only involve more complex coding but also increase the misclassification risk. When training RoBERTa, we reviewed our original resident feedback to find that an overwhelming majority of comments were positive and neutral. Sentiment analysis of 1124 sentences showed the following distribution: negative (12.6%), neutral (26.5%), and positive (60.8%). To counterbalance this data, we generated additional synthetic sentences with negative sentiments. It is therefore important to acknowledge that no training dataset is perfect, and perpetuating biases can skew results in the actual analysis, which could interfere with fair evaluations. The sentiment analysis process revealed how the nuanced nature of narrative feedback can cause difficulty with interpretation and require further discussion. For example, the model misclassified the comment ‘continue to work on providing differentials’ as negative, which should have been neutral because suggestions for a trainee to improve do not necessarily indicate weak performance. Consequently, raters spent additional time reaching a consensus on this sentiment. Misclassifying comments can thus be due to inherent characteristics of the data rather than shortcomings of the tool itself. Due to limitations, RoBERTa model-based predictions require continuous monitoring and development to address any data biases. Next steps to develop this tool will involve collecting feedback from faculty regarding the practical utility of generated resident reports and how categorising comments by sentiment could streamline the review process for promotion. In collaboration with leaders in medical education, modelling techniques employed here can likewise be extended to other residency programs. Overall, our RoBERTa model is unique in analysing narrative feedback for radiology residents to address considerable assessment burdens under CBD. Sentiment labelling provides a scalable solution to support faculty with classifying comments and evaluating performance in the CBME era. Zier Zhou: Investigation; writing – original draft; writing – review and editing. Faraz Honarvar: Writing – original draft; investigation. Arsalan P. Rizwan: Investigation; writing – review and editing. Andrew D. Chung: Writing – review and editing. Nick Rogoza: Data curation; software. Benjamin Y. M. Kwan: Conceptualization; writing – review and editing; supervision. The authors declare that they have no conflict of interest. The data that support the findings of this study are available from the corresponding author upon reasonable request.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationInnovations in Medical EducationBiomedical and Engineering Education
Volltext beim Verlag öffnen