OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 30.04.2026, 12:42

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Accuracy of automated scoring of verbal paired associates in a remote data collection context

2020·0 Zitationen·Alzheimer s & DementiaOpen Access
Volltext beim Verlag öffnen

0

Zitationen

4

Autoren

2020

Jahr

Abstract

Abstract Background The validation of remote testing methodologies has increased in relevance, given the impact of COVID‐19 on clinical trials. Verbal cognitive testing, which requires skilled raters, has not previously been feasible for remote testing. In previous research we had demonstrated that remote verbal cognitive testing using automatic speech recognition (ASR) was possible and showed the expected pattern of results. Here, we manually score responses to a verbal paired associates (VPA) test and explore the impact participant‐level demographic and technology related factors on scoring accuracy. Methods From a pool of 5,742 recordings of participants aged 17–86 years, 150 were randomly selected for manual review (age 30–70, M = 52.5). Participants were all fluent English speakers, and completed the VPA test via a device‐agnostic web‐app on their own devices. We recorded participant demographics and information regarding the operating system, browser and device on which the tasks were completed. Manual scoring was completed off‐line by trained raters through the Neurovocalix system. Results There was excellent agreement between the human scoring and ASR, with a Spearman correlation of 0.93 (p<0.0001) and an ICC(A,1) agreement of 1 (F(148,148) = 6951, p = <0.0001). The distribution of ASR errors was skewed, with a median of 0 and a mean of 0.97. The maximum number of scoring errors was 10, observed in two cases, where the ASR system did not detect correct responses due to very high levels of background noise or poor audio quality. We found no significant effect of age, gender, education or device on scoring errors. Key themes reported by raters as affecting the scoring accuracy were 1) slowness in responding, meaning that not the whole word was recorded, 2) the presence of significant background noise or poor audio quality 3) certain accents leading to miss‐recognition of specific words. Conclusion These results represent a comprehensive evaluation of the accuracy of automated scoring of verbal responses from data collected in a remote context. Overall, there is excellent accuracy. These results both demonstrate the potential utility of this approach to remote data collection as well as suggesting avenues for increasing accuracy of automated scoring.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationTelemedicine and Telehealth Implementation
Volltext beim Verlag öffnen