Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Crowdsourced and AI-generated Age of Acquisition (AoA) Norms for Vocabulary in Print: Extending the Kuperman et al. (2012) norms
3
Zitationen
4
Autoren
2025
Jahr
Abstract
This paper revisits the Age of Acquisition (AoA) norms of Kuperman et al. (2012). Three studies were conducted. Study 1 reports a crowdsourcing 'megastudy' obtaining 790,024 estimates from participants providing the age they could first read and write 11,074 early acquired words from Kuperman et al. (2012). The study aimed to differentiate between oral language receptive AoA and print AoA. The results correlate well with the original estimates and offer slightly higher AoA values for print knowledge. They are released as useful supplements to the original norms. Study 2 explored the potential of Large Language Models (LLMs), specifically GPT-4o, to replicate these new crowdsourced AoA norms. The findings indicated a strong correlation between AI-generated estimates and human judgments, showing the utility of AI in generating AoA estimates. The results confirm that AI is a valuable resource of norms for psycholinguistic and educational research, of particular value for under-resourced languages and researchers with limited resources. Based on the successful application of AI in Study 2, Study 3 extended the method to the entire set of words in the English Crowdsourcing Project (ECP), producing AI-generated AoA estimates for approximately 62,000 English words. This provides a substantial database of AoA norms that are found to correlate very highly with human-generated estimates (r =.86) and perform well in accounting for word processing times in regression analyses. The AI generated results have some important limitations, including overestimating the vocabulary acquired at some ages. All resources are available in the Open Science Framework for further exploration.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.553 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.444 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.943 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.792 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.