Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
GPTViet: An Open-Source Vietnamese Foundation Model from Pretraining to Domain Specialization
0
Zitationen
6
Autoren
2025
Jahr
Abstract
As open-source Large Language Models (LLMs) increasingly rival proprietary counterparts, the need for foundational models tailored to specific linguistic and cultural contexts becomes critical. This paper presents GPTViet, a series of foundational LLMs for the Vietnamese language. Built upon the LLaMA architecture, GPTViet was developed by curating a high-quality Vietnamese corpus and performing extensive finetuning on a range of model sizes (8 B to 70 B parameters). Evaluations demonstrate that GPTViet models significantly outperform their respective base models on Vietnamese-specific tasks, as measured by standard benchmarks and our custombuilt VietExam benchmark. The practical utility of this work is showcased through domain-specific application, VietHealth 70B for medical consultation. Adhering to the principles of open source Llama, GPTViet and all its derivatives are publicly released under an open-source license. This initiative provides the Vietnamese research and development community with a powerful, adaptable foundation to accelerate the creation of diverse intelligent applications. For more information, please visit https://github.com/VietnamAIHub/GPTViet and demo at http://gptviet.ioit.ac.vn/.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.391 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.257 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.685 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.501 Zit.