Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
ChatTogoVar: a TogoVar-based retrieval-augmented generation system for precise genomic variant interpretation
0
Zitationen
3
Autoren
2026
Jahr
Abstract
Large language models (LLMs) have recently been adopted to assist in the interpretation of human genomic variants. However, general-purpose LLMs can produce incorrect outputs (commonly termed 'hallucinations'), particularly on specialized queries, raising concerns about their reliability for variant interpretation. Here, to mitigate this risk, we developed ChatTogoVar, a retrieval-augmented generation system that queries TogoVar, a variant database that integrates information, such as allele frequency and clinical significance, and incorporates the retrieved results into prompts. We constructed a benchmark of 150 questions sampled from a predefined pool of 1500 template-variant combinations (50 templates × 30 variants). For large-scale assessment, we used the full 1500-question pool for automated LLM-based scoring. ChatTogoVar achieved the highest score for 135/150 questions, outperforming both a general-purpose LLM and an existing specialized system. Furthermore, automatic evaluation of all 1500 questions by an LLM confirmed the same trend. These results suggest that integrating a reliable variant database with an LLM can improve the accuracy of variant interpretation and that ChatTogoVar may serve as a practical tool to support genomic medicine and personalized healthcare.
Ähnliche Arbeiten
Trimmomatic: a flexible trimmer for Illumina sequence data
2014 · 68.538 Zit.
Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology
2015 · 31.513 Zit.
BEDTools: a flexible suite of utilities for comparing genomic features
2010 · 30.027 Zit.
HTSeq—a Python framework to work with high-throughput sequencing data
2014 · 22.482 Zit.
A global reference for human genetic variation
2015 · 19.701 Zit.