Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
NIA Genetics of Alzheimer’s Disease Data Storage Site (NIAGADS) AI‐Enhanced Search
0
Zitationen
9
Autoren
2024
Jahr
Abstract
BACKGROUND: NIAGADS is a national data repository that offers qualified investigators access to genomic data for Alzheimer's disease (AD) and related dementia. In addition, NIAGADS has made substantial effort to curate, harmonize, standardize, and disseminate AD-relevant variant, gene, and sequence annotations from publications, functional genomics datasets, and summary statistics deposited at NIAGADS. These results are made available to the public in a collection of interactive knowledgebases (AD Variant Portal, FILER Functional Genomics Repository, VariXam, Alzheimer's GenomicsDB & Genome Browser), all of which are accessible programmatically via the NIAGADS API. However, as these offerings grow, navigating them can be challenging. Here, we introduce AI-based enhancements to NIAGADS sites to help guide researchers and facilitate data discovery. METHOD: We leverage OpenAI's generative AI to build and train three large language models (LLMs) based on NIAGADS documentation, step-by-step recipes for data-access requests, subject-specific vocabularies, and the OpenAPI specification defining the NIAGADS API that allows programmatic access to the NIAGADS knowledgebases. For users of the API and to enhance search interfaces, we build on the LLMs to construct a framework for handling complex natural language instructions that decomposes an inquiry into tasks and subtasks and then plans, selects, and optionally executes API calls and parses the results. RESULT: Developing these LLMs allows NIAGADS to improve user experiences by integrating topic-specific chatbots and generative AI search tools into NIAGADS sites. Rule-based chatbots that leverage conversational AI on the NIAGADS portal and Data Sharing Service will respond to inquiries with answers inferred from the LLMs, with responses improving with user feedback. These bots will also supplement help requests, suggesting solutions to common inquiries. Planner-enhanced generative AI based on the API-specification trained LLMs will be tied to knowledgebase searches and filters in resources such as the GenomicsDB and FILER to allow users to leverage natural language processing to ask sophisticated questions that require multiple API calls to resolve the answer. CONCLUSION: Introducing AI-enhanced search creates an interactive opportunity for NIAGADS users to learn new information or discover resources and tools they can use to supplement their research, which, in turn, improves NIAGADS ability to support AD genetics research.
Ähnliche Arbeiten
Trimmomatic: a flexible trimmer for Illumina sequence data
2014 · 68.844 Zit.
Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology
2015 · 31.716 Zit.
BEDTools: a flexible suite of utilities for comparing genomic features
2010 · 30.145 Zit.
HTSeq—a Python framework to work with high-throughput sequencing data
2014 · 22.541 Zit.
A global reference for human genetic variation
2015 · 19.778 Zit.