Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
GEMEO: the first patient world model for rare disease, grounding generative clinical trajectories in the genome and a biomedical knowledge graph
0
Zitationen
3
Autoren
2026
Jahr
Abstract
GEMEO is, to our knowledge, the first patient world model built specifically for rare disease, and the first trained on real data from a national single-payer system (the Brazilian Unified Health System, SUS). It is a three-pillar architecture (Propose, Simulate, Verify): (A) a proposer that grounds candidate first-onset events in both a biomedical knowledge graph (PrimeKG) and the patient's genome via an ensemble of genomic foundation models (AlphaMissense, Evo 2, AlphaGenome); (B) a Causal Diffusion Forcing transformer trained with a recurrence-aware objective so it predicts genuinely novel events rather than echoing the patient's past; and (C) an agentic verifier that adjudicates every prediction with a traceable evidence path. On a new public benchmark designed to be immune to event autocorrelation (RareBench-BR Trajectory), GEMEO attains new-onset prediction Top-1 of 53.7% (95% CI 51.4-56.1) versus a 38.2% frequency baseline, and beats count-based methods on every long-context task: will-change AUROC 0.906, time-to-transition 0.827, and treatment discontinuation 0.838 versus 0.696. Its genomic pillar, validated on real ClinVar variants, scores variant pathogenicity at AUROC 0.93 (AlphaMissense, missense), 0.82 (Evo 2, zero-shot), and 0.73 (AlphaGenome, splice). We validate Level-1 (state-conditioned) and Level-2 (action-conditioned) capability on the clinical world-model rubric of Qazi et al., with the architecture designed for the counterfactual rollout of Level 3. This deposit contains the preprint (gemeo.pdf) and a snapshot of the open release (reference architecture, conformance suite, benchmark result files, and single-GPU reproducers). Dual licensing: the code, reference implementation, and conformance suite are released under Apache-2.0; the model weights and the RareBench-BR Trajectory benchmark are released under CC-BY-NC-4.0. Code: https://github.com/rarasAI/gemeo. Weights: https://huggingface.co/Raras-AI/gemeo-sus.
Ähnliche Arbeiten
Trimmomatic: a flexible trimmer for Illumina sequence data
2014 · 69.051 Zit.
Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology
2015 · 31.840 Zit.
BEDTools: a flexible suite of utilities for comparing genomic features
2010 · 30.226 Zit.
HTSeq—a Python framework to work with high-throughput sequencing data
2014 · 22.579 Zit.
A global reference for human genetic variation
2015 · 19.823 Zit.