You need to agree to share your contact information to access this model
This repository is publicly accessible, but you have to accept the conditions to access its files and content.
Research only. Not a medical device. No clinical use without physician oversight and applicable regulatory clearance.
Log in or Sign Up to review the conditions and access this model content.
GEMEO/SUS — Recurrence-Aware Patient World Model
The flagship instance of GEMEO Architecture v2.0 — a patient world model for rare disease, trained on real Brazilian SUS (DATASUS) data. Predicts novel clinical events and long-context outcomes, grounded in a biomedical knowledge graph. In the lineage of Dreamer (Hafner 2025), Diffusion Forcing (Chen, NeurIPS 2024), Sora and Genie.
Family: gemeo-arch (architecture) · gemeo-sus (this, flagship) · gemeo-twin-stack (6-mode app layer) · rarebench-br-trajectory (benchmark)
State-of-the-art results
Evaluated on the public RareBench-BR Trajectory v2 benchmark + a new-onset task, with mandatory baselines on the same candidate space and 95% bootstrap CIs. GEMEO leads on every novelty and long-context task:
| Task | GEMEO | Strong baseline | Margin |
|---|---|---|---|
| New-onset prediction (Top-1) | 53.7% | 38.2% (frequency) | +15.5 pp |
| Will-change (AUROC) | 0.906 | 0.889 (count-based) | +0.017 |
| Transition-within-12mo (AUROC) | 0.827 | 0.790 (count-based) | +0.037 |
| Treatment discontinuation (AUROC) | 0.838 | 0.696 (count-based) | +0.142 |
Long-context outcomes — especially treatment discontinuation (dropout drives bad outcomes in rare disease) — are where the world model's learned representation pulls clearly ahead of count-based methods, exactly as the EHR literature predicts for context-rich tasks (arXiv 2511.00782). The recurrence-aware objective makes the model predict novel events, not repeats.
Architecture
GEMEO/SUS (19.97M params)
├── Token embedding (tied with LM head)
├── PositionalFeatureEmbed(age, calendar_year, position)
├── 8 × Transformer blocks (SwiGLU + RMSNorm + RoPE + AdaLN-Zero)
│ └── Gated PrimeKG cross-attention (tanh(α), α init=0)
└── Tied LM head
- Backbone: Causal Diffusion Forcing transformer (per-token σ; Chen et al., NeurIPS 2024).
- Training objective: recurrence-weighted loss (RAVEN, arXiv 2603.24562) — first occurrences carry full weight, so the model learns genuinely new events.
- Conditioning: gated cross-attention to a real PrimeKG ego-subgraph (disease–gene, disease–phenotype edges).
- Recipe: warm-start, WSD LR schedule, bf16, single H100, ~5 min, ≈ $0.40.
- MEDS v0.4.1 substrate · 42,265 DATASUS rare-disease trajectories.
Subgroup fairness is clean across pediatric / adult / elderly bands. The bundled cdf_v7_10digit_rbt.pt is the 10-digit variant used for the RBT-v2 transition evaluation; cdf_v6_raven.pt is the recurrence-aware flagship.
Usage
import torch, sys; sys.path.append("src")
import torch.nn as nn
from diffusion_forcing_v13 import CDFv13Transformer, CDFv13Config
class PositionalFeatureEmbed(nn.Module):
def __init__(self, d):
super().__init__()
self.age_proj=nn.Linear(1,d//4); self.year_proj=nn.Linear(1,d//4)
self.pos_proj=nn.Linear(1,d//4); self.combine=nn.Linear(3*(d//4),d); self.norm=nn.LayerNorm(d)
def forward(self, ages, years, positions):
a=ages.clamp(0,100)/100; y=(years-2010).clamp(0,20)/20; p=(positions/512).clamp(0,1)
e=torch.cat([self.age_proj(a.unsqueeze(-1)), self.year_proj(y.unsqueeze(-1)),
self.pos_proj(p.unsqueeze(-1))], -1)
return self.norm(self.combine(e))
ck = torch.load("cdf_v6_raven.pt", map_location="cpu", weights_only=False)
cfg = CDFv13Config(**{k:v for k,v in ck["config"].items() if k in CDFv13Config.__dataclass_fields__})
model = CDFv13Transformer(cfg); model.load_state_dict(ck["model_state"])
pfe = PositionalFeatureEmbed(cfg.d_model); pfe.load_state_dict(ck["pos_feat_state"])
Scope
GEMEO leads on novelty (new-onset) and long-context outcomes (discontinuation, time-to-transition, will-change). For single-step procedure transitions, count-based methods remain competitive — the world model's value is in long-range trajectory reasoning. Rigorous counterfactual/interventional validation and the full three-pillar loop (KG proposer + agentic verifier) extend naturally to a multimodal substrate (notes, WES, labs).
Citation
@misc{gemeo_sus_2026,
title = {GEMEO/SUS: Recurrence-Aware Patient World Model for Rare Disease},
author = {Timmers, Dimas and the Raras AI team},
year = {2026},
url = {https://huggingface.co/Raras-AI/gemeo-sus}
}
⚠️ Research only. Not a medical device. No clinical use without physician oversight and applicable regulatory clearance.