Reference
Acceptance gates
The eight phase gates that keep pragmatiq honest — what each asserts and how to run it.
pragmatiq is built in eight phases, each with an acceptance gate — an executable script
under scripts/gates/
that must pass before the next phase begins. Gates are the reproducible contract: they run
the real pipeline and assert the property that phase promises.
| Gate | Asserts |
|---|---|
gate_1 | Synthetic generator acceptance — realism + byte-identical determinism. |
gate_2 | Key–value–time tokenizer — round-trip, [UNK] fallbacks, vocab range. |
gate_3 | Sharding + varlen collation — packed (varlen) attention ≡ padded attention. |
gate_4 | Model — the encoder stack and tied MLM head match the spec. |
gate_5 | End-to-end training — the probe beats the same-classifier raw-count baseline. |
gate_6 | AML GNN ablation — a graph over transfers recovers signal an isolated probe misses. |
gate_7 | Inference / serving / demo. |
gate_8 | Polish — nano end-to-end on CPU, validation, packaging + attribution. |
Running them
bash scripts/gates/gate_5.sh # CI scale (small N, CPU, minutes)
PRAGMATIQ_GATE_FULL=1 bash scripts/gates/gate_5.sh # full scale (GPU box)Each gate is runnable standalone and honors PRAGMATIQ_GATE_FULL=1 for full-scale runs.
The CI-scale defaults run on a CPU in minutes; the full-scale variants (50k–100k users) are
intended for a GPU host — see Pretrain (and scale).
Never start phase N+1 with a red gate N
The gates are ordered. A red gate means the property it guards is broken; the fix comes before any new work on top of it.