Paper fidelity

pragmatiq's goal is not novelty over PRAGMA — it is to make the implementation path concrete and honest. This page is the explicit ledger of what is the paper's and what is ours.

Component	Source
Key–value–time tokenization + `8·ln(1+Δt/8)` time transform	Paper
TimeRoPE on continuous log-seconds (profile = since milestone; history = to last event)	Paper
Profile / event / history encoders; tied 3d MLM head + label smoothing	Paper
Masking 15% token / 10% event / 10% key, 10% `[UNK]`-as-dropout	Paper
Model sizes 10M / 100M / 1B	Paper
Pre-training caps (event ≤24, profile ≤200, ≤6500 events/user)	Paper
PRAGMA+Nemotron frozen-text-embedding variant (MSE)	Paper · opt-in
Synthetic data generator	Our addition
AML transfer-graph GraphSAGE ablation	Our addition
Gradient-boosting downstream probe	Our default
`nano` CPU/CI model size	Our addition

How fidelity is kept honest

The core representation above is implemented to match the paper and reviewed against a single internal spec that every module is checked against.
Where the paper is silent on an engineering detail, pragmatiq picks a default, exposes it in config, and records it — see the Decisions log.
Where pragmatiq goes beyond the paper (the generator, the AML graph), it is presented standalone and labelled, never folded into a fidelity claim.

The paper uses real Revolut data; pragmatiq cannot, so the synthetic generator stands in. That is the single largest gap between this implementation and the paper's results, and it is why the headline numbers here are about signal recovery on synthetic data, not about matching the paper's absolute metrics.

pragmatiq is an independent implementation inspired by the PRAGMA paper (arXiv 2604.08649) and is not affiliated with or endorsed by Revolut.

How fidelity is kept honest

On this page