pragmatiq
Tutorials

AML GNN ablation

Run the three-way GraphSAGE ablation over the transfer graph and read the result.

This is the hands-on runbook for pragmatiq's AML graph extension (our addition, not in the paper). See that page for why; here is how to run it.

Build a run with transfers + AML labels

pip install -e ".[gnn]"
pragmatiq synth generate --out data/aml --n-users 12000 --n-workers 8 \
  --config configs/data/synthetic.yaml         # produces transfers.parquet + labels/aml.parquet
pragmatiq tokenize data/aml --out data/aml/tok
pragmatiq pretrain data/aml/tok --name aml --model-size small

Run the ablation

pragmatiq gnn data/aml/tok --run runs/aml \
  --transfers data/aml/transfers.parquet \
  --aml-label data/aml/labels/aml.parquet \
  --epochs 150

This trains GraphSAGE over three setups and reports held-out ROC-AUC (mean ± std over seeds): (a) a probe on isolated embeddings, (b) GraphSAGE + pragmatiq features, (c) GraphSAGE + hand-crafted features.

Read it

The gated claim is relational recovery — (c) > (a) by a clear margin: the transfer graph carries AML signal an isolated per-user embedding cannot see. On the default synthetic book the mules are structurally distinctive (so hand-crafted degree is a strong baseline) — the honest caveats are documented on the concept page and in the 04_aml_gnn notebook.

On this page