pragmatiq
Reference

CLI

The pragmatiq command-line interface — a thin wrapper over the Python API.

The CLI is intentionally thin: each command parses arguments and calls a function in pragmatiq/api.py, so the CLI, notebooks, and production callers all use the same library surface. Run any command with --help for its full options.

CommandWhat it does
pragmatiq synth generateGenerate a synthetic dataset (--config, --n-users, --seed, --n-workers).
pragmatiq synth calibrateFit generator priors to shareable aggregate statistics (no raw data).
pragmatiq validateCheck a dataset against the data contract.
pragmatiq tokenizeFit the tokenizer and write tokenized shards + index.
pragmatiq pretrainPretrain a model (--model-size, --config, --resume auto).
pragmatiq embedEmbed every user with a trained model → parquet.
pragmatiq probeProbe a model on a label table (--probe-model gbdt|logistic|lightgbm).
pragmatiq finetuneLoRA fine-tune a model's adapters + head on a label table.
pragmatiq upliftEvaluate communication-campaign uplift.
pragmatiq gnnRun the AML transfer-graph GraphSAGE ablation.
pragmatiq runs list / compareInspect and compare runs.
pragmatiq benchmarkThroughput benchmark for embedding.
pragmatiq quickstartEnd-to-end smoke: synth → tokenize → pretrain → probe.

A typical session

pragmatiq synth generate --out data/synth --config configs/data/synthetic.yaml
pragmatiq tokenize data/synth --out data/tok --n-workers 8
pragmatiq pretrain data/tok --name demo --model-size small --config configs/pretrain.yaml
pragmatiq embed data/tok --run runs/demo --out embeddings.parquet
pragmatiq probe data/tok --run runs/demo --label data/synth/labels/default_12m.parquet

Each command maps 1:1 onto a function documented in the Python API reference.

On this page