Getting started
Muster is a monorepo of small packages around one CLI. Initialize a workspace, point it at a provider, run, and check the ledger.
1 · Initialize
$ muster init Created or reused Muster config: .muster/config.json Default provider: Codex CLI with model gpt-5.5 Next: muster doctor
muster doctor checks the config, routing policy (one active runtime per run), and probes every configured provider's /models endpoint. muster doctor --fix repairs the config and data directory.
2 · Add a provider
$ muster provider add openai # any preset id — see Providers below $ muster provider add anthropic --model claude-sonnet-4-6 $ muster provider add-openai-compatible myllm http://localhost:8000/v1 served-model $ muster runtime use-provider native openai # bind the native runtime to it
3 · Run
$ muster run "Summarize what changed on uat-erp since Friday." --scope user:me recalled 1 scoped memories into context run=ea59ad5f… runtime=native model=stub/stub-model task=simple_qa status=completed tokens in=54~ out=64~
Every run is recorded as an episode with scoped memory recall and token estimates. Options: --runtime pi, --provider anthropic, --model …, --session memory|create|continue, --task-kind coding, --sensitive.
4 · Audit
$ muster tokens # per-run ledger + totals by model + waste flags $ muster episodes # every recorded run $ muster verify # line-level integrity check of all JSONL stores $ muster status # one-screen mission control
Want to see what these look like with real output? The portal preview is rendered entirely from a captured workspace.
Providers
No API or AI provider is ever a bottleneck. Anything speaking the OpenAI-compatible chat protocol works out of the box; CLI runtimes (Claude Code, Codex, Pi) and local servers are first-class too. List presets any time with muster provider presets. Source: packages/core/src/providers-catalog.ts.
Cloud providers
| id | provider | key env | default model |
|---|---|---|---|
| openai | OpenAI | OPENAI_API_KEY | gpt-5.4 |
| anthropic | Anthropic Claude (API) | ANTHROPIC_API_KEY | claude-sonnet-4-6 |
| xai | xAI Grok | XAI_API_KEY | grok-4 |
| kimi | Moonshot Kimi | MOONSHOT_API_KEY | kimi-k2-0905-preview |
| deepseek | DeepSeek | DEEPSEEK_API_KEY | deepseek-chat |
| mistral | Mistral | MISTRAL_API_KEY | mistral-large-latest |
| gemini | Google Gemini (OpenAI-compatible endpoint) | GEMINI_API_KEY | gemini-2.5-pro |
| qwen | Alibaba Qwen (DashScope) | DASHSCOPE_API_KEY | qwen-max |
| zhipu | Zhipu GLM | ZHIPU_API_KEY | glm-4.6 |
| perplexity | Perplexity | PERPLEXITY_API_KEY | sonar-pro |
| groq | Groq | GROQ_API_KEY | llama-3.3-70b-versatile |
| cerebras | Cerebras | CEREBRAS_API_KEY | llama-3.3-70b |
Aggregators (one key, many models)
| id | provider | key env | default model |
|---|---|---|---|
| openrouter | OpenRouter | OPENROUTER_API_KEY | anthropic/claude-sonnet-4.6 |
| together | Together AI | TOGETHER_API_KEY | meta-llama/Llama-3.3-70B-Instruct-Turbo |
| fireworks | Fireworks AI | FIREWORKS_API_KEY | accounts/fireworks/models/llama-v3p3-70b-instruct |
Local / self-hosted (open source, air-gap friendly)
| id | provider | base URL | notes |
|---|---|---|---|
| lmstudio | LM Studio (local) | http://localhost:1234/v1 | No API key. Start the LM Studio server first. |
| vllm | vLLM (self-hosted) | http://localhost:8000/v1 | Point --base-url at your vLLM server. |
| sglang | SGLang (self-hosted) | http://localhost:30000/v1 | Point --base-url at your SGLang server. |
CLI-auth runtimes (subscription auth, no raw API key)
codex-cli OpenAI Codex CLI — uses your local `codex` login (default gpt-5.5) Also available without presets: any OpenAI-compatible endpoint: muster provider add-openai-compatible <id> <base-url> <model> [--api-key-env VAR] Claude Code CLI runtime: muster run "..." --runtime claude-code (uses your local `claude` login) Pi-managed providers: muster run "..." --runtime pi --provider anthropic (uses Pi auth)
Flows
Flows are deterministic pipelines-as-data: validated definitions, durable runs, budgets, and approval gates that show the approver the actual output, not a step name. Source spec: docs/FLOW_ENGINE_SPEC.md.
id: weekly-ticket-digest
budgetTokens: 50000 # hard ceiling, run aborts cleanly past it
steps:
- id: fetch
kind: tool # deterministic tool call, no model
tool: frappe_dataset_fetch_for_artifact
args: { doctype: HD Ticket, filters: { status: Open } }
- id: summarize
kind: agent # model step, routed via normal run loop
prompt: "Summarize these tickets per team: {{fetch.rows}}"
taskKind: artifact
- id: approve
kind: gate # halts; resumable
show: summarize.text # approver sees the ACTUAL output, not a step name
expiresHours: 48
- id: post
kind: tool
tool: frappe_records_create
when: approve.granted
Core properties
- Preflight —
muster flow check <id>validates schema, tool existence, template variables and permissions before anything runs. - Durable runs as data — every step result appends to
.muster/data/flows/<run>.jsonl; gate state lives in the run record, not a magic token, so it survives restarts. - Replay & diff —
muster flow replay <run>andmuster flow diff <run-a> <run-b>: regression detection for automations. - Budgeted — per-flow token ceiling enforced by the ledger; cost reported per step in
muster tokens. - Loops —
muster flow loop <flow-id> --cron "0 9 * * 1"binds a flow to the scheduler.
Lifecycle
$ muster flow save deploy-digest.json # validate + store the definition $ muster flow check deploy-digest # preflight=ok $ muster flow run deploy-digest # runs until a gate → awaiting_approval $ muster flow approve flowrun_89e4607c # resumes; remaining steps execute $ muster flow show flowrun_89e4607c # full step-by-step run record
A complete captured lifecycle — gate halt, approval, resume — is on the portal preview under Flows.
Gateway & surfaces
One gateway process, one message envelope, any frontend. Six webhook adapters ship in packages/gateway/src/adapters — telegram, slack, discord, whatsapp, gchat, teams — plus a browser client in packages/surface. Discord interactions are verified with native ed25519 signatures.
$ muster gateway init # writes .muster/gateway.json (token inside) $ muster gateway start --port 7460 $ muster pairing list # pending senders $ muster pairing approve <code> # first message from an unknown sender requires pairing
Web client
The whole client is one authenticated POST (from packages/surface/demo/index.html):
const surface = (url, token) => ({
async send(text) {
const r = await fetch(url + "/v1/messages", { method: "POST",
headers: { "content-type": "application/json", authorization: "Bearer " + token },
body: JSON.stringify({ surfaceId: "web:demo", conversationId: "demo",
senderId: "demo-user", text }) });
return r.json();
}
});
If the sender is unpaired, the gateway answers status=pairing_required with a code; the operator approves it with muster pairing approve <code>. Chat adapters speak the same envelope, so a Telegram DM and a Teams message hit the same run loop with the same governance.
Capability packs
Capability packs are sandboxed tool bundles with a manifest, declared permissions, secrets and evals. Inspect before you load:
$ muster capability inspect capability-packs/frappe $ muster capability load capability-packs/frappe [--allow-high-risk]
frappe — Frappe/ERPNext Federated Bridge
From capability-packs/frappe/manifest.json (v0.1.0): permission-scoped Frappe/ERPNext tools with federated OAuth identity. Every read and write executes as the paired Frappe user; Frappe remains the only authorization authority. Sandbox: network_limited; declared secrets: FRAPPE_SITE_URL, FRAPPE_API_TOKEN; ships with identity-resolution and permission-scoping eval suites.
| implemented tools | planned tools |
|---|---|
| frappe_identity_resolve | frappe_customization_context |
| frappe_semantic_data_resolve_lite | frappe_semantic_data_resolve · frappe_records_update |
| frappe_records_create | frappe_permission_check · frappe_dataset_fetch_for_artifact |
CLI reference
Verbatim from muster help (packages/cli/src/index.ts):
Muster v0 Usage: muster init muster doctor [--fix] muster status muster chat "your prompt" muster claude inspect muster claude ask "prompt" [--model sonnet] [--effort low] [--timeout-ms 30000] muster episodes muster feedback <episode-id> --useful|--not-useful [--correct] [--reason "..."] muster candidates muster eval seed <episode-id> [--expect "..."] [--forbid "..."] muster eval run [path-or-dir] muster capability inspect <path> muster capability load <path> [--allow-high-risk] muster context graph [episode-id] [--scope tenant:hybrow] [--latest] muster memory add --summary "..." --scope user:me --provenance manual muster memory search --scope user:me [--query "..."] [--include-global] muster memory promote <memory-id> --to tenant:acme [--allow-global] muster tui muster tui ask "your prompt" muster provider list muster provider add-openai-compatible <id> <base-url> <model> [--api-key-env OPENAI_API_KEY] muster provider add-codex-cli <id> <model> muster provider presets muster provider add <preset> [--model X] [--api-key-env VAR] [--base-url URL] (openai, anthropic, xai, kimi, deepseek, groq, openrouter, ...) muster runtime use-provider <runtime-id> <provider-id> [model] muster pi inspect [--home /path/to/home] muster pi models [--provider anthropic] [--available] [--agent-dir ~/.pi/agent] muster pi tools [--agent-dir ~/.pi/agent] [--tools read,grep,find,ls] muster pi commands [--agent-dir ~/.pi/agent] [--tools read,grep,find,ls] muster pi tui ["optional startup prompt"] [--agent-dir ~/.pi/agent] [--session create|continue|memory] [--session-dir path] muster pi ask "prompt" [--provider openai] [--model gpt-4o-mini] [--transport sdk|cli] [--session memory|create|continue] [--session-dir path] [--timeout-ms 30000] muster state export [--output packages/ui/public/muster-state.json] muster state show muster migrate openclaw --dry-run muster migrate hermes --dry-run muster migrate pi --dry-run muster run "prompt" [--runtime pi] [--provider anthropic] [--model claude-sonnet-4-5] [--session memory|create|continue] [--scope user:me] [--task-kind coding] [--sensitive] muster tokens [--limit 20] muster profile create|list|use|current [name] muster schedule add "*/5 * * * *" "prompt" | list | remove <id> | run-due muster evolve <suite.json> [--runtime pi] [--provider anthropic] [--model ...] [--iterations 2] muster evolve selfcheck muster flow save <file.json> | list | check <id> | run <id> muster flow runs | show <run-id> | approve <run-id> | reject <run-id> muster gateway init muster gateway start [--port 7460] muster pairing list | approve <code> muster flow replay <run-id> [--live-agents] muster flow diff <run-id-a> <run-id-b> muster flow loop <flow-id> --cron "0 9 * * 1" muster verify Design rule: One active runtime per run. Providers/models can route dynamically by task.