Getting started

Muster is a monorepo of small packages around one CLI. Initialize a workspace, point it at a provider, run, and check the ledger.

1 · Initialize

$ muster init
Created or reused Muster config: .muster/config.json
Default provider: Codex CLI with model gpt-5.5
Next: muster doctor

muster doctor checks the config, routing policy (one active runtime per run), and probes every configured provider's /models endpoint. muster doctor --fix repairs the config and data directory.

2 · Add a provider

$ muster provider add openai                 # any preset id — see Providers below
$ muster provider add anthropic --model claude-sonnet-4-6
$ muster provider add-openai-compatible myllm http://localhost:8000/v1 served-model
$ muster runtime use-provider native openai  # bind the native runtime to it

3 · Run

$ muster run "Summarize what changed on uat-erp since Friday." --scope user:me
recalled 1 scoped memories into context
run=ea59ad5f… runtime=native model=stub/stub-model task=simple_qa status=completed
tokens in=54~ out=64~

Every run is recorded as an episode with scoped memory recall and token estimates. Options: --runtime pi, --provider anthropic, --model …, --session memory|create|continue, --task-kind coding, --sensitive.

4 · Audit

$ muster tokens        # per-run ledger + totals by model + waste flags
$ muster episodes      # every recorded run
$ muster verify        # line-level integrity check of all JSONL stores
$ muster status        # one-screen mission control

Want to see what these look like with real output? The portal preview is rendered entirely from a captured workspace.

Providers

No API or AI provider is ever a bottleneck. Anything speaking the OpenAI-compatible chat protocol works out of the box; CLI runtimes (Claude Code, Codex, Pi) and local servers are first-class too. List presets any time with muster provider presets. Source: packages/core/src/providers-catalog.ts.

Cloud providers

id	provider	key env	default model
openai	OpenAI	OPENAI_API_KEY	gpt-5.4
anthropic	Anthropic Claude (API)	ANTHROPIC_API_KEY	claude-sonnet-4-6
xai	xAI Grok	XAI_API_KEY	grok-4
kimi	Moonshot Kimi	MOONSHOT_API_KEY	kimi-k2-0905-preview
deepseek	DeepSeek	DEEPSEEK_API_KEY	deepseek-chat
mistral	Mistral	MISTRAL_API_KEY	mistral-large-latest
gemini	Google Gemini (OpenAI-compatible endpoint)	GEMINI_API_KEY	gemini-2.5-pro
qwen	Alibaba Qwen (DashScope)	DASHSCOPE_API_KEY	qwen-max
zhipu	Zhipu GLM	ZHIPU_API_KEY	glm-4.6
perplexity	Perplexity	PERPLEXITY_API_KEY	sonar-pro
groq	Groq	GROQ_API_KEY	llama-3.3-70b-versatile
cerebras	Cerebras	CEREBRAS_API_KEY	llama-3.3-70b

Aggregators (one key, many models)

id	provider	key env	default model
openrouter	OpenRouter	OPENROUTER_API_KEY	anthropic/claude-sonnet-4.6
together	Together AI	TOGETHER_API_KEY	meta-llama/Llama-3.3-70B-Instruct-Turbo
fireworks	Fireworks AI	FIREWORKS_API_KEY	accounts/fireworks/models/llama-v3p3-70b-instruct

Local / self-hosted (open source, air-gap friendly)

id	provider	base URL	notes
lmstudio	LM Studio (local)	http://localhost:1234/v1	No API key. Start the LM Studio server first.
vllm	vLLM (self-hosted)	http://localhost:8000/v1	Point --base-url at your vLLM server.
sglang	SGLang (self-hosted)	http://localhost:30000/v1	Point --base-url at your SGLang server.

CLI-auth runtimes (subscription auth, no raw API key)

codex-cli      OpenAI Codex CLI — uses your local `codex` login (default gpt-5.5)

Also available without presets:
  any OpenAI-compatible endpoint:  muster provider add-openai-compatible <id> <base-url> <model> [--api-key-env VAR]
  Claude Code CLI runtime:         muster run "..." --runtime claude-code   (uses your local `claude` login)
  Pi-managed providers:            muster run "..." --runtime pi --provider anthropic   (uses Pi auth)

Flows

Flows are deterministic pipelines-as-data: validated definitions, durable runs, budgets, and approval gates that show the approver the actual output, not a step name. Source spec: docs/FLOW_ENGINE_SPEC.md.

id: weekly-ticket-digest
budgetTokens: 50000          # hard ceiling, run aborts cleanly past it
steps:
  - id: fetch
    kind: tool               # deterministic tool call, no model
    tool: frappe_dataset_fetch_for_artifact
    args: { doctype: HD Ticket, filters: { status: Open } }
  - id: summarize
    kind: agent              # model step, routed via normal run loop
    prompt: "Summarize these tickets per team: {{fetch.rows}}"
    taskKind: artifact
  - id: approve
    kind: gate               # halts; resumable
    show: summarize.text     # approver sees the ACTUAL output, not a step name
    expiresHours: 48
  - id: post
    kind: tool
    tool: frappe_records_create
    when: approve.granted

Core properties

Preflight — muster flow check <id> validates schema, tool existence, template variables and permissions before anything runs.
Durable runs as data — every step result appends to .muster/data/flows/<run>.jsonl; gate state lives in the run record, not a magic token, so it survives restarts.
Replay & diff — muster flow replay <run> and muster flow diff <run-a> <run-b>: regression detection for automations.
Budgeted — per-flow token ceiling enforced by the ledger; cost reported per step in muster tokens.
Loops — muster flow loop <flow-id> --cron "0 9 * * 1" binds a flow to the scheduler.

Lifecycle

$ muster flow save deploy-digest.json     # validate + store the definition
$ muster flow check deploy-digest         # preflight=ok
$ muster flow run deploy-digest           # runs until a gate → awaiting_approval
$ muster flow approve flowrun_89e4607c    # resumes; remaining steps execute
$ muster flow show flowrun_89e4607c       # full step-by-step run record

A complete captured lifecycle — gate halt, approval, resume — is on the portal preview under Flows.

Gateway & surfaces

One gateway process, one message envelope, any frontend. Six webhook adapters ship in packages/gateway/src/adapters — telegram, slack, discord, whatsapp, gchat, teams — plus a browser client in packages/surface. Discord interactions are verified with native ed25519 signatures.

$ muster gateway init                # writes .muster/gateway.json (token inside)
$ muster gateway start --port 7460
$ muster pairing list                # pending senders
$ muster pairing approve <code>      # first message from an unknown sender requires pairing

Web client

The whole client is one authenticated POST (from packages/surface/demo/index.html):

const surface = (url, token) => ({
  async send(text) {
    const r = await fetch(url + "/v1/messages", { method: "POST",
      headers: { "content-type": "application/json", authorization: "Bearer " + token },
      body: JSON.stringify({ surfaceId: "web:demo", conversationId: "demo",
                             senderId: "demo-user", text }) });
    return r.json();
  }
});

If the sender is unpaired, the gateway answers status=pairing_required with a code; the operator approves it with muster pairing approve <code>. Chat adapters speak the same envelope, so a Telegram DM and a Teams message hit the same run loop with the same governance.

Capability packs

Capability packs are sandboxed tool bundles with a manifest, declared permissions, secrets and evals. Inspect before you load:

$ muster capability inspect capability-packs/frappe
$ muster capability load capability-packs/frappe [--allow-high-risk]

frappe — Frappe/ERPNext Federated Bridge

From capability-packs/frappe/manifest.json (v0.1.0): permission-scoped Frappe/ERPNext tools with federated OAuth identity. Every read and write executes as the paired Frappe user; Frappe remains the only authorization authority. Sandbox: network_limited; declared secrets: FRAPPE_SITE_URL, FRAPPE_API_TOKEN; ships with identity-resolution and permission-scoping eval suites.

implemented tools	planned tools
frappe_identity_resolve	frappe_customization_context
frappe_semantic_data_resolve_lite	frappe_semantic_data_resolve · frappe_records_update
frappe_records_create	frappe_permission_check · frappe_dataset_fetch_for_artifact

CLI reference

Verbatim from muster help (packages/cli/src/index.ts):

Muster v0

Usage:
  muster init
  muster doctor [--fix]
  muster status
  muster chat "your prompt"
  muster claude inspect
  muster claude ask "prompt" [--model sonnet] [--effort low] [--timeout-ms 30000]
  muster episodes
  muster feedback <episode-id> --useful|--not-useful [--correct] [--reason "..."]
  muster candidates
  muster eval seed <episode-id> [--expect "..."] [--forbid "..."]
  muster eval run [path-or-dir]
  muster capability inspect <path>
  muster capability load <path> [--allow-high-risk]
  muster context graph [episode-id] [--scope tenant:hybrow] [--latest]
  muster memory add --summary "..." --scope user:me --provenance manual
  muster memory search --scope user:me [--query "..."] [--include-global]
  muster memory promote <memory-id> --to tenant:acme [--allow-global]
  muster tui
  muster tui ask "your prompt"
  muster provider list
  muster provider add-openai-compatible <id> <base-url> <model> [--api-key-env OPENAI_API_KEY]
  muster provider add-codex-cli <id> <model>
  muster provider presets
  muster provider add <preset> [--model X] [--api-key-env VAR] [--base-url URL]   (openai, anthropic, xai, kimi, deepseek, groq, openrouter, ...)
  muster runtime use-provider <runtime-id> <provider-id> [model]
  muster pi inspect [--home /path/to/home]
  muster pi models [--provider anthropic] [--available] [--agent-dir ~/.pi/agent]
  muster pi tools [--agent-dir ~/.pi/agent] [--tools read,grep,find,ls]
  muster pi commands [--agent-dir ~/.pi/agent] [--tools read,grep,find,ls]
  muster pi tui ["optional startup prompt"] [--agent-dir ~/.pi/agent] [--session create|continue|memory] [--session-dir path]
  muster pi ask "prompt" [--provider openai] [--model gpt-4o-mini] [--transport sdk|cli] [--session memory|create|continue] [--session-dir path] [--timeout-ms 30000]
  muster state export [--output packages/ui/public/muster-state.json]
  muster state show
  muster migrate openclaw --dry-run
  muster migrate hermes --dry-run
  muster migrate pi --dry-run
  muster run "prompt" [--runtime pi] [--provider anthropic] [--model claude-sonnet-4-5] [--session memory|create|continue] [--scope user:me] [--task-kind coding] [--sensitive]
  muster tokens [--limit 20]
  muster profile create|list|use|current [name]
  muster schedule add "*/5 * * * *" "prompt" | list | remove <id> | run-due
  muster evolve <suite.json> [--runtime pi] [--provider anthropic] [--model ...] [--iterations 2]
  muster evolve selfcheck
  muster flow save <file.json> | list | check <id> | run <id>
  muster flow runs | show <run-id> | approve <run-id> | reject <run-id>
  muster gateway init
  muster gateway start [--port 7460]
  muster pairing list | approve <code>
  muster flow replay <run-id> [--live-agents]
  muster flow diff <run-id-a> <run-id-b>
  muster flow loop <flow-id> --cron "0 9 * * 1"
  muster verify

Design rule:
  One active runtime per run. Providers/models can route dynamically by task.