Constraint: ROADMAP-only dogfood follow-up for 21:00 nudge on rebuilt claw git_sha cca6f682
Rejected: implementation change to plugin list serializer; request was one concrete follow-up if no backlog item
Confidence: high after repeated bounded samples
Scope-risk: narrow
Directive: Keep plugin inventory schema issue distinct from broad help JSON opacity
Tested: cargo run --manifest-path rust/Cargo.toml --bin claw -- version --output-format json; repeated timeout 8 ./rust/target/debug/claw plugins list --output-format json; ./rust/target/debug/claw plugins help --output-format json; git diff --check; scripts/fmt.sh --check
Not-tested: runtime behavior change, because this commit only documents the gap
Constraint: ROADMAP-only dogfood follow-up for 20:30 nudge on rebuilt claw git_sha ee41b266
Rejected: implementation change to MCP show status schema; request was one concrete follow-up if no backlog item
Confidence: high after bounded successful repro
Scope-risk: narrow
Directive: Replaces invalid hang/nondeterminism PRs with verified status contract gap
Tested: cargo run --manifest-path rust/Cargo.toml --bin claw -- version --output-format json; ./rust/target/debug/claw mcp show does-not-exist --output-format json; git diff --check; scripts/fmt.sh --check
Not-tested: runtime behavior change, because this commit only documents the gap
Constraint: ROADMAP-only dogfood follow-up for 20:00 nudge on rebuilt claw git_sha c6c01bea
Rejected: implementation change to native-agent detail dispatch; request was one concrete follow-up if no backlog item
Confidence: high
Scope-risk: narrow
Directive: Keep agent detail fallback distinct from #328/#329 native-agent source/schema issues; closed invalid hang hypotheses first
Tested: cargo run --manifest-path rust/Cargo.toml --bin claw -- version --output-format json; ./rust/target/debug/claw agents list --output-format json; ./rust/target/debug/claw agents show analyst --output-format json; git diff --check; scripts/fmt.sh --check
Not-tested: runtime behavior change, because this commit only documents the gap
Constraint: ROADMAP-only dogfood follow-up for 18:30 nudge on rebuilt claw git_sha a510f734
Rejected: implementation change to config slash dispatcher; request was one concrete follow-up if no backlog item
Confidence: high
Scope-risk: narrow
Directive: Keep /config section discovery issue distinct from #342 /commands and #343 /models correction issues
Tested: ./rust/target/debug/claw --resume latest /config help --output-format json; /config list; /config show; bare /config; git diff --check; scripts/fmt.sh --check
Not-tested: runtime behavior change, because this commit only documents the gap
Constraint: ROADMAP-only dogfood follow-up for 16:00 nudge on rebuilt claw git_sha 58569131
Rejected: code change in the command dispatcher | request was specifically to add one ROADMAP.md-only item
Confidence: high
Scope-risk: narrow
Directive: Keep /tasks distinct from #340; this is unsupported command stub JSON, not session help
Tested: git diff --check; scripts/fmt.sh --check
Not-tested: runtime behavior change, because this commit only documents the gap
Capture the dogfood evidence as a roadmap item so the stdout JSON error-envelope contract can be fixed and regression-tested later.\n\nConstraint: User requested exactly one ROADMAP.md-only item #340 from current origin/main.\nConfidence: high\nScope-risk: narrow\nTested: git diff --check; scripts/fmt.sh --check\nNot-tested: Runtime behavior unchanged; documentation-only roadmap entry.
Constraint: ROADMAP.md-only restore of lost #337 from PR #2852 / Jobdori dogfood evidence
Rejected: Renumbering adjacent items | preserving existing #338 and surrounding roadmap entries keeps history stable
Confidence: high
Scope-risk: narrow
Directive: Keep #337 before #338 and do not collapse the dirty-file detail requirement into the broader help/status backlog
Tested: git diff --check; scripts/fmt.sh --check
Not-tested: Product behavior changes; documentation-only change
Constraint: Respond to 14:30 dogfood nudge with one direct claw-code pinpoint.\nEvidence: rebuilt actual debug binary at git_sha 24ccb59b; compared top-level help --output-format json with resume-safe /help --output-format json.\nFinding: same help surface uses message in top-level JSON and text in slash/resume JSON.\nTested: cargo run --manifest-path rust/Cargo.toml --bin claw -- version --output-format json; ./rust/target/debug/claw help --output-format json; ./rust/target/debug/claw --resume latest /help --output-format json; git diff --check; scripts/fmt.sh --check.\nNot-tested: full Rust suite; roadmap-only documentation change.
Constraint: Scope requested ROADMAP.md only with exactly one new #328 pinpoint from direct claw dogfood.\nRejected: Implementing the agents-help fix now | user requested roadmap-only evidence item.\nConfidence: high\nScope-risk: narrow\nDirective: Keep agent help source roots derived from the same loader registry as agents list; do not hand-maintain a divergent root list.\nTested: cargo run --manifest-path rust/Cargo.toml --bin claw -- version --output-format json; ./rust/target/debug/claw version --output-format json; ./rust/target/debug/claw agents help --output-format json; ./rust/target/debug/claw agents --output-format json; git diff --check; scripts/fmt.sh --check\nNot-tested: Full Rust test suite; roadmap-only documentation change.
Constraint: Scope limited to ROADMAP.md and one new pinpoint #327 from actual rebuilt claw dogfood.
Rejected: Code fix in this branch | user requested roadmap-only filing.
Confidence: high
Scope-risk: narrow
Directive: Keep mcp help source lists derived from actual config discovery, not hard-coded partial docs.
Tested: ./rust/target/debug/claw version --output-format json; ./rust/target/debug/claw mcp --help; ./rust/target/debug/claw mcp help --output-format json; temp .claw.json mcp list proof; git diff --check; scripts/fmt.sh --check
Not-tested: Full Rust test suite, documentation-only change.
Document the dogfood gap where help JSON stays parseable but hides command metadata inside a prose message, so future implementation can expose machine-readable command, slash-command, and resume-safety fields.\n\nConstraint: user requested ROADMAP.md-only pinpoint for issue #325 from origin/main d607ff36.\nRejected: implementing the schema now | requested fix shape is roadmap documentation only.\nConfidence: high\nScope-risk: narrow\nDirective: keep message for humans while adding schema/versioned structured help metadata when implementing.\nTested: git diff --check; scripts/fmt.sh --check\nNot-tested: runtime CLI behavior unchanged by docs-only change
Constraint: Documentation-only follow-up from current main e7074f47 after PR #2838; edit scope limited to ROADMAP.md.\nRejected: Implementing provenance detection now | user requested roadmap entry only.\nConfidence: high\nScope-risk: narrow\nDirective: Future implementation should compare embedded build git_sha/build date to workspace HEAD/dirty state without leaking secrets.\nTested: git diff --check; scripts/fmt.sh --check\nNot-tested: Runtime provenance behavior; this commit only records the roadmap requirement.
Keep claw --help's resume-safe slash command summary aligned with the interactive command list by filtering STUB_COMMANDS and adding regression coverage.
Operator status previously treated any tmux pane in a workspace as equivalent to active work. The new classifier uses tmux pane command/path metadata as a soft signal, treats plain shells as idle, and adds dirty-worktree abandoned markers to status and session-list output for clawhip consumers.
Constraint: Keep issue #320 prototype minimal and additive without new dependencies
Rejected: Screen-scraping pane output | fragile and broader than needed for lifecycle classification
Confidence: high
Scope-risk: narrow
Tested: cargo test -p rusty-claude-cli
Tested: cargo check -p rusty-claude-cli
Not-tested: cargo clippy -p rusty-claude-cli --all-targets -- -D warnings is blocked by pre-existing commands crate clippy::unnecessary_wraps warnings
The formatting wrapper should remain safe when invoked through different current directories or shell contexts, so resolve the script directory before entering the Rust workspace and forwarding cargo fmt arguments.
Constraint: Wrapper must be runnable from repo root while forwarding flags like --check
Rejected: Leave relative dirname cd | less robust if invocation context changes
Confidence: high
Scope-risk: narrow
Tested: scripts/fmt.sh --check
Tested: git diff --check
The Rust crate layout expects formatting to run from the rust directory, so add a root-level wrapper that preserves the working command while forwarding user flags like --check. Documentation now points contributors at the wrapper instead of the misleading virtual-workspace manifest invocation.
Constraint: Root-level cargo fmt --manifest-path rust/Cargo.toml is misleading for this virtual workspace
Rejected: Document cd rust && cargo fmt directly | a root wrapper gives one stable repo-root command
Confidence: high
Scope-risk: narrow
Tested: scripts/fmt.sh --check
Tested: git diff --check
Run rustfmt from the Rust workspace so CI format checks pass without changing behavior.
Constraint: Scope is formatting-only across tracked Rust files
Confidence: high
Scope-risk: narrow
Tested: cd rust && cargo fmt --check
Tested: git diff --check
Reject empty --allowedTools inputs instead of treating them as an empty restriction, and surface status JSON metadata that distinguishes default unrestricted tools from flag-provided allow lists.
Confidence: high
Scope-risk: narrow
Tested: cargo test -p rusty-claude-cli rejects_empty_allowed_tools_flag -- --nocapture
Tested: cargo test -p tools allowed_tools_rejects_empty_token_lists -- --nocapture
Tested: cargo check -p rusty-claude-cli -p tools
Tested: cargo test -p rusty-claude-cli -p tools
Not-tested: full workspace cargo fmt --check is blocked by pre-existing unrelated formatting drift
Worker boot could previously stall on an interactive MCP/tool permission prompt while readiness and startup-timeout surfaces only had generic idle/no-evidence shapes. This adds a first-class blocked lifecycle state, structured event payload, startup evidence fields, and regression coverage so callers can report the exact server/tool gate instead of pane-scraping.
Constraint: ROADMAP #200 requires tool/server identity, prompt age, and session-only versus always-allow capability in status/evidence surfaces
Rejected: Treat MCP/tool prompts as trust gates | conflates distinct prompts and loses tool identity
Rejected: Leave allow-scope as pane text only | clawhip still could not classify the blocker without scraping
Confidence: high
Scope-risk: moderate
Directive: Keep tool_permission_required distinct from trust_required; downstream claws rely on server/tool payload plus allow-scope metadata
Tested: cargo test -p runtime tool_permission
Tested: cargo fmt -p runtime -- --check && cargo clippy -p runtime --all-targets -- -D warnings && cargo test -p runtime
Tested: cargo test --workspace
Not-tested: live interactive MCP permission prompt in tmux
The pull brought the branch current with origin/main while replaying local follow-up work. Conflict resolution kept the roadmap/progress additions and integrated the runtime event/trust changes with upstream's newer surfaces.
The trust allowlist now treats worktree_pattern as an additional required predicate, including the missing-worktree case, so auto-trust cannot fall back to cwd-only matching when a worktree constraint was declared. The runtime formatting cleanup keeps clippy/fmt green after the merge.
Constraint: Local branch was 109 commits behind origin/main with dirty tracked follow-up work.
Rejected: Drop the autostash after conflict resolution | keeping it preserves a reversible safety backup for unrelated recovery.
Confidence: high
Scope-risk: moderate
Directive: Do not relax worktree_pattern matching without preserving the missing-worktree regression.
Tested: git diff --cached --check; cargo fmt -p runtime -- --check; cargo clippy -p runtime --all-targets -- -D warnings; cargo test -p runtime; cargo test --workspace; architect verification approved
Not-tested: Live tmux/worker auto-trust behavior outside unit/integration tests
## Gap
#77 Phase 1 added machine-readable error kind discriminants and #156 extended
them to text-mode output. However, the hint field is still prose derived from
splitting existing error text — not a stable registry-backed remediation
contract.
Downstream claws inspecting the hint field still need to parse human wording
to decide whether to retry, escalate, or terminate.
## Fix Shape
1. Remediation registry: remediation_for(kind, operation) -> Remediation struct
with action (retry/escalate/terminate/configure), target, and stable message
2. Stable hint outputs per error class (no more prose splitting)
3. Golden fixture tests replacing split_error_hint() string hacks
## Source
gaebal-gajae dogfood sweep 2026-04-22 05:30 KST
## Problem
#77 Phase 1 added machine-readable error `kind` discriminants to JSON error
payloads. Text-mode (stderr) errors still emit prose-only output with no
structured classification.
Observability tools (log aggregators, CI error parsers) parsing stderr can't
distinguish error classes without regex-scraping the prose.
## Fix
Added `[error-kind: <class>]` prefix line to all text-mode error output.
The prefix appears before the error prose, making it immediately parseable by
line-based log tools without any substring matching.
**Examples:**
## Impact
- Stderr observers (log aggregators, CI systems) can now parse error class
from the first line without regex or substring scraping
- Same classifier function used for JSON (#77 P1) and text modes
- Text-mode output remains human-readable (error prose unchanged)
- Prefix format follows syslog/structured-logging conventions
## Tests
All 179 rusty-claude-cli tests pass. Verified on 3 different error classes.
Closes ROADMAP #156.
## Problem
All JSON error payloads had the same three-field envelope:
```json
{"type": "error", "error": "<prose with hint baked in>"}
```
Five distinct error classes were indistinguishable at the schema level:
- missing_credentials (no API key)
- missing_worker_state (no state file)
- session_not_found / session_load_failed
- cli_parse (unrecognized args)
- invalid_model_syntax
Downstream claws had to regex-scrape the prose to route failures.
## Fix
1. **Added `classify_error_kind()`** — prefix/keyword classifier that returns a
snake_case discriminant token for 12 known error classes:
`missing_credentials`, `missing_manifests`, `missing_worker_state`,
`session_not_found`, `session_load_failed`, `no_managed_sessions`,
`cli_parse`, `invalid_model_syntax`, `unsupported_command`,
`unsupported_resumed_command`, `confirmation_required`, `api_http_error`,
plus `unknown` fallback.
2. **Added `split_error_hint()`** — splits multi-line error messages into
(short_reason, optional_hint) so the runbook prose stops being stuffed
into the `error` field.
3. **Extended JSON envelope** at 4 emit sites:
- Main error sink (line ~213)
- Session load failure in resume_session
- Stub command (unsupported_command)
- Unknown resumed command (unsupported_resumed_command)
## New JSON shape
```json
{
"type": "error",
"error": "short reason (first line)",
"kind": "missing_credentials",
"hint": "Hint: export ANTHROPIC_API_KEY..."
}
```
`kind` is always present. `hint` is null when no runbook follows.
`error` now carries only the short reason, not the full multi-line prose.
## Tests
Added 2 new regression tests:
- `classify_error_kind_returns_correct_discriminants` — all 9 known classes + fallback
- `split_error_hint_separates_reason_from_runbook` — with and without hints
All 179 rusty-claude-cli tests pass. Full workspace green.
Closes ROADMAP #77 Phase 1.
## Problem
Two session error messages advertised `.claw/sessions/` as the managed-session
location, but the actual on-disk layout is `.claw/sessions/<workspace_fingerprint>/`
where the fingerprint is a 16-char FNV-1a hash of the CWD path.
Users see error messages like:
```
no managed sessions found in .claw/sessions/
```
But the real directory is:
```
.claw/sessions/8497f4bcf995fc19/
```
The error copy was a direct lie — it made workspace-fingerprint partitioning
invisible and left users confused about whether sessions were lost or just in
a different partition.
## Fix
Updated two error formatters to accept the resolved `sessions_root` path
and extract the actual workspace-fingerprint directory:
1. **format_missing_session_reference**: now shows the actual fingerprint dir
and explains that it's a workspace-specific partition
2. **format_no_managed_sessions**: now shows the actual fingerprint dir and
includes a note that sessions from other CWDs are intentionally invisible
Updated all three call sites to pass `&self.sessions_root` to the formatters.
## Examples
**Before:**
```
no managed sessions found in .claw/sessions/
```
**After:**
```
no managed sessions found in .claw/sessions/8497f4bcf995fc19/
Start `claw` to create a session, then rerun with `--resume latest`.
Note: claw partitions sessions per workspace fingerprint; sessions from other CWDs are invisible.
```
```
session not found: nonexistent-id
Hint: managed sessions live in .claw/sessions/8497f4bcf995fc19/ (workspace-specific partition).
Try `latest` for the most recent session or `/session list` in the REPL.
```
## Impact
- Users can now tell from the error message that they're looking in the right
directory (the one their current CWD maps to)
- The workspace-fingerprint partitioning stops being invisible
- Operators understand why sessions from adjacent CWDs don't appear
- Error copy matches the actual on-disk structure
## Tests
All 466 runtime tests pass. Verified on two real workspaces with actual
workspace-fingerprint directories.
Closes ROADMAP #80.
## Problem
Three interactive slash commands are documented in `claw --help` but have no
corresponding section in USAGE.md:
- `/ultraplan [task]` — Run a deep planning prompt with multi-step reasoning
- `/teleport <symbol-or-path>` — Jump to a file or symbol by searching the workspace
- `/bughunter [scope]` — Inspect the codebase for likely bugs
New users see these commands in the help output but don't know:
- What each command does
- How to use it
- When to use it vs. other commands
- What kind of results to expect
## Fix
Added new section "Advanced slash commands (Interactive REPL only)" to USAGE.md
with documentation for all three commands:
1. **`/ultraplan`** — multi-step reasoning for complex tasks
- Example: `/ultraplan refactor the auth module to use async/await`
- Output: structured plan with numbered steps and reasoning
2. **`/teleport`** — navigate to a file or symbol
- Example: `/teleport UserService`, `/teleport src/auth.rs`
- Output: file content with the requested symbol highlighted
3. **`/bughunter`** — scan for likely bugs
- Example: `/bughunter src/handlers`, `/bughunter` (all)
- Output: list of suspicious patterns with explanations
## Impact
Users can now discover these commands and understand when to use them without
having to guess or search external sources. Bridges the gap between `--help`
output and full documentation.
Also filed ROADMAP #155 documenting the gap.
Closes ROADMAP #155.
## Problem
When a user types `claw --model gpt-4` or `--model qwen-plus`, they get:
```
error: invalid model syntax: 'gpt-4'. Expected provider/model (e.g., anthropic/claude-opus-4-6) or known alias
```
USAGE.md documents that "The error message now includes a hint that names the detected env var" — but this hint does not actually exist. The user has to re-read USAGE.md or guess the correct prefix.
## Fix
Enhance `validate_model_syntax` to detect when a model name looks like it belongs to a different provider:
1. **OpenAI models** (starts with `gpt-` or `gpt_`):
```
Did you mean `openai/gpt-4`? (Requires OPENAI_API_KEY env var)
```
2. **Qwen/DashScope models** (starts with `qwen`):
```
Did you mean `qwen/qwen-plus`? (Requires DASHSCOPE_API_KEY env var)
```
3. **Grok/xAI models** (starts with `grok`):
```
Did you mean `xai/grok-3`? (Requires XAI_API_KEY env var)
```
Unrelated invalid models (e.g., `asdfgh`) do not get a spurious hint.
## Verification
- `claw --model gpt-4` → hints `openai/gpt-4` + `OPENAI_API_KEY`
- `claw --model qwen-plus` → hints `qwen/qwen-plus` + `DASHSCOPE_API_KEY`
- `claw --model grok-3` → hints `xai/grok-3` + `XAI_API_KEY`
- `claw --model asdfgh` → generic error (no hint)
## Tests
Added 3 new assertions in `parses_multiple_diagnostic_subcommands`:
- GPT model error hints openai/ prefix and OPENAI_API_KEY
- Qwen model error hints qwen/ prefix and DASHSCOPE_API_KEY
- Unrelated models don't get a spurious hint
All 177 rusty-claude-cli tests pass.
Closes ROADMAP #154.
## Problem
Users frequently ask after building:
- "Where is the claw binary?"
- "Did the build actually work?"
- "Why can't I run \`claw\` from anywhere?"
This happens because \`cargo build\` puts the binary in \`rust/target/debug/claw\`
(or \`rust/target/release/claw\`), and new users don't know:
1. Where to find it
2. How to test it
3. How to add it to PATH (optional but common follow-up)
## Fix
Added new section "Post-build: locate the binary and verify" to README covering:
1. **Binary location table:** debug vs. release, macOS/Linux vs. Windows paths
2. **Verification commands:** Test the binary with \`--help\` and \`doctor\`
3. **Three ways to add to PATH:**
- Symlink (macOS/Linux): \`ln -s ... /usr/local/bin/claw\`
- cargo install: \`cargo install --path . --force\`
- Shell profile update: add rust/target/debug to \$PATH
4. **Troubleshooting:** Common errors ("command not found", "permission denied",
debug vs. release build speed)
## Impact
New users can now:
- Find the binary immediately after build
- Run it and verify with \`claw doctor\`
- Know their options for system-wide access
Also filed ROADMAP #153 documenting the gap.
Closes ROADMAP #153.
## Problem
Users commonly type `claw doctor --json`, `claw status --json`, or
`claw system-prompt --json` expecting JSON output. These fail with
`unrecognized argument \`--json\` for subcommand` with no hint that
`--output-format json` is the correct flag.
## Discovery
Filed as #152 during 21:17 dogfood nudge. The #127 worktree contained
a more comprehensive patch but conflicted with #141 (unified --help).
On re-investigation of main, Bugs 1 and 3 from #127 are already closed
(positional arg rejection works, no double "error:" prefix). Only
Bug 2 (the `--json` hint) remained.
## Fix
Two call sites add the hint:
1. `parse_single_word_command_alias`'s diagnostic-verb suffix path:
when rest[1] == "--json", append "Did you mean \`--output-format json\`?"
2. `parse_system_prompt_options` unknown-option path: same hint when
the option is exactly `--json`.
## Verification
Before:
$ claw doctor --json
error: unrecognized argument `--json` for subcommand `doctor`
Run `claw --help` for usage.
After:
$ claw doctor --json
error: unrecognized argument `--json` for subcommand `doctor`
Did you mean `--output-format json`?
Run `claw --help` for usage.
Covers: `doctor --json`, `status --json`, `sandbox --json`,
`system-prompt --json`, and any other diagnostic verb that routes
through `parse_single_word_command_alias`.
Other unrecognized args (`claw doctor garbage`) correctly don't
trigger the hint.
## Tests
- 2 new assertions in `parses_multiple_diagnostic_subcommands`:
- `claw doctor --json` produces hint
- `claw doctor garbage` does NOT produce hint
- 177 rusty-claude-cli tests pass
- Workspace tests green
Closes ROADMAP #152.
Filed from nudge directive at 21:17 KST. Implementation exists on worktree
`jobdori-127-verb-suffix` but needs rebase due to merge with #141.
Ready for Phase 1 implementation once conflicts resolved.
## Problem
`workspace_fingerprint(path)` hashes the raw path string without
canonicalization. Two equivalent paths (e.g. `/tmp/foo` vs
`/private/tmp/foo` on macOS) produce different fingerprints and
therefore different session stores. #150 fixed the test-side symptom;
this fixes the underlying product contract.
## Discovery path
#150 fix (canonicalize in test) was a workaround. Q's ack on #150
surfaced the deeper gap: the function itself is still fragile for
any caller passing a non-canonical path:
1. Embedded callers with a raw `--data-dir` path
2. Programmatic `SessionStore::from_cwd(user_path)` calls
3. NixOS store paths, Docker bind mounts, case-insensitive normalization
The REPL's default flow happens to work because `env::current_dir()`
returns canonical paths on macOS. But any caller passing a raw path
risks silent session-store divergence.
## Fix
Canonicalize inside `SessionStore::from_cwd()` and `from_data_dir()`
before computing the fingerprint. Kept `workspace_fingerprint()` itself
as a pure function for determinism — canonicalization is the entry
point's responsibility.
```rust
let canonical_cwd = fs::canonicalize(cwd).unwrap_or_else(|_| cwd.to_path_buf());
let sessions_root = canonical_cwd.join(".claw").join("sessions").join(workspace_fingerprint(&canonical_cwd));
```
Falls back to the raw path if canonicalize fails (directory doesn't
exist yet).
## Test-side updates
Three legacy-session tests expected the non-canonical base path to
match the store's workspace_root. Updated them to canonicalize
`base` after creation — same defensive pattern as #150, now
explicit across all three tests.
## Regression test
Added `session_store_from_cwd_canonicalizes_equivalent_paths` that
creates two stores from equivalent paths (raw vs canonical) and
asserts they resolve to the same sessions_dir.
## Verification
- `cargo test -p runtime session_store_` — 9/9 pass
- `cargo test --workspace` — all green, no FAILED markers
- No behavior change for existing users (REPL default flow already
used canonical paths)
## Backward compatibility
Users on macOS who always went through `env::current_dir()`:
no hash change, sessions resume identically.
Users who ever called with a non-canonical path: hash would change,
but those sessions were already broken (couldn't be resumed from a
canonical-path cwd). Net improvement.
Closes ROADMAP #151.
## #150 Fix: resume_latest test flake
**Problem:** `resume_latest_restores_the_most_recent_managed_session` intermittently
fails when run in the workspace suite or multiple times in sequence, but passes in
isolation.
**Root cause:** `workspace_fingerprint(path)` hashes the path string without
canonicalization. On macOS, `/tmp` is a symlink to `/private/tmp`. The test
creates a temp dir via `std::env::temp_dir().join(...)` which returns
`/var/folders/...` (non-canonical). When the subprocess spawns,
`env::current_dir()` returns the canonical path `/private/var/folders/...`.
The two fingerprints differ, so the subprocess looks in
`.claw/sessions/<hash1>` while files are in `.claw/sessions/<hash2>`.
Session discovery fails.
**Fix:** Call `fs::canonicalize(&project_dir)` after creating the directory
to ensure test and subprocess use identical path representations.
**Verification:** 5 consecutive runs of the full test suite — all pass.
Previously: 5/5 failed when run in sequence.
## #246 Filing: Reminder cron outcome ambiguity (control-loop blocker)
The `clawcode-dogfood-cycle-reminder` cron times out repeatedly with no
structured feedback on whether the nudge was delivered, skipped, or died in-flight.
**Phase 1 outcome schema** — add explicit field to cron result:
- `delivered` — nudge posted to Discord
- `timed_out_before_send` — died before posting
- `timed_out_after_send` — posted but cleanup timed out
- `skipped_due_to_active_cycle` — previous cycle active
- `aborted_gateway_draining` — daemon shutdown
Assigned to gaebal-gajae (cron/orchestration domain). Unblocks trustworthy
dogfood cycle observability.
Closes ROADMAP #150. Filed ROADMAP #246.
## Problem
`runtime::config::tests::validates_unknown_top_level_keys_with_line_and_field_name`
intermittently fails during `cargo test --workspace` (witnessed during
#147 and #148 workspace runs) but passes deterministically in isolation.
Example failure from workspace run:
test result: FAILED. 464 passed; 1 failed
## Root cause
`runtime/src/config.rs::tests::temp_dir()` used nanosecond timestamp
alone for namespace isolation:
std::env::temp_dir().join(format!("runtime-config-{nanos}"))
Under parallel test execution on fast machines with coarse clock
resolution, two tests start within the same nanosecond bucket and
collide on the same path. One test's `fs::remove_dir_all(root)` then
races another's in-flight `fs::create_dir_all()`.
Other crates already solved this pattern:
- plugins::tests::temp_dir(label) — label-parameterized
- runtime::git_context::tests::temp_dir(label) — label-parameterized
runtime/src/config.rs was missed.
## Fix
Added process id + monotonically-incrementing atomic counter to the
namespace, making every callsite provably unique regardless of clock
resolution or scheduling:
static COUNTER: AtomicU64 = AtomicU64::new(0);
let pid = std::process::id();
let seq = COUNTER.fetch_add(1, Ordering::Relaxed);
std::env::temp_dir().join(format!("runtime-config-{pid}-{nanos}-{seq}"))
Chose counter+pid over the label-parameterized pattern to avoid
touching all 20 callsites in the same commit (mechanical noise with
no added safety — counter alone is sufficient).
## Verification
Before: one failure per workspace run (config test flake).
After: 5 consecutive `cargo test --workspace` runs — zero config
test failures. Only pre-existing `resume_latest` flake remains
(orthogonal, unrelated to this change).
for i in 1 2 3 4 5; do cargo test --workspace; done
# All 5 runs: config tests green. Only resume_latest flake appears.
cargo test -p runtime
# 465 passed; 0 failed
## ROADMAP.md
Added Pinpoint #149 documenting the gap, root cause, and fix.
Closes ROADMAP #149.
## Scope
Two deltas in one commit:
### #128 closure (docs)
Re-verified on main HEAD `4cb8fa0`: malformed `--model` strings already
rejected at parse time (`validate_model_syntax` in parse_args). All
historical repro cases now produce specific errors:
claw --model '' → error: model string cannot be empty
claw --model 'bad model' → error: invalid model syntax: 'bad model' contains spaces
claw --model 'sonet' → error: invalid model syntax: 'sonet'. Expected provider/model or known alias
claw --model '@invalid' → error: invalid model syntax: '@invalid'. Expected provider/model ...
claw --model 'totally-not-real-xyz' → error: invalid model syntax: ...
claw --model sonnet → ok, resolves to claude-sonnet-4-6
claw --model anthropic/claude-opus-4-6 → ok, passes through
Marked #128 CLOSED in ROADMAP with repro block. Residual provenance gap
split off as #148.
### #148 implementation
**Problem.** After #128 closure, `claw status --output-format json`
still surfaces only the resolved model string. No way for a claw to
distinguish whether `claude-sonnet-4-6` came from `--model sonnet`
(alias resolution) vs `--model claude-sonnet-4-6` (pass-through) vs
`ANTHROPIC_MODEL` env vs `.claw.json` config vs compiled-in default.
Debug forensics had to re-read argv instead of reading a structured
field. Clawhip orchestrators sending `--model` couldn't confirm the
flag was honored vs falling back to default.
**Fix.** Added two fields to status JSON envelope:
- `model_source`: "flag" | "env" | "config" | "default"
- `model_raw`: user's input before alias resolution (null on default)
Text mode appends a `Model source` line under `Model`, showing the
source and raw input (e.g. `Model source flag (raw: sonnet)`).
**Resolution order** (mirrors resolve_repl_model but with source
attribution):
1. If `--model` / `--model=` flag supplied → source: flag, raw: flag value
2. Else if ANTHROPIC_MODEL set → source: env, raw: env value
3. Else if `.claw.json` model key set → source: config, raw: config value
4. Else → source: default, raw: null
## Changes
### rust/crates/rusty-claude-cli/src/main.rs
- Added `ModelSource` enum (Flag/Env/Config/Default) with `as_str()`.
- Added `ModelProvenance` struct (resolved, raw, source) with
three constructors: `default_fallback()`, `from_flag(raw)`, and
`from_env_or_config_or_default(cli_model)`.
- Added `model_flag_raw: Option<String>` field to `CliAction::Status`.
- Parse loop captures raw input in `--model` and `--model=` arms.
- Extended `parse_single_word_command_alias` to thread
`model_flag_raw: Option<&str>` through.
- Extended `print_status_snapshot` signature to accept
`model_flag_raw: Option<&str>`. Resolves provenance at dispatch time
(flag provenance from arg; else probe env/config/default).
- Extended `status_json_value` signature with
`provenance: Option<&ModelProvenance>`. On Some, adds `model_source`
and `model_raw` fields; on None (legacy resume paths), omits them
for backward compat.
- Extended `format_status_report` signature with optional provenance.
On Some, renders `Model source` line after `Model`.
- Updated all existing callers (REPL /status, resume /status, tests)
to pass None (legacy paths don't carry flag provenance).
- Added 2 regression assertions in parse_args test covering both
`--model sonnet` and `--model=...` forms.
### ROADMAP.md
- Marked #128 CLOSED with re-verification block.
- Filed #148 documenting the provenance gap split, fix shape, and
acceptance criteria.
## Live verification
$ claw --model sonnet --output-format json status | jq '{model,model_source,model_raw}'
{"model": "claude-sonnet-4-6", "model_source": "flag", "model_raw": "sonnet"}
$ claw --output-format json status | jq '{model,model_source,model_raw}'
{"model": "claude-opus-4-6", "model_source": "default", "model_raw": null}
$ ANTHROPIC_MODEL=haiku claw --output-format json status | jq '{model,model_source,model_raw}'
{"model": "claude-haiku-4-5-20251213", "model_source": "env", "model_raw": "haiku"}
$ echo '{"model":"claude-opus-4-7"}' > .claw.json && claw --output-format json status | jq '{model,model_source,model_raw}'
{"model": "claude-opus-4-7", "model_source": "config", "model_raw": "claude-opus-4-7"}
$ claw --model sonnet status
Status
Model claude-sonnet-4-6
Model source flag (raw: sonnet)
Permission mode danger-full-access
...
## Tests
- rusty-claude-cli bin: 177 tests pass (2 new assertions for #148)
- Full workspace green except pre-existing resume_latest flake (unrelated)
Closes ROADMAP #128, #148.
## Problem
The `"prompt"` subcommand arm enforced `if prompt.trim().is_empty()`
and returned a specific error. The fallthrough `other` arm in the same
match block — which routes any unrecognized first positional arg to
`CliAction::Prompt` — had no such guard. Result:
$ claw ""
error: missing Anthropic credentials; export ANTHROPIC_AUTH_TOKEN ...
$ claw " "
error: missing Anthropic credentials; ...
$ claw "" ""
error: missing Anthropic credentials; ...
$ claw --output-format json ""
{"error":"missing Anthropic credentials; ...","type":"error"}
An empty prompt should never reach the credentials check. Worse: with
valid credentials, the literal empty string gets sent to Claude as a
user prompt, either burning tokens for nothing or triggering a model-
side refusal. Same prompt-misdelivery family as #145.
## Root cause
In `parse_subcommand()`, the final `other =>` arm in the top-level
match only guards against typos (#108 guard via `looks_like_subcommand_typo`)
and then unconditionally builds `CliAction::Prompt { prompt: rest.join(" ") }`.
An empty/whitespace-only join passes through.
## Changes
### rust/crates/rusty-claude-cli/src/main.rs
Added the same `if joined.trim().is_empty()` guard already used in the
`"prompt"` arm to the fallthrough path. Error message distinguishes it
from the `prompt` subcommand path:
empty prompt: provide a subcommand (run `claw --help`) or a
non-empty prompt string
Runs AFTER the typo guard (so `claw sttaus` still suggests `status`)
and BEFORE CliAction::Prompt construction (so no network call ever
happens for empty inputs).
### Regression tests
Added 4 assertions in the existing parse_args test:
- parse_args([""]) → Err("empty prompt: ...")
- parse_args([" "]) → Err("empty prompt: ...")
- parse_args(["", ""]) → Err("empty prompt: ...")
- parse_args(["sttaus"]) → Err("unknown subcommand: ...") [verifies #108 typo guard still takes precedence]
### ROADMAP.md
Added Pinpoint #147 documenting the gap, verification, root cause,
fix shape, and acceptance. Joins the prompt-misdelivery cluster
alongside #145.
## Live verification
$ claw ""
error: empty prompt: provide a subcommand (run `claw --help`) or a non-empty prompt string
$ claw " "
error: empty prompt: provide a subcommand (run `claw --help`) or a non-empty prompt string
$ claw --output-format json ""
{"error":"empty prompt: provide a subcommand ...","type":"error"}
$ claw prompt "" # unchanged: subcommand-specific error preserved
error: prompt subcommand requires a prompt string
$ claw hello # unchanged: typo guard still fires
error: unknown subcommand: hello.
Did you mean help
$ claw "real prompt here" # unchanged: real prompts still reach API
error: api returned 401 Unauthorized (with dummy key, as expected)
All empty/whitespace-only paths exit 1. No network call. No misleading
credentials error.
## Tests
- rusty-claude-cli bin: 177 tests pass (4 new assertions)
- Full workspace green except pre-existing resume_latest flake (unrelated)
Closes ROADMAP #147.
## Problem
`claw config` and `claw diff` are pure-local read-only introspection
commands (config merges .claw.json + .claw/settings.json from disk; diff
shells out to `git diff --cached` + `git diff`). Neither needs a session
context, yet both rejected direct CLI invocation:
$ claw config
error: `claw config` is a slash command. Use `claw --resume SESSION.jsonl /config` ...
$ claw diff
error: `claw diff` is a slash command. ...
This forced clawing operators to spin up a full session just to inspect
static disk state, and broke natural pipelines like
`claw config --output-format json | jq`.
## Root cause
Sibling of #145: `SlashCommand::Config { section }` and
`SlashCommand::Diff` had working renderers (`render_config_report`,
`render_config_json`, `render_diff_report`, `render_diff_json_for`)
exposed for resume sessions, but the top-level CLI parser in
`parse_subcommand()` had no arms for them. Zero-arg `config`/`diff`
hit `parse_single_word_command_alias`'s fallback to
`bare_slash_command_guidance`, producing the misleading guidance.
## Changes
### rust/crates/rusty-claude-cli/src/main.rs
- Added `CliAction::Config { section, output_format }` and
`CliAction::Diff { output_format }` variants.
- Added `"config"` / `"diff"` arms to the top-level parser in
`parse_subcommand()`. `config` accepts an optional section name
(env|hooks|model|plugins) matching SlashCommand::Config semantics.
`diff` takes no positional args. Both reject extra trailing args
with a clear error.
- Added `"config" | "diff" => None` to
`parse_single_word_command_alias` so bare invocations fall through
to the new parser arms instead of the slash-guidance error.
- Added dispatch in run() that calls existing renderers: text mode uses
`render_config_report` / `render_diff_report`; JSON mode uses
`render_config_json` / `render_diff_json_for` with
`serde_json::to_string_pretty`.
- Added 5 regression assertions in parse_args test covering:
parse_args(["config"]), parse_args(["config", "env"]),
parse_args(["config", "--output-format", "json"]),
parse_args(["diff"]), parse_args(["diff", "--output-format", "json"]).
### ROADMAP.md
Added Pinpoint #146 documenting the gap, verification, root cause,
fix shape, and acceptance. Explicitly notes which other slash commands
(`hooks`, `usage`, `context`, etc.) are NOT candidates because they
are session-state-modifying.
## Live verification
$ claw config # no config files
Config
Working directory /private/tmp/cd-146-verify
Loaded files 0
Merged keys 0
Discovered files
user missing ...
project missing ...
local missing ...
Exit 0.
$ claw config --output-format json
{
"cwd": "...",
"files": [...],
...
}
$ claw diff # no git
Diff
Result no git repository
Detail ...
Exit 0.
$ claw diff --output-format json # inside claw-code
{
"kind": "diff",
"result": "changes",
"staged": "",
"unstaged": "diff --git ..."
}
Exit 0.
## Tests
- rusty-claude-cli bin: 177 tests pass (5 new assertions in parse_args)
- Full workspace green except pre-existing resume_latest flake (unrelated)
## Not changed
`hooks`, `usage`, `context`, `tasks`, `theme`, `voice`, `rename`,
`copy`, `color`, `effort`, `branch`, `rewind`, `ide`, `tag`,
`output-style`, `add-dir` — all session-mutating or interactive-only;
correctly remain slash-only.
Closes ROADMAP #146.
## Problem
`claw plugins` (and `claw plugins list`, `claw plugins --help`,
`claw plugins info <name>`, etc.) fell through the top-level subcommand
match and got routed into the prompt-execution path. Result: a purely
local introspection command triggered an Anthropic API call and surfaced
`missing Anthropic credentials` to the user. With valid credentials, it
would actually send the literal string "plugins" as a user prompt to
Claude, burning tokens for a local query.
$ claw plugins
error: missing Anthropic credentials; export ANTHROPIC_AUTH_TOKEN or ANTHROPIC_API_KEY before calling the Anthropic API
$ ANTHROPIC_API_KEY=dummy claw plugins
⠋ 🦀 Thinking...
✘ ❌ Request failed
error: api returned 401 Unauthorized
Meanwhile siblings (`agents`, `mcp`, `skills`) all worked correctly:
$ claw agents
No agents found.
$ claw mcp
MCP
Working directory ...
Configured servers 0
## Root cause
`CliAction::Plugins` exists, has a working dispatcher
(`LiveCli::print_plugins`), and is produced inside the REPL via
`SlashCommand::Plugins`. But the top-level CLI parser in
`parse_subcommand()` had arms for `agents`, `mcp`, `skills`, `status`,
`doctor`, `init`, `export`, `prompt`, etc., and **no arm for
`plugins`**. The dispatch never ran from the CLI entry point.
## Changes
### rust/crates/rusty-claude-cli/src/main.rs
Added a `"plugins"` arm to the top-level match in `parse_subcommand()`
that produces `CliAction::Plugins { action, target, output_format }`,
following the same positional convention as `mcp` (`action` = first
positional, `target` = second). Rejects >2 positional args with a clear
error.
Added four regression assertions in the existing `parse_args` test:
- `plugins` alone → `CliAction::Plugins { action: None, target: None }`
- `plugins list` → action: Some("list"), target: None
- `plugins enable <name>` → action: Some("enable"), target: Some(...)
- `plugins --output-format json` → action: None, output_format: Json
### ROADMAP.md
Added Pinpoint #145 documenting the gap, verification, root cause,
fix shape, and acceptance.
## Live verification
$ claw plugins # no credentials set
Plugins
example-bundled v0.1.0 disabled
sample-hooks v0.1.0 disabled
$ claw plugins --output-format json # no credentials set
{
"action": "list",
"kind": "plugin",
"message": "Plugins\n example-bundled ...\n sample-hooks ...",
"reload_runtime": false,
"target": null
}
Exit 0 in all modes. No network call. No "missing credentials" error.
## Tests
- rusty-claude-cli bin: 177 tests pass (new plugin assertions included)
- Full workspace green except pre-existing resume_latest flake (unrelated)
Closes ROADMAP #145.
Filing + Phase 1 fix in one commit (sibling of #143).
## Context
With #143 Phase 1 landed (`claw status` degrades), `claw mcp` was the
remaining diagnostic surface that hard-failed on a malformed `.claw.json`.
Same input, same parse error, same partial-success violation. Fresh
dogfood at 18:59 KST caught it on main HEAD `e2a43fc`.
## Changes
### ROADMAP.md
Added Pinpoint #144 documenting the gap and acceptance criteria. Joins
the partial-success / Principle #5 cluster with #143.
### rust/crates/commands/src/lib.rs
`render_mcp_report_for()` + `render_mcp_report_json_for()` now catch the
ConfigError at loader.load() instead of propagating:
- **Text mode** prepends a "Config load error" block (same shape as
#143's status output) before the MCP listing. The listing still renders
with empty servers so the output structure is preserved.
- **JSON mode** adds top-level `status: "ok" | "degraded"` +
`config_load_error: string | null` fields alongside existing fields
(`kind`, `action`, `working_directory`, `configured_servers`,
`servers[]`). On clean runs, `status: "ok"` and
`config_load_error: null`. On parse failure, `status: "degraded"`,
`config_load_error: "..."`, `servers: []`, exit 0.
- Both list and show actions get the same treatment.
### Regression test
`commands::tests::mcp_degrades_gracefully_on_malformed_mcp_config_144`:
- Injects the same malformed .claw.json as #143 (one valid + one broken
mcpServers entry).
- Asserts mcp list returns Ok (not Err).
- Asserts top-level status: "degraded" and config_load_error names the
malformed field path.
- Asserts show action also degrades.
- Asserts clean path returns status: "ok" with config_load_error null.
## Live verification
$ claw mcp --output-format json
{
"action": "list",
"kind": "mcp",
"status": "degraded",
"config_load_error": ".../.claw.json: mcpServers.missing-command: missing string field command",
"working_directory": "/Users/yeongyu/clawd",
"configured_servers": 0,
"servers": []
}
Exit 0.
## Contract alignment after this commit
All three diagnostic surfaces match now:
- `doctor` — degraded envelope with typed check entries ✅
- `status` — degraded envelope with config_load_error ✅ (#143)
- `mcp` — degraded envelope with config_load_error ✅ (this commit)
Phase 2 (typed-error object joining taxonomy §4.44) tracked separately
across all three surfaces.
Full workspace test green except pre-existing resume_latest flake (unrelated).
Closes ROADMAP #144 phase 1.
Previously `claw status` hard-failed on any config parse error, emitting
a bare error string and exiting 1. This took down the entire health
surface for a single malformed MCP entry, even though workspace, git,
model, permission, and sandbox state could all be reported independently.
`claw doctor` already degraded gracefully on the exact same input.
This commit matches `claw status` to that contract.
Changes:
- Add `StatusContext::config_load_error: Option<String>` to capture parse
errors without aborting.
- Rewrite `status_context()` to match on `ConfigLoader::load()`: on Err,
fall back to default `SandboxConfig` for sandbox resolution and record
the parse error, then continue populating workspace/git/memory fields.
- JSON output gains top-level `status: "ok" | "degraded"` marker and a
`config_load_error` string (null on clean runs). All other existing
fields preserved for backward compat.
- Text output prepends a "Config load error" block with Details + Hint
when config failed to parse, then a "Status (degraded)" header on the
main block. Clean runs show the usual "Status" header.
- Doctor path updated to pass the config load error through StatusContext.
Regression test `status_degrades_gracefully_on_malformed_mcp_config_143`:
- Injects a .claw.json with one valid + one malformed mcpServers entry
- Asserts status_context() returns Ok (not Err)
- Asserts config_load_error names the malformed field path
- Asserts workspace/sandbox fields still populated in JSON
- Asserts top-level status is 'degraded'
- Asserts clean config path still returns status: 'ok'
Verified live on /Users/yeongyu/clawd (contains deliberately broken MCP entries):
$ claw status --output-format json
{ "status": "degraded",
"config_load_error": ".../mcpServers.missing-command: missing string field command",
"model": "claude-opus-4-6",
"workspace": {...},
"sandbox": {...},
... }
Phase 2 (typed error object joining #4.44 taxonomy) tracked separately.
Full workspace test green except pre-existing resume_latest flake (unrelated).
Closes ROADMAP #143 phase 1.
Add two missing sections documenting the recently-fixed commands:
- **Initialize a repository**: Shows both text and JSON output modes for
`claw init`. Explains that structured JSON fields (created[], updated[],
skipped[], artifacts[]) allow claws to detect per-artifact state without
substring-matching prose. Documents idempotency.
- **Inspect worker state**: Documents `claw state` and the prerequisite
that a worker must have executed at least once. Includes the helpful error
message and remediation hints (claw or claw prompt <text>) so users
discovering the command for the first time see actionable guidance.
These sections complement the product fixes in #142 (init JSON structure)
and #139 (state error actionability) by documenting the contract from a
user perspective.
Related: ROADMAP #142 (structured init output), #139 (worker-state discoverability).
Previously `claw state` errored with "no worker state file found ... — run a
worker first" but there is no `claw worker` subcommand, so claws had no
discoverable path from the error to a fix.
Changes:
- Rewrite the missing-state error to name the two concrete commands that
produce .claw/worker-state.json:
* `claw` (interactive REPL, writes state on first turn)
* `claw prompt <text>` (one non-interactive turn)
Also tell the user what to rerun: `claw state [--output-format json]`.
- Expand the State --help topic with "Produces state", "Observes state",
and "Exit codes" lines so the worker-state contract is discoverable
before the user hits the error.
- Add regression test state_error_surfaces_actionable_worker_commands_139
asserting the error contains `claw prompt`, REPL mention, and the
rerun path, plus that the help topic documents the producer contract.
Verified live:
$ claw state
error: no worker state file found at .claw/worker-state.json
Hint: worker state is written by the interactive REPL or a non-interactive prompt.
Run: claw # start the REPL (writes state on first turn)
Or: claw prompt <text> # run one non-interactive turn
Then rerun: claw state [--output-format json]
JSON mode preserves the full hint inside the error envelope so CI/claws
can match on `claw prompt` without losing the canonical prefix.
Full workspace test green except pre-existing resume_latest flake (unrelated).
Closes ROADMAP #139.
Previously `claw init --output-format json` emitted a valid JSON envelope but
packed the entire human-formatted output into a single `message` string. Claw
scripts had to substring-match human language to tell `created` from `skipped`.
Changes:
- Add InitStatus::json_tag() returning machine-stable "created"|"updated"|"skipped"
(unlike label() which includes the human " (already exists)" suffix).
- Add InitReport::NEXT_STEP constant so claws can read the next-step hint
without grepping the message string.
- Add InitReport::artifacts_with_status() to partition artifacts by state.
- Add InitReport::artifact_json_entries() for the structured artifacts[] array.
- Rewrite run_init + init_json_value to emit first-class fields alongside the
legacy message string (kept for text consumers): project_path, created[],
updated[], skipped[], artifacts[], next_step, message.
- Update the slash-command Init dispatch to use the same structured JSON.
- Add regression test artifacts_with_status_partitions_fresh_and_idempotent_runs
asserting both fresh + idempotent runs produce the right partitioning and
that the machine-stable tag is bare 'skipped' not label()'s phrasing.
Verified output:
- Fresh dir: created[] has 4 entries, skipped[] empty
- Idempotent call: created[] empty, skipped[] has 4 entries
- project_path, next_step as first-class keys
- message preserved verbatim for backward compat
Full workspace test green except pre-existing resume_latest flake (unrelated).
Closes ROADMAP #142.
Previously, `claw <subcommand> --help` had 5 different behaviors:
- 7 subcommands returned subcommand-specific help (correct)
- init/export/state/version silently fell back to global `claw --help`
- system-prompt/dump-manifests errored with `unknown <cmd> option: --help`
- bootstrap-plan printed its phase list instead of help text
Changes:
- Extend LocalHelpTopic enum with Init, State, Export, Version, SystemPrompt,
DumpManifests, BootstrapPlan variants.
- Extend parse_local_help_action() to resolve those 7 subcommands to their
local help topic instead of falling through to the main dispatch.
- Remove init/state/export/version from the explicit wants_help=true matcher
so they reach parse_local_help_action() before being routed to global help.
- Add render_help_topic() entries for the 7 new topics with consistent
Usage/Purpose/Output/Formats/Related structure.
- Add regression test subcommand_help_flag_has_one_contract_across_all_subcommands_141
asserting every documented subcommand + both --help and -h variants resolve
to a HelpTopic with non-empty text that contains a Usage line.
Verification:
- All 14 subcommands now return subcommand-specific help (live dogfood).
- Full workspace test green except pre-existing resume_latest flake.
Closes ROADMAP #141.
Previously this test inherited the cargo test runner's CWD, which could contain
a stale .claw/settings.json with "permissionMode": "acceptEdits" written by
another test. The deprecated-field resolver then silently downgraded the
default permission mode to WorkspaceWrite, breaking the test's assertion.
Fix: wrap the assertion in with_current_dir() + env_lock() so the test runs in
an isolated temp directory with no stale config.
Full workspace test now passes except for pre-existing resume_latest flake
(unrelated to #140, environment-dependent, tracked separately).
Closes ROADMAP #140.
Updated LocalHelpTopic help strings to surface --output-format support:
- Status, Sandbox, Doctor, Acp all now show [--output-format <format>]
- Added 'Formats: text (default), json' line to each
Diagnostic verbs support JSON output but help text didn't advertise it.
Post-#127 fix: help text now matches actual CLI surface.
Verified: cargo build passes, claw doctor --help shows output-format.
Refs: #127
USAGE.md now documents:
- for machine-readable diagnostics
- Note about parse-time rejection of invalid suffix args (post-#127 fix)
Verifies that diagnostic verbs support JSON output for scripting,
and documents the behavior change from #127 (invalid args rejected
at parse time instead of falling through to prompt dispatch).
Refs: #127
Diagnostic verbs (help, version, status, sandbox, doctor, state) now
reject unrecognized suffix arguments at parse time instead of silently
falling through to Prompt dispatch.
Fixes: claw doctor --json (and similar) no longer accepts --json silently
and attempts to send it to the LLM as a prompt. Now properly emits:
'unrecognized argument `--json` for subcommand `doctor`'
Joined parser-level trust gap quintet #108 + #117 + #119 + #122 + #127.
Prevents token burn on rejected arguments.
Verified: cargo build --workspace passes, claw doctor --json errors cleanly.
Refs: #127, ROADMAP
Adds ship provenance detection and emission in execute_bash_async():
- Detects git push to main/master commands
- Captures current branch, HEAD commit, git user as actor
- Emits ship.prepared event with ShipProvenance payload
- Logs to stderr as interim routing (event stream integration pending)
This is the first wired provenance event — schema (§4.44.5) now has
runtime emission at actual git operation boundary.
Verified: cargo build --workspace passes.
Next: wire ship.commits_selected, ship.merged, ship.pushed_main events.
Refs: §4.44.5.1, ROADMAP #4.44.5
#122: doctor invocation now checks stale-base condition
- Calls run_stale_base_preflight(None) in render_doctor_report()
- Emits stale-base warnings to stderr when branch is behind main
- Fixes inconsistency: doctor 'ok' vs prompt 'stale base' warning
#125: git_state field reflects non-git directories
- When !in_git_repo, git_state = 'not in git repo' instead of 'clean'
- Fixes contradiction: in_git_repo: false but git_state: 'clean'
- Applied in both doctor text output and status JSON
Verified: cargo build --workspace passes.
Refs: ROADMAP #122 (dd73962), #125 (debbcbe)
Adds run_stale_base_preflight(None) call to render_doctor_report() so that
claw doctor emits stale-base warnings to stderr when the current branch is
behind main. Previously doctor reported 'ok' even when branch was stale,
creating inconsistency with prompt path warnings.
Fixes silent-state inventory gap: doctor now consistent with prompt/repl
stale-base checking. No behavior change for non-stale branches.
Verified: cargo build --workspace passes, no test failures.
Ref: ROADMAP #122 dogfood filing @ dd73962
Dogfood cycle 2026-04-20 identified that §4.44.5 ship/provenance event schema
is implemented (ShipProvenance struct, ship.* constructors, tests pass) but
actual git push/merge/commit-range operations do not yet emit these events.
Events remain dead code—constructors exist but are never called during real
workflows. This pinpoint tracks the missing wiring: locating actual git
operation call sites in main.rs/tools/lib.rs/worker_boot.rs and intercepting
to emit ship.prepared/commits_selected/merged/pushed_main with real metadata
(source_branch, commit_range, merge_method, actor, pr_number).
Acceptance: at least one real git push emits all 4 events with actual payload
values, claw state JSON surfaces ship provenance.
Ref: dogfood gaebal-gajae @ 1495672954573291571 (15:30 KST)
Adds structured ship provenance surface to eliminate delivery-path opacity:
New lane events:
- ship.prepared — intent to ship established
- ship.commits_selected — commit range locked
- ship.merged — merge completed with provenance
- ship.pushed_main — delivery to main confirmed
ShipProvenance struct carries:
- source_branch, base_commit
- commit_count, commit_range
- merge_method (direct_push/fast_forward/merge_commit/squash_merge/rebase_merge)
- actor, pr_number
Constructor methods added to LaneEvent for all four ship events.
Tests:
- Wire value serialization for ship events
- Round-trip deserialization
- Canonical event name coverage
Runtime: 465 tests pass
ROADMAP updated with IMPLEMENTED status
This closes the gap where 56 commits pushed to main had no structured
provenance trail — now emits first-class events for clawhip consumption.
Added structured delivery-path contract to surface branch → merge → main-push
provenance as first-class events. Filed from the 56-commit 2026-04-20 push
that exposed the gap.
Also fixes: ApiError test compilation — add suggested_action: None to 4 sites
- Line ~8414: opaque_provider_wrapper_surfaces_failure_class_session_and_trace
- Line ~8436: retry_exhaustion_uses_retry_failure_class_for_generic_provider_wrapper
- Line ~8499: provider_context_window_errors_are_reframed_with_same_guidance
- Line ~8533: retry_wrapped_context_window_errors_keep_recovery_guidance
Dogfooded 2026-04-18 on main HEAD debbcbe from /tmp/cdBB2.
Non-git directory:
$ mkdir /tmp/cdBB2 && cd /tmp/cdBB2 # NO git init
$ claw --output-format json status | jq .workspace.git_state
'clean' # should be null — not in a git repo
$ claw --output-format json doctor | jq '.checks[]
| select(.name=="workspace") | {in_git_repo, git_state}'
{"in_git_repo": false, "git_state": "clean"}
# CONTRADICTORY: not in git BUT git is 'clean'
Trace:
main.rs:2550-2554 parse_git_workspace_summary:
let Some(status) = status else {
return summary; // all-zero default when no git
};
All-zero GitWorkspaceSummary → is_clean() (changed_files==0)
→ true → headline() = 'clean'
main.rs:4950 status JSON: git_summary.headline() for git_state
main.rs:1856 doctor workspace: same headline() for git_state
Fix shape (~25 lines):
- Return Option<GitWorkspaceSummary> when status is None
- headline() returns Option<String>: None when no git
- Status JSON: git_state: null when not in git
- Doctor: omit git_state when in_git_repo: false, or set null
- Optional: claw init skip .gitignore in non-git dirs
- Regression: non-git → null, git clean → 'clean',
detached HEAD → 'clean' + 'detached HEAD'
Joins Truth-audit — 'clean' is a lie for non-git dirs.
Adjacent to #89 (claw blind to mid-rebase) — same field,
different missing state.
Joins #100 (status/doctor JSON gaps) — another field whose
value doesn't reflect reality.
Natural bundle: #89 + #100 + #125 — git-state-completeness
triple: rebase/merge invisible (#89) + stale-base unplumbed
(#100) + non-git 'clean' lie (#125). Complete git_state
field failure coverage.
Filed in response to Clawhip pinpoint nudge 1495016073085583442
in #clawcode-building-in-public.
Dogfooded 2026-04-18 on main HEAD bb76ec9 from /tmp/cdAA2.
--model flag has zero validation:
claw --model sonet status → model:'sonet' (typo passthrough)
claw --model '' status → model:'' (empty accepted)
claw --model garbage status → model:'garbage' (any string)
Valid aliases do resolve:
sonnet → claude-sonnet-4-6
opus → claude-opus-4-6
Config aliases also resolve via resolve_model_alias_with_config
But unresolved strings pass through silently. Typo 'sonet'
becomes literal model ID sent to API → fails late with
'model not found' after full context assembly.
Compare:
--reasoning-effort: validates low|medium|high. Has guard.
--permission-mode: validates against known set. Has guard.
--model: no guard. Any string.
--base-commit: no guard (#122). Same pattern.
status JSON:
{model: 'sonet'} — shows resolved name only.
No model_source (flag/config/default).
No model_raw (pre-resolution input).
No model_valid (known to any provider).
Claw can't distinguish typo from exact model from alias.
Trace:
main.rs:470-480 --model parsing:
model = value.clone(); index += 2;
No validation. Raw string stored.
main.rs:1032-1046 resolve_model_alias_with_config:
resolves known aliases. Unknown strings pass through.
main.rs:~4951 status JSON builder:
reports resolved model. No source/raw/valid fields.
Fix shape (~65 lines):
- Reject empty string at parse time
- Warn on unresolved aliases with fuzzy-match suggestion
- Add model_source, model_raw to status JSON
- Add model-validity check to doctor
- Regression per failure mode
Joins #105 (4-surface model disagreement) — model pair:
#105 status ignores config model, doctor mislabels
#124 --model flag unvalidated, no provenance in JSON
Joins #122 (--base-commit zero validation) — unvalidated-flag
pair: same parser pattern, no guards.
Joins Silent-flag/documented-but-unenforced as 17th.
Joins Truth-audit — status model field has no provenance.
Joins Parallel-entry-point asymmetry as 10th.
Filed in response to Clawhip pinpoint nudge 1495000973914144819
in #clawcode-building-in-public.
Dogfooded 2026-04-18 on main HEAD d1608ae from /tmp/cdYY.
Three related findings:
1. --base-commit has zero validation:
$ claw --base-commit doctor
warning: worktree HEAD (...) does not match expected
base commit (doctor). Session may run against a stale
codebase.
error: missing Anthropic credentials; ...
# 'doctor' used as base-commit value literally.
# Subcommand absorbed. Prompt fallthrough. Billable.
2. Greedy swallow of next flag:
$ claw --base-commit --model sonnet status
warning: ...does not match expected base commit (--model)
# '--model' taken as value. status never dispatched.
3. Garbage values silently accepted:
$ claw --base-commit garbage status
Status ...
# No validation. No warning (status path doesn't run check).
4. Stale-base signal missing from JSON surfaces:
$ claw --output-format json --base-commit $BASE status
{"kind":"status", ...}
# no stale_base, no base_commit, no base_commit_mismatch.
Stale-base check runs ONLY on Prompt path, as stderr prose.
Trace:
main.rs:487-494 --base-commit parsing:
'base-commit' => {
let value = args.get(index + 1).ok_or_else(...)?;
base_commit = Some(value.clone());
index += 2;
}
No format check. No reject-on-flag-prefix. No reject-on-
known-subcommand.
Compare main.rs:498-510 --reasoning-effort:
validates 'low' | 'medium' | 'high'. Has guard.
stale_base.rs check_base_commit runs on Prompt/turn path
only. No Status/Doctor handler includes base_commit field.
grep 'stale_base|base_commit_matches|base_commit:'
rust/crates/rusty-claude-cli/src/main.rs | grep status|doctor
→ zero matches.
Fix shape (~40 lines):
- Reject values starting with '-' (flag-like)
- Reject known-subcommand names as values
- Optionally run 'git cat-file -e {value}' to verify real commit
- Plumb base_commit + base_commit_matches + stale_base_warning
into Status and Doctor JSON surfaces
- Emit warning as structured JSON event too (not just stderr)
- Regression per failure mode
Joins Silent-flag/documented-but-unenforced (#96-#101, #104,
#108, #111, #115, #116, #117, #118, #119, #121) as 15th.
Joins Parser-level trust gaps: #108 + #117 + #119 + #122 —
billable-token silent-burn via parser too-eager consumption.
Joins Parallel-entry-point asymmetry (#91, #101, #104, #105,
#108, #114, #117) as 8th — stale-base implemented for Prompt
but absent from Status/Doctor.
Joins Truth-audit — 'expected base commit (doctor)' lies by
including user's mistake as truth.
Cross-cluster with Unplumbed-subsystem (#78, #96, #100, #102,
#103, #107, #109, #111, #113, #121) — stale-base signal in
runtime but not JSON.
Natural bundles:
Parser-level trust gap quintet (grown):
#108 + #117 + #119 + #122 — billable-token silent-burn
via parser too-eager consumption.
#100 + #122 — stale-base diagnostic-integrity pair:
#100 stale-base subsystem unplumbed (general)
#122 --base-commit accepts anything, greedy, Status/Doctor
JSON unplumbed (specific)
Filed in response to Clawhip pinpoint nudge 1494978319920136232
in #clawcode-building-in-public.
Dogfooded 2026-04-18 on main HEAD 3848ea6 from /tmp/cdUU.
The 'this is a slash command' helpful-error only fires when
invoked EXACTLY bare. Adding ANY argument silently falls through
to Prompt dispatch and burns billable tokens.
$ claw --output-format json hooks
{"error":"`claw hooks` is a slash command. Use `claw
--resume SESSION.jsonl /hooks`..."}
# clean error
$ claw --output-format json hooks --help
{"error":"missing Anthropic credentials; ..."}
# Prompt fallthrough. The CLI tried to send 'hooks --help'
# to the LLM as a user prompt.
9 known slash-only verbs affected:
hooks, plan, theme, tasks, subagent, agent, providers,
tokens, cache
All exhibit identical pattern:
bare verb → clean error
verb + any arg (--help, list, on, off, --json, etc) →
Prompt fallthrough, billable LLM call
User pattern: 'claw status --help' prints usage. So users
naturally try 'claw hooks --help' expecting same. Gets
charged for prompt 'hooks --help' to LLM instead.
Trace:
main.rs:745-763 entry point:
if rest.len() != 1 { return None; } <-- THE BUG
match rest[0].as_str() {
'help' => ...,
'version' => ...,
other => bare_slash_command_guidance(other).map(Err),
}
main.rs:765-793 bare_slash_command_guidance:
looks up command in slash_command_specs()
returns helpful error string
WORKS CORRECTLY — just never called when args present
Claude Code convention: 'claude hooks --help' prints usage,
'claude hooks list' lists hooks. claw-code silently charges.
Compare sibling bugs:
#108 typo'd verb + args → Prompt (typo path)
#117 -p 'text' --arg → Prompt with swallowed flags (greedy -p)
#119 known slash-verb + any arg → Prompt (too-narrow guidance)
All three are silent-billable-token-burn. Same underlying cause:
too-narrow parser detection + greedy Prompt dispatch.
Fix shape (~35 lines):
- Remove rest.len() != 1 gate. Widen to:
if rest.is_empty() { return None; }
let first = rest[0].as_str();
if rest.len() == 1 {
// existing bare-verb handling
}
if let Some(guidance) = bare_slash_command_guidance(first) {
return Some(Err(format!(
'{} The extra argument `{}` was not recognized.',
guidance, rest[1..].join(' ')
)));
}
None
- Subcommand --help support: catch --help for all recognized
slash verbs, print SlashCommandSpec.description
- Regression tests: 'claw <verb> --help' prints help,
'claw <verb> any arg' prints guidance, no Prompt fallthrough
Joins Silent-flag/documented-but-unenforced (#96-#101, #104,
#108, #111, #115, #116, #117, #118) as 14th.
Joins Claude Code migration parity (#103, #109, #116, #117)
as 5th — muscle memory from claude <verb> --help burns tokens.
Joins Truth-audit — 'missing credentials' is a lie; real cause
is CLI invocation was interpreted as chat prompt.
Cross-cluster with Parallel-entry-point asymmetry — slash-verb
with args is another entry point differing from bare form.
Natural bundles:
#108 + #117 + #119 — billable-token silent-burn triangle:
typo fallthrough (#108) +
flag swallow (#117) +
known-slash-verb fallthrough (#119)
#108 + #111 + #118 + #119 — parser-level trust gap quartet:
typo fallthrough + 2-way collapse + 3-way collapse +
known-verb fallthrough
Filed in response to Clawhip pinpoint nudge 1494948121099243550
in #clawcode-building-in-public.
Dogfooded 2026-04-18 on main HEAD ad02761 from /tmp/cdRR.
Three related gaps in one finding:
1. Unknown keys are strict ERRORS, not warnings:
{"permissions":{"defaultMode":"default"},"futureField":"x"}
$ claw --output-format json status
# stdout: empty
# stderr: {"type":"error","error":"unknown key futureField"}
# exit: 1
2. Claude Code migration parity broken:
$ cp .claude.json .claw.json
# .claude.json has apiKeyHelper (real Claude Code field)
$ claw --output-format json status
# stderr: unknown key apiKeyHelper → exit 1
No 'this is a Claude Code field we don't support, ignored' message.
3. Only errors[0] is reported — iterative discovery required:
3 unknown fields → 3 edit-run-fix cycles to fix them all.
Error-routing split with --output-format json:
success → stdout
errors → stderr (structured JSON)
Empty stdout on config errors. A claw piping stdout silently
gets nothing. Must capture both streams.
No escape hatch. No --ignore-unknown-config, no --strict flag,
no strictValidation config option.
Trace:
config.rs:282-291 ConfigLoader gate:
let validation = validate_config_file(...);
if !validation.is_ok() {
let first_error = &validation.errors[0];
return Err(ConfigError::Parse(first_error.to_string()));
}
all_warnings.extend(validation.warnings);
config_validate.rs:19-47 DiagnosticKind::UnknownKey:
level: DiagnosticLevel::Error (not Warning)
config_validate.rs schema allow-list is hard-coded. No
forward-compat extension (no x-* reserved namespace, no
additionalProperties: true, no opt-in lax mode).
grep 'apiKeyHelper' rust/crates/runtime/ → 0 matches.
Claude-Code-native fields not tolerated as no-ops.
grep 'ignore.*unknown|--no-validate|strict.*validation'
rust/crates/ → 0 matches. No escape hatch.
Fix shape (~100 lines):
- Downgrade UnknownKey Error → Warning default. ~5 lines.
- Add strict mode flag: .claw.json strictValidation: true OR
--strict-config CLI flag. Default off. ~15 lines.
- Collect all diagnostics, don't halt on first. ~20 lines.
- TOLERATED_CLAUDE_CODE_FIELDS allow-list: apiKeyHelper, env
etc. emit migration-hint warning 'not yet supported; ignored'
instead of hard-fail. ~30 lines.
- Emit structured error envelope on stdout too, not just stderr.
--output-format json stdout includes config_diagnostics[]. ~15.
- Wire suggestion: Option<String> for UnknownKey via fuzzy
match ('permisions' → 'permissions'). ~15 lines.
- Regression tests per outcome.
Joins Claude Code migration parity (#103, #109) as 3rd member —
most severe migration break. #103 silently drops .md files,
#109 stderr-prose warnings, #116 outright hard-fails.
Joins Reporting-surface/config-hygiene (#90, #91, #92, #110,
#115) on error-routing-vs-stdout axis.
Joins Silent-flag/documented-but-unenforced (#96-#101, #104,
#108, #111, #115) — only first error reported, rest silent.
Cross-cluster with Truth-audit — validation.is_ok() hides all
but first structured problem.
Natural bundles:
#103 + #109 + #116 — Claude Code migration parity triangle:
loss of compat (.md dropped) +
loss of structure (stderr prose warnings) +
loss of forward-compat (unknowns hard-fail)
#109 + #116 — config validation reporting surface:
only first warning surfaces structurally (#109)
only first error surfaces structurally AND halts (#116)
Filed in response to Clawhip pinpoint nudge 1494925472239321160
in #clawcode-building-in-public.
Dogfooded 2026-04-18 on main HEAD 43eac4d from /tmp/cdNN and /tmp/cdOO.
Three related findings on session reference resolution asymmetry:
1. /clear divergence (primary):
- /clear --confirm rewrites session_id inside the file header
but reuses the old filename.
- /session list reads meta header, reports new id.
- --resume looks up by filename stem, not meta header.
- Net: /session list reports ids that --resume can't resolve.
Concrete:
claw --resume ses /clear --confirm
→ new_session_id: session-1776481564268-1
→ file still named ses.jsonl, meta session_id now the new id
claw --resume ses /session list
→ active: session-1776481564268-1
claw --resume session-1776481564268-1
→ ERROR session not found
2. .bak files filtered out of /session list silently:
ls .claw/sessions/<bucket>/
ses.jsonl ses.jsonl.before-clear-<ts>.bak
/session list → only ses.jsonl visible, .bak zero discoverability
is_managed_session_file only matches .jsonl and .json.
3. 0-byte session files fabricate phantom sessions:
touch .claw/sessions/<bucket>/emptyses.jsonl
claw --resume emptyses /session list
→ active: session-<ms>-0
→ sessions: [session-<ms>-1]
Two different fabricated ids, neither persisted to disk.
--resume either fabricated id → 'session not found'.
Trace:
session_control.rs:86-116 resolve_reference:
handle.id = session_id_from_path(&path) (filename stem)
.unwrap_or_else(|| ref.to_string())
Meta header NEVER consulted for ref → id mapping.
session_control.rs:118-137 resolve_managed_path:
for ext in [jsonl, json]:
path = sessions_root / '{ref}.{ext}'
if path.exists(): return
Lookup key is filename. Zero fallback to meta scan.
session_control.rs:228-285 collect_sessions_from_dir:
on load success: summary.id = session.session_id (meta)
on load failure: summary.id = path.file_stem() (filename)
/session list thus reports meta ids for good files.
/clear handler rewrites session_id in-place, writes to same
session_path. File keeps old name, gets new id inside.
is_managed_session_file filters .jsonl/.json only. .bak invisible.
Fix shape (~90 lines):
- /clear preserves filename's identity (Option A: keep session_id,
wipe content). /session fork handles new-id semantics (#113).
- resolve_reference falls back to meta-header scan when filename
lookup fails. Covers legacy divergent files.
- /session list surfaces backups via --include-backups flag OR
separate backups: [] array with structured metadata.
- 0-byte session files produce SessionError::EmptySessionFile
instead of silent fabrication. Structured error, not phantom.
- regression tests per failure mode.
Joins Session-handling: #93 + #112 + #113 + #114 — reference
resolution + concurrent-modification + programmatic management +
reference/enumeration asymmetry. Complete session-handling cluster.
Joins Truth-audit — /session list output factually wrong about
what is resumable.
Cross-cluster with Parallel-entry-point asymmetry (#91, #101,
#104, #105, #108) — entry points reading same underlying data
produce mutually inconsistent identifiers.
Natural bundle: #93 + #112 + #113 + #114 (session-handling
quartet — complete coverage).
Alternative bundle: #104 + #114 — /clear filename semantics +
/export filename semantics both hide identity in filename.
Filed in response to Clawhip pinpoint nudge 1494895272936079493
in #clawcode-building-in-public.
Dogfooded 2026-04-18 on main HEAD a049bd2 from /tmp/cdII.
5 concurrent /compact on same session → 4 succeed, 1 races with
raw ENOENT. Same pattern with concurrent /clear --confirm.
Trace:
session.rs:204-212 save_to_path:
rotate_session_file_if_needed(path)?
write_atomic(path, &snapshot)?
cleanup_rotated_logs(path)?
Three steps. No lock around sequence.
session.rs:1085-1094 rotate_session_file_if_needed:
metadata(path) → rename(path, rot_path)
Classic TOCTOU. Race window between check and rename.
session.rs:1063-1071 write_atomic:
writes .tmp-{ts}-{counter}, renames to path
Atomic per rename, not per multi-step sequence.
cleanup_rotated_logs deletes .rot-{ts} files older than 3 most
recent. Can race against another process reading that rot file.
No flock, no advisory lock file, no fcntl.
grep 'flock|FileLock|advisory' session.rs → zero matches.
SessionError::Io Display forwards os::Error Display:
'No such file or directory (os error 2)'
No domain translation to 'session file vanished during save'
or 'concurrent modification detected, retry safe'.
Fix shape (~90 lines + test):
- advisory lock: .claw/sessions/<bucket>/<session>.jsonl.lock
exclusive flock for duration of save_to_path (fs2 crate)
- domain error variants:
SessionError::ConcurrentModification {path, operation}
SessionError::SessionFileVanished {path}
- error-to-JSON mapping:
{error_kind: 'concurrent_modification', retry_safe: true}
- retry-policy hints on idempotent ops (/compact, /clear)
- regression test: spawn 10 concurrent /compact, assert all
success OR structured ConcurrentModification (no raw os_error)
Affected operations:
- /compact (session save_to_path after compaction)
- /clear --confirm (save_to_path after new session)
- /export (may hit rotation boundary)
- Turn-persist (append_persisted_message can race rotation)
Not inherently a bug if sessions are single-writer, but
workspace-bucket scoping at session_control.rs:31-32 assumes
one claw per workspace. Parallel ulw lanes, CI matrix runners,
orchestration loops all violate that assumption.
Joins truth-audit (error lies by omission about what happened).
New micro-cluster 'session handling' with #93. Adjacent to
#104 on session-file-handling axis.
Natural bundle: #93 + #112 (session semantic correctness +
concurrency error clarity).
Filed in response to Clawhip pinpoint nudge 1494880177099116586
in #clawcode-building-in-public.
Dogfooded 2026-04-18 on main HEAD b2366d1 from /tmp/cdHH.
Specification mismatch at the command-dispatch layer:
commands/src/lib.rs:716-720 SlashCommandSpec registry:
name: 'providers', summary: 'List available model providers'
commands/src/lib.rs:1386 parser:
'doctor' | 'providers' => SlashCommand::Doctor
So /providers dispatches to SlashCommand::Doctor. A claw calling
/providers expecting {kind: 'providers', providers: [...]} gets
{kind: 'doctor', checks: [auth, config, install_source, workspace,
sandbox, system]} instead. Same top-level kind field name,
completely different payload.
Help text lies twice:
--help slash listing: '/providers List available model providers'
--help Resume-safe summary: includes /providers
Unlike STUB_COMMANDS (#96) which fail noisily, /providers fails
QUIETLY — returns wrong subsystem output.
Runtime has provider data:
ProviderKind::{Anthropic, Xai, OpenAi, ...} at main.rs:1143-1147
resolve_repl_model with provider-prefix routing
pricing_for_model with per-provider costs
provider_fallbacks config field
Scaffolding is present; /providers just doesn't use it.
By contrast /tokens → Stats and /cache → Stats are semantically
reasonable (Stats has the requested data). /providers → Doctor
is genuinely bizarre.
Fix shape:
A. Implement: SlashCommand::Providers variant + render helper
using ProviderKind + provider_fallbacks + env-var check (~60)
B. Remove: delete 'providers' from registry + parser (~3 lines)
then /providers becomes 'unknown, did you mean /doctor?'
Either way: fix --help to match.
Parallel to #78 (claw plugins CLI variant never constructed,
falls through to prompt). Both are 'declared in spec, not
implemented as declared.' #78 fails noisy, #111 fails quiet.
Joins silent-flag cluster (#96-#101, #104, #108) — 8th
doc-vs-impl mismatch. Joins unplumbed-subsystem (#78, #96,
#100, #102, #103, #107, #109) as 8th declared-but-not-
delivered surface. Joins truth-audit.
Natural bundles:
#78 + #96 + #111 — declared-but-not-as-declared triangle
#96 + #108 + #111 — full --help/dispatch hygiene quartet
(help-filter-leaks + subcommand typo fallthrough + slash
mis-dispatch)
Filed in response to Clawhip pinpoint nudge 1494872623782301817
in #clawcode-building-in-public.
Dogfooded 2026-04-18 on main HEAD 16244ce from /tmp/cdGG/nested/deep/dir.
ConfigLoader::discover at config.rs:242-270 hardcodes every
project/local path as self.cwd.join(...):
- self.cwd.join('.claw.json')
- self.cwd.join('.claw').join('settings.json')
- self.cwd.join('.claw').join('settings.local.json')
No ancestor walk. No consultation of project_root.
Concrete:
cd /tmp/cdGG && git init && echo '{permissions:{defaultMode:read-only}}' > .claw.json
cd /tmp/cdGG/nested/deep/dir
claw status → permission_mode: 'danger-full-access' (fallback)
claw doctor → 'Config files loaded 0/0, defaults are active'
But project_root: /tmp/cdGG is correctly detected via git walk.
Same config file, same repo, invisible from subdirectory.
Meanwhile CLAUDE.md discovery walks ancestors unbounded (per #85
over-discovery). Same subsystem category, opposite policy, no doc.
Security-adjacent per #87: permission-mode fallback is
danger-full-access. cd'ing to a subdirectory silently upgrades
from read-only (configured) → danger-full-access (fallback) —
workspace-location-dependent permission drift.
Fix shape (~90 lines):
- add project_root_for(&cwd) helper (reuse git-root walker from
render_doctor_report)
- config search: user → project_root/.claw.json →
project_root/.claw/settings.json → cwd/.claw.json (overlay) →
cwd/.claw/settings.* (overlays)
- optionally walk intermediate ancestors
- surface 'where did my config come from' in doctor (pairs with
#106 + #109 provenance)
- warn when cwd has no config but project_root does
- documentation parity with CLAUDE.md
- regression tests per cwd depth + overlay precedence
Joins truth-audit (doctor says 'ok, defaults active' when config
exists). Joins discovery-overreach as opposite-direction sibling:
#85: skills ancestor walk UNBOUNDED (over-discovery)
#88: CLAUDE.md ancestor walk enables injection
#110: config NO ancestor walk (under-discovery)
Natural bundle: #85 + #110 (ancestor policy unification), or
#85 + #88 + #110 (full three-way ancestor-walk audit).
Filed in response to Clawhip pinpoint nudge 1494865079567519834
in #clawcode-building-in-public.
Dogfooded 2026-04-18 on main HEAD 91c79ba from /tmp/cdCC.
Unrecognized first-positional tokens fall through the
_other => Ok(CliAction::Prompt { ... }) arm at main.rs:707.
Per --help this is 'Shorthand non-interactive prompt mode' —
documented behavior — but it eats known-subcommand typos too:
claw doctorr → Prompt("doctorr") → LLM API call
claw skilsl → Prompt("skilsl") → LLM API call
claw statuss → Prompt("statuss") → LLM API call
claw deply → Prompt("deply") → LLM API call
With credentials set, each burns real tokens. Without creds,
returns 'missing Anthropic credentials' — indistinguishable
from a legitimate prompt failure. No 'did you mean' suggestion.
Infrastructure exists:
slash command typos:
claw --resume s /skilsl
→ 'Unknown slash command: /skilsl. Did you mean /skill, /skills'
flag typos:
claw --fake-flag
→ structured error 'unknown option: --fake-flag'
subcommand typos:
→ silently become LLM prompts
The did-you-mean helper exists for slash commands. Flag
validation exists. Only subcommand dispatch has the silent-
fallthrough.
Fix shape (~60 lines):
- suggest_similar_subcommand(token) using levenshtein ≤ 2
against the ~16-item known-subcommand list
- gate the Prompt fallthrough on a shape heuristic:
single-token + near-match → return structured error with
did-you-mean. Otherwise fall through unchanged.
- preserve shorthand-prompt mode for multi-word inputs,
quoted inputs, and non-near-match tokens
- regression tests per typo shape + legit prompt + quoted
workaround
Cross-claw orchestration hazard: claws constructing subcommand
names from config or other claws' output have a latent 'typo →
live LLM call' vector. Over CI matrix with 1% typo rate, that's
billed-token waste + structural signal loss (error handler
can't distinguish typo from legit prompt failure).
Joins silent-flag cluster (#96-#101, #104) on subcommand axis —
6th instance of 'malformed input silently produces unintended
behavior.' Joins parallel-entry-point asymmetry (#91, #101,
#104, #105) — slash vs subcommand disagree on typo handling.
Natural bundles: #96 + #98 + #108 (--help/dispatch surface
hygiene triangle), #91 + #101 + #104 + #105 + #108 (parallel-
entry-point 5-way).
Filed in response to Clawhip pinpoint nudge 1494849975530815590
in #clawcode-building-in-public.
Dogfooded 2026-04-18 on main HEAD a436f9e from /tmp/cdBB.
Complete hook invisibility across JSON diagnostic surfaces:
1. doctor: no check_hooks_health function exists. check_config_health
emits 'Config files loaded N/M, MCP servers N, Discovered file X'
— NO hook count, no hook event breakdown, no hook health.
.claw.json with 3 hooks (including /does/not/exist and
curl-pipe-sh remote-exec payload) → doctor: ok, has_failures: false.
2. /hooks list: in STUB_COMMANDS (main.rs:7272) → returns 'not yet
implemented in this build'. Parallel /mcp list / /agents list /
/skills list work fine. /hooks has no sibling.
3. /config hooks: reports loaded_files and merged_keys but NOT
hook bodies, NOT hook source files, NOT per-event breakdown.
4. Hook progress events route to eprintln! as prose:
CliHookProgressReporter (main.rs:6660-6695) emits
'[hook PreToolUse] tool_name: command' to stderr unconditionally.
NEVER into --output-format json. A claw piping stderr to
/dev/null (common in pipelines) loses all hook visibility.
5. parse_optional_hooks_config_object (config.rs:766) accepts any
non-empty string. No fs::metadata() check, no which() check,
no shell-syntax sanity check.
6. shell_command (hooks.rs:739-754) runs 'sh -lc <command>' with
full shell expansion — env vars, globs, pipes, , remote
curl pipes.
Compounds with #106: downstream .claw/settings.local.json can
silently replace the entire upstream hook array via the
deep_merge_objects replace-semantic. A team-level audit hook in
~/.claw/settings.json is erasable and replaceable by an
attacker-controlled hook with zero visibility anywhere
machine-readable.
Fix shape (~220 lines, all additive):
- check_hooks_health doctor check (like #102's check_mcp_health)
- status JSON exposes {pre_tool_use, post_tool_use,
post_tool_use_failure} with source-file provenance
- implement /hooks list (remove from STUB_COMMANDS)
- route HookProgressEvent into JSON turn-summary as hook_events[]
- validate hook commands at config-load, classify execution_kind
- regression tests
Joins truth-audit (#80-#87, #89, #100, #102, #103, #105) — doctor
lies when hooks are broken or hostile. Joins unplumbed-subsystem
(#78, #96, #100, #102, #103) — HookProgressEvent exists,
JSON-invisible. Joins subsystem-doctor-coverage (#100, #102, #103)
as fourth opaque subsystem. Cross-cluster with permission-audit
(#94, #97, #101, #106) because hooks ARE a permission mechanism.
Natural bundle: #102 + #103 + #107 (subsystem-doctor-coverage
3-way becomes 4-way). Plus #106 + #107 (policy-erasure + policy-
visibility = complete hook-security story).
Filed in response to Clawhip pinpoint nudge 1494834879127486544
in #clawcode-building-in-public.
Dogfooded 2026-04-18 on main HEAD 71e7729 from /tmp/cdAA.
deep_merge_objects at config.rs:1216-1230 recurses into nested
objects but REPLACES arrays. So:
~/.claw/settings.json: {"permissions":{"deny":["Bash(rm *)"]}}
.claw.json: {"permissions":{"deny":["Bash(sudo *)"]}}
Merged: {"permissions":{"deny":["Bash(sudo *)"]}}
User's Bash(rm *) deny rule SILENTLY LOST. No warning. doctor: ok.
Worst case:
~/.claw/settings.json: {deny: [...strict list...]}
.claw/settings.local.json: {deny: []}
Merged: {deny: []}
Every deny rule from every upstream layer silently removed by a
workspace-local file. Any team/org security policy distributed
via user-home config is trivially erasable.
Arrays affected:
permissions.allow/deny/ask
hooks.PreToolUse/PostToolUse/PostToolUseFailure
plugins.externalDirectories
MCP servers are merged BY-KEY (merge_mcp_servers at :709) so
distinct server names across layers coexist. Author chose
merge-by-key for MCP but not for policy arrays. Design is
internally inconsistent.
extend_unique + push_unique helpers EXIST at :1232-1244 that do
union-merge with dedup. They are not called on the config-merge
axis for any policy array.
Fix shape (~100 lines):
- union-merge permissions.allow/deny/ask via extend_unique
- union-merge hooks.* arrays
- union-merge plugins.externalDirectories
- explicit replace-semantic opt-in via 'deny!' sentinel or
'permissions.replace: [...]' form (opt-in, not default)
- doctor surfaces policy provenance per rule (also helps #94)
- emit warning when replace-sentinel is used
- regression tests for union + explicit replace + multi-layer
Joins permission-audit sweep as 4-way composition-axis finding
(#94, #97, #101, #106). Joins truth-audit (doctor says 'ok'
while silently deleted every deny rule).
Natural bundle: #94 + #106 (rule validation + rule composition).
Plus #91 + #94 + #97 + #101 + #106 as 5-way policy-surface-audit.
Filed in response to Clawhip pinpoint nudge 1494827325085454407
in #clawcode-building-in-public.
Dogfooded 2026-04-18 on main HEAD 6580903 from /tmp/cdZ.
.claw.json with {"model":"haiku"} produces:
claw status → model: 'claude-opus-4-6' (DEFAULT_MODEL, config ignored)
claw doctor → 'Resolved model haiku' (raw alias, label lies)
turn dispatch → claude-haiku-4-5-20251213 (actually-resolved canonical)
ANTHROPIC_MODEL=sonnet → status still says claude-opus-4-6
FOUR separate understandings of 'active model':
1. config file (alias as written)
2. doctor (alias mislabeled as 'Resolved')
3. status (hardcoded DEFAULT_MODEL ignoring config entirely)
4. turn dispatch (canonical, alias-resolved, what turns actually use)
Trace:
main.rs:59 DEFAULT_MODEL const = claude-opus-4-6
main.rs:400 parse_args starts model = DEFAULT_MODEL
main.rs:753 Status dispatch: model.to_string() — never calls
resolve_repl_model, never reads config or env
main.rs:1125 resolve_repl_model: source of truth for actual
model, consults ANTHROPIC_MODEL env + config + alias table.
Called from Prompt and Repl dispatch. NOT from Status.
main.rs:1701 check_config_health: 'Resolved model {model}'
where model is raw configured string, not resolved.
Label says Resolved, value is pre-resolution alias.
Orchestration hazard: a claw picks tool strategy based on
status.model assuming it reflects what turns will use. Status
lies: always reports DEFAULT_MODEL unless --model flag was
passed. Config and env var completely ignored by status.
Fix shape (~30 lines):
- call resolve_repl_model from print_status_snapshot
- add effective_model field to status JSON (or rename/enrich)
- fix doctor 'Resolved model' label (either rename to 'Configured'
or actually alias-resolve before emitting)
- honor ANTHROPIC_MODEL env in status
- regression tests per model source with cross-surface equality
Joins truth-audit (#80-#84, #86, #87, #89, #100, #102, #103).
Joins two-paths-diverge (#91, #101, #104) — now 4-way with #105.
Joins doctor-surface-coverage triangle (#100 + #102 + #105).
Filed in response to Clawhip pinpoint nudge 1494819785676947543
in #clawcode-building-in-public.
Dogfooded 2026-04-18 on main HEAD 6a16f08 from /tmp/cdX.
Two-part gap on agent subsystem:
1. File-format gate silently discards .md (YAML frontmatter):
commands/src/lib.rs:3180-3220 load_agents_from_roots filters
extension() != 'toml' and silently continues. No log, no warn.
.claw/agents/foo.md → agents list count: 0, doctor: ok.
Same file renamed to .toml → discovered instantly.
2. No content validation inside accepted .toml:
model='nonexistent/model-that-does-not-exist' → accepted.
tools=['DoesNotExist', 'AlsoFake'] → accepted.
reasoning_effort string → unvalidated.
No check against model registry, tool registry, or
reasoning-effort enum — all machinery exists elsewhere
(#97 validates tools for --allowedTools flag).
Compounded:
- agents help JSON lists sources but NOT accepted file formats.
Operators have zero documentation-surface way to diagnose
'why does my .md file not work?'
- Doctor check set has no agents check. 3 files present with
1 silently skipped → summary: 'ok'.
- Skills use .md (SKILL.md). MCP uses .json (.claw.json).
Agents uses .toml. Three subsystems, three formats, no
cross-subsystem consistency or documentation.
- Claude Code convention is .md with YAML frontmatter.
Migrating operators copy that and silently fail.
Fix shape (~100 lines):
- accept .md with YAML frontmatter via existing
parse_skill_frontmatter helper
- validate model/tools/reasoning_effort against existing
registries; emit status: 'invalid' + validation_errors
instead of silently accepting
- agents list summary.skipped: [{path, reason}]
- add agents doctor check (total/active/skipped/invalid)
- agents help: accepted_formats list
Joins truth-audit (#80-#84, #86, #87, #89, #100, #102) on
silent-ok-while-ignoring axis. Joins silent-flag (#96-#101) at
subsystem scale. Joins unplumbed-subsystem (#78, #96, #100,
#102) as 5th unreachable surface: load_agents_from_roots
present, parse_skill_frontmatter present, validation helpers
present, agents path calls none of them.
Also opens new 'Claude Code migration parity' cross-cluster:
claw-code silently breaks the expected convention migration
path for a first-class subsystem.
Natural bundles: #102 + #103 (subsystem-doctor-coverage),
#78 + #96 + #100 + #102 + #103 (unplumbed-surface quintet).
Filed in response to Clawhip pinpoint nudge 1494804679962661187
in #clawcode-building-in-public.
Dogfooded 2026-04-18 on main HEAD eabd257 from /tmp/cdW2.
A .claw.json pointing at command='/does/not/exist' as an MCP server
cheerfully reports:
mcp show unreachable → found: true
mcp list → configured_servers: 1, status field absent
doctor → config: ok, MCP servers: 1, has_failures: false
The broken server is invisible until agent tries to call a tool
from it mid-turn — burning tokens on failed tool call and forcing
retry loop.
Trace:
main.rs:1701-1780 check_config_health counts via
runtime_config.mcp().servers().len()
No which(). No TcpStream::connect(). No filesystem touch.
render_doctor_report has 6 checks (auth/config/install_source/
workspace/sandbox/system). No check_mcp_health exists.
commands/src/lib.rs mcp list/show emit config-side repr only.
No status field, no reachable field, no startup_state.
runtime/mcp_stdio.rs HAS startup machinery with error types,
but only invoked at turn-execution time — too late for
preflight.
Roadmap prescribes this exact surface:
- Phase 1 §3.5 Boot preflight / doctor contract explicitly lists
'MCP config presence and server reachability expectations'
- Phase 2 §4 canonical lane event schema includes lane.ready
- Phase 4.4.4 event provenance / environment labeling
- Product Principle #5 'Partial success is first-class' —
'MCP startup can succeed for some servers and fail for
others, with structured degraded-mode reporting'
All four unimplementable without preflight + per-server status.
Fix shape (~110 lines):
- check_mcp_health: which(command) for stdio, 1s TcpStream
connect for http/sse. Aggregate ok/warn/fail with per-server
detail lines.
- mcp list/show: add status field
(configured/resolved/command_not_found/connect_refused/
startup_failed). --probe flag for deeper handshake.
- doctor top-level: degraded_mode: bool, startup_summary.
- Wire preflight into prompt/repl bootstrap; emit one-time
mcp_preflight event.
Joins unplumbed-subsystem cross-cluster (#78, #100, #102) —
subsystem exists, diagnostic surface JSON-invisible. Joins
truth-audit (#80-#84, #86, #87, #89, #100) — doctor: ok lies
when MCP broken.
Natural bundle: #78 + #96 + #100 + #102 unplumbed-surface
quartet. Also #100 + #102 as pure doctor-surface-coverage 2-way.
Filed in response to Clawhip pinpoint nudge 1494797126041862285
in #clawcode-building-in-public.
Dogfooded 2026-04-18 on main HEAD 63a0d30 from /tmp/cdU + /tmp/cdO*.
Three-fold gap:
1. status/doctor JSON workspace object has 13 fields; none of them
contain: head_sha, head_short_sha, expected_base, base_source,
stale_base_state, upstream, ahead, behind, merge_base, is_detached,
is_bare, is_worktree. A claw cannot answer 'is this lane at the
expected base?' from the JSON surface alone.
2. --base-commit flag is silently accepted by status/doctor/sandbox/
init/export/mcp/skills/agents and silently dropped on dispatch.
Same silent-no-op class as #98. A claw running
'claw --base-commit $expected status' gets zero effect — flag
parses into a local, discharged at dispatch.
3. runtime::stale_base subsystem is FULLY implemented with 30+ tests
(BaseCommitState, BaseCommitSource, resolve_expected_base,
read_claw_base_file, check_base_commit, format_stale_base_warning).
run_stale_base_preflight at main.rs:3058 calls it from Prompt/Repl
only, writes output to stderr as human prose. .claw-base file is
honored internally but invisible to status/doctor JSON. Complete
implementation, wrong dispatch points.
Plus: detached HEAD reported as magic string 'git_branch: "detached HEAD"'
without accompanying SHA. Bare repo/worktree/submodule indistinguishable
from regular repo in JSON. parse_git_status_branch has latent dot-split
truncation bug on branch names like 'feat.ui' with upstream.
Hits roadmap Product Principle #4 (Branch freshness before blame) and
Phase 2 §4.2 (branch.stale_against_main event) directly — both
unimplementable without commit identity in the JSON surface.
Fix shape (~80 lines plumbing):
- add head_sha/head_short_sha/is_detached/head_ref/is_bare/is_worktree
- add base_commit: {source, expected, state}
- add upstream: {ref, ahead, behind, merge_base}
- wire --base-commit into CliAction::Status + CliAction::Doctor
- add stale_base doctor check
- fix parse_git_status_branch dot-split at :2541
Cross-cluster: truth-audit/diagnostic-integrity (#80-#87, #89) +
silent-flag (#96-#99) + unplumbed-subsystem (#78). Natural bundles:
#89+#100 (git-state completeness) and #78+#96+#100 (unplumbed surface).
Milestone: ROADMAP #100.
Filed in response to Clawhip pinpoint nudge 1494782026660712672
in #clawcode-building-in-public.
Dogfooded 2026-04-18 on main HEAD 0e263be from /tmp/cdN.
parse_system_prompt_args at main.rs:1162-1190 does:
cwd = PathBuf::from(value);
date.clone_from(value);
Zero validation. Both values flow through to
SystemPromptBuilder::render_env_context (prompt.rs:175-186) and
render_project_context (prompt.rs:289-293) where they are formatted
into the system prompt output verbatim via format!().
Two injection points per value:
- # Environment context
- 'Working directory: {cwd}'
- 'Date: {date}'
- # Project context
- 'Working directory: {cwd}'
- 'Today's date is {date}.'
Demonstrated attacks:
--date 'not-a-date' → accepted
--date '9999-99-99' → accepted
--date '1900-01-01' → accepted
--date "2025-01-01'; DROP TABLE users;--" → accepted verbatim
--date $'2025-01-01\nMALICIOUS: ignore all previous rules'
→ newline breaks out of bullet into standalone system-prompt
instruction line that the LLM will read as separate guidance
--cwd '/does/not/exist' → silently accepted, rendered verbatim
--cwd '' → empty 'Working directory: ' line
--cwd $'/tmp\nMALICIOUS: pwn' → newline injection same pattern
--help documents format as '[--cwd PATH] [--date YYYY-MM-DD]'.
Parser enforces neither. Same class as #96 / #98 — documented
constraint, unenforced at parse boundary.
Severity note: most severe of the #96/#97/#98/#99 silent-flag
class because the failure mode is prompt injection, not a silent
feature no-op. A claw or CI pipeline piping tainted
$REPO_PATH / $USER_INPUT into claw system-prompt is a
vector for LLM manipulation.
Fix shape:
1. parse --date as chrono::NaiveDate::parse_from_str(value, '%Y-%m-%d')
2. validate --cwd via std::fs::canonicalize(value)
3. defense-in-depth: debug_assert no-newlines at render boundary
4. regression tests for each rejected case
Cross-cluster: sibling of #83 (system-prompt date = build date)
and #84 (dump-manifests bakes abs path) — all three are about
the system-prompt / manifest surface trusting compile-time or
operator-supplied values that should be validated.
Filed in response to Clawhip pinpoint nudge 1494774477009981502
in #clawcode-building-in-public.
Dogfooded 2026-04-18 on main HEAD 7a172a2 from /tmp/cdM.
--help at main.rs:8251 documents --compact as 'text mode only;
useful for piping.' The implementation knows the constraint but
never enforces it at the parse boundary — the flag is silently
dropped in every non-{Prompt+Text} dispatch path:
1. --output-format json prompt: run_turn_with_output (:3807-3817)
has no CliOutputFormat::Json if compact arm; JSON branch
ignores compact entirely
2. status/sandbox/doctor/init/export/mcp/skills/agents: those
CliAction variants have no compact field at all; parse_args
parses --compact into a local bool and then discharges it
with nowhere to go on dispatch
3. claw --compact with piped stdin: the stdin fallthrough at
main.rs:614 hardcodes compact: false regardless of the
user-supplied --compact — actively overriding operator intent
No error, no warning, no diagnostic. A claw using
claw --compact --output-format json '...' to pipe-friendly output
gets full verbose JSON silently.
Fix shape:
- reject --compact + --output-format json at parse time (~5 lines)
- reject --compact on non-Prompt subcommands with a named error
(~15 lines)
- honor --compact in stdin-piped Prompt fallthrough: change
compact: false to compact at :614 (1 line)
- optionally add CliOutputFormat::Json if compact arm if
compact-JSON is desirable
Joins silent-flag no-op class with #96 (Resume-safe leak) and
#97 (silent-empty allow-set). Natural bundle #96+#97+#98 covers
the --help/flag-validation hygiene triangle.
Filed in response to Clawhip pinpoint nudge 1494766926826700921
in #clawcode-building-in-public.
Dogfooded 2026-04-18 on main HEAD 3ab920a from /tmp/cdL.
Silent vs loud asymmetry for equivalent mis-input at the
tool-allow-list knob:
- `--allowedTools "nonsense"` → loud structured error naming
every valid tool (works as intended)
- `--allowedTools ""` (shell-expansion failure, $TOOLS expanded
empty) → silent Ok(Some(BTreeSet::new())) → all tools blocked
- `--allowedTools ",,"` → same silent empty set
- `.claw.json` with `allowedTools` → fails config load with
'unknown key allowedTools' — config-file surface locked out,
CLI flag is the only knob, and the CLI flag has the footgun
Trace: tools/src/lib.rs:192-248 normalize_allowed_tools. Input
values=[""] is NOT empty (len=1) so the early None guard at
main.rs:1048 skips. Inner split/filter on empty-only tokens
produces zero elements; the error-producing branch never runs.
Returns Ok(Some(empty)), which downstream filter treats as
'allow zero tools' instead of 'allow all tools.'
No observable recovery: status JSON exposes kind/model/
permission_mode/sandbox/usage/workspace but no allowed_tools
field. doctor check set has no tool_restrictions category. A
lane that silently restricted itself to zero tools gets no
signal until an actual tool call fails at runtime.
Fix shape: reject empty-token input at parse time with a clear
error. Add explicit --allowedTools none opt-in if zero-tool
lanes are desirable. Surface active allow-set in status JSON
and as a doctor check. Consider supporting allowedTools in
.claw.json or improving its rejection message.
Joins permission-audit sweep (#50/#87/#91/#94) on the
tool-allow-list axis. Sibling of #86 on the truth-audit side:
both are 'misconfigured claws have no observable signal.'
Filed in response to Clawhip pinpoint nudge 1494759381068419115
in #clawcode-building-in-public.
Dogfooded 2026-04-18 on main HEAD 8db8e49 from /tmp/cdK. Partial
regression of ROADMAP #39 / #54 at the help-output layer.
'claw --help' emits two separate slash-command enumerations:
(1) Interactive slash commands block -- correctly filtered via
render_slash_command_help_filtered(STUB_COMMANDS) at main.rs:8268
(2) Resume-safe commands one-liner -- UNFILTERED, emits every entry
from resume_supported_slash_commands() at main.rs:8270-8278
Programmatic cross-check: intersect the Resume-safe listing with
STUB_COMMANDS (60+ entries at main.rs:7240-7320) returns 62
overlaps: budget, rate-limit, metrics, diagnostics, workspace,
reasoning, changelog, bookmarks, allowed-tools, tool-details,
language, max-tokens, temperature, system-prompt, output-style,
privacy-settings, keybindings, thinkback, insights, stickers,
advisor, brief, summary, vim, and more. All advertised as
resume-safe; all produce 'Did you mean /X' stub-guard errors when
actually invoked in resume mode.
Fix shape: one-line filter at main.rs:8270 adding
.filter(|spec| !STUB_COMMANDS.contains(&spec.name)) or extract
shared helper resume_supported_slash_commands_filtered. Add
regression test parallel to stub_commands_absent_from_repl_
completions that parses the Resume-safe line and asserts no entry
matches STUB_COMMANDS.
Filed in response to Clawhip pinpoint nudge 1494751832399024178 in
#clawcode-building-in-public.
Dogfooded 2026-04-18 on main HEAD b7539e6 from /tmp/cdJ. Three
stacked gaps on the skill-install surface:
(1) User-scope only install. default_skill_install_root at
commands/src/lib.rs returns CLAW_CONFIG_HOME/skills ->
CODEX_HOME/skills -> HOME/.claw/skills -- all user-level. No
project-scope code path. Installing from workspace A writes to
~/.claw/skills/X and makes X active:true in every other
workspace with source.id=user_claw.
(2) No uninstall. claw --help enumerates /skills
[list|install|help|<skill>] -- no uninstall. 'claw skills
uninstall X' falls through to prompt-dispatch. REPL /skill is
identical. Removing a bad skill requires manual rm -rf on the
installed path parsed out of install receipt output.
(3) No scope signal. Install receipt shows 'Registry
/Users/yeongyu/.claw/skills' but the operator is never asked
project vs user, and JSON receipt does not distinguish install
scope.
Doubly compounds with #85 (skill discovery ancestor walk): an
attacker who can write under an ancestor OR can trick the operator
into one bad 'skills install' lands a skill in the user-level
registry that's active in every future claw invocation.
Runs contrary to the project/user/local three-tier scope settings
already use (User / Project / Local via ConfigSource). Skills
collapse all three onto User at install time.
Fix shape (~60 lines): --scope user|project|local flag on skills
install (no default in --output-format json mode, prompt
interactively); claw skills uninstall + /skills uninstall
slash-command; installed_path per skill record in --output-format
json skills output.
Filed in response to Clawhip pinpoint nudge 1494744278423961742 in
#clawcode-building-in-public.
Dogfooded 2026-04-18 on main HEAD 7f76e6b from /tmp/cdI. Three
stacked failures on the permission-rule surface:
(1) Typo tolerance. parse_optional_permission_rules at
runtime/src/config.rs:780-798 is just optional_string_array with
no per-entry validation. Typo rules like 'Reed', 'Bsh(echo:*)',
'WebFech' load silently; doctor reports config: ok.
(2) Case-sensitive match against lowercase runtime names.
PermissionRule::matches does self.tool_name != tool_name strict
compare. Runtime registers tools lowercase (bash).
Claude Code convention / MCP docs use capitalized (Bash). So
'deny: ["Bash(rm:*)"]' never fires because tool_name='bash' !=
rule.tool_name='Bash'. Cross-harness config portability fails
open, not closed.
(3) Loaded rules invisible. status JSON has no permission_rules
field. doctor has no rules check. A clawhip preflight asking
'does this lane actually deny Bash(rm:*)?' has no
machine-readable answer; has to re-parse .claw.json and
re-implement parse semantics.
Contrast: --allowedTools CLI flag HAS tool-name validation with a
50+ tool registry. The same registry is not consulted when parsing
permissions.allow/deny/ask. Asymmetric validation, same shape as
#91 (config accepts more permission-mode labels than CLI).
Fix shape (~30-45 lines): validate rule tool names against the
same registry --allowedTools uses; case-fold tool_name compare in
PermissionRule::matches; expose loaded rules in status/doctor JSON
with unknown_tool flag.
Filed in response to Clawhip pinpoint nudge 1494736729582862446 in
#clawcode-building-in-public.
Dogfooded 2026-04-18 on main HEAD bab66bb from /tmp/cdH.
SessionStore::resolve_reference at runtime/src/session_control.rs:
86-116 branches on a textual heuristic -- looks_like_path =
direct.extension().is_some() || direct.components().count() > 1.
Same-looking reference triggers two different code paths:
Repros:
- 'claw --resume session-123' -> managed store lookup (no extension,
no slash) -> 'session not found: session-123'
- 'claw --resume session-123.jsonl' -> workspace-relative file path
(extension triggers path branch) -> opens /cwd/session-123.jsonl,
succeeds if present
- 'claw --resume /etc/passwd' -> absolute path opened verbatim,
fails only because JSONL parse errors ('invalid JSONL record at
line 1: unexpected character: #')
- 'claw --resume /etc/hosts' -> same; file is read, structural
details (first char, line number) leak in error
- symlink inside .claw/sessions/<fp>/passwd-symlink.jsonl pointing
at /etc/passwd -> claw --resume passwd-symlink follows it
Clawability impact: operators copying session ids from /session
list naturally try adding .jsonl and silently hit the wrong branch.
Orchestrators round-tripping session ids through --resume cannot
do any path normalization without flipping lookup modes. No
workspace scoping, so any readable file on disk is a valid target.
Symlinks inside managed path escape the workspace silently.
Fix shape (~15 lines minimum): canonicalize the resolved candidate
and assert prefix match with workspace_root before opening; return
OutsideWorkspace typed error otherwise. Optional cleanup: split
--resume <id> and --resume-file <path> into explicit shapes.
Filed in response to Clawhip pinpoint nudge 1494729188895359097 in
#clawcode-building-in-public.
Dogfooded 2026-04-18 on main HEAD d0de86e from /tmp/cdE. MCP
command, args, url, headers, headersHelper config fields are
loaded and passed to execve/URL-parse verbatim. No ${VAR}
interpolation, no ~/ home expansion, no preflight check, no doctor
warning.
Repros:
- {'command':'~/bin/my-server','args':['~/config/file.json']} ->
execve('~/bin/my-server', ['~/config/file.json']) -> ENOENT at
MCP connect time.
- {'command':'${HOME}/bin/my-server','args':['--tenant=${TENANT_ID}']}
-> literal ${HOME}/bin/my-server handed to execve; literal
${TENANT_ID} passed to the server as tenant argument.
- {'headers':{'Authorization':'Bearer ${API_TOKEN}'}} -> literal
string 'Bearer ${API_TOKEN}' sent as HTTP header.
Trace: parse_mcp_server_config in runtime/src/config.rs stores
strings raw; McpStdioProcess::spawn at mcp_stdio.rs:1150-1170 is
Command::new(&transport.command).args(&transport.args).spawn().
grep interpolate/expand_env/substitute/${ across runtime/src/
returns empty outside format-string literals.
Clawability impact: every public MCP server README uses ${VAR}/~/
in examples; copy-pasted configs load with doctor:ok and fail
opaquely at spawn with generic ENOENT that has lost the context
about why. Operators forced to hardcode secrets in .claw.json
(triggering #90) or wrap commands in shell scripts -- both worse
security postures than the ecosystem norm. Cross-harness round-trip
from Claude Code /.mcp.json breaks when interpolation is present.
Fix shape (~50 lines): config-load-time interpolation of ${VAR}
and leading ~/ in command/args/url/headers/headers_helper; missing-
variable warnings captured into ConfigLoader all_warnings; optional
{'config':{'expand_env':false}} toggle; mcp_config_interpolation
doctor check that flags literal ${ / ~/ remaining after substitution.
Filed in response to Clawhip pinpoint nudge 1494721628917989417 in
#clawcode-building-in-public.
Dogfooded 2026-04-18 on main HEAD 478ba55 from /tmp/cdC. Two
permission-mode parsers disagree on valid labels:
- Config parse_permission_mode_label (runtime/src/config.rs:851-862)
accepts 8 labels and collapses 5 aliases onto 3 canonical modes.
- CLI normalize_permission_mode (rusty-claude-cli/src/main.rs:5455-
5461) accepts only the 3 canonical labels.
Same binary, same intent, opposite verdicts:
.claw.json {"defaultMode":"plan"} -> silent ReadOnly + doctor ok
--permission-mode plan -> rejected with 'unsupported permission mode'
Semantic collapses of note:
- 'default' -> ReadOnly (name says nothing about what default means)
- 'plan' -> ReadOnly (upstream plan-mode semantics don't exist in
claw; ExitPlanMode tool exists but has no matching PermissionMode
variant)
- 'acceptEdits'/'auto' -> WorkspaceWrite (ambiguous names)
- 'dontAsk' -> DangerFullAccess (FOOTGUN: sounds like 'quiet mode',
actually the most permissive; community copy-paste bypasses every
danger-keyword audit)
Status JSON exposes canonicalized permission_mode only; original
label lost. Claw reading status cannot distinguish 'plan' from
explicit 'read-only', or 'dontAsk' from explicit 'danger-full-access'.
Fix shape (~20-30 lines): align the two parsers to accept/reject
identical labels; add permission_mode_raw to status JSON (paired
with permission_mode_source from #87); either remove the 'dontAsk'
alias or trigger a doctor warn when raw='dontAsk'; optionally
introduce a real PermissionMode::Plan runtime variant.
Filed in response to Clawhip pinpoint nudge 1494714078965403848 in
#clawcode-building-in-public.
Dogfooded 2026-04-17 on main HEAD 64b29f1 from /tmp/cdB. The MCP
details surface correctly redacts env -> env_keys and headers ->
header_keys (deliberate precedent for 'show config without secrets'),
but dumps args, url, and headersHelper verbatim even though all
three standardly carry inline credentials.
Repros:
(1) args leak: {'args':['--api-key','sk-secret-ABC123','--token=...',
'--url=https://user:password@host/db']} appears unredacted in
both details.args and the summary string.
(2) URL leak: 'url':'https://user:SECRET@api.example.com/mcp' and
matching summary.
(3) headersHelper leak: helper command path + its secret-bearing
argv emitted whole.
Trace: mcp_server_details_json at commands/src/lib.rs:3972-3999 is
the single redaction point. env/headers get key-only projection;
args/url/headers_helper carve-out with no explaining comment. Text
surface at :3873-3920 mirrors the same leak.
Clawability shape: mcp list --output-format json is exactly the
surface orchestrators scrape for preflight and that logs / Discord
announcements / claw export / CI artifacts will carry. Asymmetric
redaction sends the wrong signal -- consumers assume secret-aware,
the leak is unexpected and easy to miss. Standard MCP wiring
patterns (--api-key, postgres://user:pass@, token helper scripts)
all hit the leak.
Fix shape (~40-60 lines): redact args with secret heuristic
(--api-key, --token, --password, high-entropy tails, user:pass@);
redact URL basic-auth + query-string secrets; split headersHelper
argv and apply args heuristic; add optional --show-sensitive
opt-in; add mcp_secret_posture doctor check. No MCP runtime
behavior changes -- only reporting surface.
Filed in response to Clawhip pinpoint nudge 1494706529918517390 in
#clawcode-building-in-public.
Dogfooded 2026-04-17 on main HEAD 9882f07. A rebase halted on
conflict leaves .git/rebase-merge/ on disk + HEAD detached on the
rebase intermediate commit. 'claw --output-format json status'
reports git_state='dirty ... 1 conflicted', git_branch='detached
HEAD', no rebase flag. 'claw --output-format json doctor' reports
workspace: {status:ok, summary:'project root detected on branch
detached HEAD'}.
Trace: parse_git_workspace_summary at rusty-claude-cli/src/main.rs:
2550-2587 scans git status --short output only; no .git/rebase-
merge, .git/rebase-apply, .git/MERGE_HEAD, .git/CHERRY_PICK_HEAD,
.git/BISECT_LOG check anywhere in rust/crates/. check_workspace_
health emits Ok so long as a project root was detected.
Clawability impact: preflight blindness (doctor ok on paused lane),
stale-branch detection breaks (freshness vs base is meaningless
when HEAD is a rebase intermediate), no recovery surface (no
abort/resume hints), same 'surface lies about runtime truth' family
as #80-#87.
Fix shape (~20 lines): detect marker files, expose typed
workspace.git_operation field (kind/paused/abort_hint/resume_hint),
flip workspace doctor verdict to warn when git_operation != null.
Filed in response to Clawhip pinpoint nudge 1494698980091756678 in
#clawcode-building-in-public.
Dogfooded 2026-04-17 on main HEAD 82bd8bb from
/tmp/claude-md-injection/inner/work. discover_instruction_files at
runtime/src/prompt.rs:203-224 walks cursor.parent() until None with
no project-root bound, no HOME containment, no git boundary. Four
candidate paths per ancestor (CLAUDE.md, CLAUDE.local.md,
.claw/CLAUDE.md, .claw/instructions.md) are loaded and inlined
verbatim into the agent's system prompt under '# Claude instructions'.
Repro: /tmp/claude-md-injection/CLAUDE.md containing adversarial
guidance appears under 'CLAUDE.md (scope: /private/tmp/claude-md-
injection)' in claw system-prompt from any nested CWD. git init
inside the worker does not terminate the walk. /tmp/CLAUDE.md alone
is sufficient -- /tmp is world-writable with sticky bit on macOS/
Linux, so any local user can plant agent guidance for every other
user's claw invocation under /tmp/anything.
Worse than #85 (skills ancestor walk): no agent action required
(injection fires on every turn before first user message), lower
bar for the attacker (raw Markdown, no frontmatter), standard
world-writable drop point (/tmp), no doctor signal. Same structural
fix family though: prompt.rs:203, commands/src/lib.rs:2795
(skills), and commands/src/lib.rs:2724 (agents) all need the same
project_root / HOME bound.
Fix shape (~30-50 lines): bound ancestor walk at project root /
HOME; add doctor check that surfaces loaded instruction files with
paths; add settings.json opt-in toggle for monorepo ancestor
inheritance with 'source: ancestor' annotation.
Filed in response to Clawhip pinpoint nudge 1494691430096961767 in
#clawcode-building-in-public.
Dogfooded 2026-04-17 on main HEAD d6003be against /tmp/cd8. Fresh
workspace, no config, no env, no CLI flag: claw status reports
'Permission mode danger-full-access'. 'claw doctor' has no
permission-mode check at all -- zero lines mention it.
Trace: rusty-claude-cli/src/main.rs:1099-1107 default_permission_mode
falls back to PermissionMode::DangerFullAccess when env/config miss.
runtime/src/permissions.rs:7-15 PermissionMode ordinal puts
DangerFullAccess above WorkspaceWrite/ReadOnly, so current_mode >=
required_mode gate at :260-264 auto-approves every tool spec requiring
DangerFullAccess or below -- including bash and PowerShell.
check_sandbox_health exists at :1895-1910 but no parallel
check_permission_health. Status JSON exposes permission_mode but no
permission_mode_source field -- fallback indistinguishable from
deliberate choice.
Interacts badly with #86: corrupt .claw.json silently drops the
user's 'plan' choice AND escalates to danger-full-access fallback,
and doctor reports Config: ok across both failures.
Fix shape (~30-40 lines): add permission doctor check (warn when
effective=DangerFullAccess via fallback); add permission_mode_source
to status JSON; optionally flip fallback to WorkspaceWrite/Prompt
for non-interactive invocations.
Filed in response to Clawhip pinpoint nudge 1494683886658257071 in
#clawcode-building-in-public.
Dogfooded 2026-04-17 on main HEAD 586a92b against /tmp/cd7. A valid
.claw.json with permissions.defaultMode=plan applies correctly
(claw status shows Permission mode read-only). Corrupt the same
file to junk text and: (1) claw status reverts to
danger-full-access, (2) claw doctor still reports
Config: status=ok, summary='runtime config loaded successfully',
with loaded_config_files=0 and discovered_files_count=1 side by
side in the same check.
Trace: read_optional_json_object at runtime/src/config.rs:674-692
sets is_legacy_config = (file_name == '.claw.json') and on parse
failure returns Ok(None) instead of Err(ConfigError::Parse). No
warning, no eprintln. ConfigLoader::load() continues past the None,
reports overall success. Doctor check at
rusty-claude-cli/src/main.rs:1725-1754 emits DiagnosticLevel::Ok
whenever load() returned Ok, even with loaded 0/1.
Compare a non-legacy settings path at .claw/settings.json with
identical corruption: doctor correctly fails loudly. Same file
contents, different filename -> opposite diagnostic verdict.
Intent was presumably legacy compat with stale historical .claw.json.
Implementation now masks live user-written typos. A clawhip preflight
that gates on 'status != ok' never sees this. Same surface-lies-
about-runtime-truth shape as #80-#84, at the config layer.
Fix shape (~20-30 lines): replace silent skip with warn-and-skip
carrying the parse error; flip doctor verdict when
loaded_count < present_count; expose skipped_files in JSON surface.
Filed in response to Clawhip pinpoint nudge 1494676332507041872 in
#clawcode-building-in-public.
Dogfooded 2026-04-17 on main HEAD 2eb6e0c. discover_skill_roots at
commands/src/lib.rs:2795 iterates cwd.ancestors() unbounded -- no
project-root check, no HOME containment, no git boundary. Any
.claw/skills, .omc/skills, .agents/skills, .codex/skills,
.claude/skills directory on any ancestor path up to / is enumerated
and marked active: true in 'claw --output-format json skills'.
Repro 1 (cross-tenant skill injection): write
/tmp/trap/.agents/skills/rogue/SKILL.md; cd /tmp/trap/inner/work
and 'claw skills' shows rogue as active, sourced as Project roots.
git init inside the inner CWD does NOT stop the walk.
Repro 2 (CWD-dependent skill set): CWD under $HOME yields
~/.agents/skills contents; CWD outside $HOME hides them. Same user,
same binary, 26-skill delta driven by CWD alone.
Security shape: any attacker-writable ancestor becomes a skill
injection primitive. Skill descriptions are free-form Markdown fed
into the agent context -- crafted descriptions become prompt
injection. tools/src/lib.rs:3295 independently walks ancestors for
dispatch, so the injected skill is also executable via slash
command, not just listed.
Fix shape (~30-50 lines): bound ancestor walk at project root
(ConfigLoader::project_root), optionally also at $HOME; require
explicit settings.json toggle for monorepo ancestor inheritance;
mirror fix in tools/src/lib.rs::push_project_skill_lookup_roots so
listed and dispatchable skill surfaces match.
Filed in response to Clawhip pinpoint nudge 1494668784382771280 in
#clawcode-building-in-public.
Dogfooded 2026-04-17 on main HEAD 70a0f0c from /tmp/cd4.
'claw dump-manifests' with no arguments emits:
error: Manifest source files are missing.
repo root: /Users/yeongyu/clawd/claw-code
missing: src/commands.ts, src/tools.ts, src/entrypoints/cli.tsx
That path is the *build machine*'s absolute filesystem layout, baked
in via env!('CARGO_MANIFEST_DIR') at rusty-claude-cli/src/main.rs:2016.
strings on the binary reveals the raw path verbatim. JSON surface
(--output-format json) leaks the same path identically.
Three problems: (1) broken default for any user running a distributed
binary because the path won't exist on their machine; (2) privacy
leak -- build user's $HOME segment embedded in the binary and
surfaced to every recipient; (3) reproducibility violation -- two
binaries built from the same commit on different machines produce
different runtime behavior. Same compile-time-vs-runtime family as
ROADMAP #83 (build date injected as 'today').
Fix shape (<=20 lines): drop env!('CARGO_MANIFEST_DIR') from the
runtime default, require CLAUDE_CODE_UPSTREAM / --manifests-dir /
settings entry, reword error to name the required config instead of
leaking a path the user never asked for. Optional polish: add a
settings.json [upstream] entry.
Acceptance: strings <binary> | grep '^/Users/' returns empty for the
shipped binary. Default error surface contains zero absolute paths
from the build machine.
Filed in response to Clawhip pinpoint nudge 1494661235336282248 in
#clawcode-building-in-public.
Dogfooded 2026-04-17 on main HEAD e58c194 against /tmp/cd3. Binary
built 2026-04-10; today is 2026-04-17. 'claw system-prompt' emits
'Today's date is 2026-04-10.' The same DEFAULT_DATE constant
(rusty-claude-cli/src/main.rs:69-72) is threaded into
build_system_prompt() at :6173-6180 and every ClaudeCliSession /
StreamingCliSession / non-interactive runner (lines 3649, 3746,
4165, 4211, ...), so the stale date lives in the LIVE agent prompt,
not just the system-prompt subcommand.
Agents reason from 'today = compile day,' which silently breaks any
task that depends on real time (freshness, deadlines, staleness,
expiry). Violates ROADMAP principle #4 (branch freshness before
blame) and mixes compile-time context into runtime behavior,
producing different prompts for two agents on the same main HEAD
built a week apart.
Fix shape (~30 lines): compute current_date at runtime via
chrono::Utc::now().date_naive(), sweep DEFAULT_DATE call sites in
main.rs, keep --date override and --version's build-date meaning,
add CLAWD_OVERRIDE_DATE env escape for reproducible tests.
Filed in response to Clawhip pinpoint nudge 1494653681222811751 in
#clawcode-building-in-public.
Dogfooded 2026-04-17 on main HEAD 1743e60 against /tmp/claw-dogfood-2.
claw --output-format json sandbox on macOS reports filesystem_active=
true, filesystem_mode=workspace-only but the actual enforcement is
only HOME/TMPDIR env-var rebasing at bash.rs:205-209 / :228-232.
build_linux_sandbox_command is cfg(target_os=linux)-gated and returns
None on macOS, so the fallback path is sh -lc <command> with env
tweaks and nothing else. Direct escape proof: a child with
HOME=/ws/.sandbox-home TMPDIR=/ws/.sandbox-tmp writes
/tmp/claw-escape-proof.txt and mkdir /tmp/claw-probe-target without
error.
Clawability problem: claws/orchestrators read SandboxStatus JSON and
branch on filesystem_active && filesystem_mode=='workspace-only' to
decide whether a worker can safely touch /tmp or $HOME. Today that
branch lies on macOS.
Fix shape option A (low-risk, ~15 lines): compute filesystem_active
only where an enforcement path exists, so macOS reports false by
default and fallback_reason surfaces the real story. Option B:
wire a Seatbelt (sandbox-exec) profile for actual macOS enforcement.
Filed in response to Clawhip pinpoint nudge 1494646135317598239 in
#clawcode-building-in-public.
Dogfooded 2026-04-17 on main HEAD a48575f inside claw-code itself
and reproduced on /tmp/claw-split-17. SessionStore::from_cwd at
session_control.rs:32-40 uses the raw CWD as input to
workspace_fingerprint() (line 295-303), not the project root
surfaced in claw status. Result: two CWDs in the same git repo
(e.g. ~/clawd/claw-code vs ~/clawd/claw-code/rust) report the same
Project root in status but land in two disjoint .claw/sessions/
<fp>/ partitions. claw --resume latest from one CWD returns
'no managed sessions found' even though the adjacent CWD has a
live session visible via /session list.
Status-layer truth (Project root) and session-layer truth
(fingerprint-of-CWD) disagree and neither surface exposes the
disagreement -- classic split-truth per ROADMAP pain point #2.
Fix shape (<=40 lines): (a) fingerprint the project root instead
of raw CWD, or (b) surface partition key explicitly in status.
Filed in response to Clawhip pinpoint nudge 1494638583481372833
in #clawcode-building-in-public.
Dogfooded 2026-04-17 on main HEAD 688295e against /tmp/claw-d4.
SessionStore::from_cwd at session_control.rs:32-40 places sessions
under .claw/sessions/<workspace_fingerprint>/ (16-char FNV-1a hex
at line 295-303), but format_no_managed_sessions and
format_missing_session_reference at line 516-526 advertise plain
.claw/sessions/ with no fingerprint context.
Concrete repro: fresh workspace, no sessions yet, .claw/sessions/
contains foo/ (hash dir, empty) + ffffffffffffffff/foreign.jsonl
(foreign workspace session). 'claw --resume latest' still says
'no managed sessions found in .claw/sessions/' even though that
directory is not empty -- the sessions just belong to other
workspace partitions.
Fix shape is ~30 lines: plumb the resolved sessions_root/workspace
into the two format helpers, optionally enumerate sibling partitions
so error copy tells the operator where sessions from other workspaces
are and why they're invisible.
Filed in response to Clawhip pinpoint nudge 1494615932222439456 in
#clawcode-building-in-public.
Dogfooded 2026-04-17 on main HEAD 9deaa29. init.rs:38-113 already
builds a fully-typed InitReport { project_root, artifacts: Vec<
InitArtifact { name, status: InitStatus }> } but main.rs:5436-5454
calls .render() on it and throws the structure away, emitting only
{kind, message: '<prose>'} via init_json_value(). Downstream claws
have to regex 'created|updated|skipped' out of the message string
to know per-artifact state.
version/system-prompt/acp/bootstrap-plan all emit structured payloads
on the same binary -- init is the sole odd-one-out. Fix shape is ~20
lines: add InitReport::to_json_value + InitStatus::as_str, switch
run_init to hold the report instead of .render()-ing it eagerly,
preserve message for backward compat, add output_format_contract
regression.
Filed in response to Clawhip pinpoint nudge 1494608389068558386 in
#clawcode-building-in-public.
Dogfooded 2026-04-17 on main HEAD d05c868. CliAction::Plugins variant
is declared at main.rs:303-307 and wired to LiveCli::print_plugins at
main.rs:202-206, but parse_args has no "plugins" arm, so
claw plugins / claw plugins list / claw --output-format json plugins
all fall through to the LLM-prompt catch-all and emit a missing
Anthropic credentials error. This is the sole documented-shaped
subcommand that does NOT resolve to a local CLI route:
agents, mcp, skills, acp, init, dump-manifests, bootstrap-plan,
system-prompt, export all work. grep confirms CliAction::Plugins has
exactly one hit in crates/ (the handler), not a constructor anywhere.
Filed with a ~15 line parser fix shape plus help/test wiring, matching
the pattern already used by agents/mcp/skills.
Filed in response to Clawhip pinpoint nudge 1494600832652546151 in
#clawcode-building-in-public.
Dogfooded 2026-04-17 against main HEAD 00d0eb6. Five distinct failure
classes (missing credentials, missing manifests, missing worker state,
session not found, CLI parse) all emit the same {type,error} envelope
with no machine-readable kind/code, so downstream claws have to regex
the prose to route failures. Success payloads already carry a stable
'kind' discriminator; error payloads do not. Fix shape proposes an
ErrorKind discriminant plus hint/context fields to match the success
side contract.
Filed in response to Clawhip pinpoint nudge 1494593284180414484 in
#clawcode-building-in-public.
Add ModelTokenLimit entries for kimi-k2.5 and kimi-k1.5 to enable
preflight context window validation. Per Moonshot AI documentation:
- Context window: 256,000 tokens
- Max output: 16,384 tokens
Includes 3 unit tests:
- returns_context_window_metadata_for_kimi_models
- kimi_alias_resolves_to_kimi_k25_token_limits
- preflight_blocks_oversized_requests_for_kimi_models
All tests pass, clippy clean.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
All 23 stories (US-001 through US-023) are now complete.
Updated status from "in_progress" to "completed".
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add "kimi" to the strip_routing_prefix matches so that models like
"kimi/kimi-k2.5" have their prefix stripped before sending to the
DashScope API (consistent with qwen/openai/xai/grok handling).
Also add unit test strip_routing_prefix_strips_kimi_provider_prefix.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Move US-023 from inProgressStories to completedStories
- All acceptance criteria met and verified
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Changes in rust/crates/api/src/providers/mod.rs:
- Add 'kimi' alias to MODEL_REGISTRY resolving to 'kimi-k2.5' with DashScope config
- Add kimi/kimi- prefix routing to DashScope endpoint in metadata_for_model()
- Add resolve_model_alias() handling for kimi -> kimi-k2.5
- Add unit tests: kimi_prefix_routes_to_dashscope, kimi_alias_resolves_to_kimi_k2_5
Users can now use:
- --model kimi (resolves to kimi-k2.5)
- --model kimi-k2.5 (auto-routes to DashScope)
- --model kimi/kimi-k2.5 (explicit provider prefix)
All 127 tests pass, clippy clean.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add structured error context to API failures:
- Request ID tracking across retries with full context in error messages
- Provider-specific error code mapping with actionable suggestions
- Suggested user actions for common error types (401, 403, 413, 429, 500, 502-504)
- Added suggested_action field to ApiError::Api variant
- Updated enrich_bearer_auth_error to preserve suggested_action
Files changed:
- rust/crates/api/src/error.rs: Add suggested_action field, update Display
- rust/crates/api/src/providers/openai_compat.rs: Add suggested_action_for_status()
- rust/crates/api/src/providers/anthropic.rs: Update error handling
All tests pass, clippy clean.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Added criterion benchmarks and optimized flatten_tool_result_content:
- Added criterion dev-dependency and request_building benchmark suite
- Optimized flatten_tool_result_content to pre-allocate capacity and avoid
intermediate Vec construction (was collecting to Vec then joining)
- Made key functions public for benchmarking: translate_message,
build_chat_completion_request, flatten_tool_result_content,
is_reasoning_model, model_rejects_is_error_field
Benchmark results:
- flatten_tool_result_content/single_text: ~17ns
- translate_message/text_only: ~200ns
- build_chat_completion_request/10 messages: ~16.4µs
- is_reasoning_model detection: ~26-42ns
All 119 unit tests and 29 integration tests pass.
cargo clippy passes
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Created comprehensive MODEL_COMPATIBILITY.md documenting:
- Kimi models is_error exclusion (prevents 400 Bad Request)
- Reasoning models tuning parameter stripping (o1, o3, o4, grok-3-mini, qwen-qwq)
- GPT-5 max_completion_tokens requirement
- Qwen model routing through DashScope
Includes implementation details, key functions table, guide for adding new
models, and testing commands. Cross-referenced with existing code comments.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Added 4 unit tests to verify is_error field handling for kimi models:
- model_rejects_is_error_field_detects_kimi_models: Detects kimi-k2.5, kimi-k1.5, dashscope/kimi-k2.5 (case insensitive)
- translate_message_includes_is_error_for_non_kimi_models: Verifies gpt-4o, grok-3, claude include is_error
- translate_message_excludes_is_error_for_kimi_models: Verifies kimi models exclude is_error (prevents 400 Bad Request)
- build_chat_completion_request_kimi_vs_non_kimi_tool_results: Full integration test for request building
All 119 unit tests and 29 integration tests pass.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- translate_message now conditionally includes is_error field
- kimi models (kimi-k2.5, kimi-k1.5, etc.) exclude is_error
- Other models (openai, grok, xai) keep is_error support
- Update prd.json: mark US-001 through US-007 as passes: true
- Add progress.txt: detailed implementation summary for all stories
All acceptance criteria verified:
- US-001: Startup failure evidence bundle + classifier
- US-002: Lane event schema with provenance and deduplication
- US-003: Stale branch detection with policy integration
- US-004: Recovery recipes with ledger
- US-005: Typed task packet format with TaskScope
- US-006: Policy engine for autonomous coding
- US-007: Plugin/MCP lifecycle maturity
Adds typed worker.startup_no_evidence event with evidence bundle when worker
startup times out. The classifier attempts to down-rank the vague bucket into
specific failure classifications:
- trust_required
- prompt_misdelivery
- prompt_acceptance_timeout
- transport_dead
- worker_crashed
- unknown
Evidence bundle includes:
- Last known worker lifecycle state
- Pane/command being executed
- Prompt-send timestamp
- Prompt-acceptance state
- Trust-prompt detection result
- Transport health summary
- MCP health summary
- Elapsed seconds since worker creation
Includes 6 regression tests covering:
- Evidence bundle serialization
- Transport dead classification
- Trust required classification
- Prompt acceptance timeout
- Worker crashed detection
- Unknown fallback
Closes Phase 1.6 from ROADMAP.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 09:05:33 +00:00
36 changed files with 13370 additions and 235 deletions
@@ -7,7 +7,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
- Frameworks: none detected from the supported starter markers.
## Verification
- Run Rust verification from `rust/`: `cargo fmt`,`cargo clippy --workspace --all-targets -- -D warnings`, `cargo test --workspace`
- Run Rust verification from repo root: `scripts/fmt.sh --check`; for formatting use `scripts/fmt.sh`. Run Rust clippy/tests from `rust/`:`cargo clippy --workspace --all-targets -- -D warnings`, `cargo test --workspace`
-`src/` and `tests/` are both present; update both surfaces together when behavior changes.
**Git Bash / WSL** are optional alternatives, not requirements. If you prefer bash-style paths (`/c/Users/you/...` instead of `C:\Users\you\...`), Git Bash (ships with Git for Windows) works well. In Git Bash, the `MINGW64` prompt is expected and normal — not a broken install.
## Post-build: locate the binary and verify
After running `cargo build --workspace`, the `claw` binary is built but **not** automatically installed to your system. Here's where to find it and how to verify the build succeeded.
### Binary location
After `cargo build --workspace` in `claw-code/rust/`:
**Debug build (default, faster compile):**
- **macOS/Linux:** `rust/target/debug/claw`
- **Windows:** `rust/target/debug/claw.exe`
**Release build (optimized, slower compile):**
- **macOS/Linux:** `rust/target/release/claw`
- **Windows:** `rust/target/release/claw.exe`
If you ran `cargo build` without `--release`, the binary is in the `debug/` folder.
### Verify the build succeeded
Test the binary directly using its path:
```bash
# macOS/Linux (debug build)
./rust/target/debug/claw --help
./rust/target/debug/claw doctor
# Windows PowerShell (debug build)
.\rust\target\debug\claw.exe --help
.\rust\target\debug\claw.exe doctor
```
If these commands succeed, the build is working. `claw doctor` is your first health check — it validates your API key, model access, and tool configuration.
### Optional: Add to PATH
If you want to run `claw` from any directory without the full path, choose one of these approaches:
Build and install to Cargo's default location (`~/.cargo/bin/`, which is usually on PATH):
```bash
# From the claw-code/rust/ directory
cargo install --path . --force
# Then from anywhere
claw --help
```
**Option 3: Update shell profile (bash/zsh)**
Add this line to `~/.bashrc` or `~/.zshrc`:
```bash
export PATH="$(pwd)/rust/target/debug:$PATH"
```
Reload your shell:
```bash
source ~/.bashrc # or source ~/.zshrc
claw --help
```
### Troubleshooting
- **"command not found: claw"** — The binary is in `rust/target/debug/claw`, but it's not on your PATH. Use the full path `./rust/target/debug/claw` or symlink/install as above.
- **"permission denied"** — On macOS/Linux, you may need `chmod +x rust/target/debug/claw` if the executable bit isn't set (rare).
- **Debug vs. release** — If the build is slow, you're in debug mode (default). Add `--release` to `cargo build` for faster runtime, but the build itself will take 5–10 minutes.
> [!NOTE]
> **Auth:** claw requires an **API key** (`ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, etc.) — Claude subscription login is not a supported auth path.
Run the workspace test suite:
Run the workspace test suite after verifying the binary works:
Or run doctor directly with JSON output for scripting:
```bash
cd rust
./target/debug/claw doctor --output-format json
```
**Note:** Diagnostic verbs (`doctor`, `status`, `sandbox`, `version`) support `--output-format json` for machine-readable output. Invalid suffix arguments (e.g., `--json`) are now rejected at parse time rather than falling through to prompt dispatch.
### Initialize a repository
Set up a new repository with `.claw` config, `.claw.json`, `.gitignore` entries, and a `CLAUDE.md` guidance file:
```bash
cd /path/to/your/repo
./target/debug/claw init
```
Text mode (human-readable) shows artifact creation summary with project path and next steps. Idempotent — running multiple times in the same repo marks already-created files as "skipped".
JSON mode for scripting:
```bash
./target/debug/claw init --output-format json
```
Returns structured output with `project_path`, `created[]`, `updated[]`, `skipped[]` arrays (one per artifact), and `artifacts[]` carrying each file's `name` and machine-stable `status` tag. The legacy `message` field preserves backward compatibility.
**Why structured fields matter:** Claws can detect per-artifact state (`created` vs `updated` vs `skipped`) without substring-matching human prose. Use the `created[]`, `updated[]`, and `skipped[]` arrays for conditional follow-up logic (e.g., only commit if files were actually created, not just updated).
The `claw state` command reads `.claw/worker-state.json`, which is written by the interactive REPL or a one-shot prompt when a worker executes a task. This file contains the worker ID, session reference, model, and permission mode.
Prerequisite: You must run `claw` (interactive REPL) or `claw prompt <text>` at least once in the repository to produce the worker state file.
```bash
cd rust
./target/debug/claw state
```
JSON mode:
```bash
./target/debug/claw state --output-format json
```
If you run `claw state` before any worker has executed, you will see a helpful error:
```
error: no worker state file found at .claw/worker-state.json
Hint: worker state is written by the interactive REPL or a non-interactive prompt.
Run: claw # start the REPL (writes state on first turn)
Or: claw prompt <text> # run one non-interactive turn
These commands are available inside the interactive REPL (`claw` with no args). They extend the assistant with workspace analysis, planning, and navigation features.
### `/ultraplan` — Deep planning with multi-step reasoning
**Purpose:** Break down a complex task into steps using extended reasoning.
```bash
# Start the REPL
claw
# Inside the REPL
/ultraplan refactor the auth module to use async/await
/ultraplan design a caching layer for database queries
/ultraplan analyze this module for performance bottlenecks
```
Output: A structured plan with numbered steps, reasoning for each step, and expected outcomes. Use this when you want the assistant to think through a problem in detail before coding.
### `/teleport` — Jump to a file or symbol
**Purpose:** Quickly navigate to a file, function, class, or struct by name.
```bash
# Jump to a symbol
/teleport UserService
/teleport authenticate_user
/teleport RequestHandler
# Jump to a file
/teleport src/auth.rs
/teleport crates/runtime/lib.rs
/teleport ./ARCHITECTURE.md
```
Output: The file content, with the requested symbol highlighted or the file fully loaded. Useful for exploring the codebase without manually navigating directories. If multiple matches exist, the assistant shows the top candidates.
### `/bughunter` — Scan for likely bugs and issues
**Purpose:** Analyze code for common pitfalls, anti-patterns, and potential bugs.
```bash
# Scan the entire workspace
/bughunter
# Scan a specific directory or file
/bughunter src/handlers
/bughunter rust/crates/runtime
/bughunter src/auth.rs
```
Output: A list of suspicious patterns with explanations (e.g., "unchecked unwrap()", "potential race condition", "missing error handling"). Each finding includes the file, line number, and suggested fix. Use this as a first pass before a full code review.
This document describes model-specific handling in the OpenAI-compatible provider. When adding new models or providers, review this guide to ensure proper compatibility.
The `openai_compat.rs` provider translates Claude Code's internal message format to OpenAI-compatible chat completion requests. Different models have varying requirements for:
- Tool result message fields (`is_error`)
- Sampling parameters (temperature, top_p, etc.)
- Token limit fields (`max_tokens` vs `max_completion_tokens`)
- Base URL routing
## Model-Specific Handling
### Kimi Models (is_error Exclusion)
**Affected models:**`kimi-k2.5`, `kimi-k1.5`, `kimi-moonshot`, and any model with `kimi` in the name (case-insensitive)
**Behavior:** The `is_error` field is **excluded** from tool result messages.
**Rationale:** Kimi models (via Moonshot AI and DashScope) reject the `is_error` field with a 400 Bad Request error:
**Note:** Some Qwen models are also reasoning models (see [Reasoning Models](#reasoning-models-tuning-parameter-stripping) above) and receive both treatments.
## Implementation Details
### File Location
All model-specific logic is in:
```
rust/crates/api/src/providers/openai_compat.rs
```
### Key Functions
| Function | Purpose |
|----------|---------|
| `model_rejects_is_error_field()` | Detects models that don't support `is_error` in tool results |
| `is_reasoning_model()` | Detects reasoning models that need tuning param stripping |
| `translate_message()` | Converts internal messages to OpenAI format (applies `is_error` logic) |
| `build_chat_completion_request()` | Constructs full request payload (applies all model-specific logic) |
### Provider Prefix Handling
All model detection functions strip provider prefixes (e.g., `dashscope/kimi-k2.5` → `kimi-k2.5`) before matching:
```rust
letcanonical=model.to_ascii_lowercase()
.rsplit('/')
.next()
.unwrap_or(model);
```
This ensures consistent detection regardless of whether models are referenced with or without provider prefixes.
## Adding New Models
When adding support for new models:
1.**Check if the model is a reasoning model**
- Does it reject temperature/top_p parameters?
- Add to `is_reasoning_model()` detection
2.**Check tool result compatibility**
- Does it reject the `is_error` field?
- Add to `model_rejects_is_error_field()` detection
3.**Check token limit field**
- Does it require `max_completion_tokens` instead of `max_tokens`?
- Update the `max_tokens_key` logic
4.**Add tests**
- Unit test for detection function
- Integration test in `build_chat_completion_request`
5.**Update this documentation**
- Add the model to the affected lists
- Document any special behavior
## Testing
### Running Model-Specific Tests
```bash
# All OpenAI compatibility tests
cargo test --package api providers::openai_compat
# Specific test categories
cargo test --package api model_rejects_is_error_field
cargo test --package api reasoning_model
cargo test --package api gpt5
cargo test --package api qwen
```
### Test Files
- Unit tests: `rust/crates/api/src/providers/openai_compat.rs` (in `mod tests`)
"description":"When startup times out, emit typed worker.startup_no_evidence event with evidence bundle including last known worker lifecycle state, pane command, prompt-send timestamp, prompt-acceptance state, trust-prompt detection result, and transport/MCP health summary. Classifier should down-rank into specific failure classes.",
"acceptanceCriteria":[
"worker.startup_no_evidence event emitted on startup timeout with evidence bundle",
"Session identity completeness at creation (title, workspace, purpose)",
"Duplicate terminal-event suppression with fingerprinting",
"Lane ownership/scope binding in events",
"Nudge acknowledgment with dedupe contract",
"clawhip consumes typed lane events instead of pane scraping"
],
"passes":true,
"priority":"P0"
},
{
"id":"US-003",
"title":"Phase 3 - Stale-branch detection before broad verification",
"description":"Before broad test runs, compare current branch to main and detect if known fixes are missing. Emit branch.stale_against_main event and suggest/auto-run rebase/merge-forward.",
"acceptanceCriteria":[
"Branch freshness comparison against main implemented",
"branch.stale_against_main event emitted when behind",
"Auto-rebase/merge-forward policy integration",
"Avoid misclassifying stale-branch failures as new regressions"
],
"passes":true,
"priority":"P1"
},
{
"id":"US-004",
"title":"Phase 3 - Recovery recipes with ledger",
"description":"Encode automatic recoveries for common failures (trust prompt, prompt misdelivery, stale branch, compile red, MCP startup). Expose recovery attempt ledger with recipe id, attempt count, state, timestamps, failure summary.",
"acceptanceCriteria":[
"Recovery recipes defined for: trust_prompt_unresolved, prompt_delivered_to_shell, stale_branch, compile_red_after_refactor, MCP_handshake_failure, partial_plugin_startup",
"title":"Phase 4 - Policy engine for autonomous coding",
"description":"Encode automation rules: if green + scoped diff + review passed -> merge to dev; if stale branch -> merge-forward before broad tests; if startup blocked -> recover once, then escalate; if lane completed -> emit closeout and cleanup session.",
"acceptanceCriteria":[
"Policy rules engine implemented",
"Rules: green + scoped diff + review -> merge",
"Rules: stale branch -> merge-forward before tests",
"Rules: startup blocked -> recover once, then escalate",
"description":"First-class plugin/MCP lifecycle contract: config validation, startup healthcheck, discovery result, degraded-mode behavior, shutdown/cleanup. Close gaps in end-to-end lifecycle.",
"acceptanceCriteria":[
"Plugin/MCP config validation contract",
"Startup healthcheck with structured results",
"Discovery result reporting",
"Degraded-mode behavior documented and implemented",
"Shutdown/cleanup contract",
"Partial startup and per-server failures reported structurally"
],
"passes":true,
"priority":"P2"
},
{
"id":"US-008",
"title":"Fix kimi-k2.5 model API compatibility",
"description":"The kimi-k2.5 model (and other kimi models) reject API requests containing the is_error field in tool result messages. The OpenAI-compatible provider currently always includes is_error for all models. Need to make this field conditional based on model support.",
"acceptanceCriteria":[
"translate_message function accepts model parameter",
"is_error field excluded for kimi models (kimi-k2.5, kimi-k1.5, etc.)",
"is_error field included for models that support it (openai, grok, xai, etc.)",
"build_chat_completion_request passes model to translate_message",
"Tests verify is_error presence/absence based on model",
"cargo test passes",
"cargo clippy passes",
"cargo fmt passes"
],
"passes":true,
"priority":"P0"
},
{
"id":"US-009",
"title":"Add unit tests for kimi model compatibility fix",
"description":"During dogfooding we discovered the existing test coverage for model-specific is_error handling is insufficient. Need to add dedicated tests for model_rejects_is_error_field function and translate_message behavior with different models.",
"Test translate_message includes is_error for gpt-4, grok-3, claude models",
"Test translate_message excludes is_error for kimi models",
"Test build_chat_completion_request produces correct payload for kimi vs non-kimi",
"All new tests pass",
"cargo test --package api passes"
],
"passes":true,
"priority":"P1"
},
{
"id":"US-010",
"title":"Add model compatibility documentation",
"description":"Document which models require special handling (is_error exclusion, reasoning model tuning param stripping, etc.) in a MODEL_COMPATIBILITY.md file for operators and contributors.",
"acceptanceCriteria":[
"MODEL_COMPATIBILITY.md created in docs/ or repo root",
"title":"Performance optimization: reduce API request serialization overhead",
"description":"The translate_message function creates intermediate JSON Value objects that could be optimized. Profile and optimize the hot path for API request building, especially for conversations with many tool results.",
"acceptanceCriteria":[
"Profile current request building with criterion or similar",
"Identify bottlenecks in translate_message and build_chat_completion_request",
"title":"Trust prompt resolver with allowlist auto-trust",
"description":"Add allowlisted auto-trust behavior for known repos/worktrees. Trust prompts currently block TUI startup and require manual intervention. Implement automatic trust resolution for pre-approved repositories.",
"acceptanceCriteria":[
"TrustAllowlist config structure with repo patterns",
"Auto-trust behavior for allowlisted repos/worktrees",
"trust_required event emitted when trust prompt detected",
"trust_resolved event emitted when trust is granted",
"description":"When the same session emits contradictory lifecycle events (idle, error, completed, transport/server-down) in close succession, expose deterministic final truth. Attach monotonic sequence/causal ordering metadata, classify terminal vs advisory events, reconcile duplicate/out-of-order terminal events into one canonical lane outcome.",
"description":"Every emitted event should declare its source (live_lane, test, healthcheck, replay, transport) so claws do not mistake test noise for production truth. Include environment/channel label, emitter identity, and confidence/trust level.",
"acceptanceCriteria":[
"EventProvenance enum with live_lane, test, healthcheck, replay, transport variants",
"Environment/channel label attached to all events",
"Emitter identity field on events",
"Confidence/trust level field for downstream automation",
"Tests verify provenance labeling and filtering"
],
"passes":true,
"priority":"P1"
},
{
"id":"US-015",
"title":"Phase 2 - Session identity completeness at creation time",
"description":"A newly created session should emit stable title, workspace/worktree path, and lane/session purpose at creation time. If any field is not yet known, emit explicit typed placeholder reason rather than bare unknown string.",
"description":"When the same session emits repeated completed/failed/terminal notifications, collapse duplicates before they trigger repeated downstream reactions. Attach canonical terminal-event fingerprint per lane/session outcome.",
"acceptanceCriteria":[
"Canonical terminal-event fingerprint attached per lane/session outcome",
"Suppress/coalesce repeated terminal notifications within reconciliation window",
"Preserve raw event history for audit while exposing one actionable outcome downstream",
"Surface when later duplicate materially differs from original terminal payload",
"Tests verify deduplication and material difference detection"
],
"passes":true,
"priority":"P2"
},
{
"id":"US-017",
"title":"Phase 2 - Lane ownership / scope binding",
"description":"Each session and lane event should declare who owns it and what workflow scope it belongs to. Attach owner/assignee identity, workflow scope (claw-code-dogfood, external-git-maintenance, infra-health, manual-operator), and mark whether watcher is expected to act, observe only, or ignore.",
"acceptanceCriteria":[
"Owner/assignee identity attached to sessions and lane events",
"Workflow scope field (claw-code-dogfood, external-git-maintenance, etc.)",
"Watcher action expectation field (act, observe-only, ignore)",
"Preserve scope through session restarts, resumes, and late terminal events",
"description":"Periodic clawhip nudges should carry nudge id/cycle id and delivery timestamp. Expose whether claw has already acknowledged or responded for that cycle. Distinguish new nudge, retry nudge, and stale duplicate.",
"acceptanceCriteria":[
"Nudge id / cycle id and delivery timestamp attached",
"Acknowledgment state exposed (already acknowledged or not)",
"Distinguish new nudge vs retry nudge vs stale duplicate",
"Allow downstream summaries to bind reported pinpoint back to triggering nudge id",
"Tests verify nudge deduplication and acknowledgment tracking"
],
"passes":true,
"priority":"P2"
},
{
"id":"US-019",
"title":"Phase 2 - Stable roadmap-id assignment for newly filed pinpoints",
"description":"When a claw records a new pinpoint/follow-up, assign or expose a stable tracking id immediately. Expose that id in structured event/report payload and preserve across edits, reorderings, and summary compression.",
"acceptanceCriteria":[
"Canonical roadmap id assigned at filing time",
"Roadmap id exposed in structured event/report payload",
"Same id preserved across edits, reorderings, summary compression",
"Distinguish 'new roadmap filing' from 'update to existing roadmap item'",
"Tests verify stable id assignment and update detection"
],
"passes":true,
"priority":"P2"
},
{
"id":"US-020",
"title":"Phase 2 - Roadmap item lifecycle state contract",
"description":"Each roadmap pinpoint should carry machine-readable lifecycle state (filed, acknowledged, in_progress, blocked, done, superseded). Attach last state-change timestamp and preserve lineage when one pinpoint supersedes or merges into another.",
"acceptanceCriteria":[
"Lifecycle state enum with filed, acknowledged, in_progress, blocked, done, superseded",
"Last state-change timestamp attached",
"New report can declare first filing, status update, or closure",
"Preserve lineage when one pinpoint supersedes or merges into another",
"Tests verify lifecycle state transitions"
],
"passes":true,
"priority":"P2"
},
{
"id":"US-021",
"title":"Request body size pre-flight check for OpenAI-compatible provider",
"description":"Implement pre-flight request body size estimation to prevent 400 Bad Request errors from API gateways with size limits. Based on dogfood findings with kimi-k2.5 testing, DashScope API has a 6MB request body limit that was exceeded by large system prompts.",
"acceptanceCriteria":[
"Pre-flight size estimation before sending requests to OpenAI-compatible providers",
"Clear error message when request exceeds provider-specific size limit",
"Configuration for different provider limits (6MB DashScope, 100MB OpenAI, etc.)",
"Unit tests for size estimation and limit checking",
"Integration with existing error handling for actionable user messages"
],
"passes":true,
"priority":"P1"
},
{
"id":"US-022",
"title":"Enhanced error context for API failures",
"description":"Add structured error context to API failures including request ID tracking across retries, provider-specific error code mapping, and suggested user actions based on error type (e.g., 'Reduce prompt size' for 413, 'Check API key' for 401).",
"acceptanceCriteria":[
"Request ID tracking across retries with full context in error messages",
"Provider-specific error code mapping with actionable suggestions",
"Suggested user actions for common error types (401, 403, 413, 429, 500, 502-504)",
"Unit tests for error context extraction",
"All existing tests pass and clippy is clean"
],
"passes":true,
"priority":"P1"
},
{
"id":"US-023",
"title":"Add automatic routing for kimi models to DashScope",
"description":"Based on dogfood findings with kimi-k2.5 testing, users must manually prefix with dashscope/kimi-k2.5 instead of just using kimi-k2.5. Add automatic routing for kimi/ and kimi- prefixed models to DashScope (similar to qwen models), and add a 'kimi' alias to the model registry.",
"acceptanceCriteria":[
"kimi/ and kimi- prefix routing to DashScope in metadata_for_model()",
"'kimi' alias in MODEL_REGISTRY that resolves to 'kimi-k2.5'",
"resolve_model_alias() handles the kimi alias correctly",
"Unit tests for kimi routing (similar to qwen routing tests)",
"All tests pass and clippy is clean"
],
"passes":true,
"priority":"P1"
},
{
"id":"US-024",
"title":"Add token limit metadata for kimi models",
"description":"The model_token_limit() function has no entries for kimi-k2.5 or kimi-k1.5, causing preflight context window validation to skip these models. Add token limit metadata to enable preflight checks and accurate max token defaults. Per Moonshot AI documentation, kimi-k2.5 supports 256K context window and 16K max output tokens.",
- US-015: Session identity completeness at creation time
- US-016: Duplicate terminal-event suppression
- US-017: Lane ownership / scope binding
- US-018: Nudge acknowledgment / dedupe contract
- US-019: Stable roadmap-id assignment
- US-020: Roadmap item lifecycle state contract
Iteration 8: 2026-04-16
------------------------
US-021 COMPLETED (Request body size pre-flight check - from dogfood findings)
- Files:
- rust/crates/api/src/error.rs (new error variant)
- rust/crates/api/src/providers/openai_compat.rs
- Added RequestBodySizeExceeded error variant with actionable message
- Added max_request_body_bytes to OpenAiCompatConfig:
- DashScope: 6MB (6_291_456 bytes) - from dogfood with kimi-k2.5
- OpenAI: 100MB (104_857_600 bytes)
- xAI: 50MB (52_428_800 bytes)
- Added estimate_request_body_size() for pre-flight checks
- Added check_request_body_size() for validation
- Pre-flight check integrated in send_raw_request()
- Tests: 5 new tests for size estimation and limit checking
PROJECT STATUS: COMPLETE (21/21 stories)
Iteration 2026-04-29 - ROADMAP #96 COMPLETED
------------------------------------------------
- Pulled origin/main: already up to date.
- Selected ROADMAP #96 as a small repo-local Immediate Backlog item: the `claw --help` Resume-safe command summary leaked slash-command stubs despite the main Interactive command listing filtering them.
This file provides guidance to Claw Code (clawcode.dev) when working with code in this repository.
## Detected stack
- Languages: Rust.
- Frameworks: none detected from the supported starter markers.
## Verification
- From the repository root, run Rust formatting with `scripts/fmt.sh` (or `scripts/fmt.sh --check` for CI-style checks). From this `rust/` directory, the equivalent command is `../scripts/fmt.sh`. Root-level `cargo fmt --manifest-path rust/Cargo.toml` is not the supported formatting command.
- From this `rust/` directory, run Rust verification with `cargo clippy --workspace --all-targets -- -D warnings` and `cargo test --workspace`.
## Working agreement
- Prefer small, reviewable changes and keep generated bootstrap files aligned with actual repo workflows.
- Keep shared defaults in `.claw.json`; reserve `.claw/settings.local.json` for machine-local overrides.
- Do not overwrite existing `CLAUDE.md` content automatically; update it intentionally when repo workflows change.
// #144: degrade gracefully on config parse failure (same contract
// as #143 for `status`). Text mode prepends a "Config load error"
// block before the MCP list; the list falls back to empty.
matchloader.load(){
Ok(runtime_config)=>Ok(render_mcp_summary_report(
cwd,
runtime_config.mcp().servers(),
)),
Err(err)=>{
letempty=std::collections::BTreeMap::new();
Ok(format!(
"Config load error\n Status fail\n Summary runtime config failed to load; reporting partial MCP view\n Details {err}\n Hint `claw doctor` classifies config parse errors; fix the listed field and rerun\n\n{}",
// #144: same degradation for `mcp show`; if config won't parse,
// the specific server lookup can't succeed, so report the parse
// error with context.
matchloader.load(){
Ok(runtime_config)=>Ok(render_mcp_server_report(
cwd,
server_name,
runtime_config.mcp().get(server_name),
)),
Err(err)=>Ok(format!(
"Config load error\n Status fail\n Summary runtime config failed to load; cannot resolve `{server_name}`\n Details {err}\n Hint `claw doctor` classifies config parse errors; fix the listed field and rerun"
// #80: show the actual workspace-fingerprint directory instead of lying about .claw/sessions/
letfingerprint_dir=sessions_root
.file_name()
.and_then(|f|f.to_str())
.unwrap_or("<unknown>");
format!(
"session not found: {reference}\nHint: managed sessions live in .claw/sessions/. Try `{LATEST_SESSION_REFERENCE}` for the most recent session or `/session list` in the REPL."
"session not found: {reference}\nHint: managed sessions live in .claw/sessions/{fingerprint_dir}/ (workspace-specific partition).\nTry `{LATEST_SESSION_REFERENCE}` for the most recent session or `/session list` in the REPL."
// #80: show the actual workspace-fingerprint directory instead of lying about .claw/sessions/
letfingerprint_dir=sessions_root
.file_name()
.and_then(|f|f.to_str())
.unwrap_or("<unknown>");
format!(
"no managed sessions found in .claw/sessions/\nStart `claw` to create a session, then rerun with `--resume {LATEST_SESSION_REFERENCE}`."
"no managed sessions found in .claw/sessions/{fingerprint_dir}/\nStart `claw` to create a session, then rerun with `--resume {LATEST_SESSION_REFERENCE}`.\nNote: claw partitions sessions per workspace fingerprint; sessions from other CWDs are invisible."
)
}
@@ -744,6 +762,40 @@ mod tests {
assert_eq!(fp_a1.len(),16,"fingerprint must be a 16-char hex string");
}
/// #151 regression: equivalent paths (e.g. `/tmp/foo` vs `/private/tmp/foo`
/// on macOS where `/tmp` is a symlink to `/private/tmp`) must resolve to
/// the same session store. Previously they diverged because
/// `workspace_fingerprint()` hashed the raw path string. Now
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.