Commit Graph

396 Commits

Author SHA1 Message Date
bellman
db6f30fa33 verify plugin lifecycle JSON contract
Lock the plugin inventory JSON contract so lifecycle state and lifecycle summary fields stay visible to orchestrators while allowing bundled plugins to coexist in isolated inventories.

Constraint: G007-plugin-mcp Task 1 requires plugin/MCP lifecycle contract evidence without mutating .omx/ultragoal.

Rejected: Assuming an empty plugin inventory in tests | bundled plugins are auto-synced and should not make lifecycle contract verification brittle.

Confidence: high

Scope-risk: narrow

Directive: Keep plugin inventory JSON machine-readable for lifecycle_state, lifecycle, status, and load_failures; do not collapse it back to message-only JSON.

Tested: cargo test -p plugins plugin_registry_report_collects_load_failures_without_dropping_valid_plugins -- --nocapture; cargo test -p commands renders_plugins_report -- --nocapture; cargo test -p rusty-claude-cli --test output_format_contract plugins_json_surfaces_lifecycle_contract_when_plugin_is_installed -- --nocapture; cargo test -p rusty-claude-cli --test output_format_contract inventory_commands_emit_structured_json_when_requested -- --nocapture; cargo check --workspace; cargo fmt --all -- --check

Not-tested: cargo clippy -p rusty-claude-cli --test output_format_contract -- -D warnings is blocked by pre-existing runtime::policy_engine::LaneContext clippy::struct_excessive_bools.

Co-authored-by: OmX <omx@oh-my-codex.dev>
2026-05-15 09:59:02 +09:00
bellman
983ceb939c omx(team): auto-checkpoint worker-1 [1] 2026-05-15 09:57:03 +09:00
bellman
cac73b4410 omx(team): auto-checkpoint worker-3 [4] 2026-05-15 09:57:00 +09:00
bellman
9ae6aa3f30 Keep plugin introspection available when MCP config is malformed
Route plugin command rendering through the same degraded config envelope used by status and MCP, falling back to empty runtime config when config loading fails so local plugin listing remains inspectable.

Constraint: Task 4 requires malformed MCP config consistency across status, doctor, mcp, and plugins surfaces.

Rejected: Hard-failing plugins on ConfigLoader errors | inconsistent with status/mcp degraded-mode contract and hides local plugin diagnostics.

Confidence: high

Scope-risk: narrow

Directive: Keep config_load_error/status fields aligned across local introspection commands when adding new config-dependent surfaces.

Tested: cargo test -p rusty-claude-cli malformed_mcp_config -- --nocapture; cargo test -p commands mcp_degrades_gracefully_on_malformed_mcp_config_144 -- --nocapture; cargo check -p rusty-claude-cli; cargo fmt --all -- --check; claw plugins --output-format json malformed-MCP smoke.

Not-tested: full workspace clippy remains blocked by pre-existing clippy warnings in runtime and rusty-claude-cli unrelated to this change.
2026-05-15 09:56:56 +09:00
bellman
985c6e97f9 omx(team): auto-checkpoint worker-1 [1] 2026-05-15 09:56:50 +09:00
bellman
c522dc970f Preserve plugin lifecycle JSON in G007 CLI output
Constraint: G007 worker integrations made plugin command JSON degraded-aware but omitted the structured plugin/load-failure arrays expected by inventory contracts.\nRejected: Drop lifecycle arrays from tests | G007 requires plugin lifecycle state to stay machine-readable across plugin surfaces.\nConfidence: high\nScope-risk: narrow\nDirective: Keep  carrying plugin entries, lifecycle state, and load failures even when config loading degrades.\nTested: cargo fmt --manifest-path rust/Cargo.toml --all -- --check; cargo check --manifest-path rust/Cargo.toml -p runtime -p tools -p rusty-claude-cli -p commands -p plugins; cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli --test output_format_contract plugins_json_surfaces_lifecycle_contract_when_plugin_is_installed -- --nocapture; cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli --test output_format_contract inventory_commands_emit_structured_json_when_requested -- --nocapture; git diff --check\nNot-tested: full workspace suite\n\nCo-authored-by: OmX <omx@oh-my-codex.dev>
2026-05-15 09:56:46 +09:00
bellman
2454f012b6 omx(team): auto-checkpoint worker-1 [1] 2026-05-15 09:52:29 +09:00
bellman
80b8984b62 omx(team): auto-checkpoint worker-4 [5] 2026-05-15 09:49:36 +09:00
bellman
b01192dde7 omx(team): auto-checkpoint worker-3 [4] 2026-05-15 09:49:33 +09:00
bellman
1a6e475f74 omx(team): auto-checkpoint worker-1 [1] 2026-05-15 09:49:22 +09:00
bellman
0cd1eabb5d Keep G007 plugin command integration compiling
Constraint: G007 worker integrations added plugin command surfaces but left the REPL handler referencing a pre-refactor variable.\nRejected: Revert the worker plugin-command surface | the parser/degraded-config changes are part of the G007 scope and only needed a narrow compile repair.\nConfidence: high\nScope-risk: narrow\nDirective: Keep plugin CLI and REPL command paths routed through plugins_command_payload_for so malformed config can degrade consistently.\nTested: cargo check --manifest-path rust/Cargo.toml -p runtime -p tools -p rusty-claude-cli -p commands -p plugins; cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli parse_args_plugins -- --nocapture\nNot-tested: full G007 team suite pending worker completion\n\nCo-authored-by: OmX <omx@oh-my-codex.dev>
2026-05-15 09:49:16 +09:00
bellman
f2ba3648d6 omx(team): auto-checkpoint worker-3 [4] 2026-05-15 09:45:57 +09:00
bellman
0a14f8511e omx(team): auto-checkpoint worker-4 [5] 2026-05-15 09:45:33 +09:00
bellman
f7235ca932 Make G006 task policy state machine executable
Typed task packets, policy decisions, lane board status, and session liveness now have concrete runtime contracts and focused regressions for Stream 4.

Constraint: G006 requires task/lane operation without pane scraping while preserving legacy task packet callers.
Rejected: waiting on stale worker worktrees | all G006 worker worktrees remained at main with no commits, so leader integrated the verified slice directly.
Confidence: high
Scope-risk: moderate
Directive: Keep task packet serde defaults when adding fields so older packets continue to deserialize.
Tested: git diff --check; cargo fmt --manifest-path rust/Cargo.toml --all -- --check; cargo check --manifest-path rust/Cargo.toml -p runtime -p tools -p rusty-claude-cli; cargo test --manifest-path rust/Cargo.toml -p runtime task_packet -- --nocapture; cargo test --manifest-path rust/Cargo.toml -p runtime policy_engine -- --nocapture; cargo test --manifest-path rust/Cargo.toml -p runtime task_registry -- --nocapture; cargo test --manifest-path rust/Cargo.toml -p runtime session_heartbeat -- --nocapture; cargo test --manifest-path rust/Cargo.toml -p tools run_task_packet_creates_packet_backed_task -- --nocapture; cargo test --manifest-path rust/Cargo.toml -p tools lane_completion -- --nocapture; cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli status_json_surfaces -- --nocapture
Not-tested: full workspace test suite; PR/issue reconciliation deferred to G011/G012

Co-authored-by: OmX <omx@oh-my-codex.dev>
2026-05-15 09:29:26 +09:00
bellman
8f7eaffcef Close the G005 verification gaps before checkpoint
Constraint: G005 requires stale-base doctor consistency, green-contract policy integration, hung-test evidence, and a durable verification map before ultragoal checkpointing.\nRejected: Treat worker task status alone as complete | worker-2 lifecycle was stale-failed despite landed recovery evidence, so leader verification and explicit map are required.\nConfidence: medium\nScope-risk: moderate\nDirective: Keep PR/issue reconciliation deferred to G011/G012; do not mutate .omx/ultragoal outside checkpoint commands.\nTested: git diff --check; cargo fmt --manifest-path rust/Cargo.toml --all -- --check; cargo check --manifest-path rust/Cargo.toml -p rusty-claude-cli; cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli workspace_health_warns_when_stale_base_diverged -- --nocapture; cargo check --manifest-path rust/Cargo.toml -p tools\nNot-tested: full workspace test suite due known unrelated permission/lifecycle failures from worker evidence.\n\nCo-authored-by: OmX <omx@oh-my-codex.dev>
2026-05-14 18:38:22 +09:00
bellman
aec291caab omx(team): auto-checkpoint worker-4 [unknown] 2026-05-14 17:51:53 +09:00
bellman
43b182882a Lock doctor JSON boot preflight contract
Constraint: G003 boot/session work adds a structured doctor boot-preflight check that must be visible in JSON output.
Rejected: reducing the doctor check count back to six | boot preflight is an explicit G003 acceptance surface.
Confidence: high
Scope-risk: narrow
Directive: Keep doctor/status JSON contract tests aligned with boot_preflight schema fields when extending preflight diagnostics.
Tested: git diff --check; cargo fmt --manifest-path rust/Cargo.toml --all -- --check; cargo test --manifest-path rust/Cargo.toml -p runtime trusted_roots -- --nocapture; cargo test --manifest-path rust/Cargo.toml -p runtime startup -- --nocapture; cargo test --manifest-path rust/Cargo.toml -p runtime worker_boot -- --nocapture; cargo test --manifest-path rust/Cargo.toml -p tools path_scope -- --nocapture; cargo test --manifest-path rust/Cargo.toml -p rusty-claude-cli --test output_format_contract -- --nocapture; cargo check --manifest-path rust/Cargo.toml --workspace
Not-tested: full cargo test --workspace remains deferred during active G003 team reconciliation.

Co-authored-by: OmX <omx@oh-my-codex.dev>
2026-05-14 17:51:47 +09:00
bellman
307b23d27f omx(team): auto-checkpoint worker-4 [unknown] 2026-05-14 17:50:36 +09:00
bellman
79d3b809f9 omx(team): auto-checkpoint worker-4 [unknown] 2026-05-14 17:46:16 +09:00
bellman
9ab569e626 omx(team): auto-checkpoint worker-2 [3] 2026-05-14 17:18:55 +09:00
YeonGyu-Kim
75c08bc982 fix: REPL display, /compact panic, identity leak, DeepSeek reasoning, thinking blocks
Five interrelated fixes from parallel Hephaestus sessions:

1. fix(repl): display assistant text after spinner (#2981, #2982, #2937)
   - Added final_assistant_text() call after run_turn spinner completes
   - REPL now shows response text like run_prompt_json does

2. fix(compact): handle Thinking content blocks (#2985)
   - Added ContentBlock::Thinking variant throughout compact summarizer
   - Prevents panic when /compact encounters thinking blocks

3. fix(prompt): provider-aware model identity (#2822)
   - New ModelFamilyIdentity enum (Claude vs Generic)
   - Non-Anthropic models no longer say 'I am Claude'
   - model_family_identity_for() detects provider and sets identity

4. fix(openai): preserve DeepSeek reasoning_content (#2821)
   - Stream parser now captures reasoning_content from OpenAI-compat
   - Emits ThinkingDelta/SignatureDelta events for reasoning models
   - Thinking blocks included in conversation history for re-send

5. feat(runtime): Thinking block support across codebase
   - AssistantEvent::Thinking variant in conversation.rs
   - ContentBlock::Thinking in session serialization
   - Thinking-aware compact summarization
   - Tests for thinking block ordering and content

Closes #2981, #2982, #2937, #2985, #2822, #2821
2026-05-06 15:32:34 +09:00
YeonGyu-Kim
28998422e2 Merge pull request #2984 from andhai/pr/openai-token-limit-hardening
openai: harden token-limit handling and default output-token caps
2026-05-06 14:53:24 +09:00
YeonGyu-Kim
b4733b67a6 Merge pull request #2949 from Yeachan-Heo/fix/export-help-json
Fix export help JSON output
2026-05-06 14:52:09 +09:00
YeonGyu-Kim
d074d1c046 fix(mcp): exit 1 when JSON envelope contains ok:false (#2995)
* fix(mcp): exit 1 when JSON envelope contains ok:false

mcp info, mcp describe, and mcp list-filter all return
{"action":"error","ok":false,...} but previously exited 0,
requiring automation callers to inspect the envelope field.

After this fix: print_mcp detects ok:false in the rendered JSON
value and calls process::exit(1) after printing, so the exit code
reflects the semantic error in the envelope.

Unaffected: mcp list, mcp show, mcp help all have no ok field and
continue to exit 0 (they are not error paths).

Closes ROADMAP #68 (partial — agents bogus/mcp show nonexistent
found:false remain exit:0 as they use different envelope shapes).

* feat(scripts): add dogfood-build.sh — build from checkout and verify provenance

Builds claw from the current HEAD, then checks that the binary's
git_sha matches git rev-parse --short HEAD. Exits non-zero if the
binary is stale or provenance is opaque (git_sha: null).

Usage:
  CLAW=$(bash scripts/dogfood-build.sh)   # fail-fast if stale
  $CLAW version --output-format json       # provenance confirmed

Addresses ROADMAP #69: dogfooders using a stale installed binary
cannot attribute behavior to specific commits. This script makes
dogfood round zero unambiguous.

Also documents the safe workaround for contributors who have a
stale system-installed binary.
2026-05-05 06:09:11 +09:00
YeonGyu-Kim
caeac828b5 fix(permissions): return guidance for multi-word forms instead of falling through to LLM (#2994)
claw permissions list / claw permissions allow <tool> / claw permissions deny <tool>
all fell through to the prompt/LLM path because parse_subcommand had no
arm for "permissions". The single-word bare form was already intercepted
by bare_slash_command_guidance, but any form with rest.len() > 1 bypassed
the single-word guard and landed in the _other => CliAction::Prompt branch.

Fix: add a "permissions" arm in parse_subcommand that returns a structured
guidance Err so all multi-word forms get the same exit:1 + JSON error as
the bare single-word form, without any LLM call or session creation.

Verified: all invocation forms (bare, list, read-only, workspace-write,
allow/deny <tool>) exit 1 with kind:unknown guidance JSON. Zero sessions.
2026-05-05 05:35:50 +09:00
YeonGyu-Kim
85435ad4b5 fix(plugins): route plugin and marketplace aliases through local handler (#2993)
claw plugin list / claw marketplace / claw marketplace list all fell
through to the prompt/LLM path because parse_subcommand only matched
"plugins" (the primary name) while the canonical spec aliases
"plugin" and "marketplace" were unhandled.

This manifested as auth errors and session creation on direct
invocation — dogfood confirmed Gaebal's binary created one session
via plugin prompt fallback.

Fix: extend the plugins arm in parse_subcommand to also match
"plugin" | "marketplace" so all three forms route to the same
CliAction::Plugins without network calls or session creation.

Verified: all six forms (bare + list subcommand for each name) return
kind:plugin JSON, exit 0, and create zero sessions.

Closes ROADMAP #55 partial (plugins/marketplace bypass complete).
2026-05-05 05:16:00 +09:00
YeonGyu-Kim
65aa559733 fix: support /plugins slash command in resume mode (#2973)
* fix: support /plugins slash command in resume mode

Move SlashCommand::Plugins out of the 'unsupported resumed slash
command' catch-all and add a handler arm in run_resume_command that
calls handle_plugins_slash_command for list/help actions.

Mutation actions (install/uninstall/enable/disable) are rejected with
a clear error since there is no runtime to reload in resume mode.

Add /plugins coverage to resumed_inventory_commands test in
output_format_contract.rs: kind, action, reload_runtime, target.

Before: claw --resume session.jsonl /plugins --output-format json
-> {error: 'unsupported resumed slash command', type: 'error'}, exit 1

After: claw --resume session.jsonl /plugins --output-format json
-> {kind: 'plugin', action: 'list', ...}, exit 0

* style: cargo fmt line wrap in run_resume_command plugins handler

* fix: block /plugins update in resume mode, fix comment

Address REQUEST_CHANGES from OMX review:
1. Add 'update' to the blocked mutation actions in resume mode
   (previously only install/uninstall/enable/disable were blocked)
2. Fix comment: 'Only list is supported' instead of 'Only list/help'
   since /plugins help doesn't actually parse as a valid action

* style: cargo fmt after conflict resolution
2026-05-05 04:55:39 +09:00
YeonGyu-Kim
ac8a24b30b fix(config): emit section and section_value in JSON output for config subcommands (#2990)
`claw config model --output-format json` and all other section subcommands
(`env`, `hooks`, `plugins`) returned identical output with no section field
— the section arg was parsed but discarded (_section parameter).

Fix: render_config_json now:
- Passes section through to handler
- Looks up the section value via runtime_config.get(), converting the
  internal JsonValue to serde_json::Value via render()+parse
- Emits `section` (string) and `section_value` (JSON value or null)
  in the response envelope
- Returns ok:false + error for unsupported section tokens

Test: config_section_json_emits_section_and_value asserts:
- No section field when no section arg
- section + section_value fields present for all known sections
- ok:false + error for unknown section

Pinpoint: ROADMAP #126
2026-05-05 04:50:33 +09:00
YeonGyu-Kim
9b97c4d832 fix(tests): isolate CLAW_CONFIG_HOME in resumed_status JSON test (#2992)
resumed_status_command_emits_structured_json_when_requested was reading
the real ~/.claw/settings.json, causing loaded_config_files to be 1
instead of the expected 0 on machines with user config present.

Root cause: unlike other tests (e.g. resumed_config_command_loads_settings_files),
this test did not pass an isolated CLAW_CONFIG_HOME env var to run_claw,
so claw fell back to the real HOME and loaded the developer's settings file.

Fix: create a temp config-home dir and pass it as CLAW_CONFIG_HOME via
run_claw_with_env. This gives the assertion a clean 0-file baseline.

Unblocks PRs #2973, #2988, #2990 which all failed this same test on main.

Ref: ROADMAP #65
2026-05-05 04:49:46 +09:00
YeonGyu-Kim
1206f4131d fix(resume): emit structured JSON for /agents --output-format json (#2987)
Resumed /agents --output-format json was returning a human-readable
text render wrapped in a JSON envelope field instead of the actual
structured agent list. The run_resume_command handler was calling
handle_agents_slash_command (text) for the json field instead of
handle_agents_slash_command_json.

Fix: use handle_agents_slash_command_json for the json outcome field,
matching the pattern already used by /skills and /plugins.

Test: extended resumed_inventory_commands_emit_structured_json_when_requested
to cover /agents, asserting kind=="agents", action=="list",
agents is an array, and count is a number (not a text render).
2026-05-05 04:20:52 +09:00
YeonGyu-Kim
c99330372c fix(version): add build_date and executable_path to version JSON output
`claw version --output-format json` was missing build_date and
executable_path, making it impossible to identify which binary is
running or correlate it with a specific build/commit.

Fix: version_json_value() now includes:
- build_date: compile-time BUILD_DATE env (already in text output)
- executable_path: std::env::current_exe() at runtime

Test: version_emits_json_when_requested extended to assert both fields
are strings in the JSON envelope.

Pinpoint: ROADMAP #507
2026-05-05 04:20:12 +09:00
Andreas Haida
9a512633a5 Cap OpenAI default output tokens using model metadata 2026-05-03 22:16:12 +02:00
Andreas Haida
6ac13ffdad Handle OpenAI token-limit errors as context-window failures 2026-05-03 22:16:12 +02:00
YeonGyu-Kim
8e45f1850c test(output_format_contract): add plugins json coverage to inventory_commands test (#2972)
Add four assertions to inventory_commands_emit_structured_json_when_requested:
- kind == "plugin"
- action == "list"
- reload_runtime is boolean
- target is null when no plugin is targeted

Closes the only major --output-format json surface with zero contract
coverage. All other surfaces (agents, mcp, skills, status, sandbox,
doctor, help, version, acp, bootstrap-plan, system-prompt, init, diff,
config) already had test assertions.
2026-05-01 06:03:31 +09:00
Yeachan-Heo
51b9e6b37f Fix export help JSON output 2026-04-30 09:04:11 +00:00
Sigrid Jin (ง'̀-'́)ง oO
1011a83823 Merge pull request #2829 from ultraworkers/fix/issue-320-session-lifecycle-classification
Fix session lifecycle classification for idle tmux shells
2026-04-29 16:11:58 +09:00
Yeachan-Heo
1376d92064 Filter stub commands from resume-safe help
Keep claw --help's resume-safe slash command summary aligned with the interactive command list by filtering STUB_COMMANDS and adding regression coverage.
2026-04-29 03:31:34 +00:00
Yeachan-Heo
be53e04671 Classify saved sessions by live work rather than pane existence
Operator status previously treated any tmux pane in a workspace as equivalent to active work. The new classifier uses tmux pane command/path metadata as a soft signal, treats plain shells as idle, and adds dirty-worktree abandoned markers to status and session-list output for clawhip consumers.

Constraint: Keep issue #320 prototype minimal and additive without new dependencies

Rejected: Screen-scraping pane output | fragile and broader than needed for lifecycle classification

Confidence: high

Scope-risk: narrow

Tested: cargo test -p rusty-claude-cli

Tested: cargo check -p rusty-claude-cli

Not-tested: cargo clippy -p rusty-claude-cli --all-targets -- -D warnings is blocked by pre-existing commands crate clippy::unnecessary_wraps warnings
2026-04-28 13:12:37 +00:00
Yeachan-Heo
74ea754d29 Restore Rust formatting compliance
Run rustfmt from the Rust workspace so CI format checks pass without changing behavior.

Constraint: Scope is formatting-only across tracked Rust files

Confidence: high

Scope-risk: narrow

Tested: cd rust && cargo fmt --check

Tested: git diff --check
2026-04-28 09:19:16 +00:00
Yeachan-Heo
77afde768c Clarify allowed tool status handling
Reject empty --allowedTools inputs instead of treating them as an empty restriction, and surface status JSON metadata that distinguishes default unrestricted tools from flag-provided allow lists.

Confidence: high
Scope-risk: narrow
Tested: cargo test -p rusty-claude-cli rejects_empty_allowed_tools_flag -- --nocapture
Tested: cargo test -p tools allowed_tools_rejects_empty_token_lists -- --nocapture
Tested: cargo check -p rusty-claude-cli -p tools
Tested: cargo test -p rusty-claude-cli -p tools
Not-tested: full workspace cargo fmt --check is blocked by pre-existing unrelated formatting drift
2026-04-28 05:44:14 +00:00
YeonGyu-Kim
f1e4ad7574 feat: #156 — error classification for text-mode output (Phase 2 of #77)
## Problem

#77 Phase 1 added machine-readable error `kind` discriminants to JSON error
payloads. Text-mode (stderr) errors still emit prose-only output with no
structured classification.

Observability tools (log aggregators, CI error parsers) parsing stderr can't
distinguish error classes without regex-scraping the prose.

## Fix

Added `[error-kind: <class>]` prefix line to all text-mode error output.
The prefix appears before the error prose, making it immediately parseable by
line-based log tools without any substring matching.

**Examples:**

## Impact

- Stderr observers (log aggregators, CI systems) can now parse error class
  from the first line without regex or substring scraping
- Same classifier function used for JSON (#77 P1) and text modes
- Text-mode output remains human-readable (error prose unchanged)
- Prefix format follows syslog/structured-logging conventions

## Tests

All 179 rusty-claude-cli tests pass. Verified on 3 different error classes.

Closes ROADMAP #156.
2026-04-22 00:21:32 +09:00
YeonGyu-Kim
9362900b1b feat: #77 Phase 1 — machine-readable error classification in JSON error payloads
## Problem

All JSON error payloads had the same three-field envelope:
```json
{"type": "error", "error": "<prose with hint baked in>"}
```

Five distinct error classes were indistinguishable at the schema level:
- missing_credentials (no API key)
- missing_worker_state (no state file)
- session_not_found / session_load_failed
- cli_parse (unrecognized args)
- invalid_model_syntax

Downstream claws had to regex-scrape the prose to route failures.

## Fix

1. **Added `classify_error_kind()`** — prefix/keyword classifier that returns a
   snake_case discriminant token for 12 known error classes:
   `missing_credentials`, `missing_manifests`, `missing_worker_state`,
   `session_not_found`, `session_load_failed`, `no_managed_sessions`,
   `cli_parse`, `invalid_model_syntax`, `unsupported_command`,
   `unsupported_resumed_command`, `confirmation_required`, `api_http_error`,
   plus `unknown` fallback.

2. **Added `split_error_hint()`** — splits multi-line error messages into
   (short_reason, optional_hint) so the runbook prose stops being stuffed
   into the `error` field.

3. **Extended JSON envelope** at 4 emit sites:
   - Main error sink (line ~213)
   - Session load failure in resume_session
   - Stub command (unsupported_command)
   - Unknown resumed command (unsupported_resumed_command)

## New JSON shape

```json
{
  "type": "error",
  "error": "short reason (first line)",
  "kind": "missing_credentials",
  "hint": "Hint: export ANTHROPIC_API_KEY..."
}
```

`kind` is always present. `hint` is null when no runbook follows.
`error` now carries only the short reason, not the full multi-line prose.

## Tests

Added 2 new regression tests:
- `classify_error_kind_returns_correct_discriminants` — all 9 known classes + fallback
- `split_error_hint_separates_reason_from_runbook` — with and without hints

All 179 rusty-claude-cli tests pass. Full workspace green.

Closes ROADMAP #77 Phase 1.
2026-04-21 22:38:13 +09:00
YeonGyu-Kim
3cfe6e2b14 feat: #154 — hint provider prefix and env var when model name looks like different provider
## Problem

When a user types `claw --model gpt-4` or `--model qwen-plus`, they get:
```
error: invalid model syntax: 'gpt-4'. Expected provider/model (e.g., anthropic/claude-opus-4-6) or known alias
```

USAGE.md documents that "The error message now includes a hint that names the detected env var" — but this hint does not actually exist. The user has to re-read USAGE.md or guess the correct prefix.

## Fix

Enhance `validate_model_syntax` to detect when a model name looks like it belongs to a different provider:

1. **OpenAI models** (starts with `gpt-` or `gpt_`):
   ```
   Did you mean `openai/gpt-4`? (Requires OPENAI_API_KEY env var)
   ```

2. **Qwen/DashScope models** (starts with `qwen`):
   ```
   Did you mean `qwen/qwen-plus`? (Requires DASHSCOPE_API_KEY env var)
   ```

3. **Grok/xAI models** (starts with `grok`):
   ```
   Did you mean `xai/grok-3`? (Requires XAI_API_KEY env var)
   ```

Unrelated invalid models (e.g., `asdfgh`) do not get a spurious hint.

## Verification

- `claw --model gpt-4` → hints `openai/gpt-4` + `OPENAI_API_KEY`
- `claw --model qwen-plus` → hints `qwen/qwen-plus` + `DASHSCOPE_API_KEY`
- `claw --model grok-3` → hints `xai/grok-3` + `XAI_API_KEY`
- `claw --model asdfgh` → generic error (no hint)

## Tests

Added 3 new assertions in `parses_multiple_diagnostic_subcommands`:
- GPT model error hints openai/ prefix and OPENAI_API_KEY
- Qwen model error hints qwen/ prefix and DASHSCOPE_API_KEY
- Unrelated models don't get a spurious hint

All 177 rusty-claude-cli tests pass.

Closes ROADMAP #154.
2026-04-21 21:40:48 +09:00
YeonGyu-Kim
79352a2d20 feat: #152 — hint --output-format json when user types --json on diagnostic verbs
## Problem

Users commonly type `claw doctor --json`, `claw status --json`, or
`claw system-prompt --json` expecting JSON output. These fail with
`unrecognized argument \`--json\` for subcommand` with no hint that
`--output-format json` is the correct flag.

## Discovery

Filed as #152 during 21:17 dogfood nudge. The #127 worktree contained
a more comprehensive patch but conflicted with #141 (unified --help).
On re-investigation of main, Bugs 1 and 3 from #127 are already closed
(positional arg rejection works, no double "error:" prefix). Only
Bug 2 (the `--json` hint) remained.

## Fix

Two call sites add the hint:

1. `parse_single_word_command_alias`'s diagnostic-verb suffix path:
   when rest[1] == "--json", append "Did you mean \`--output-format json\`?"

2. `parse_system_prompt_options` unknown-option path: same hint when
   the option is exactly `--json`.

## Verification

Before:
  $ claw doctor --json
  error: unrecognized argument `--json` for subcommand `doctor`
  Run `claw --help` for usage.

After:
  $ claw doctor --json
  error: unrecognized argument `--json` for subcommand `doctor`
  Did you mean `--output-format json`?
  Run `claw --help` for usage.

Covers: `doctor --json`, `status --json`, `sandbox --json`,
`system-prompt --json`, and any other diagnostic verb that routes
through `parse_single_word_command_alias`.

Other unrecognized args (`claw doctor garbage`) correctly don't
trigger the hint.

## Tests

- 2 new assertions in `parses_multiple_diagnostic_subcommands`:
  - `claw doctor --json` produces hint
  - `claw doctor garbage` does NOT produce hint
- 177 rusty-claude-cli tests pass
- Workspace tests green

Closes ROADMAP #152.
2026-04-21 21:23:17 +09:00
YeonGyu-Kim
eaa077bf91 fix: #150 — eliminate symlink canonicalization flake in resume_latest test + file #246 (reminder outcome ambiguity)
## #150 Fix: resume_latest test flake

**Problem:** `resume_latest_restores_the_most_recent_managed_session` intermittently
fails when run in the workspace suite or multiple times in sequence, but passes in
isolation.

**Root cause:** `workspace_fingerprint(path)` hashes the path string without
canonicalization. On macOS, `/tmp` is a symlink to `/private/tmp`. The test
creates a temp dir via `std::env::temp_dir().join(...)` which returns
`/var/folders/...` (non-canonical). When the subprocess spawns,
`env::current_dir()` returns the canonical path `/private/var/folders/...`.
The two fingerprints differ, so the subprocess looks in
`.claw/sessions/<hash1>` while files are in `.claw/sessions/<hash2>`.
Session discovery fails.

**Fix:** Call `fs::canonicalize(&project_dir)` after creating the directory
to ensure test and subprocess use identical path representations.

**Verification:** 5 consecutive runs of the full test suite — all pass.
Previously: 5/5 failed when run in sequence.

## #246 Filing: Reminder cron outcome ambiguity (control-loop blocker)

The `clawcode-dogfood-cycle-reminder` cron times out repeatedly with no
structured feedback on whether the nudge was delivered, skipped, or died in-flight.

**Phase 1 outcome schema** — add explicit field to cron result:
- `delivered` — nudge posted to Discord
- `timed_out_before_send` — died before posting
- `timed_out_after_send` — posted but cleanup timed out
- `skipped_due_to_active_cycle` — previous cycle active
- `aborted_gateway_draining` — daemon shutdown

Assigned to gaebal-gajae (cron/orchestration domain). Unblocks trustworthy
dogfood cycle observability.

Closes ROADMAP #150. Filed ROADMAP #246.
2026-04-21 21:01:09 +09:00
YeonGyu-Kim
f84c7c4ed5 feat: #148 + #128 closure — model provenance in claw status JSON/text
## Scope

Two deltas in one commit:

### #128 closure (docs)

Re-verified on main HEAD `4cb8fa0`: malformed `--model` strings already
rejected at parse time (`validate_model_syntax` in parse_args). All
historical repro cases now produce specific errors:

  claw --model ''                       → error: model string cannot be empty
  claw --model 'bad model'              → error: invalid model syntax: 'bad model' contains spaces
  claw --model 'sonet'                  → error: invalid model syntax: 'sonet'. Expected provider/model or known alias
  claw --model '@invalid'               → error: invalid model syntax: '@invalid'. Expected provider/model ...
  claw --model 'totally-not-real-xyz'   → error: invalid model syntax: ...
  claw --model sonnet                   → ok, resolves to claude-sonnet-4-6
  claw --model anthropic/claude-opus-4-6 → ok, passes through

Marked #128 CLOSED in ROADMAP with repro block. Residual provenance gap
split off as #148.

### #148 implementation

**Problem.** After #128 closure, `claw status --output-format json`
still surfaces only the resolved model string. No way for a claw to
distinguish whether `claude-sonnet-4-6` came from `--model sonnet`
(alias resolution) vs `--model claude-sonnet-4-6` (pass-through) vs
`ANTHROPIC_MODEL` env vs `.claw.json` config vs compiled-in default.

Debug forensics had to re-read argv instead of reading a structured
field. Clawhip orchestrators sending `--model` couldn't confirm the
flag was honored vs falling back to default.

**Fix.** Added two fields to status JSON envelope:
- `model_source`: "flag" | "env" | "config" | "default"
- `model_raw`: user's input before alias resolution (null on default)

Text mode appends a `Model source` line under `Model`, showing the
source and raw input (e.g. `Model source     flag (raw: sonnet)`).

**Resolution order** (mirrors resolve_repl_model but with source
attribution):
1. If `--model` / `--model=` flag supplied → source: flag, raw: flag value
2. Else if ANTHROPIC_MODEL set → source: env, raw: env value
3. Else if `.claw.json` model key set → source: config, raw: config value
4. Else → source: default, raw: null

## Changes

### rust/crates/rusty-claude-cli/src/main.rs

- Added `ModelSource` enum (Flag/Env/Config/Default) with `as_str()`.
- Added `ModelProvenance` struct (resolved, raw, source) with
  three constructors: `default_fallback()`, `from_flag(raw)`, and
  `from_env_or_config_or_default(cli_model)`.
- Added `model_flag_raw: Option<String>` field to `CliAction::Status`.
- Parse loop captures raw input in `--model` and `--model=` arms.
- Extended `parse_single_word_command_alias` to thread
  `model_flag_raw: Option<&str>` through.
- Extended `print_status_snapshot` signature to accept
  `model_flag_raw: Option<&str>`. Resolves provenance at dispatch time
  (flag provenance from arg; else probe env/config/default).
- Extended `status_json_value` signature with
  `provenance: Option<&ModelProvenance>`. On Some, adds `model_source`
  and `model_raw` fields; on None (legacy resume paths), omits them
  for backward compat.
- Extended `format_status_report` signature with optional provenance.
  On Some, renders `Model source` line after `Model`.
- Updated all existing callers (REPL /status, resume /status, tests)
  to pass None (legacy paths don't carry flag provenance).
- Added 2 regression assertions in parse_args test covering both
  `--model sonnet` and `--model=...` forms.

### ROADMAP.md

- Marked #128 CLOSED with re-verification block.
- Filed #148 documenting the provenance gap split, fix shape, and
  acceptance criteria.

## Live verification

$ claw --model sonnet --output-format json status | jq '{model,model_source,model_raw}'
{"model": "claude-sonnet-4-6", "model_source": "flag", "model_raw": "sonnet"}

$ claw --output-format json status | jq '{model,model_source,model_raw}'
{"model": "claude-opus-4-6", "model_source": "default", "model_raw": null}

$ ANTHROPIC_MODEL=haiku claw --output-format json status | jq '{model,model_source,model_raw}'
{"model": "claude-haiku-4-5-20251213", "model_source": "env", "model_raw": "haiku"}

$ echo '{"model":"claude-opus-4-7"}' > .claw.json && claw --output-format json status | jq '{model,model_source,model_raw}'
{"model": "claude-opus-4-7", "model_source": "config", "model_raw": "claude-opus-4-7"}

$ claw --model sonnet status
Status
  Model            claude-sonnet-4-6
  Model source     flag (raw: sonnet)
  Permission mode  danger-full-access
  ...

## Tests

- rusty-claude-cli bin: 177 tests pass (2 new assertions for #148)
- Full workspace green except pre-existing resume_latest flake (unrelated)

Closes ROADMAP #128, #148.
2026-04-21 20:48:46 +09:00
YeonGyu-Kim
4cb8fa059a feat: #147 — reject empty / whitespace-only prompts at CLI fallthrough
## Problem

The `"prompt"` subcommand arm enforced `if prompt.trim().is_empty()`
and returned a specific error. The fallthrough `other` arm in the same
match block — which routes any unrecognized first positional arg to
`CliAction::Prompt` — had no such guard. Result:

$ claw ""
error: missing Anthropic credentials; export ANTHROPIC_AUTH_TOKEN ...

$ claw "   "
error: missing Anthropic credentials; ...

$ claw "" ""
error: missing Anthropic credentials; ...

$ claw --output-format json ""
{"error":"missing Anthropic credentials; ...","type":"error"}

An empty prompt should never reach the credentials check. Worse: with
valid credentials, the literal empty string gets sent to Claude as a
user prompt, either burning tokens for nothing or triggering a model-
side refusal. Same prompt-misdelivery family as #145.

## Root cause

In `parse_subcommand()`, the final `other =>` arm in the top-level
match only guards against typos (#108 guard via `looks_like_subcommand_typo`)
and then unconditionally builds `CliAction::Prompt { prompt: rest.join(" ") }`.
An empty/whitespace-only join passes through.

## Changes

### rust/crates/rusty-claude-cli/src/main.rs

Added the same `if joined.trim().is_empty()` guard already used in the
`"prompt"` arm to the fallthrough path. Error message distinguishes it
from the `prompt` subcommand path:

  empty prompt: provide a subcommand (run `claw --help`) or a
  non-empty prompt string

Runs AFTER the typo guard (so `claw sttaus` still suggests `status`)
and BEFORE CliAction::Prompt construction (so no network call ever
happens for empty inputs).

### Regression tests

Added 4 assertions in the existing parse_args test:
- parse_args([""]) → Err("empty prompt: ...")
- parse_args(["   "]) → Err("empty prompt: ...")
- parse_args(["", ""]) → Err("empty prompt: ...")
- parse_args(["sttaus"]) → Err("unknown subcommand: ...") [verifies #108 typo guard still takes precedence]

### ROADMAP.md

Added Pinpoint #147 documenting the gap, verification, root cause,
fix shape, and acceptance. Joins the prompt-misdelivery cluster
alongside #145.

## Live verification

$ claw ""
error: empty prompt: provide a subcommand (run `claw --help`) or a non-empty prompt string

$ claw "   "
error: empty prompt: provide a subcommand (run `claw --help`) or a non-empty prompt string

$ claw --output-format json ""
{"error":"empty prompt: provide a subcommand ...","type":"error"}

$ claw prompt ""   # unchanged: subcommand-specific error preserved
error: prompt subcommand requires a prompt string

$ claw hello        # unchanged: typo guard still fires
error: unknown subcommand: hello.
  Did you mean     help

$ claw "real prompt here"   # unchanged: real prompts still reach API
error: api returned 401 Unauthorized (with dummy key, as expected)

All empty/whitespace-only paths exit 1. No network call. No misleading
credentials error.

## Tests

- rusty-claude-cli bin: 177 tests pass (4 new assertions)
- Full workspace green except pre-existing resume_latest flake (unrelated)

Closes ROADMAP #147.
2026-04-21 20:35:17 +09:00
YeonGyu-Kim
f877acacbf feat: #146 — wire claw config and claw diff as standalone subcommands
## Problem

`claw config` and `claw diff` are pure-local read-only introspection
commands (config merges .claw.json + .claw/settings.json from disk; diff
shells out to `git diff --cached` + `git diff`). Neither needs a session
context, yet both rejected direct CLI invocation:

$ claw config
error: `claw config` is a slash command. Use `claw --resume SESSION.jsonl /config` ...

$ claw diff
error: `claw diff` is a slash command. ...

This forced clawing operators to spin up a full session just to inspect
static disk state, and broke natural pipelines like
`claw config --output-format json | jq`.

## Root cause

Sibling of #145: `SlashCommand::Config { section }` and
`SlashCommand::Diff` had working renderers (`render_config_report`,
`render_config_json`, `render_diff_report`, `render_diff_json_for`)
exposed for resume sessions, but the top-level CLI parser in
`parse_subcommand()` had no arms for them. Zero-arg `config`/`diff`
hit `parse_single_word_command_alias`'s fallback to
`bare_slash_command_guidance`, producing the misleading guidance.

## Changes

### rust/crates/rusty-claude-cli/src/main.rs

- Added `CliAction::Config { section, output_format }` and
  `CliAction::Diff { output_format }` variants.
- Added `"config"` / `"diff"` arms to the top-level parser in
  `parse_subcommand()`. `config` accepts an optional section name
  (env|hooks|model|plugins) matching SlashCommand::Config semantics.
  `diff` takes no positional args. Both reject extra trailing args
  with a clear error.
- Added `"config" | "diff" => None` to
  `parse_single_word_command_alias` so bare invocations fall through
  to the new parser arms instead of the slash-guidance error.
- Added dispatch in run() that calls existing renderers: text mode uses
  `render_config_report` / `render_diff_report`; JSON mode uses
  `render_config_json` / `render_diff_json_for` with
  `serde_json::to_string_pretty`.
- Added 5 regression assertions in parse_args test covering:
  parse_args(["config"]), parse_args(["config", "env"]),
  parse_args(["config", "--output-format", "json"]),
  parse_args(["diff"]), parse_args(["diff", "--output-format", "json"]).

### ROADMAP.md

Added Pinpoint #146 documenting the gap, verification, root cause,
fix shape, and acceptance. Explicitly notes which other slash commands
(`hooks`, `usage`, `context`, etc.) are NOT candidates because they
are session-state-modifying.

## Live verification

$ claw config   # no config files
Config
  Working directory /private/tmp/cd-146-verify
  Loaded files      0
  Merged keys       0
Discovered files
  user    missing ...
  project missing ...
  local   missing ...
Exit 0.

$ claw config --output-format json
{
  "cwd": "...",
  "files": [...],
  ...
}

$ claw diff   # no git
Diff
  Result           no git repository
  Detail           ...
Exit 0.

$ claw diff --output-format json   # inside claw-code
{
  "kind": "diff",
  "result": "changes",
  "staged": "",
  "unstaged": "diff --git ..."
}
Exit 0.

## Tests

- rusty-claude-cli bin: 177 tests pass (5 new assertions in parse_args)
- Full workspace green except pre-existing resume_latest flake (unrelated)

## Not changed

`hooks`, `usage`, `context`, `tasks`, `theme`, `voice`, `rename`,
`copy`, `color`, `effort`, `branch`, `rewind`, `ide`, `tag`,
`output-style`, `add-dir` — all session-mutating or interactive-only;
correctly remain slash-only.

Closes ROADMAP #146.
2026-04-21 20:07:28 +09:00
YeonGyu-Kim
7d63699f9f feat: #145 — wire claw plugins subcommand to CLI parser (prompt misdelivery fix)
## Problem

`claw plugins` (and `claw plugins list`, `claw plugins --help`,
`claw plugins info <name>`, etc.) fell through the top-level subcommand
match and got routed into the prompt-execution path. Result: a purely
local introspection command triggered an Anthropic API call and surfaced
`missing Anthropic credentials` to the user. With valid credentials, it
would actually send the literal string "plugins" as a user prompt to
Claude, burning tokens for a local query.

$ claw plugins
error: missing Anthropic credentials; export ANTHROPIC_AUTH_TOKEN or ANTHROPIC_API_KEY before calling the Anthropic API

$ ANTHROPIC_API_KEY=dummy claw plugins
⠋ 🦀 Thinking...
✘  Request failed
error: api returned 401 Unauthorized

Meanwhile siblings (`agents`, `mcp`, `skills`) all worked correctly:

$ claw agents
No agents found.
$ claw mcp
MCP
  Working directory ...
  Configured servers 0

## Root cause

`CliAction::Plugins` exists, has a working dispatcher
(`LiveCli::print_plugins`), and is produced inside the REPL via
`SlashCommand::Plugins`. But the top-level CLI parser in
`parse_subcommand()` had arms for `agents`, `mcp`, `skills`, `status`,
`doctor`, `init`, `export`, `prompt`, etc., and **no arm for
`plugins`**. The dispatch never ran from the CLI entry point.

## Changes

### rust/crates/rusty-claude-cli/src/main.rs

Added a `"plugins"` arm to the top-level match in `parse_subcommand()`
that produces `CliAction::Plugins { action, target, output_format }`,
following the same positional convention as `mcp` (`action` = first
positional, `target` = second). Rejects >2 positional args with a clear
error.

Added four regression assertions in the existing `parse_args` test:
- `plugins` alone → `CliAction::Plugins { action: None, target: None }`
- `plugins list` → action: Some("list"), target: None
- `plugins enable <name>` → action: Some("enable"), target: Some(...)
- `plugins --output-format json` → action: None, output_format: Json

### ROADMAP.md

Added Pinpoint #145 documenting the gap, verification, root cause,
fix shape, and acceptance.

## Live verification

$ claw plugins   # no credentials set
Plugins
  example-bundled      v0.1.0      disabled
  sample-hooks         v0.1.0      disabled

$ claw plugins --output-format json   # no credentials set
{
  "action": "list",
  "kind": "plugin",
  "message": "Plugins\n  example-bundled ...\n  sample-hooks ...",
  "reload_runtime": false,
  "target": null
}

Exit 0 in all modes. No network call. No "missing credentials" error.

## Tests

- rusty-claude-cli bin: 177 tests pass (new plugin assertions included)
- Full workspace green except pre-existing resume_latest flake (unrelated)

Closes ROADMAP #145.
2026-04-21 19:36:49 +09:00
YeonGyu-Kim
e2a43fcd49 feat: #143 phase 1 — claw status degrades gracefully on malformed config
Previously `claw status` hard-failed on any config parse error, emitting
a bare error string and exiting 1. This took down the entire health
surface for a single malformed MCP entry, even though workspace, git,
model, permission, and sandbox state could all be reported independently.

`claw doctor` already degraded gracefully on the exact same input.
This commit matches `claw status` to that contract.

Changes:
- Add `StatusContext::config_load_error: Option<String>` to capture parse
  errors without aborting.
- Rewrite `status_context()` to match on `ConfigLoader::load()`: on Err,
  fall back to default `SandboxConfig` for sandbox resolution and record
  the parse error, then continue populating workspace/git/memory fields.
- JSON output gains top-level `status: "ok" | "degraded"` marker and a
  `config_load_error` string (null on clean runs). All other existing
  fields preserved for backward compat.
- Text output prepends a "Config load error" block with Details + Hint
  when config failed to parse, then a "Status (degraded)" header on the
  main block. Clean runs show the usual "Status" header.
- Doctor path updated to pass the config load error through StatusContext.

Regression test `status_degrades_gracefully_on_malformed_mcp_config_143`:
- Injects a .claw.json with one valid + one malformed mcpServers entry
- Asserts status_context() returns Ok (not Err)
- Asserts config_load_error names the malformed field path
- Asserts workspace/sandbox fields still populated in JSON
- Asserts top-level status is 'degraded'
- Asserts clean config path still returns status: 'ok'

Verified live on /Users/yeongyu/clawd (contains deliberately broken MCP entries):
  $ claw status --output-format json
  { "status": "degraded",
    "config_load_error": ".../mcpServers.missing-command: missing string field command",
    "model": "claude-opus-4-6",
    "workspace": {...},
    "sandbox": {...},
    ... }

Phase 2 (typed error object joining #4.44 taxonomy) tracked separately.

Full workspace test green except pre-existing resume_latest flake (unrelated).

Closes ROADMAP #143 phase 1.
2026-04-21 18:37:42 +09:00