docs(roadmap): add #681 — unsupported mcp mutation verbs return help with exit 0

2026-05-25 15:06:44 +00:00 · 2026-05-24 20:01:23 +00:00
1 changed files with 27 additions and 23 deletions
--- a/ROADMAP.md
+++ b/ROADMAP.md
@@ -6429,38 +6429,42 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed)

 450. **`prompt` emits `kind:"missing_credentials"` JSON on STDERR (not stdout), leaving stdout at 0 bytes — automation pattern `output=$(claw prompt hello --output-format json)` captures nothing on auth-absent failure; `doctor` correctly surfaces `auth.status:"warn"` with `api_key_present:false` but exposes no `prompt_ready:false` field that automation can check before invoking `prompt`** — dogfooded 2026-05-16 by Jobdori on `a35ee9a0` in response to Clawhip pinpoint nudge at `1505208225321062521`. Exact reproduction (isolated env, no creds, fresh git repo, HEAD `a35ee9a0`): `timeout 5 env -i HOME=$ISOLATED_HOME PATH=$PATH CLAW_CONFIG_HOME=$PROBE/.claw-cfg claw prompt hello --output-format json > stdout.txt 2> stderr.txt` → stdout = **0 bytes**, stderr = 195 bytes containing `{"error":"missing Anthropic credentials…","exit_code":1,"hint":null,"kind":"missing_credentials","type":"error"}`, exit code 1. Confirms Gaebal's `1505208553793781792` pinpoint that `prompt` timeout + zero bytes was the prior state — HEAD `a35ee9a0` now correctly exits 1 with `kind:"missing_credentials"` **but the envelope is still routed to stderr** (issue #447 class, same class as prior entries #422, #435). **Contrast with `doctor`:** `claw doctor --output-format json 2>/dev/null` succeeds to stdout with `checks[auth].status:"warn"`, `api_key_present:false`, `auth_token_present:false` — but the auth check has no `prompt_ready:false` field. Automation that gates on `doctor` before invoking `prompt` must re-derive readiness from `api_key_present && auth_token_present` — there is no single canonical boolean. **Three compound problems:** (a) **stdout-empty on `--output-format json` failure**: same class as #447; `prompt`'s error envelope goes to stderr, not stdout. The canonical automation idiom `if ! result=$(claw prompt "q" --output-format json); then echo "$result" | jq .kind; fi` sees `$result=""` on failure — the jq call gets nothing. All `--output-format json` error paths must route JSON to stdout per #447 contract; (b) **`doctor` missing `prompt_ready` field**: `doctor --output-format json` already knows auth is absent (`api_key_present:false`) but surfaces no derived `prompt_ready:bool` or `prompt_blocked_reason:string` field. Automation must infer readiness from `api_key_present || auth_token_present || legacy_*_present` — a 5-field OR across legacy fields that is fragile as auth mechanisms evolve. A single `prompt_ready:false` (with `prompt_blocked_reason:"auth_missing"`) inside the `auth` check would give downstream a stable contract; (c) **`claw prompt` with no auth does no preflight and fires straight at the API**: the preflight check that `doctor` runs (auth discovery) is not reused by `prompt` to emit a fast typed error before attempting the network call. Both Gaebal's pinpoint (prompt hanging silently on older HEAD) and the current behavior (prompt hitting auth gate after a brief API attempt) stem from the same root: prompt does not short-circuit at the point where `doctor` already knows auth is absent. If `doctor` can emit `kind:"doctor"` with `auth.status:"warn"` in ~20ms without a network call, `prompt` should emit `kind:"missing_credentials"` in the same window and output it to stdout. **Required fix shape:** (a) `prompt --output-format json` must write the `kind:"missing_credentials"` JSON envelope to **stdout**, not stderr — same fix as #447 for all error envelopes; (b) add `prompt_ready:bool` and `prompt_blocked_reason:string|null` to the `auth` check in `doctor --output-format json`; derive it as `api_key_present || auth_token_present || legacy_saved_oauth_present`; (c) `prompt` must run the credential preflight check (same codepath as doctor's auth check) before attempting any API call and emit `{"kind":"missing_credentials","prompt_blocked_reason":"auth_missing"}` on **stdout** with exit 1 if the check fails; (d) `--output-format json` stdout routing fix must cover: `prompt`, `session list` (cross-ref #449), `skills uninstall` (cross-ref #431), `resume` (cross-ref #435), `acp serve` (cross-ref #443) — the full `kind:"missing_credentials"` class; (e) regression test: `claw prompt hello --output-format json` with no creds writes JSON to stdout (0 bytes stderr), exits 1, `kind:"missing_credentials"`, in under 200ms (no network attempt). **Why this matters:** `prompt` is the primary consumer entry point. Auth-absent failure routing to stderr breaks every automation wrapper that captures `$(claw prompt ... --output-format json)`. The `doctor` preflight metadata gap means auth-readiness checks require parsing 5 legacy fields instead of reading one boolean. Cross-references #447 (all JSON error envelopes on stderr), #449 (session list hits auth gate), #431 (skills uninstall hits auth gate), #357 (auth gate on local ops cluster), #422 (exit-code parity). Source: Jobdori live dogfood, `a35ee9a0`, 2026-05-16.

-470. **`--reasoning-effort` is accepted by local diagnostic subcommands (`status`, `doctor`, `version`, etc.) even though it is a prompt/API-request knob, then disappears from every diagnostic JSON surface; valid values (`low|medium|high`) parse and exit 0 on `status` / `doctor` with no `reasoning_effort` field, no `ignored_flags` warning, and no indication that the flag will only affect future prompt turns. Invalid values have their own strict parser (`LOW`, `Low`, `low `, ` low`, empty, `extreme`, `1`) but that error sentinel is another `classify_error_kind` orphan (`kind:"unknown"`, `hint:null`), repeating the #463/#464 pattern on a third flag-value parser** — dogfooded 2026-05-24 for the 19:30 Clawhip nudge at message `1508190377130197222`, reproduced on local `./rust/target/debug/claw` `git_sha 003b739d` (origin/main `f8e1bb72`) in a clean isolated env.
+681. **Unsupported `claw mcp` mutation verbs (`add`, `remove`, `delete`, `enable`, `disable`) return a help payload with `exit=0` instead of a typed unsupported/not-implemented error: `claw mcp add demo -- /bin/echo hi --output-format json` emits `{kind:"mcp", action:"help", unexpected:"add demo -- /bin/echo hi", usage:{...}}` on stdout, no stderr, and no file changes. The command clearly did not add anything, but shell/CI sees success; `unexpected` is the only clue, and it is embedded in a help object rather than a `status:"unsupported"` / `kind:"error"` envelope** — dogfooded 2026-05-24 for the 20:00 Clawhip nudge at message `1508197932497899740`, reproduced on local `./rust/target/debug/claw` `git_sha 003b739d` (origin/main `f8e1bb72`) in a clean isolated env. Number intentionally jumps to #681 because Jobdori publicly filed/announced ROADMAP #680 in this channel; avoiding distributed queue collision.

    Reproduction:

    ```bash
-    $ claw --output-format json --reasoning-effort high status | jq 'keys'
-    ["allowed_tools","config_load_error","kind","model","model_raw","model_source","permission_mode","sandbox","status","usage","workspace"]
-    # no reasoning_effort, no ignored_flags, no active_request_options
-
-    $ claw --output-format json --reasoning-effort high doctor | jq '.checks[].name'
-    "auth" "config" "install_source" "workspace" "sandbox" "system"
-    # no reasoning/model/request-options check carrying reasoning_effort
-
-    $ claw --output-format json --reasoning-effort LOW status
-    {"error":"invalid value for --reasoning-effort: 'LOW'; must be low, medium, or high","hint":null,"kind":"unknown","type":"error"}
-
-    $ claw --output-format json --reasoning-effort 'low ' status
-    {"error":"invalid value for --reasoning-effort: 'low '; must be low, medium, or high","hint":null,"kind":"unknown","type":"error"}
+    $ env -i HOME=/tmp/iso29/home PATH=/usr/bin:/bin TERM=dumb \
+        claw mcp add demo -- /bin/echo hi --output-format json
+    {
+      "action": "help",
+      "kind": "mcp",
+      "unexpected": "add demo -- /bin/echo hi",
+      "usage": {
+        "direct_cli": "claw mcp [list|show <server>|help]",
+        "slash_command": "/mcp [list|show <server>|help]",
+        "sources": [".claw/settings.json", ".claw/settings.local.json"]
+      }
+    }
+    # exit 0, stderr empty, no .claw/settings.json written
    ```

-    Valid matrix: `low`, `medium`, `high` all exit 0 on `status` and `doctor` but are invisible. Invalid matrix: `LOW`, `Low`, `low `, ` low`, empty string, `extreme`, `1` all exit 1 with `kind:"unknown"` and `hint:null`.
+    Same shape for:

-    **Root cause traced:** `rust/crates/rusty-claude-cli/src/main.rs:712-731` validates `--reasoning-effort` as a global flag and stores `reasoning_effort = Some(value)`, but the local diagnostic renderers never receive or serialize it. `status_json_value()` around `main.rs:5688-5714` exposes model, permission mode, allowed tools, workspace, sandbox, config load error, and usage — not request options. `doctor` builds fixed checks (`auth`, `config`, `install_source`, `workspace`, `sandbox`, `system`) and has no request-options check. The invalid-value error string is:
-
-    ```rust
-    "invalid value for --reasoning-effort: '{value}'; must be low, medium, or high"
+    ```bash
+    claw mcp add demo /bin/echo hi --output-format json
+    claw mcp remove demo --output-format json
+    claw mcp delete demo --output-format json
+    claw mcp disable demo --output-format json
+    claw mcp enable demo --output-format json
    ```

-    but `classify_error_kind()` has no branch for it, so every invalid reasoning-effort parse error becomes `kind:"unknown"`.
+    All return `exit=0`, `kind:"mcp"`, `action:"help"`, `unexpected:"<verb ...>"` and no stderr. The supported action list in `mcp help --output-format json` is only `list|show <server>|help`, so these mutation verbs are unsupported by contract, but the failure mode is success/help.

-    **Why distinct from existing items:** ROADMAP #98 covers `--compact` silently ignored outside prompt/text output. This entry covers `--reasoning-effort`, a different prompt/API knob with different parser and provider semantics. ROADMAP #464 covers invalid `--output-format` enum parsing; this entry covers `--reasoning-effort` enum parsing plus valid-value invisibility. ROADMAP #468 covers duplicate global flags; this entry covers single valid flag accepted by diagnostic subcommands with no visible effect. ROADMAP #34/#35 cover OpenAI-compat reasoning-effort transport support; this entry is about CLI diagnostic surfaces accepting the knob without exposing whether it is active. No existing entry documents `--reasoning-effort high status` being a successful no-op/invisible state.
+    **Root cause shape:** the MCP command parser treats unknown sub-actions as a help-topic request and preserves the raw tail in `unexpected`, but does not convert that branch into a command failure. The help JSON schema has no `status`, `ok`, `error_kind`, `exit_ok`, or `unsupported_action` field, so automation must special-case `action:"help" && unexpected != null` to detect that a requested operation did not run.

-    **Why this matters:** (1) **Prompt/API knobs on diagnostic commands are misleading.** A launcher may run `claw --reasoning-effort high status` to verify a lane is configured for high reasoning; status says ok but never confirms the knob. (2) **Automation cannot audit request options.** `status` is the lightweight preflight surface; if reasoning effort affects cost/latency/model behavior, claws need to know its active value before prompt execution. (3) **Same enum-parser quality gap repeats.** `LOW` and trailing whitespace are common wrapper/env-file outputs; the parser rejects them with `kind:"unknown"` instead of `kind:"invalid_reasoning_effort"`, no suggestion, no trim/case handling. (4) **Silent successful no-op and loud unknown invalid are both bad.** Valid values disappear; invalid values are misclassified. Together they make the knob hard to reason about programmatically. (5) **Regression tests are too shallow.** Existing tests around `rejects_invalid_reasoning_effort_value` assert the substring exists, and `accepts_valid_reasoning_effort_values` asserts parse state only. They do not assert JSON envelope kind/hint, diagnostic-surface visibility, or prompt-vs-diagnostic applicability.
+    **Why distinct from existing items:** ROADMAP #327 covers `mcp help` source-list mismatch (`.claw.json` omitted while accepted). This entry covers unsupported mutation verbs returning success. ROADMAP #347 covers `mcp show <missing>` returning `status:"ok"` with `found:false`; this entry is broader/different: unsupported *verbs* (`add/remove/enable/disable/delete`) return help with exit 0 and no command-status field. ROADMAP #102/#129 cover MCP server liveness/runtime startup; this is control-plane mutation command semantics. ROADMAP #78 covered `plugins` route falling through to prompt; this is the `mcp` route resolving locally but reporting failed mutation as successful help. No existing entry found for `mcp add/remove` write-target or unsupported mutation verbs.

-    **Required fix shape:** (a) Decide contract: either reject prompt/API-only knobs (`--reasoning-effort`, maybe future `--max-output-tokens`) on local diagnostic subcommands with a structured `unsupported_flag_for_subcommand` error, or expose them in `status`/`doctor` as `request_options.reasoning_effort` / `active_request_options`. (b) If accepted, add `reasoning_effort` to `status --output-format json` and a `request_options` doctor check so preflight can verify it. (c) Register `invalid_reasoning_effort` in `classify_error_kind` and split a real `hint` field (valid values: `low`, `medium`, `high`). (d) Normalize or suggest common variants: trim whitespace, lower-case `LOW`/`Low`, or produce `Did you mean: low?`. (e) Add regression coverage for valid visibility on `status`/`doctor`, invalid-value JSON kind/hint, and prompt-path preservation. **Acceptance check:** `claw --output-format json --reasoning-effort high status | jq -e '.request_options.reasoning_effort == "high" or .ignored_flags[]?.flag == "--reasoning-effort"'` should pass; current output has neither. `claw --output-format json --reasoning-effort LOW status 2>&1 | jq -e '.kind == "invalid_reasoning_effort"'` should pass; current kind is `unknown`. Source: gaebal-gajae dogfood for the 2026-05-24 19:30 Clawhip nudge. Number intentionally skips #469 because Jobdori publicly reported a local #469 (`/compact` slash divergence) not yet pushed; this avoids collision.
+    **Why this matters:** (1) **Automation false success.** A setup script can run `claw mcp add demo ...` and proceed because exit code is 0, even though no server was added. (2) **Mutation verbs are common user expectation.** MCP CLIs often support `add`/`remove`; users will try them. If not implemented, fail closed with a clear typed error. (3) **No write-target clarity.** Since help claims sources `.claw/settings.json` / `.claw/settings.local.json`, a user may reasonably expect `add` to write one. Returning success/help leaves them guessing. (4) **Machine contract ambiguity.** `unexpected` inside help is not an error field; claws should not have to infer failure from a non-null optional help attribute. (5) **Compounds #327.** The help object shown in the failure path also repeats the stale source-list problem, so the unsupported-action error points at incomplete config-source docs.
+
+    **Required fix shape:** (a) For unsupported MCP sub-actions, return a typed JSON error such as `{type:"error", kind:"unsupported_mcp_action", action:"add", supported_actions:["list","show","help"], hint:"MCP mutation commands are not implemented; edit .claw/settings.json manually or use ..."}` and exit non-zero. (b) Preserve a human help fallback only for explicit `mcp help` / `mcp --help`, not for attempted mutations. (c) If mutation verbs are intended roadmap features, add `not_implemented` status with non-zero exit and no file writes. (d) Add tests for `add/remove/enable/disable/delete` proving they are distinguishable from successful help and successful list/show. (e) When mutation support lands, include explicit write-target/source-layer semantics (`project`, `local`, `user`) instead of guessing. **Acceptance check:** `claw mcp add demo -- /bin/echo hi --output-format json >/tmp/out 2>/tmp/err; test $? -ne 0 && jq -e '.kind == "unsupported_mcp_action" and .action == "add"' /tmp/err` should pass; currently exit is 0 and stdout is a help object. Source: gaebal-gajae dogfood for the 2026-05-24 20:00 Clawhip nudge. Coordination note: avoided Jobdori-claimed #680/session-sort and F/CLAW_CONFIG_HOME; targeted MCP mutation semantics after pre-grep showed common MCP lifecycle gaps already covered.