mirror of
https://github.com/instructkr/claude-code.git
synced 2026-05-13 17:36:44 +00:00
Compare commits
1 Commits
docs/roadm
...
docs/roadm
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
3b5bafc847 |
30
ROADMAP.md
30
ROADMAP.md
@@ -6262,32 +6262,4 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed)
|
||||
|
||||
323. **`status --output-format json` reports `session.session = "live-repl"` while simultaneously reporting `session_lifecycle.kind = "saved_only"` — contradictory session identity in a single status snapshot** — dogfooded 2026-04-29 by Jobdori on current main (`804d96b`). Running `claw status --output-format json` from an active REPL-style invocation produced `"session": "live-repl"` in the `workspace` block and `"session_lifecycle": {"kind": "saved_only", "pane_id": null, ...}` in the same object. Those two fields carry contradictory claims: `"live-repl"` asserts there is an active interactive session, while `"saved_only"` asserts there is no live tmux pane hosting the session — the session exists only as a saved artifact. A downstream claw reading this snapshot cannot tell which claim to trust: is this a running session whose pane is undetectable, or a saved-only session that the `session` field is misclassifying? Root cause: `"live-repl"` is a fallback sentinel emitted by `main.rs:6070` when `context.session_path` is `None`, while `session_lifecycle` is computed independently by `classify_session_lifecycle_for()` from tmux pane discovery; the two fields share no common source and can diverge. **Required fix shape:** (a) derive both `session.session` and `session_lifecycle.kind` from the same lifecycle classification result so they cannot diverge; (b) replace the `"live-repl"` free-form sentinel with a structured `session_kind` field (`live_repl`, `saved`, `resume`, etc.) that carries the same type vocabulary as `session_lifecycle.kind`; (c) when `session_lifecycle.kind = "saved_only"`, never emit `"session": "live-repl"` (or vice versa); (d) add a regression test proving `status --output-format json` never emits `session.kind = "live_repl"` and `session_lifecycle.kind = "saved_only"` simultaneously. **Why this matters:** `status --output-format json` is the machine-readable truth surface for session state; if two fields in the same snapshot contradict each other, every lane, monitor, and orchestrator has to pick a winner instead of reading a coherent state. Source: Jobdori live dogfood on mengmotaHost, claw-code `804d96b`, 2026-04-29.
|
||||
|
||||
324. **Stale local debug binaries can impersonate the current workspace because version/status/doctor do not compare embedded build provenance to repo HEAD** — dogfooded 2026-04-29 on current `origin/main` / workspace HEAD `e7074f47` after PR #2838. The working tree was at `e7074f47`, but running `./rust/target/debug/claw version --output-format json` reported embedded `git_sha` `1f901988`. `status` and `doctor` remained green and exposed no warning that the executable under test was stale relative to the workspace HEAD, nor any structured build-provenance freshness signal that downstream claws could use to decide whether the observed behavior came from the checked-out code or an older debug artifact. This is a repo-identity opacity gap: the JSON truth surfaces can look authoritative while actually describing a different binary lineage than the source tree being dogfooded. **Required fix shape:** (a) compare the embedded build `git_sha` / build date with the current workspace git HEAD and dirty state when the binary can discover a containing worktree; (b) expose redaction-safe structured fields in `version --output-format json`, `status --output-format json`, and `doctor --output-format json`, including `binary_provenance`, `workspace_head`, and `stale_binary` (with enough reason/detail to distinguish clean match, dirty workspace, unknown workspace, and definite stale SHA mismatch); (c) warn in human/text mode when executing a stale local debug binary such as `./rust/target/debug/claw` so dogfooders do not trust old behavior as current-main evidence; (d) avoid leaking secrets or absolute sensitive paths beyond the existing workspace-identification policy; (e) add regression/fixture coverage for matching HEAD, dirty workspace, no-worktree/unknown provenance, and stale embedded SHA cases. **Why this matters:** status/doctor/version are supposed to be the machine-readable basis for dogfood truth. If a stale binary can report a different `git_sha` than the checked-out repo without any freshness warning, claws can file or verify bugs against the wrong code and waste cycles chasing already-fixed or not-yet-built behavior. Source: gaebal-gajae dogfood follow-up from current main `e7074f47` after PR #2838; observed `./rust/target/debug/claw version --output-format json` reporting `git_sha` `1f901988` with no stale-binary-vs-workspace-HEAD warning.
|
||||
|
||||
325. **`help --output-format json` returns valid JSON but hides the actual help schema inside one prose `message` string** — dogfooded 2026-04-29 on current `origin/main` / workspace HEAD `d607ff36`. Running `./rust/target/debug/claw help --output-format json` produces parseable JSON, but the object only exposes top-level keys like `kind` and `message`; all command names, global flags, slash-command metadata, aliases, resume-safety, output-format support, auth/preflight notes, and descriptions are flattened into one human-oriented prose blob. That technically satisfies “valid JSON” while still forcing automation to scrape the same help text humans read, making `/issue`, `/help`, and resume-safety contracts opaque to claws. **Required fix shape:** (a) keep `message` as the compact human-rendered help summary, but add a documented structured schema with `schema` / `schema_version` fields; (b) expose first-class arrays/objects such as `commands[]`, `options[]`, and `slash_commands[]` with stable fields including `name`, `aliases`, `description`, `args`, `output_formats_supported`, `resume_safe`, `interactive_only`, and `creates_external_side_effects`; (c) include auth and creation preflight metadata where relevant, especially for GitHub/issue flows (`auth_preflight`, `creation_unavailable`, `gh_cli_authenticated`, `github_token_present`, or equivalent non-secret state); (d) make `/issue`, `/help`, aliases, and resume-dispatch safety machine-readable from the JSON payload instead of recoverable only by parsing prose markers; (e) add regression coverage proving `help --output-format json` is valid JSON and that `/issue`, `/help`, resume-safe vs interactive-only slash commands, aliases, descriptions, supported output formats, and side-effect/auth-preflight fields are present and internally consistent. **Why this matters:** help JSON is the discoverability surface automation uses before invoking commands. If it is just prose wrapped in JSON, claws cannot safely decide whether a command can run non-interactively, resume from a saved session, create external GitHub side effects, or requires auth/preflight without brittle text scraping. Source: gaebal-gajae dogfood follow-up from current main `d607ff36`; observed `./rust/target/debug/claw help --output-format json` returning valid JSON with only `{kind,message}` at the top level while the actionable command schema remained buried in `message`.
|
||||
|
||||
326. **`status --output-format json` underreports active workspace pane inventory when one tmux session has multiple panes/processes in the same project** — dogfooded 2026-04-29 on current `origin/main` / workspace HEAD `b90875fa` while responding to the claw-code dogfood nudge. The active OMX session `claw-code-issue-326-dogfood-pinpoint` was running in `/mnt/offloading/Workspace/claw-code` with two panes: `%9384` (`cmd=node`, active pane) and `%9385` (`cmd=node`, inactive sidecar pane). `tmux list-panes -a -F '#{session_name}:#{window_index}.#{pane_index} #{pane_id} pid=#{pane_pid} cmd=#{pane_current_command} cwd=#{pane_current_path} active=#{pane_active}'` showed both panes in the same session/workspace, but `./rust/target/debug/claw status --output-format json` collapsed the workspace lifecycle to a single object: `session_lifecycle.kind = "running_process"`, `pane_id = "%9384"`, `pane_command = "node"`, with no `panes[]`, process count, sidecar/secondary-pane inventory, or ambiguity marker. A downstream claw reading only status JSON would believe there is exactly one live process for that workspace even though the control plane has multiple panes in the same task session. **Required fix shape:** (a) expose a structured active-session inventory in `status --output-format json`, including `panes[]` or `processes[]` with pane id, command, cwd, active flag, and session/window identity for all matching workspace panes; (b) keep the compact `session_lifecycle` summary, but add an explicit `pane_count` / `has_sidecar_panes` / `inventory_truncated` signal so summaries cannot masquerade as complete truth; (c) define how to classify primary vs sidecar/inactive panes without losing them, and make the chosen primary pane provenance visible; (d) add regression coverage for a tmux session with two panes in one workspace proving status JSON reports both panes or marks the inventory as partial. **Why this matters:** status JSON is the machine-readable lane truth surface. If it reports only the primary pane while hiding secondary panes, clawhip and other claws can miss sidecar workers, blocked helpers, stale subprocesses, or duplicated control-plane processes and make bad restart/cleanup/routing decisions from an undercounted session snapshot. Source: gaebal-gajae dogfood session `claw-code-issue-326-dogfood-pinpoint`; observed `claw status --output-format json` returning only `%9384` while `tmux list-panes` showed `%9384` and `%9385` in the same claw-code workspace.
|
||||
|
||||
327. **`claw mcp help` omits `.claw.json` from its documented config sources even though `claw mcp` still loads MCP servers from `.claw.json`** — dogfooded 2026-04-29 on current `origin/main` / workspace HEAD `981aff7c` after rebuilding the actual debug binary with `cargo run --manifest-path rust/Cargo.toml --bin claw -- version --output-format json` so `./rust/target/debug/claw version --output-format json` reported embedded `git_sha` `981aff7c` matching the workspace. Running `./rust/target/debug/claw mcp --help` printed `Sources .claw/settings.json, .claw/settings.local.json`, and `./rust/target/debug/claw mcp help --output-format json` returned `"sources": [".claw/settings.json", ".claw/settings.local.json"]`. In the same rebuilt binary, a temp workspace containing only a project `.claw.json` with `{"mcpServers":{"demo":{"command":"/bin/echo","args":["hi"]}}}` made `./rust/target/debug/claw mcp --output-format json` report `configured_servers: 1` and `servers[0].name: "demo"`. The MCP lifecycle surface therefore tells users and claws that `.claw.json` is not a source while actively accepting it as one. This is distinct from #322's JSON warning corruption, #323/#326's status lifecycle contradictions, #324's stale-binary provenance gap, and #325's top-level help schema flattening: the pinpoint is a concrete MCP subcommand source-of-truth mismatch in both text and JSON help. **Required fix shape:** (a) derive the `mcp help` source list from the same `ConfigLoader::discover`/settings-source registry that `mcp list` actually uses instead of hard-coding a partial list; (b) include all supported MCP config sources in stable order, including legacy/project `.claw.json`, user `~/.claw/settings.json`, project `.claw/settings.json`, and local `.claw/settings.local.json` as applicable; (c) add source metadata to `mcp --output-format json` entries so each server can be attributed to the file/layer that provided it; (d) add a regression proving a server loaded from `.claw.json` is accompanied by help/JSON source metadata that names `.claw.json`, and that help stays in sync when config source discovery changes. **Why this matters:** MCP setup is already a high-friction lifecycle path; if the command that diagnoses MCP servers omits a still-supported source, operators can move or delete the wrong config file, and automation cannot tell whether `.claw.json` support is intentional compatibility or accidental legacy behavior. Source: gaebal-gajae dogfood in `/home/bellman/Workspace/claw-code` on 2026-04-29 using the rebuilt actual `./rust/target/debug/claw`; temp-workspace proof showed `.claw.json` loads one MCP server while `mcp help` documents only `.claw/settings*.json` sources.
|
||||
|
||||
328. **`claw agents help` omits the `.codex/agents` roots that `claw agents` actually loads from, so native-agent discovery provenance is misleading** — dogfooded 2026-04-29 on current `origin/main` / workspace HEAD `ee85fed6` after rebuilding the actual debug binary with `cargo run --manifest-path rust/Cargo.toml --bin claw -- version --output-format json`; `./rust/target/debug/claw version --output-format json` then reported embedded `git_sha` `ee85fed6`, matching the workspace. Running `./rust/target/debug/claw agents help --output-format json` returned `usage.sources = [".claw/agents", "~/.claw/agents", "$CLAW_CONFIG_HOME/agents"]`, with no `.codex/agents` or `~/.codex/agents` entry. In the same environment, `./rust/target/debug/claw agents --output-format json` listed native agents such as `analyst` with source `{id: "user_claw", label: "User home roots"}` even though `/home/bellman/.claw/agents` does not exist and `/home/bellman/.codex/agents/analyst.toml` does exist. The agents lifecycle surface therefore documents one set of roots while loading from another, and the loaded-agent provenance collapses the real Codex root behind a generic `user_claw` label. This is distinct from #327's MCP source-list mismatch: the affected subsystem is native-agent discovery, where claws choose delegation/staffing lanes from `claw agents` and need to know which root supplied each agent. **Required fix shape:** (a) derive `agents help` source roots from the same registry/search path used by the agent loader instead of a hard-coded `.claw`-only list; (b) include all supported native-agent roots in stable order, including project/user `.codex/agents` roots alongside `.claw/agents` and `$CLAW_CONFIG_HOME/agents`; (c) make each `agents --output-format json` entry expose non-secret source provenance precise enough to distinguish `user_codex`, `project_codex`, `user_claw`, and `project_claw` (without leaking unnecessary absolute paths); (d) add a regression proving an agent loaded from `~/.codex/agents` is accompanied by help-source metadata naming that root and per-agent provenance that does not mislabel it as generic `user_claw`. **Why this matters:** agent selection is a control-plane decision. If help says only `.claw/agents` are searched while the runtime actually consumes `.codex/agents`, claws and operators can edit the wrong directory, misdiagnose missing/stale agents, or trust the wrong ownership boundary for delegated work. Source: gaebal-gajae dogfood in `/home/bellman/Workspace/claw-code` on 2026-04-29 using rebuilt `./rust/target/debug/claw`; proof commands showed `agents help` omitting `.codex/agents` while `agents` loaded `analyst` from the existing `/home/bellman/.codex/agents/analyst.toml` with no `/home/bellman/.claw/agents` directory present.
|
||||
329. **Resume-safe slash `/agents --output-format json` downgrades structured agent inventory into prose even though top-level `claw agents --output-format json` returns machine-readable entries** — dogfooded 2026-04-29 on current `origin/main` / workspace HEAD `0f7578c0` after rebuilding the actual debug binary with `cargo run --manifest-path rust/Cargo.toml --bin claw -- version --output-format json`; `./rust/target/debug/claw version --output-format json` reported embedded `git_sha` `0f7578c0`, matching the workspace. Running `./rust/target/debug/claw --resume latest /agents --output-format json` returned only `{"kind":"agents","text":"Agents\n 20 active agents..."}`: the agent names, source ids, models, reasoning effort, active/shadowed state, and working-directory context are all flattened into one human prose string. In the same rebuilt binary and same workspace, `./rust/target/debug/claw agents --output-format json` returned a structured object with top-level `agents[]`, `count`, `summary`, `working_directory`, and per-agent fields such as `name`, `description`, `model`, `reasoning_effort`, `active`, `shadowed_by`, and `source`. The resume-safe slash surface therefore looks JSON-shaped while throwing away exactly the structured inventory that automation needs, and it diverges from the already-existing top-level command schema. This is distinct from #325's broad help JSON opacity and #328's source-root mismatch: the pinpoint is the `/agents` slash command losing structured inventory in resume mode even though the non-slash agents command already has it. **Required fix shape:** (a) make resume-safe `/agents --output-format json` reuse the same serializer/schema as `claw agents --output-format json` instead of wrapping rendered text; (b) preserve per-agent source/provenance fields, model/reasoning metadata, active/shadowed state, count/summary, and working-directory context; (c) keep `text` or `message` as an optional human summary only, not the sole payload; (d) add regression coverage proving top-level `claw agents --output-format json` and resume-safe `/agents --output-format json` expose equivalent structured agent inventory for the same workspace. **Why this matters:** `/agents` is the in-session delegation/staffing truth surface. Claws operating through `--resume latest` need to choose agents without scraping prose; losing structure at the slash boundary makes automated staffing brittle and contradicts the top-level command contract. Source: gaebal-gajae dogfood in `/home/bellman/Workspace/claw-code` on 2026-04-29 using rebuilt `./rust/target/debug/claw`; proof commands showed slash `/agents` JSON had only `kind,text` while top-level `agents` JSON had `agents[]` and provenance metadata.
|
||||
337. **`status` / `/session list --output-format json` session lifecycle reports `workspace_dirty: true` and `abandoned: true` but omits dirty-file detail and abandonment cause, making automated GC unable to distinguish live work from crash leftovers** — restored from PR #2852 / Jobdori dogfood on current main (`0f7578c`). The evidence bundle listed 10 sessions and every listed session had `workspace_dirty: true` plus `abandoned: true`; each lifecycle object exposed `abandoned: true`, `kind: "saved_only"`, `pane_id: null`, and `workspace_dirty: true`, but did not include `dirty_file_count`, `dirty_file_paths` / summary, or `abandoned_reason`. That leaves cleanup policy with only a boolean dirty/abandoned pair: it cannot tell whether a saved-only session contains intentional uncommitted user work, a harmless stale pane artifact, or crash leftovers that are safe to collect. **Required fix shape:** (a) add `dirty_file_count: u32` to session lifecycle/status payloads whenever dirty state is evaluated; (b) add an `abandoned_reason` enum such as `pane_closed`, `process_killed`, `session_replaced`, `workspace_missing`, or `unknown` instead of a bare boolean-only abandonment signal; (c) optionally add summarized `dirty_file_paths` / `dirty_file_summary` with truncation metadata so automation can present useful evidence without leaking excessive path detail; (d) add regression coverage proving dirty abandoned saved-only sessions include file count, abandonment reason, and stable behavior when path summaries are omitted or truncated. **Why this matters:** session GC must not delete live user work, but it also cannot leave every crash leftover forever. A lifecycle object that says only `workspace_dirty: true` and `abandoned: true` forces cleanup tooling to guess instead of applying a safe policy from structured evidence. Source: PR #2852 / Jobdori dogfood; all 10 listed sessions shared the same dirty+abandoned shape, and the sample lifecycle object had `abandoned: true`, `kind: "saved_only"`, `pane_id: null`, `workspace_dirty: true`, with no dirty-file count, path summary, or abandonment reason.
|
||||
|
||||
338. **Top-level `help --output-format json` and resume-safe `/help --output-format json` use different payload fields for the same help surface (`message` vs `text`)** — dogfooded 2026-04-29 on current `origin/main` / workspace HEAD `24ccb59b` after rebuilding the actual debug binary with `cargo run --manifest-path rust/Cargo.toml --bin claw -- version --output-format json`; `./rust/target/debug/claw version --output-format json` reported embedded `git_sha` `24ccb59b`, matching the workspace. Running `./rust/target/debug/claw help --output-format json` returned a valid JSON object with keys `kind,message`, while `./rust/target/debug/claw --resume latest /help --output-format json` returned the same conceptual help surface with keys `kind,text`. Both are prose-only help payloads, but automation now has to special-case whether help was reached through the top-level command dispatcher or the resume-safe slash dispatcher before it can even locate the rendered help body. This is distinct from #325's broader structured-schema absence: the pinpoint here is a concrete JSON field-name contract drift between two help entrypoints that should be equivalent or explicitly versioned. **Required fix shape:** (a) define one canonical help JSON body field such as `message` or `text` and use it consistently across top-level `help`, slash `/help`, and resume-safe `/help`; (b) if backward compatibility requires both fields temporarily, emit both with identical contents plus a `schema_version` and deprecation metadata; (c) add regression coverage proving `claw help --output-format json` and `claw --resume latest /help --output-format json` expose the same top-level field contract and `kind=help`; (d) document whether slash-command JSON is intended to share schemas with top-level command JSON or carry its own explicit schema namespace. **Why this matters:** help JSON is the bootstrap discoverability surface for claws. If the same help concept moves its body between `message` and `text` depending on invocation path, every orchestrator needs brittle per-entrypoint parsers before it can inspect commands, flags, or resume safety. Source: gaebal-gajae dogfood in `/home/bellman/Workspace/claw-code` on 2026-04-29 using rebuilt `./rust/target/debug/claw`; proof commands showed top-level help JSON keys `kind,message` and resume-safe slash help JSON keys `kind,text` on the same rebuilt binary.
|
||||
|
||||
339. **`/session delete` is not resume-safe while `/session list` is, making session GC impossible from `--resume` mode automation** — dogfooded 2026-04-30 by Jobdori on `24ccb59`. Running `claw --output-format json --resume latest /session list` succeeds and returns the full session list (10 sessions, all `workspace_dirty: true, abandoned: true`). Running `claw --output-format json --resume latest /session delete <id>` returns `{"command":"...","error":"unsupported resumed slash command","type":"error"}` — again using `"type"` not `"kind"` (#336 vocab violation). An automation lane that discovers abandoned sessions via `--resume` cannot delete any of them via the same path; it must spawn an interactive REPL session just to issue delete, breaking the machine-readable JSON surface contract. **Required fix shape:** (a) mark `/session delete` as resume-safe; (b) return `{"kind":"session_deleted","session_id":"<id>","path":"<deleted_path>"}` on success; (c) require `--force` only for dirty/active sessions; (d) add regression coverage. Source: Jobdori live dogfood, mengmotaHost, `24ccb59`, 2026-04-30.
|
||||
|
||||
340. **Resume-safe `/session help --output-format json` writes its primary JSON error envelope to stderr and uses `type` instead of the session JSON `kind` vocabulary** — dogfooded 2026-04-29 on current `origin/main` / rebuilt `./rust/target/debug/claw` with embedded `git_sha` `dc47482e`. Running `./rust/target/debug/claw --resume latest /session help --output-format json` wrote no stdout bytes, but wrote a JSON error object to stderr: `{"command":"/session help","error":"Unknown /session action ...","type":"error"}`. Meanwhile `/session list --output-format json` wrote valid stdout JSON with `kind=session_list`. The JSON output contract is therefore split across stderr for an error/help-ish action and switches vocabulary from `kind` to `type`; automation that reads stdout sees empty/non-JSON output and cannot handle errors consistently with successful session JSON responses. **Required fix shape:** (a) all `--output-format json` command responses, including resumed slash errors, should emit the primary JSON envelope on stdout; (b) use `kind:"error"` or a documented error schema consistently instead of an ad hoc `type` field; (c) reserve stderr prose for text mode or optional non-primary diagnostics, not the machine-readable envelope; (d) add a regression for `/session help` or an unsupported `/session` action under `--resume` proving stdout contains the structured JSON error envelope and stderr does not carry the only parseable payload. **Why this matters:** claws need one stdout JSON contract for both success and failure. If a help-ish session error is silently moved to stderr and shaped differently from `session_list`, orchestration lanes cannot distinguish an unsupported action from transport corruption or an empty response without bespoke stderr parsing. Source: gaebal-gajae dogfood follow-up for the 15:30 nudge on rebuilt `./rust/target/debug/claw` `dc47482e`.
|
||||
|
||||
341. **Resume-safe `/tasks --output-format json` emits an unsupported-command JSON error only on stderr and mixes `kind` with `type` classification vocabularies** — dogfooded 2026-04-29 for the 16:00 nudge on current `origin/main` / rebuilt `./rust/target/debug/claw` with embedded `git_sha` `58569131`. Running `./rust/target/debug/claw --resume latest /tasks --output-format json` wrote no stdout bytes, but wrote a JSON error object to stderr: `{"command":"/tasks","error":"/tasks is not yet implemented in this build","kind":"unsupported_command","type":"error"}`. The unsupported command envelope therefore has two separate top-level classification vocabularies (`kind=unsupported_command` and `type=error`) and places the only parseable payload on stderr, while successful JSON commands use stdout and a `kind`-only classification. This is distinct from #340 because it is not session help; it shows implemented-but-unsupported command stubs can emit a dual-vocabulary error envelope. **Required fix shape:** (a) in `--output-format json` mode, emit the primary JSON envelope on stdout for unsupported resumed slash commands such as `/tasks`; (b) document and use one error discriminator, preferably `kind:"error"` plus `code:"unsupported_command"`, or `kind:"unsupported_command"` plus `status:"error"`, but not `type`; (c) reserve stderr for non-primary diagnostics or text-mode prose, never as the sole JSON payload; (d) add regression coverage for `/tasks` under `--resume` with JSON output proving stdout contains the structured error envelope, stderr is not the only parseable stream, and the envelope uses the documented single-vocabulary discriminator. **Why this matters:** claws need the same stdout JSON contract for implemented successes and implemented-but-unsupported stubs. If `/tasks` errors can silently move to stderr and advertise both `kind` and `type`, automation must special-case command stubs instead of applying one JSON error parser. Source: gaebal-gajae dogfood follow-up for the 16:00 nudge on rebuilt `./rust/target/debug/claw` `58569131`.
|
||||
342. **Resume-safe `/commands --output-format json` is rejected as an unknown slash command even though the error points users at `/help` for slash-command discovery, leaving no structured command-index alias** — dogfooded 2026-04-29 for the 16:30 nudge on current `origin/main` / rebuilt `./rust/target/debug/claw` with embedded `git_sha` `f65b2b4f`. Running `./rust/target/debug/claw --resume latest /commands --output-format json` wrote no stdout bytes and emitted only stderr JSON: `{"command":"/commands","error":"Unknown slash command: /commands\n Help /help lists available slash commands","type":"error"}`. In the same rebuilt binary, `./rust/target/debug/claw --resume latest /help --output-format json` succeeded on stdout but exposed only prose keys `kind,text`. The discoverability path therefore has two gaps at once: the intuitive `/commands` index/alias is unavailable, and the fallback suggestion is buried inside an error string rather than surfaced as structured `suggested_command` / `discovery_command` metadata. This is distinct from #340 and #341: the pinpoint is not merely stderr-only JSON error placement, but the absence of a machine-readable slash-command discovery alias/index and typed correction guidance when users or claws try the natural `/commands` form. **Required fix shape:** (a) either implement `/commands` as a resume-safe alias for slash-command discovery or return a typed `unknown_command` JSON envelope with `suggested_command:"/help"` and `discovery_command:"/help"` fields; (b) make the primary JSON error envelope follow the stdout JSON contract and single-discriminator schema from #340/#341; (c) expose structured slash-command inventory from the discovery surface rather than requiring callers to scrape `text`; (d) add regression coverage proving `/commands --output-format json` either returns the structured command inventory or returns a structured correction that automation can follow without parsing prose. **Why this matters:** claws need a predictable way to discover valid slash commands before invoking them. If the natural command-index spelling fails with stderr-only JSON and a human-formatted hint, orchestration has to guess, parse prose, and special-case command discovery before it can even learn the supported command surface. Source: gaebal-gajae dogfood follow-up for the 16:30 nudge on rebuilt `./rust/target/debug/claw` `f65b2b4f`.
|
||||
343. **Resume-safe `/models --output-format json` suggests `/model` as a correction even though `/model` is itself unsupported in the same resume-safe JSON path** — dogfooded 2026-04-29 for the 17:00 nudge on current `origin/main` / rebuilt `./rust/target/debug/claw` with embedded `git_sha` `a1bfcd41`. Running `./rust/target/debug/claw --resume latest /models --output-format json` wrote no stdout bytes and emitted stderr JSON: `{"command":"/models","error":"Unknown slash command: /models\n Did you mean /model, /tokens\n Help /help lists available slash commands","type":"error"}`. Immediately following the suggested correction with `./rust/target/debug/claw --resume latest /model --output-format json` also wrote no stdout bytes and returned `{"command":"/model","error":"unsupported resumed slash command","type":"error"}`. The correction path therefore points automation from an unknown plural form to a command that cannot run in the same resume-safe noninteractive mode, while `/tokens --output-format json` succeeds and exposes only token counters. This is distinct from #342's missing `/commands` discovery alias: the pinpoint here is dead-end suggestion quality and resume-safety awareness in `Did you mean` guidance. **Required fix shape:** (a) make unknown-command suggestions context-aware so resume-mode JSON only suggests commands that are actually resume-safe for the current invocation, or labels non-resume-safe suggestions with `resume_safe:false`; (b) expose suggestions as structured `suggestions[]` objects with `command`, `resume_safe`, `reason`, and optional `replacement_for` fields instead of burying them in the `error` string; (c) if `/model` remains interactive-only, suggest a machine-readable status/config/model inspection command that works under `--resume`, or return a typed `interactive_only` blocker; (d) add regression coverage proving `/models --output-format json` does not recommend an unusable `/model` command without structured resume-safety metadata. **Why this matters:** claws follow correction hints automatically. A suggestion that leads straight into another unsupported resumed slash command turns error recovery into a loop and makes command discovery less trustworthy than no suggestion at all. Source: gaebal-gajae dogfood follow-up for the 17:00 nudge on rebuilt `./rust/target/debug/claw` `a1bfcd41`.
|
||||
344. **Resume-safe `/config help --output-format json` is treated as an unsupported config section instead of a structured config-section discovery surface** — dogfooded 2026-04-29 for the 18:30 nudge on current `origin/main` / rebuilt `./rust/target/debug/claw` with embedded `git_sha` `a510f734`. Running `./rust/target/debug/claw --resume latest /config help --output-format json` wrote no stdout bytes and emitted stderr JSON: `{"command":"/config help","error":"Unsupported /config section 'help'. Use env, hooks, model, or plugins.\n Usage /config [env|hooks|model|plugins]\n\n/config\n Summary Inspect Claude config files or merged sections\n Usage /config [env|hooks|model|plugins]\n Category Config\n Resume Supported with --resume SESSION.jsonl","type":"error"}`. The same shape appears for natural discovery forms such as `/config list` and `/config show`, while bare `/config --output-format json` succeeds and returns config-file data. The config surface is therefore resume-supported, but its section discovery/help path is only available as a human-formatted error string on stderr, with no structured `sections[]`, no `help` alias, and no typed `unsupported_section` metadata. This is distinct from #342's missing slash-command index and #343's dead-end suggestion: the pinpoint is a command-specific subcommand/section discovery contract for an otherwise working resume-safe command. **Required fix shape:** (a) make `/config help` or `/config sections` resume-safe and return stdout JSON containing supported sections such as `env`, `hooks`, `model`, and `plugins`; (b) for unsupported config sections, emit a typed JSON envelope with `kind:"error"` or equivalent plus `code:"unsupported_config_section"`, `section`, and structured `supported_sections[]`; (c) keep human usage text optional, not the only machine-readable recovery path; (d) add regression coverage proving `/config help --output-format json` or its canonical replacement exposes structured section metadata and that `/config list`/`show` errors include structured supported-section guidance. **Why this matters:** config inspection is a control-plane surface. Claws should not have to intentionally trigger an error and scrape prose to learn which config sections can be inspected under `--resume`; section discovery needs the same machine-readable contract as the config payload itself. Source: gaebal-gajae dogfood follow-up for the 18:30 nudge on rebuilt `./rust/target/debug/claw` `a510f734`.
|
||||
345. **Resume-safe `/config env|hooks|model|plugins --output-format json` accepts different section names but returns the same generic config-file summary for every section** — dogfooded 2026-04-29 for the 19:00 nudge on current `origin/main` / rebuilt `./rust/target/debug/claw` with embedded `git_sha` `a510f734`. Running `./rust/target/debug/claw --resume latest /config env --output-format json`, `/config hooks`, `/config model`, and `/config plugins` all wrote stdout JSON successfully and no stderr, but each response had the same top-level shape and values: `kind:"config"`, `cwd`, `files[]`, `loaded_files:1`, and `merged_keys:1`. None of the outputs included the requested `section`, section-specific keys, hook/model/plugin/env data, `section_missing`, `section_empty`, or truncation metadata; the `env`, `hooks`, `model`, and `plugins` arguments appear to be accepted while producing an indistinguishable generic config summary. This is distinct from #344's missing config-section discovery/help path: the pinpoint here is that the advertised section-specific entrypoints do not produce section-specific machine-readable payloads once invoked. **Required fix shape:** (a) include a `section` field in `/config <section> --output-format json` responses; (b) return section-specific structured payloads for `env`, `hooks`, `model`, and `plugins`, with explicit empty/missing states when applicable; (c) preserve the config-file provenance summary separately from the requested section content so callers can tell what was inspected; (d) add regression coverage proving the four supported sections produce distinguishable JSON contracts and do not silently collapse to the bare `/config` summary. **Why this matters:** config inspection is used to diagnose model, hook, plugin, and env lifecycle issues. If every supported section returns the same generic file list, claws cannot tell whether a section is empty, unsupported, redacted, or simply ignored, and config troubleshooting remains prose/error archaeology instead of structured state inspection. Source: gaebal-gajae dogfood follow-up for the 19:00 nudge on rebuilt `./rust/target/debug/claw` `a510f734`.
|
||||
346. **Top-level `agents show <name> --output-format json` accepts a natural agent-detail request but falls back to generic help JSON instead of returning the selected agent or a typed unsupported-detail error** — dogfooded 2026-04-29 for the 20:00 nudge on current `origin/main` / rebuilt `./rust/target/debug/claw` with embedded `git_sha` `c6c01bea`. Running `./rust/target/debug/claw agents list --output-format json` returned a valid stdout JSON inventory with `kind:"agents"`, `action:"list"`, and an `agents[]` entry named `analyst`. Immediately running `./rust/target/debug/claw agents show analyst --output-format json` returned success on stdout but did not return the `analyst` detail object; instead it returned generic help-shaped JSON: `{"action":"help","kind":"agents","unexpected":"show analyst","usage":{"direct_cli":"claw agents [list|help]","slash_command":"/agents [list|help]",...}}`. Both stderr streams were empty. The command therefore accepts a natural detail-inspection spelling, recognizes it only as `unexpected`, and hides the absence of an agent-detail surface behind a successful help fallback rather than a typed `unsupported_agents_action` / `agent_detail_unavailable` error. This is distinct from #328 and #329: those cover source/provenance mismatch and slash `/agents` inventory flattening, while this pinpoint is the missing top-level agent detail/inspection contract after inventory discovery succeeds. **Required fix shape:** (a) either implement `agents show <name> --output-format json` returning the selected agent's structured fields and provenance, or return a non-success typed JSON error with `code:"unsupported_agents_action"`, `requested_action:"show"`, and `supported_actions:["list","help"]`; (b) include `agent_name` and whether the name exists in the current inventory when rejecting detail inspection; (c) avoid `action:"help"` success envelopes for unsupported subcommands because they make failed detail inspection look like intentional help output; (d) add regression coverage proving `agents show analyst --output-format json` does not silently collapse to generic help when `analyst` exists in `agents list`. **Why this matters:** claws discover agents first, then need to inspect a chosen agent before delegation. If the natural detail command returns successful generic help instead of a selected-agent payload or typed unsupported-action error, automation cannot distinguish typo, unsupported detail view, missing agent, or successful help request without comparing unrelated inventory output. Source: gaebal-gajae dogfood follow-up for the 20:00 nudge on rebuilt `./rust/target/debug/claw` `c6c01bea`; earlier false hang hypotheses for `mcp help` and `agents list` were closed after bounded repros succeeded.
|
||||
347. **Top-level `mcp show <missing-server> --output-format json` reports a missing server as `status:"ok"` instead of a typed not-found/error status** — dogfooded 2026-04-29 for the 20:30 nudge on current `origin/main` / rebuilt `./rust/target/debug/claw` with embedded `git_sha` `ee41b266`. After rebuilding and verifying the binary provenance, running `./rust/target/debug/claw mcp show does-not-exist --output-format json` returned stdout JSON with `{"action":"show","config_load_error":null,"found":false,"kind":"mcp","message":"server `does-not-exist` is not configured","server_name":"does-not-exist","status":"ok"}` and no stderr. `found:false` is useful, but pairing it with `status:"ok"` makes the command-level outcome ambiguous: a missing requested server is not an OK inspection result for automation that needs to distinguish successful detail retrieval from a not-found lookup. This is distinct from #327's MCP source-list mismatch and the invalid #2874/#2879/#2880 hang/nondeterminism hypotheses that were closed after bounded repros. **Required fix shape:** (a) return a typed not-found status such as `status:"not_found"` or `kind:"error"` plus `code:"mcp_server_not_found"` while preserving `server_name` and optional `available_servers[]`; (b) document whether `found:false` objects are considered success or error and keep that convention consistent across text and JSON modes; (c) ensure process exit semantics match the JSON status contract or expose a separate `exit_ok`/`lookup_status` field; (d) add regression coverage proving missing-server lookup is distinguishable from successful server detail retrieval without parsing the human `message`. **Why this matters:** MCP inspection is a control-plane diagnostic. If a missing server returns `status:"ok"`, claws can silently treat a failed lookup as healthy MCP state unless they special-case `found:false`, which defeats the purpose of a clear machine-readable status field. Source: gaebal-gajae dogfood follow-up for the 20:30 nudge on rebuilt `./rust/target/debug/claw` `ee41b266`.
|
||||
348. **Top-level `plugins list --output-format json` returns plugin inventory only as a prose `message` string instead of structured `plugins[]` entries** — dogfooded 2026-04-29 for the 21:00 nudge on current `origin/main` / rebuilt `./rust/target/debug/claw` with embedded `git_sha` `cca6f682`. Running `./rust/target/debug/claw plugins list --output-format json` repeatedly returned valid stdout JSON with `{"action":"list","kind":"plugin","message":"Plugins\n example-bundled v0.1.0 disabled\n sample-hooks v0.1.0 disabled","reload_runtime":false,"target":null}` and no stderr. The actual plugin names, versions, and enabled/disabled states are present only inside the human-formatted `message` table; there is no `plugins[]` array, no per-plugin `name`, `version`, `enabled`, `source`, `load_error`, or lifecycle/action metadata. This is distinct from #325's broad help JSON opacity and the config/MCP/agent items: the affected surface is plugin lifecycle inventory, where automation needs a structured list before enabling, disabling, updating, or uninstalling plugins. **Required fix shape:** (a) add `plugins[]` with stable per-plugin fields such as `name`, `version`, `enabled`, `source`, `configured`, `load_status`, and optional `error`; (b) keep `message` only as a human summary, not the sole inventory payload; (c) expose counts and truncation metadata if the list can be large; (d) add regression coverage proving `plugins list --output-format json` can be parsed without scraping the prose message and that disabled/enabled state survives as booleans/enums. **Why this matters:** plugin lifecycle management is a control-plane path. If the JSON inventory is just a text table, claws must scrape spacing-sensitive prose before deciding whether a plugin is installed, disabled, broken, or safe to mutate. Source: gaebal-gajae dogfood follow-up for the 21:00 nudge on rebuilt `./rust/target/debug/claw` `cca6f682`.
|
||||
349. **Top-level `plugins show <name> --output-format json` returns success-shaped JSON for an unsupported plugin action instead of a typed unsupported-action error** — dogfooded 2026-04-29 for the 21:30 nudge on current `origin/main` / rebuilt `./rust/target/debug/claw` with embedded `git_sha` `a2a38df9`. After rebuilding and verifying the binary provenance, repeated bounded runs of `./rust/target/debug/claw plugins show does-not-exist --output-format json` returned stdout JSON with `{"action":"show","kind":"plugin","message":"Unknown /plugins action 'show'. Use list, install, enable, disable, uninstall, or update.","reload_runtime":false,"target":"does-not-exist"}` and no stderr. The command therefore reports the requested unsupported action as the top-level `action:"show"` and exits successfully while hiding the failure class inside a human `message`; it does not provide `status:"unsupported_action"`, `code:"plugin_action_unsupported"`, or structured `supported_actions[]`. This is distinct from #348's prose-only plugin inventory schema: #348 covers `plugins list` payload shape, while this pinpoint covers unsupported plugin action classification and recovery metadata. **Required fix shape:** (a) return a typed stdout JSON error or explicit non-ok status for unsupported plugin actions, with `requested_action`, `supported_actions`, and `target` fields; (b) do not label the primary `action` as the unsupported requested verb unless a separate `status`/`code` makes the failure unambiguous; (c) keep the human message optional and avoid making it the only way to detect the unsupported action; (d) add regression coverage proving `plugins show foo --output-format json` is machine-classifiable as unsupported without scraping prose. **Why this matters:** plugin lifecycle automation follows action/status fields. If an unsupported mutation/inspection verb returns success-shaped JSON and only says "Unknown" in prose, claws can treat a failed preflight as a valid plugin show result and continue toward unsafe lifecycle actions. Source: gaebal-gajae dogfood follow-up for the 21:30 nudge on rebuilt `./rust/target/debug/claw` `a2a38df9`; invalid hang PR #2885 was closed after repeated bounded repros returned stdout JSON.
|
||||
|
||||
363. **`/status --output-format json` workspace file-change fields (`unstaged_files`, `staged_files`, `untracked_files`, `changed_files`) are integer counts, not path arrays; file identities are unavailable from the status response** — dogfooded 2026-04-30 by Jobdori on `a2a38df9`. Running `/status --output-format json` returns: `{"workspace":{"unstaged_files":0,"staged_files":0,"untracked_files":4,"changed_files":4,...}}`. All four file-change fields are integers, not lists of file paths. There is no `files[]` or `paths[]` sub-array, no `{path, change_type}` per entry. An orchestrator knowing that 4 files are untracked cannot determine which files without a separate `git status` call. This compounds pinpoint #352 (`/diff` returns raw patch text without a `files[]` array): the two commands together expose change count and raw diff text, but neither provides a clean `{path, change_type, insertions, deletions}` inventory. The workspace is therefore observable only through two-step prose-or-count approaches. **Required fix shape:** (a) change the four fields to arrays of `{path, change_type}` objects, or add a parallel `unstaged_paths[]` / `untracked_paths[]` alongside the existing counts; (b) make the counts derivable from the length of the arrays (remove duplication); (c) if the count-only design is intentional for performance, add a `detail: false` default and `--workspace-detail` flag to expand to path arrays; (d) add regression coverage confirming `changed_files` equals the length of the corresponding paths array. Source: Jobdori live dogfood, mengmotaHost, `a2a38df9`, 2026-04-30.
|
||||
332. **`doctor --output-format json` has no top-level `status` field, breaking the `result["status"] == "ok"` check pattern that works for all other JSON commands** — dogfooded 2026-04-29 by Jobdori on current main (`e7074f4`). Running `claw doctor --output-format json | python3 -c "import sys,json; print(json.load(sys.stdin).get('status'))"` prints `None`. The doctor JSON uses `has_failures` (bool) + `summary: {failures, ok, total, warnings}` to express aggregate health, while every other JSON command (`status`, `sandbox`, `stats`, `cost`, etc.) uses a top-level `"status": "ok"|"warn"|"error"` field. A downstream lane checking `result.get("status") == "ok"` will silently misread a doctor output with 2 warnings as if no status were present, instead of getting `"warn"`. **Required fix shape:** (a) add a top-level `"status": "ok" | "warn" | "error"` field to doctor JSON output derived from the same `has_failures`/warning-count logic already present; (b) ensure the field is always present and uses the same vocabulary as other JSON commands; (c) keep `has_failures` and `summary` for backward compat but document `status` as the canonical machine-readable aggregate verdict; (d) add regression coverage proving `claw doctor --output-format json` always includes a non-null `"status"` field. **Why this matters:** `status` is the idiomatic machine-readable health verdict in claw's JSON surface; when `doctor` skips it, automation layers that unify health checks across commands (`doctor` + `status` + `sandbox`) must special-case doctor or silently miss warnings. Source: Jobdori live dogfood on mengmotaHost, claw-code `e7074f4`, 2026-04-29.
|
||||
|
||||
Reference in New Issue
Block a user