Compare commits

..

3 Commits

Author SHA1 Message Date
YeonGyu-Kim
4de868a485 docs(roadmap): fix #425 cross-reference #401#400 (status session mismatch) 2026-05-01 04:06:31 +09:00
YeonGyu-Kim
aa182758d5 docs(roadmap): add #425 — changed_files silently includes untracked files, false-positive dirty signal 2026-05-01 03:33:08 +09:00
YeonGyu-Kim
57096b0a1a docs(roadmap): add no-session kind drift item
Adds ROADMAP #422 documenting the concrete export/resume no-session ErrorKind drift.
2026-05-01 02:46:12 +09:00

View File

@@ -6301,4 +6301,6 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed)
380. **Top-level `tokens --help --output-format json` hangs with zero stdout/stderr instead of returning bounded command help JSON** — dogfooded 2026-04-30 for the 02:30 nudge on current `origin/main` / rebuilt `./rust/target/debug/claw` with embedded `git_sha` `d95b230c`. After verifying #358 covered `cost --help`, a fresh adjacent probe on the token-budget surface showed the same silent failure class: repeated bounded runs of `timeout 8 ./rust/target/debug/claw tokens --help --output-format json` exited `124` with `stdout=0` and `stderr=0`. In the same rebuilt binary, `version --output-format json` returned promptly with version/build metadata, proving the binary itself and JSON output path are reachable. This is distinct from #358's cost help hang: the affected surface is the sibling `tokens` command help, which agents use before estimating prompt/session token budgets. **Required fix shape:** (a) make `tokens --help --output-format json` return static/bounded stdout JSON with `kind:"help"` or `kind:"tokens"`, `action:"help"`, usage, options, examples, supported output formats, and related slash/direct commands; (b) ensure help rendering does not initialize slow token accounting, session, or provider state; (c) if any dynamic provider is consulted, return a typed JSON timeout/unavailable error instead of hanging; (d) add regression coverage proving tokens help in JSON mode returns within a deterministic budget. **Why this matters:** token budgeting is a preflight clawability surface. If help hangs silently, automation cannot safely discover how to inspect or constrain token usage before running expensive prompts, and budget-aware wrappers stall at the discovery step. Source: gaebal-gajae dogfood follow-up for the 02:30 nudge on rebuilt `./rust/target/debug/claw` `d95b230c`.
381. **Top-level `cache --help --output-format json` hangs with zero stdout/stderr instead of returning bounded command help JSON** — dogfooded 2026-04-30 for the 03:00 nudge on current `origin/main` / rebuilt `./rust/target/debug/claw` with embedded `git_sha` `d95b230c`. After #358 and #380 landed for the cost/tokens preflight help hangs, a fresh adjacent probe on the cache-control surface showed the same silent failure class: repeated bounded runs of `timeout --kill-after=1s 8s ./rust/target/debug/claw cache --help --output-format json` exited `124` with `stdout=0` and `stderr=0`. In the same rebuilt binary, `version --output-format json` returned promptly with version/build metadata, proving the binary itself and JSON output path are reachable. This is distinct from the separate `/cache` slash-command envelope mismatch class: the affected surface here is top-level `cache` command help, where agents need bounded local discovery before deciding whether to inspect, clear, or summarize cache state. **Required fix shape:** (a) make `cache --help --output-format json` return static/bounded stdout JSON with `kind:"help"` or `kind:"cache"`, `action:"help"`, usage, options, examples, supported output formats, and related slash/direct commands; (b) ensure help rendering does not initialize slow cache/session/provider state; (c) if any dynamic provider is consulted, return a typed JSON timeout/unavailable error instead of hanging; (d) add regression coverage proving cache help in JSON mode returns within a deterministic budget. **Why this matters:** cache inspection and cleanup are recovery/control-plane operations. If cache help hangs silently, claws cannot safely discover cache semantics before attempting cleanup, and automation stalls before it can choose a non-destructive cache action. Source: gaebal-gajae dogfood follow-up for the 03:00 nudge on rebuilt `./rust/target/debug/claw` `d95b230c`.
417. **`mcp show <nonexistent> --output-format json` returns `found:false` + `status:"ok"` with exit 0 — a "not found" result with a success status is contradictory and causes automation to false-positive** — dogfooded 2026-04-30 by Jobdori on `e939777f`. Running `claw mcp show nonexistent --output-format json` when the server is not configured returns `{"action":"show","config_load_error":null,"found":false,"kind":"mcp","message":"server \`nonexistent\` is not configured","server_name":"nonexistent","status":"ok","working_directory":"..."}` with exit code 0. `found:false` means the server does not exist in config; `status:"ok"` means the show operation completed without internal error — but these two fields give conflicting signals: a claw checking `if status == "ok" then server_is_present` will false-positive on every nonexistent server query. The only unambiguous signal is `found:false`, but `status:"ok"` directly contradicts it. `exit 0` compounds this: standard CLI convention is exit 1 for "not found" lookups. **This is distinct from ROADMAP #102** which covers the deeper liveness-probe gap (server configured but command unreachable); #417 targets the simpler semantic contradiction in the `show` response when the server is not configured at all. **Required fix shape:** (a) use `status:"not_found"` (or `"missing"`) instead of `status:"ok"` when `found:false`; (b) exit with a non-zero code (exit 1 or exit 2) when the queried server is not found, consistent with standard lookup-command semantics; (c) if `status:"ok"` is intentionally reserved for "the show command itself ran without error" (not "the server exists"), rename it to `query_status:"ok"` and add a separate `found_status:"not_found"|"found"` field to prevent ambiguity; (d) add regression coverage proving `mcp show <nonexistent> --output-format json` returns a non-ok or semantically distinct status and a non-zero exit code. **Why this matters:** automation that orchestrates MCP server checks before starting a session will loop over a server list with `mcp show` and check `status`. If `status:"ok"` means "command ran OK" not "server exists", every missing server looks healthy to a standard status-check loop. Source: Jobdori live dogfood, `e939777f`, 2026-04-30.
422. **`export --output-format json` and `--resume latest` report the same "no managed sessions" scenario using two different `kind` codes — `no_managed_sessions` vs `session_load_failed` — making "no session found" undetectable by a single kind-code check** — dogfooded 2026-04-30 KST (UTC+9) by Jobdori on `e939777f`. Running `claw export --output-format json` with no session present returns (on stderr, exit 1): `{"error":"no managed sessions found in .claw/sessions/<fingerprint>/","hint":"Start \`claw\` to create a session, then rerun with \`--resume latest\`.\nNote: claw partitions sessions per workspace fingerprint; sessions from other CWDs are invisible.","kind":"no_managed_sessions","type":"error"}`. Running `claw --resume latest /status --output-format json` with no session present returns (on stderr, exit 1): `{"error":"failed to restore session: no managed sessions found in .claw/sessions/<fingerprint>/","hint":"Start \`claw\` to create a session, then rerun with \`--resume latest\`.\nNote: claw partitions sessions per workspace fingerprint; sessions from other CWDs are invisible.","kind":"session_load_failed","type":"error"}`. Both describe the same root condition — there are no sessions to operate on — but they expose it via different `kind` discriminants. Automation that checks `kind == "no_managed_sessions"` to detect a cold workspace will miss the `--resume` path's `session_load_failed`, and vice versa. A wrapper that guards "run with --resume only if a session exists" must special-case both codes. The hint text is identical between them, suggesting the messages are logically equivalent. Additionally neither code matches the proposed canonical names `session_not_found` / `session_load_failed` as stable `ErrorKind` discriminants described in ROADMAP #77's fix shape, which explicitly proposes typed error-kind codes for session lifecycle failures. **Required fix shape:** (a) unify "no sessions found for this workspace fingerprint" under a single canonical `kind` code — either `no_managed_sessions` or `session_not_found` — used consistently by every command path that encounters an empty session registry; (b) if `session_load_failed` is a more general category (covering e.g. corrupt session files, IO errors, schema version mismatches), it should nest a concrete `reason:"no_managed_sessions"` or `reason:"session_not_found"` sub-field so callers can distinguish "empty registry" from "found but unreadable"; (c) align with the canonical error-kind contract proposed in #77; (d) add regression coverage proving `export` and `--resume latest` in an empty workspace both return an error with the same top-level `kind` code. **Why this matters:** session guard-rails in orchestration need a single stable `kind` to detect cold workspaces without enumerating all possible no-session synonyms. Two divergent codes for the same condition make defensive automation brittle and contradict the promise of machine-readable error envelopes. Source: Jobdori live dogfood, `e939777f`, 2026-04-30 KST (UTC+9).
425. **`status --output-format json` workspace.`changed_files` silently includes untracked files in its count alongside staged and unstaged tracked files — a claw checking `changed_files > 0` to detect "uncommitted changes" will false-positive when only untracked files exist** — dogfooded 2026-04-30 KST (UTC+9) by Jobdori on `e939777f`, traced to source. In a workspace with only one untracked file and no staged/unstaged tracked changes: `claw status --output-format json` returns `{"workspace":{"changed_files":1,"staged_files":0,"unstaged_files":0,"untracked_files":1,"git_state":"dirty · 1 files · 1 untracked",...}}`. `changed_files` is 1 even though `staged_files` and `unstaged_files` are both 0. Source trace: `rust/crates/rusty-claude-cli/src/main.rs:3230-3255``parse_git_workspace_summary` increments `changed_files` for every non-header, non-blank line in `git status --short --branch` output before branching on `??` (untracked). The `??` branch increments `untracked_files` then calls `continue`, but `changed_files` was already incremented — so `changed_files = staged_files + unstaged_files + untracked_files` (plus conflicted). Meanwhile `git_state` is computed from `is_clean() = changed_files == 0`, so a repo with only untracked files produces `git_state:"dirty"` and `changed_files:N` while `staged_files == 0 && unstaged_files == 0`. A claw using the idiomatic guard `if workspace.changed_files > 0 { block_commit_or_rebase }` will block even when there are only new untracked files that will not be part of any commit. A claw using `changed_files == 0` as a "clean for commit" signal must additionally check `staged_files == 0 && unstaged_files == 0` — but the field name `changed_files` does not signal this requirement. This is distinct from #125 (git_state:"clean" in non-git dirs): #125 covers a false "clean" in non-git dirs; this covers a false "dirty" in git repos with only untracked files. **Required fix shape:** (a) rename `changed_files` to `total_git_status_entries` or similar to accurately reflect it counts all `git status --short` entries including untracked, OR exclude untracked from `changed_files` and rename the field to `tracked_changed_files`; (b) OR keep `changed_files` but document it explicitly as "includes untracked" in the schema and add a separate `tracked_changed_files = staged_files + unstaged_files` computed field; (c) fix `is_clean()` to also require `untracked_files == 0` if untracked files are meant to be included in the "dirty" signal, or to only use `staged_files + unstaged_files` if they are not; (d) add regression coverage proving that a workspace with only untracked files returns `changed_files > 0` and the documentation makes the semantic explicit. **Why this matters:** `changed_files` is the primary integer gate for "is this workspace clean enough to proceed?" in CI/automation flows. A silent untracked-file false positive causes automation to block valid operations (commits, rebases, PR creation) or fail health checks unnecessarily. Source: Jobdori live dogfood + source trace, `e939777f`, 2026-04-30 KST (UTC+9). Cross-reference: #125 (git_state clean in non-git dir), #400 (status workspace session/session_id mismatch).