docs(roadmap): fix #425 cross-reference #401 → #400 (status session mismatch)

docs(roadmap): add #425 — changed_files silently includes untracked files, false-positive dirty signal
docs(roadmap): add no-session kind drift item
2026-05-13 17:36:44 +00:00 · 2026-05-01 04:06:31 +09:00 · 2026-05-01 03:33:08 +09:00 · 2026-05-01 02:46:12 +09:00 · 2026-04-30 12:09:06 +09:00 · 2026-04-30 03:06:16 +00:00
1 changed files with 6 additions and 0 deletions
--- a/ROADMAP.md
+++ b/ROADMAP.md
@@ -6298,3 +6298,9 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed)
 356. **Top-level `status --help --output-format json` exits successfully but emits plain text help instead of JSON** — dogfooded 2026-04-30 for the 01:00 nudge on current `origin/main` / rebuilt `./rust/target/debug/claw` with embedded `git_sha` `74338dc6`. After rebuilding and verifying the binary provenance, repeated bounded runs of `./rust/target/debug/claw status --help --output-format json` exited `0` with `stdout=326` and `stderr=0`, but stdout was plain text (`Status`, `Usage`, `Purpose`, `Output`, `Formats`, `Related`) rather than a JSON object. In the same rebuilt binary, `version --output-format json` returned proper stdout JSON with version/build metadata, proving the JSON output path itself is reachable. This is distinct from #354/#355 memory/session JSON help/list hangs: the status help path returns promptly, but ignores the requested JSON format. **Required fix shape:** (a) make `status --help --output-format json` emit valid stdout JSON with `kind:"help"` or `kind:"status"`, `action:"help"`, usage, options, examples, supported output formats, and related slash/direct commands; (b) preserve text help for default/text mode only; (c) add a `format:"json"` or equivalent field so callers can assert the contract without parsing prose; (d) add regression coverage proving status help with JSON format parses as JSON and does not silently fall back to plain text. **Why this matters:** help is the discovery surface automation uses before invoking status. If `--output-format json` is accepted but help remains plain text, claws must scrape formatting-sensitive prose or special-case help output, defeating the point of machine-readable CLI contracts. Source: gaebal-gajae dogfood follow-up for the 01:00 nudge on rebuilt `./rust/target/debug/claw` `74338dc6`; invalid hang PR #2907 was closed after repeated bounded repros returned promptly.
 357. **Top-level `doctor --help --output-format json` exits successfully but emits plain text help instead of JSON** — dogfooded 2026-04-30 for the 01:30 nudge on current `origin/main` / rebuilt `./rust/target/debug/claw` with embedded `git_sha` `52a909ce`. After rebuilding and verifying the binary provenance, repeated bounded runs of `./rust/target/debug/claw doctor --help --output-format json` exited `0` with `stdout=343` and `stderr=0`, but stdout was plain text (`Doctor`, `Usage`, `Purpose`, `Output`, `Formats`, `Related`) rather than a JSON object. In the same rebuilt binary, `status --help --output-format json` also returned promptly as plain text (#356), confirming a broader help-format fallback class while keeping this pinpoint on the doctor surface. This is distinct from #354/#355 memory/session JSON help/list hangs: doctor help returns promptly, but ignores the requested JSON format. **Required fix shape:** (a) make `doctor --help --output-format json` emit valid stdout JSON with `kind:"help"` or `kind:"doctor"`, `action:"help"`, usage, checks, options, examples, supported output formats, and related slash/direct commands; (b) preserve text help for default/text mode only; (c) add a `format:"json"` or equivalent field so callers can assert the contract without parsing prose; (d) add regression coverage proving doctor help with JSON format parses as JSON and does not silently fall back to plain text. **Why this matters:** doctor is the diagnostic entrypoint users reach for when things are broken. If JSON help falls back to prose, claws cannot discover diagnostic semantics or present structured recovery instructions without scraping formatting-sensitive text. Source: gaebal-gajae dogfood follow-up for the 01:30 nudge on rebuilt `./rust/target/debug/claw` `52a909ce`; invalid hang PR #2911 was closed after repeated bounded repros returned promptly.
 358. **Top-level `cost --help --output-format json` hangs with zero stdout/stderr instead of returning bounded command help JSON** — dogfooded 2026-04-30 for the 02:00 nudge on current `origin/main` / rebuilt `./rust/target/debug/claw` with embedded `git_sha` `d95b230c`. After rebuilding and verifying the binary provenance, repeated bounded runs of `timeout 8 ./rust/target/debug/claw cost --help --output-format json` exited `124` with `stdout=0` and `stderr=0`. In the same rebuilt binary, `version --output-format json` returned promptly with version/build metadata, proving the binary itself and the JSON output path are reachable; the hang is specific to the cost help path, though other help surfaces have separate known JSON contract issues (#356/#357). **Required fix shape:** (a) make `cost --help --output-format json` return static/bounded stdout JSON with `kind:"help"` or `kind:"cost"`, `action:"help"`, usage, options, examples, supported output formats, and related slash/direct commands; (b) ensure help rendering does not initialize slow cost/session/accounting providers; (c) if any dynamic provider is accidentally consulted, return a typed JSON timeout/unavailable error instead of hanging; (d) add regression coverage proving cost help in JSON mode returns within a deterministic budget. **Why this matters:** cost/tokens surfaces are commonly consumed by automation for budgeting. If even cost help can hang silently, claws cannot discover cost command semantics or present safe budget diagnostics before running potentially slow accounting paths. Source: gaebal-gajae dogfood follow-up for the 02:00 nudge on rebuilt `./rust/target/debug/claw` `d95b230c`.
+380. **Top-level `tokens --help --output-format json` hangs with zero stdout/stderr instead of returning bounded command help JSON** — dogfooded 2026-04-30 for the 02:30 nudge on current `origin/main` / rebuilt `./rust/target/debug/claw` with embedded `git_sha` `d95b230c`. After verifying #358 covered `cost --help`, a fresh adjacent probe on the token-budget surface showed the same silent failure class: repeated bounded runs of `timeout 8 ./rust/target/debug/claw tokens --help --output-format json` exited `124` with `stdout=0` and `stderr=0`. In the same rebuilt binary, `version --output-format json` returned promptly with version/build metadata, proving the binary itself and JSON output path are reachable. This is distinct from #358's cost help hang: the affected surface is the sibling `tokens` command help, which agents use before estimating prompt/session token budgets. **Required fix shape:** (a) make `tokens --help --output-format json` return static/bounded stdout JSON with `kind:"help"` or `kind:"tokens"`, `action:"help"`, usage, options, examples, supported output formats, and related slash/direct commands; (b) ensure help rendering does not initialize slow token accounting, session, or provider state; (c) if any dynamic provider is consulted, return a typed JSON timeout/unavailable error instead of hanging; (d) add regression coverage proving tokens help in JSON mode returns within a deterministic budget. **Why this matters:** token budgeting is a preflight clawability surface. If help hangs silently, automation cannot safely discover how to inspect or constrain token usage before running expensive prompts, and budget-aware wrappers stall at the discovery step. Source: gaebal-gajae dogfood follow-up for the 02:30 nudge on rebuilt `./rust/target/debug/claw` `d95b230c`.
+381. **Top-level `cache --help --output-format json` hangs with zero stdout/stderr instead of returning bounded command help JSON** — dogfooded 2026-04-30 for the 03:00 nudge on current `origin/main` / rebuilt `./rust/target/debug/claw` with embedded `git_sha` `d95b230c`. After #358 and #380 landed for the cost/tokens preflight help hangs, a fresh adjacent probe on the cache-control surface showed the same silent failure class: repeated bounded runs of `timeout --kill-after=1s 8s ./rust/target/debug/claw cache --help --output-format json` exited `124` with `stdout=0` and `stderr=0`. In the same rebuilt binary, `version --output-format json` returned promptly with version/build metadata, proving the binary itself and JSON output path are reachable. This is distinct from the separate `/cache` slash-command envelope mismatch class: the affected surface here is top-level `cache` command help, where agents need bounded local discovery before deciding whether to inspect, clear, or summarize cache state. **Required fix shape:** (a) make `cache --help --output-format json` return static/bounded stdout JSON with `kind:"help"` or `kind:"cache"`, `action:"help"`, usage, options, examples, supported output formats, and related slash/direct commands; (b) ensure help rendering does not initialize slow cache/session/provider state; (c) if any dynamic provider is consulted, return a typed JSON timeout/unavailable error instead of hanging; (d) add regression coverage proving cache help in JSON mode returns within a deterministic budget. **Why this matters:** cache inspection and cleanup are recovery/control-plane operations. If cache help hangs silently, claws cannot safely discover cache semantics before attempting cleanup, and automation stalls before it can choose a non-destructive cache action. Source: gaebal-gajae dogfood follow-up for the 03:00 nudge on rebuilt `./rust/target/debug/claw` `d95b230c`.
+
+422. **`export --output-format json` and `--resume latest` report the same "no managed sessions" scenario using two different `kind` codes — `no_managed_sessions` vs `session_load_failed` — making "no session found" undetectable by a single kind-code check** — dogfooded 2026-04-30 KST (UTC+9) by Jobdori on `e939777f`. Running `claw export --output-format json` with no session present returns (on stderr, exit 1): `{"error":"no managed sessions found in .claw/sessions/<fingerprint>/","hint":"Start \`claw\` to create a session, then rerun with \`--resume latest\`.\nNote: claw partitions sessions per workspace fingerprint; sessions from other CWDs are invisible.","kind":"no_managed_sessions","type":"error"}`. Running `claw --resume latest /status --output-format json` with no session present returns (on stderr, exit 1): `{"error":"failed to restore session: no managed sessions found in .claw/sessions/<fingerprint>/","hint":"Start \`claw\` to create a session, then rerun with \`--resume latest\`.\nNote: claw partitions sessions per workspace fingerprint; sessions from other CWDs are invisible.","kind":"session_load_failed","type":"error"}`. Both describe the same root condition — there are no sessions to operate on — but they expose it via different `kind` discriminants. Automation that checks `kind == "no_managed_sessions"` to detect a cold workspace will miss the `--resume` path's `session_load_failed`, and vice versa. A wrapper that guards "run with --resume only if a session exists" must special-case both codes. The hint text is identical between them, suggesting the messages are logically equivalent. Additionally neither code matches the proposed canonical names `session_not_found` / `session_load_failed` as stable `ErrorKind` discriminants described in ROADMAP #77's fix shape, which explicitly proposes typed error-kind codes for session lifecycle failures. **Required fix shape:** (a) unify "no sessions found for this workspace fingerprint" under a single canonical `kind` code — either `no_managed_sessions` or `session_not_found` — used consistently by every command path that encounters an empty session registry; (b) if `session_load_failed` is a more general category (covering e.g. corrupt session files, IO errors, schema version mismatches), it should nest a concrete `reason:"no_managed_sessions"` or `reason:"session_not_found"` sub-field so callers can distinguish "empty registry" from "found but unreadable"; (c) align with the canonical error-kind contract proposed in #77; (d) add regression coverage proving `export` and `--resume latest` in an empty workspace both return an error with the same top-level `kind` code. **Why this matters:** session guard-rails in orchestration need a single stable `kind` to detect cold workspaces without enumerating all possible no-session synonyms. Two divergent codes for the same condition make defensive automation brittle and contradict the promise of machine-readable error envelopes. Source: Jobdori live dogfood, `e939777f`, 2026-04-30 KST (UTC+9).
+
+425. **`status --output-format json` workspace.`changed_files` silently includes untracked files in its count alongside staged and unstaged tracked files — a claw checking `changed_files > 0` to detect "uncommitted changes" will false-positive when only untracked files exist** — dogfooded 2026-04-30 KST (UTC+9) by Jobdori on `e939777f`, traced to source. In a workspace with only one untracked file and no staged/unstaged tracked changes: `claw status --output-format json` returns `{"workspace":{"changed_files":1,"staged_files":0,"unstaged_files":0,"untracked_files":1,"git_state":"dirty · 1 files · 1 untracked",...}}`. `changed_files` is 1 even though `staged_files` and `unstaged_files` are both 0. Source trace: `rust/crates/rusty-claude-cli/src/main.rs:3230-3255` — `parse_git_workspace_summary` increments `changed_files` for every non-header, non-blank line in `git status --short --branch` output before branching on `??` (untracked). The `??` branch increments `untracked_files` then calls `continue`, but `changed_files` was already incremented — so `changed_files = staged_files + unstaged_files + untracked_files` (plus conflicted). Meanwhile `git_state` is computed from `is_clean() = changed_files == 0`, so a repo with only untracked files produces `git_state:"dirty"` and `changed_files:N` while `staged_files == 0 && unstaged_files == 0`. A claw using the idiomatic guard `if workspace.changed_files > 0 { block_commit_or_rebase }` will block even when there are only new untracked files that will not be part of any commit. A claw using `changed_files == 0` as a "clean for commit" signal must additionally check `staged_files == 0 && unstaged_files == 0` — but the field name `changed_files` does not signal this requirement. This is distinct from #125 (git_state:"clean" in non-git dirs): #125 covers a false "clean" in non-git dirs; this covers a false "dirty" in git repos with only untracked files. **Required fix shape:** (a) rename `changed_files` to `total_git_status_entries` or similar to accurately reflect it counts all `git status --short` entries including untracked, OR exclude untracked from `changed_files` and rename the field to `tracked_changed_files`; (b) OR keep `changed_files` but document it explicitly as "includes untracked" in the schema and add a separate `tracked_changed_files = staged_files + unstaged_files` computed field; (c) fix `is_clean()` to also require `untracked_files == 0` if untracked files are meant to be included in the "dirty" signal, or to only use `staged_files + unstaged_files` if they are not; (d) add regression coverage proving that a workspace with only untracked files returns `changed_files > 0` and the documentation makes the semantic explicit. **Why this matters:** `changed_files` is the primary integer gate for "is this workspace clean enough to proceed?" in CI/automation flows. A silent untracked-file false positive causes automation to block valid operations (commits, rebases, PR creation) or fail health checks unnecessarily. Source: Jobdori live dogfood + source trace, `e939777f`, 2026-04-30 KST (UTC+9). Cross-reference: #125 (git_state clean in non-git dir), #400 (status workspace session/session_id mismatch).
Author	SHA1	Message	Date
YeonGyu-Kim	4de868a485	docs(roadmap): fix #425 cross-reference #401 → #400 (status session mismatch)	2026-05-01 04:06:31 +09:00
YeonGyu-Kim	aa182758d5	docs(roadmap): add #425 — changed_files silently includes untracked files, false-positive dirty signal	2026-05-01 03:33:08 +09:00
YeonGyu-Kim	57096b0a1a	docs(roadmap): add no-session kind drift item Adds ROADMAP #422 documenting the concrete export/resume no-session ErrorKind drift.	2026-05-01 02:46:12 +09:00
Bellman	e939777f92	Merge pull request #2921 from ultraworkers/docs/roadmap-381-cache-help-json-hangs docs(roadmap): add #381 — cache help json hangs	2026-04-30 12:09:06 +09:00
Yeachan-Heo	1093e26792	Document cache help JSON hang Constraint: ROADMAP-only dogfood follow-up for 03:00 nudge on rebuilt claw git_sha `d95b230c` Rejected: implementation change to cache help; request was one concrete follow-up if no backlog item Confidence: high after repeated bounded samples plus version JSON responsiveness sanity check Scope-risk: narrow Directive: Continue preflight help hang coverage with distinct cache surface pinpoint Tested: repeated timeout --kill-after=1s 8s ./rust/target/debug/claw cache --help --output-format json; timeout --kill-after=1s 8s ./rust/target/debug/claw version --output-format json; git diff --check; scripts/fmt.sh --check Not-tested: runtime behavior change, because this commit only documents the gap	2026-04-30 03:06:16 +00:00
Bellman	44cca2054d	Merge pull request #2918 from ultraworkers/docs/roadmap-380-tokens-help-json-hangs docs(roadmap): add #380 — tokens help json hangs	2026-04-30 11:38:50 +09:00
Yeachan-Heo	6dc7b26d82	Document tokens help JSON hang Constraint: ROADMAP-only dogfood follow-up for 02:30 nudge on rebuilt claw git_sha `d95b230c` Rejected: implementation change to tokens help; request was one concrete follow-up if no backlog item Confidence: high after repeated bounded samples plus version JSON responsiveness sanity check Scope-risk: narrow Directive: Continue cost/token preflight help coverage with a distinct tokens surface pinpoint Tested: repeated timeout 8 ./rust/target/debug/claw tokens --help --output-format json; timeout 8 ./rust/target/debug/claw version --output-format json; git diff --check; scripts/fmt.sh --check Not-tested: runtime behavior change, because this commit only documents the gap	2026-04-30 02:35:19 +00:00