fix: route broad_cwd JSON error to stdout and close ROADMAP 446-447

The enforce_broad_cwd_policy function was sending JSON error envelopes
to stderr instead of stdout. Fixed to use println! for JSON mode,
matching the main error handler and all resume error paths.

Also closed ROADMAP #446 (config warning deduplication already handled
by emit_config_warning_once) and #447 (JSON errors already routed to
stdout in main handler).

Generated with https://github.com/Yeachan-Heo/gajae-code
Co-authored-by: Gajae Code <dev@gajae-code.com>
This commit is contained in:
bellman
2026-06-05 00:54:17 +09:00
parent 5d85739358
commit de66bfc082
2 changed files with 3 additions and 3 deletions

View File

@@ -6413,10 +6413,10 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed)
445. **DONE — skill name-vs-directory mismatch now detected and reported** — fixed 2026-06-04 in `fix: detect skill name/dir mismatch and report metadata drift`. Skill discovery now tracks `dir_name` alongside the frontmatter `name` and detects when they differ. `skills list --output-format json` includes `metadata_drift:[{dir_name, frontmatter_name, path}]` and reports `status:"degraded"` when drift entries exist. `valid_count` and `metadata_drift_count` provide automation-friendly counts. The `SkillMetadataDrift` struct tracks each mismatch for downstream tooling. Remaining sibling items: subdirs without SKILL.md and loose .md files at skills-dir root are tracked separately. 445. **DONE — skill name-vs-directory mismatch now detected and reported** — fixed 2026-06-04 in `fix: detect skill name/dir mismatch and report metadata drift`. Skill discovery now tracks `dir_name` alongside the frontmatter `name` and detects when they differ. `skills list --output-format json` includes `metadata_drift:[{dir_name, frontmatter_name, path}]` and reports `status:"degraded"` when drift entries exist. `valid_count` and `metadata_drift_count` provide automation-friendly counts. The `SkillMetadataDrift` struct tracks each mismatch for downstream tooling. Remaining sibling items: subdirs without SKILL.md and loose .md files at skills-dir root are tracked separately.
446. **Config is loaded 2-3 times per command invocation; each load re-emits identical deprecation warnings without deduplication — `status` triggers 3× `enabledPlugins` warning, `doctor`/`mcp` trigger 2× each, only `version` (config-free) emits 0** — dogfooded 2026-05-11 by Jobdori on `5a4cc506` in response to Clawhip pinpoint nudge at `1503388740595224717`. Reproduction: with a `~/.claw/settings.json` containing the deprecated `enabledPlugins` key, run each command from a fresh empty cwd and count `warning: ... is deprecated` lines on stderr — `claw status 2>&1 >/dev/null | grep -c deprecated` returns **3**, `claw doctor` returns **2**, `claw mcp` returns **2**, `claw version` returns **0**. Each duplicate is byte-identical (same file path, same line number, same field name). The pattern proves the config-load pipeline is invoked 2-3 times within a single command process; warnings are emitted at each load without checking a `warned_files: HashSet<PathBuf>` deduplication set. **Three sibling implications:** (a) **load-count varies by command** — status:3, doctor:2, mcp:2, version:0 — suggesting each command implements its own config-load call rather than going through a shared cached loader; (b) **noise pollution**: users running `claw status` once see the same 64-character warning 3 times in their terminal scrollback, making real warnings (other config errors, real deprecations) lost in the duplicate noise; (c) **performance signal**: 3× config load means 3× JSON parsing of `~/.claw/settings.json`, `~/.claw.json`, `$CLAW_CONFIG_HOME/settings.json`, and the project-local `.claw.json` / `.claw/settings.json` / `.claw/settings.local.json`. For a workspace with 5 config files, that's 15 redundant disk reads per status invocation. Earlier roadmap entries observed 3× (#424) and 4× (#425) warning counts at different HEADs; the count keeps fluctuating, suggesting the underlying issue is config-load fan-out that nobody has refactored. **Required fix shape:** (a) introduce a `ConfigLoader` cache scoped to the command-process lifetime: first load reads files and emits warnings; subsequent calls hit the cache and emit zero warnings; (b) move config validation/warnings to a single canonical entry point (`ConfigLoader::load_with_diagnostics()` returns `(RuntimeConfig, Vec<Warning>)` exactly once); (c) every command that needs config goes through the cached loader instead of re-reading from disk; (d) `doctor --output-format json` exposes `config_load_count:int` field so we can regression-test that loads are deduplicated; (e) regression test: any single command invocation emits each deprecation warning at most once. **Why this matters:** repeated identical warnings train users to ignore stderr noise. Real warnings (a new deprecation, a config error from a different file, an MCP server failure) get drowned out by 3-4 copies of the same notice. The 15-disk-read worst case is wasted I/O that adds startup latency. The fact that count fluctuates between HEADs (3 at `6c0c305a`, 4 at `d7dbe951`, back to 3 at `5a4cc506`) suggests dev velocity is moving config loads around without an architectural fix. Cross-references #424 (deprecation warning 3×), #425 (deprecation warning 4×), #421 (cwd canonicalization — possibly tied to per-load symlink resolution), #428 (default permission_mode loaded from same config files). Source: Jobdori live dogfood, `5a4cc506`, 2026-05-11. 446. **DONE — config deprecation warnings already deduplicated across multiple loads** — resolved by the existing `emit_config_warning_once` mechanism (ROADMAP #698). `ConfigLoader::load()` emits warnings through a process-lifetime `static EMITTED_CONFIG_WARNINGS: OnceLock<Mutex<HashSet<String>>>` that prevents duplicate stderr output even when `load()` is called 2-3 times per command invocation. JSON mode suppresses all stderr warnings via `SUPPRESS_CONFIG_WARNINGS_STDERR`. The hook deprecation warnings added in #441 also flow through this same deduplication path.
447. **All JSON error envelopes go to STDERR not STDOUT; stdout is empty (0 bytes) on every `--output-format json` failure — breaks the standard automation pattern `output=$(claw cmd --output-format json)` which captures nothing on error and forces ugly `2>&1` redirects to even see the JSON** — dogfooded 2026-05-11 by Jobdori on `5ab969e7` in response to Clawhip pinpoint nudge at `1503396289071808523`. Reproduction (stderr-vs-stdout discipline audit): `claw --no-such-flag --output-format json >stdout.txt 2>stderr.txt` → stdout = **0 bytes**, stderr = 115 bytes containing `{"error":"unknown option: --no-such-flag","hint":"Run \`claw --help\` for usage.","kind":"cli_parse","type":"error"}`. Same pattern across four error envelopes probed: (a) `cli_parse` → stdout 0 / stderr 115; (b) `missing_credentials` → stdout 0 / stderr 853 (includes deprecation warnings ahead of envelope); (c) `session_load_failed` → stdout 0 / stderr 322; (d) `invalid_model_syntax` → stdout 0 / stderr 199. Success paths route correctly: `claw status --output-format json` → stdout 1496 / stderr 0. **The asymmetry is wrong on two axes:** (a) **JSON-format outputs should always go to stdout regardless of success/failure**: every major CLI in this class (kubectl, gh, aws, jq, terraform `-json`, `npm --json`) emits JSON on stdout for both ok and error paths; consumers parse `stdout | jq .kind` and switch on the kind to detect errors. claw's split forces consumers to capture both streams or use `2>&1` which then includes deprecation prose alongside the JSON envelope and breaks parsing. (b) **Deprecation/info warnings leak into the JSON error envelope on stderr**: when stderr is the only path to get the JSON, the deprecation warning prefix (`warning: ... enabledPlugins ... is deprecated`) precedes the JSON, making `tail -1 stderr.txt | jq .` fragile. **Three sibling problems:** (i) **breaks the canonical Bash idiom** `if ! output=$(cmd --output-format json); then echo "$output" | jq .error; fi` — `$output` is empty on error so the `jq` call sees nothing. (ii) **forces N-line stderr parsing**: to get the JSON envelope from stderr, automation must read until EOF, then skip leading `warning:` lines, then parse only the last `{...}` JSON. This is a brittle heuristic that breaks if more warnings are added. (iii) **inconsistent with text mode**: text-mode error output ALSO goes to stderr (e.g., `claw --no-such-flag` → stderr `[error-kind: cli_parse]\nerror: ...`) — that's correct for text mode (stderr is the diagnostic channel). The bug is JSON mode inheriting the same routing. **Required fix shape:** (a) JSON error envelopes go to STDOUT when `--output-format json` is active; (b) keep text-mode error output on stderr (no change for text path); (c) deprecation/info warnings should ALSO go to stderr in JSON mode (they're diagnostic prose, not part of the JSON contract) — separate channels: JSON envelope on stdout, prose warnings on stderr; (d) add `--quiet` / `--no-warn` flag to fully suppress stderr warnings for clean automation; (e) regression test: every `--output-format json` failure path emits the JSON envelope on stdout, exit non-zero, no JSON ever on stderr. **Why this matters:** the entire point of `--output-format json` is enabling automation. Splitting JSON success vs error across stdout vs stderr defeats the purpose — automation must capture both, dedupe sources, and parse mixed streams. Cross-references #422 (exit-code parity across error envelopes), #424 (deprecation warnings noise), #428 (envelope vs prose tension), #446 (multi-load deprecation duplication). Source: Jobdori live dogfood, `5ab969e7`, 2026-05-11. 447. **DONE — JSON error envelopes route to stdout** — the main error handler already routes JSON error envelopes to stdout via `println!` (line 408). The `enforce_broad_cwd_policy` function was the remaining outlier sending JSON to stderr; fixed to use `println!` for JSON mode. All resume error paths also correctly use `println!` for JSON. Text mode errors correctly stay on stderr. JSON mode suppresses prose deprecation warnings on stderr via `SUPPRESS_CONFIG_WARNINGS_STDERR`.
448. **`sandbox --output-format json` has contradictory state flags — `enabled:true, supported:false, active:false, filesystem_active:true, allowed_mounts:[]`: claim that sandbox is "enabled" while OS doesn't support namespace isolation and `allowed_mounts:[]` is empty contradicts `filesystem_active:true filesystem_mode:"workspace-only"`** — dogfooded 2026-05-11 by Jobdori on `7244a82b` in response to Clawhip pinpoint nudge at `1503403842920779917` (using fresh-current-main runner at `/tmp/claw-dog-1430` per gajae's 14:00 protocol switch). Reproduction: `claw sandbox --output-format json` on macOS (where `unshare` is unavailable) returns `{"active":false,"active_namespace":false,"active_network":false,"allowed_mounts":[],"enabled":true,"fallback_reason":"namespace isolation unavailable (requires Linux with \`unshare\`)","filesystem_active":true,"filesystem_mode":"workspace-only","in_container":false,"kind":"sandbox","markers":[],"requested_namespace":true,"requested_network":false,"supported":false}`. **Three contradictions in the same envelope:** (a) `enabled:true` AND `supported:false`: what does "enabled" mean if the OS doesn't support sandboxing? Read literally, sandbox is *enabled but unsupported* — semantic nonsense. The likely intent is "user requested sandbox in config" but the field name `enabled` says "is ON". A better name would be `requested:true` or `config_intent:true`, with `enabled` reserved for the actually-active state. (b) `filesystem_active:true, filesystem_mode:"workspace-only"` AND `allowed_mounts:[]`: if the filesystem fence is active in workspace-only mode, the workspace directory itself MUST be an allowed mount. An empty `allowed_mounts:[]` array combined with `filesystem_active:true` means either (i) the fence is being misreported (it's not really active), (ii) the workspace is implicit and `allowed_mounts` only lists *additional* mounts, or (iii) the fence has no allowed paths and nothing is readable — all three are inconsistent with the user-facing summary. (c) `active:false` AND `filesystem_active:true`: the top-level `active` field is a single boolean summary, but it disagrees with `filesystem_active:true` (one component is active). Either `active` is "all components active" (then it should be `false` when any component is off) or "any component active" (then it should be `true` when filesystem is). The current value is `false` despite filesystem being active. **Sibling: no `claw sandbox --help`**: `claw sandbox status` and `claw sandbox --help` go to LLM-prompt fallback or hang (gajae confirmed at 13:00 that `sandbox status` returns typed `cli_parse` but `sandbox --help` is bounded — schema is non-uniform across help paths). **Required fix shape:** (a) rename `enabled` to `requested` or `config_intent` to disambiguate from "currently active"; (b) make `allowed_mounts` explicitly include the workspace when filesystem_mode is "workspace-only" (`allowed_mounts:[{path:"<cwd>",writable:true,reason:"workspace_root"}]`); (c) document the `active` aggregate semantics: pick either "all" or "any" composition rule and document the choice; (d) add `active_components:["filesystem"]` array as a richer alternative to the single boolean — surfaces exactly which sandbox subsystems are live; (e) regression test: when `filesystem_mode == "workspace-only"`, `allowed_mounts` MUST contain the cwd and `active` must agree with the documented composition rule. **Why this matters:** sandbox is the trust surface — automation that checks `sandbox.active == true` before running a risky LLM prompt sees `false` (no namespace, no network) and assumes no isolation, but `filesystem_active:true` means there IS partial isolation. The mixed signal forces consumers to OR all `*_active` fields together. Cross-references #428 (default permission_mode=danger-full-access — paired with sandbox-not-active means zero isolation), #444 (no broad-cwd guard — sandbox is the only safety net and its status is unclear). Source: Jobdori live dogfood, `7244a82b`, 2026-05-11. 448. **`sandbox --output-format json` has contradictory state flags — `enabled:true, supported:false, active:false, filesystem_active:true, allowed_mounts:[]`: claim that sandbox is "enabled" while OS doesn't support namespace isolation and `allowed_mounts:[]` is empty contradicts `filesystem_active:true filesystem_mode:"workspace-only"`** — dogfooded 2026-05-11 by Jobdori on `7244a82b` in response to Clawhip pinpoint nudge at `1503403842920779917` (using fresh-current-main runner at `/tmp/claw-dog-1430` per gajae's 14:00 protocol switch). Reproduction: `claw sandbox --output-format json` on macOS (where `unshare` is unavailable) returns `{"active":false,"active_namespace":false,"active_network":false,"allowed_mounts":[],"enabled":true,"fallback_reason":"namespace isolation unavailable (requires Linux with \`unshare\`)","filesystem_active":true,"filesystem_mode":"workspace-only","in_container":false,"kind":"sandbox","markers":[],"requested_namespace":true,"requested_network":false,"supported":false}`. **Three contradictions in the same envelope:** (a) `enabled:true` AND `supported:false`: what does "enabled" mean if the OS doesn't support sandboxing? Read literally, sandbox is *enabled but unsupported* — semantic nonsense. The likely intent is "user requested sandbox in config" but the field name `enabled` says "is ON". A better name would be `requested:true` or `config_intent:true`, with `enabled` reserved for the actually-active state. (b) `filesystem_active:true, filesystem_mode:"workspace-only"` AND `allowed_mounts:[]`: if the filesystem fence is active in workspace-only mode, the workspace directory itself MUST be an allowed mount. An empty `allowed_mounts:[]` array combined with `filesystem_active:true` means either (i) the fence is being misreported (it's not really active), (ii) the workspace is implicit and `allowed_mounts` only lists *additional* mounts, or (iii) the fence has no allowed paths and nothing is readable — all three are inconsistent with the user-facing summary. (c) `active:false` AND `filesystem_active:true`: the top-level `active` field is a single boolean summary, but it disagrees with `filesystem_active:true` (one component is active). Either `active` is "all components active" (then it should be `false` when any component is off) or "any component active" (then it should be `true` when filesystem is). The current value is `false` despite filesystem being active. **Sibling: no `claw sandbox --help`**: `claw sandbox status` and `claw sandbox --help` go to LLM-prompt fallback or hang (gajae confirmed at 13:00 that `sandbox status` returns typed `cli_parse` but `sandbox --help` is bounded — schema is non-uniform across help paths). **Required fix shape:** (a) rename `enabled` to `requested` or `config_intent` to disambiguate from "currently active"; (b) make `allowed_mounts` explicitly include the workspace when filesystem_mode is "workspace-only" (`allowed_mounts:[{path:"<cwd>",writable:true,reason:"workspace_root"}]`); (c) document the `active` aggregate semantics: pick either "all" or "any" composition rule and document the choice; (d) add `active_components:["filesystem"]` array as a richer alternative to the single boolean — surfaces exactly which sandbox subsystems are live; (e) regression test: when `filesystem_mode == "workspace-only"`, `allowed_mounts` MUST contain the cwd and `active` must agree with the documented composition rule. **Why this matters:** sandbox is the trust surface — automation that checks `sandbox.active == true` before running a risky LLM prompt sees `false` (no namespace, no network) and assumes no isolation, but `filesystem_active:true` means there IS partial isolation. The mixed signal forces consumers to OR all `*_active` fields together. Cross-references #428 (default permission_mode=danger-full-access — paired with sandbox-not-active means zero isolation), #444 (no broad-cwd guard — sandbox is the only safety net and its status is unclear). Source: Jobdori live dogfood, `7244a82b`, 2026-05-11.

View File

@@ -6569,7 +6569,7 @@ fn enforce_broad_cwd_policy(
); );
match output_format { match output_format {
CliOutputFormat::Json => { CliOutputFormat::Json => {
eprintln!( println!(
"{}", "{}",
serde_json::json!({ serde_json::json!({
"kind": "broad_cwd", "kind": "broad_cwd",