docs(roadmap): add #330 — resume mode stats/cost always zero

2026-05-13 17:36:44 +00:00 · 2026-04-29 20:02:55 +09:00
parent e7074f47ee
commit 52161c781a
1 changed files with 2 additions and 0 deletions
--- a/ROADMAP.md
+++ b/ROADMAP.md
@@ -6261,3 +6261,5 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed)
 322. **Config deprecation warnings are emitted to stderr even under `--output-format json`, making JSON output unparseable from combined stdout+stderr capture** — dogfooded 2026-04-29 by Jobdori on current main (`8e22f75`). Running `cargo run --bin claw -- doctor --output-format json 2>&1 | python3 -c "import sys,json; json.loads(sys.stdin.read())"` fails with `Expecting value: line 1 column 1 (char 0)` because a `warning: /path/settings.json: field "enabledPlugins" is deprecated. Use "plugins.enabled" instead` line is emitted to stderr before the JSON body begins. When a caller captures combined output (the common automation pattern: `2>&1`, subprocess `STDOUT | STDERR`, PTY capture, or tmux pane scrape) the warning prefix breaks JSON parse for every downstream consumer. Root cause: `rust/crates/runtime/src/config.rs` line ~300 calls `eprintln!("warning: {warning}")` unconditionally during `ClawSettings::load_merged()` regardless of active output format. **Required fix shape:** (a) thread the active `CliOutputFormat` through the config loading path and suppress or defer human-readable warning strings when `json` mode is active; (b) instead, collect deprecation diagnostics and inject them into the JSON output as a top-level `"warnings": [...]` array (same field already used by `doctor`); (c) ensure the JSON body is always the first bytes on stdout and all prose warnings stay on stderr or are suppressed in json mode; (d) add regression coverage proving `claw <any-cmd> --output-format json` stdout is valid JSON regardless of config deprecation state. **Why this matters:** `--output-format json` is the automation/claw contract; if config warnings can silently corrupt the JSON stream, every orchestration layer that captures combined output gets broken parse-on-warning with no stable fallback. Source: Jobdori live dogfood on mengmotaHost, claw-code main `8e22f75`, 2026-04-29.

 323. **`status --output-format json` reports `session.session = "live-repl"` while simultaneously reporting `session_lifecycle.kind = "saved_only"` — contradictory session identity in a single status snapshot** — dogfooded 2026-04-29 by Jobdori on current main (`804d96b`). Running `claw status --output-format json` from an active REPL-style invocation produced `"session": "live-repl"` in the `workspace` block and `"session_lifecycle": {"kind": "saved_only", "pane_id": null, ...}` in the same object. Those two fields carry contradictory claims: `"live-repl"` asserts there is an active interactive session, while `"saved_only"` asserts there is no live tmux pane hosting the session — the session exists only as a saved artifact. A downstream claw reading this snapshot cannot tell which claim to trust: is this a running session whose pane is undetectable, or a saved-only session that the `session` field is misclassifying? Root cause: `"live-repl"` is a fallback sentinel emitted by `main.rs:6070` when `context.session_path` is `None`, while `session_lifecycle` is computed independently by `classify_session_lifecycle_for()` from tmux pane discovery; the two fields share no common source and can diverge. **Required fix shape:** (a) derive both `session.session` and `session_lifecycle.kind` from the same lifecycle classification result so they cannot diverge; (b) replace the `"live-repl"` free-form sentinel with a structured `session_kind` field (`live_repl`, `saved`, `resume`, etc.) that carries the same type vocabulary as `session_lifecycle.kind`; (c) when `session_lifecycle.kind = "saved_only"`, never emit `"session": "live-repl"` (or vice versa); (d) add a regression test proving `status --output-format json` never emits `session.kind = "live_repl"` and `session_lifecycle.kind = "saved_only"` simultaneously. **Why this matters:** `status --output-format json` is the machine-readable truth surface for session state; if two fields in the same snapshot contradict each other, every lane, monitor, and orchestrator has to pick a winner instead of reading a coherent state. Source: Jobdori live dogfood on mengmotaHost, claw-code `804d96b`, 2026-04-29.
+
+330. **`/stats` and `/cost` in `--resume` mode always return zero token counts regardless of saved session usage** — dogfooded 2026-04-29 by Jobdori on current main (`e7074f4`). Running `claw --output-format json --resume latest /stats` and `claw --output-format json --resume latest /cost` both return `{"input_tokens": 0, "output_tokens": 0, "total_tokens": 0, ...}` even when the session file was created by a real interactive claw run. The same zero-fill is produced by `claw --output-format json status` (usage sub-object). Inspection of the default session path (`~/.claw/sessions/.../session-*.jsonl`) shows the file has only 2 events (`session_meta`, `message`) with no usage/token records embedded. Two candidate root causes: (a) session serialization never writes usage events into the JSONL file even after real prompt exchanges, so resume-mode has no usage to replay; (b) resume-mode stat accumulation reads usage from memory-side counters that are always zero because no API calls happened in the resume-only invocation. Either way, the operator effect is the same: `--resume` + `/stats`/`/cost` cannot report what the session actually consumed. **Required fix shape:** (a) persist per-turn usage records (input/output/cache tokens per exchange) into the session JSONL at write time so resume-mode can reconstruct cumulative counts by replay; (b) expose a `"total_usage"` summary block at the tail of every session JSONL so resume-mode can read it in O(1) without full replay; (c) ensure `/stats` and `/cost` in resume-mode sum the persisted usage, not memory-side live counters; (d) add regression coverage proving `--resume SESSION.jsonl /stats` returns non-zero token counts after a session that made real API calls. **Why this matters:** token usage visibility is critical for quota management and cost attribution; if `--resume` mode always shows zeros, operators and orchestration lanes cannot trust usage data from saved sessions and must rely on external billing dashboards instead of in-band tooling. Source: Jobdori live dogfood on mengmotaHost, claw-code `e7074f4`, 2026-04-29.