diff --git a/ROADMAP.md b/ROADMAP.md index 93db094b..b55bcb71 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -6573,3 +6573,5 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed) 508. **Config/plan-mode writes use direct `std::fs::write` with no atomic rename or advisory lock, so crashes and concurrent tool calls can corrupt `settings.json`, `settings.local.json`, or plan-mode state** — dogfooded 2026-05-20 from the `#clawcode-building-in-public` 19:00 UTC nudge on `/home/bellman/Workspace/claw-code-pr2967` with branch/origin `docs/roadmap-workdir-provenance@e2b96ea`. Code inspection: `execute_config` reads a settings JSON object, mutates it, and persists through `write_json_object`; `EnterPlanMode`/`ExitPlanMode` also call `write_json_object` for `.claw/settings.local.json` and `write_plan_mode_state` for `.claw/tool-state/plan-mode.json`. Both `write_json_object` and `write_plan_mode_state` create parent directories and then call `std::fs::write(path, serde_json::to_string_pretty(...))` directly on the destination. There is no temp file + fsync + rename, and no lock around the read-modify-write window. By contrast, the OAuth credentials writer already uses a safer `.tmp` then rename pattern. Two concurrent `Config`/plan-mode calls can both read the same old document and last-writer-wins one update; a crash/interruption mid-write can leave a truncated JSON settings file that later startup/doctor must diagnose. **Required fix shape:** (a) introduce a shared `atomic_write_json(path, value)` helper using same-directory temp file, flush/fsync, and rename; (b) wrap read-modify-write config mutations in an advisory lock (`settings.json.lock` / `settings.local.json.lock`) or a compare-and-swap retry loop; (c) use the same helper for `write_plan_mode_state`, `write_agent_manifest` where appropriate, and any future JSON state files; (d) add stress/regression tests with two concurrent config writes and a simulated partial-write failure proving no malformed JSON and no lost sibling setting. **Why this matters:** config and plan-mode are control-plane state. A supposedly safe tool call should not be able to silently lose another setting or leave the workspace unable to load settings after an interrupted write; that turns a small config action into startup friction and stale permission-mode confusion. Source: gaebal-gajae dogfood response to Clawhip message `1506733276037910700` on 2026-05-20. 509. **`write_file`/`edit_file` tool results echo full file contents (`content`, `original_file`) even though they also compute structured patches, so large file edits can double-return megabytes of text into the model context** — dogfooded 2026-05-20 from the `#clawcode-building-in-public` 19:30 UTC nudge on `/home/bellman/Workspace/claw-code-pr2967` with branch/origin `docs/roadmap-workdir-provenance@8c62fff`. Code inspection: `runtime/src/file_ops.rs::write_file` reads the prior file (if any), writes the new content, then returns `WriteFileOutput { content: content.to_owned(), original_file, structured_patch: make_patch(...) }`. `edit_file` similarly reads the full original file, computes `updated`, writes it, and returns `EditFileOutput { old_string, new_string, original_file: original_file.clone(), structured_patch: make_patch(&original_file, &updated), ... }`. The tool already has a structured patch/diff, but the serialized result still includes full pre/post content fields. Updating a 1MB file can return roughly 1MB `content` plus 1MB `original_file` plus patch metadata; a tiny `edit_file` change on a large file returns the entire original file even when a short diff would suffice. This is the file-edit sibling of the NotebookEdit/TodoWrite/REPL output-amplification cluster. **Required fix shape:** (a) make write/edit results patch-first and omit full `content`/`original_file` by default; (b) include bounded previews plus `original_bytes`, `new_bytes`, `content_truncated`, and `original_truncated` metadata when useful; (c) expose an explicit debug/full-output flag only for small files or trusted callers; (d) add regressions for editing/writing a large file proving serialized tool output remains bounded while the structured patch still identifies the change. **Why this matters:** file editing is the core coding surface. Returning full file bodies after every update wastes context, raises costs, and can force compaction precisely during code-review/debug loops where the model only needs a concise diff and path/byte metadata. Source: gaebal-gajae dogfood response to Clawhip message `1506740829912567824` on 2026-05-20. + +510. **`read_file` with no `limit` can return the entire 10MB text file into the model context, because the file-size guard is a disk-read cap, not an output budget** — dogfooded 2026-05-20 from the `#clawcode-building-in-public` 20:30 UTC nudge on `/home/bellman/Workspace/claw-code-pr2967` with branch/origin `docs/roadmap-workdir-provenance@154e7ed`. Code inspection: `runtime/src/file_ops.rs::read_file` rejects files larger than `MAX_READ_SIZE` (10MB) and binary-looking files, then reads the entire file via `fs::read_to_string`, splits into `lines`, and when `limit` is absent sets `end_index = lines.len()`. The serialized `ReadFileOutput.file.content` is therefore the full selected content; for any text file at or below 10MB, the default read emits all of it. `limit` is line-based and optional, with no byte/token cap, no `content_truncated` metadata, and no default windowing. This is distinct from #509 write/edit amplification: a read-only exploratory call can still inject megabytes into context by omitting `limit`, even though most callers need the first window plus total metadata. **Required fix shape:** (a) add a default output byte/line cap for `read_file` (for example first 200 lines / 64KB) unless the caller explicitly requests a bounded range; (b) enforce a hard serialized-output byte cap even when `limit` is huge; (c) return `truncated`, `total_lines`, `selected_start_line`, `selected_end_line`, and `total_bytes` so callers can page intentionally; (d) add regressions for 1MB and 10MB text files proving default read output is bounded and explicit paging works without exceeding the cap. **Why this matters:** `read_file` is allowed in read-only mode and is the first tool a claw uses during debugging. A single accidental full-file read of a generated JSON/log/source bundle can consume the context window and force compaction before any useful analysis happens. Source: gaebal-gajae dogfood response to Clawhip message `1506755925225111724` on 2026-05-20.