docs(roadmap): add grep search pre-limit scan gap

2026-05-21 21:26:45 +00:00 · 2026-05-20 21:00:44 +00:00
parent 02f288724f
commit 0864f39512
1 changed files with 2 additions and 0 deletions
--- a/ROADMAP.md
+++ b/ROADMAP.md
@@ -6575,3 +6575,5 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed)
 509. **`write_file`/`edit_file` tool results echo full file contents (`content`, `original_file`) even though they also compute structured patches, so large file edits can double-return megabytes of text into the model context** — dogfooded 2026-05-20 from the `#clawcode-building-in-public` 19:30 UTC nudge on `/home/bellman/Workspace/claw-code-pr2967` with branch/origin `docs/roadmap-workdir-provenance@8c62fff`. Code inspection: `runtime/src/file_ops.rs::write_file` reads the prior file (if any), writes the new content, then returns `WriteFileOutput { content: content.to_owned(), original_file, structured_patch: make_patch(...) }`. `edit_file` similarly reads the full original file, computes `updated`, writes it, and returns `EditFileOutput { old_string, new_string, original_file: original_file.clone(), structured_patch: make_patch(&original_file, &updated), ... }`. The tool already has a structured patch/diff, but the serialized result still includes full pre/post content fields. Updating a 1MB file can return roughly 1MB `content` plus 1MB `original_file` plus patch metadata; a tiny `edit_file` change on a large file returns the entire original file even when a short diff would suffice. This is the file-edit sibling of the NotebookEdit/TodoWrite/REPL output-amplification cluster. **Required fix shape:** (a) make write/edit results patch-first and omit full `content`/`original_file` by default; (b) include bounded previews plus `original_bytes`, `new_bytes`, `content_truncated`, and `original_truncated` metadata when useful; (c) expose an explicit debug/full-output flag only for small files or trusted callers; (d) add regressions for editing/writing a large file proving serialized tool output remains bounded while the structured patch still identifies the change. **Why this matters:** file editing is the core coding surface. Returning full file bodies after every update wastes context, raises costs, and can force compaction precisely during code-review/debug loops where the model only needs a concise diff and path/byte metadata. Source: gaebal-gajae dogfood response to Clawhip message `1506740829912567824` on 2026-05-20.

 510. **`read_file` with no `limit` can return the entire 10MB text file into the model context, because the file-size guard is a disk-read cap, not an output budget** — dogfooded 2026-05-20 from the `#clawcode-building-in-public` 20:30 UTC nudge on `/home/bellman/Workspace/claw-code-pr2967` with branch/origin `docs/roadmap-workdir-provenance@154e7ed`. Code inspection: `runtime/src/file_ops.rs::read_file` rejects files larger than `MAX_READ_SIZE` (10MB) and binary-looking files, then reads the entire file via `fs::read_to_string`, splits into `lines`, and when `limit` is absent sets `end_index = lines.len()`. The serialized `ReadFileOutput.file.content` is therefore the full selected content; for any text file at or below 10MB, the default read emits all of it. `limit` is line-based and optional, with no byte/token cap, no `content_truncated` metadata, and no default windowing. This is distinct from #509 write/edit amplification: a read-only exploratory call can still inject megabytes into context by omitting `limit`, even though most callers need the first window plus total metadata. **Required fix shape:** (a) add a default output byte/line cap for `read_file` (for example first 200 lines / 64KB) unless the caller explicitly requests a bounded range; (b) enforce a hard serialized-output byte cap even when `limit` is huge; (c) return `truncated`, `total_lines`, `selected_start_line`, `selected_end_line`, and `total_bytes` so callers can page intentionally; (d) add regressions for 1MB and 10MB text files proving default read output is bounded and explicit paging works without exceeding the cap. **Why this matters:** `read_file` is allowed in read-only mode and is the first tool a claw uses during debugging. A single accidental full-file read of a generated JSON/log/source bundle can consume the context window and force compaction before any useful analysis happens. Source: gaebal-gajae dogfood response to Clawhip message `1506755925225111724` on 2026-05-20.
+
+511. **`grep_search` collects every file and every matching content line before applying `head_limit`, so a small requested result can still scan/read/store an unbounded workspace worth of data** — dogfooded 2026-05-20 from the `#clawcode-building-in-public` 21:00 UTC nudge on `/home/bellman/Workspace/claw-code-pr2967` with branch/origin `docs/roadmap-workdir-provenance@02f2887`. Code inspection: `grep_search_impl` first calls `collect_search_files(&base_path)?`, which walks the entire tree into a `Vec<PathBuf>` with no ignored-dir policy, file-count cap, or early stop. For every candidate it then does `fs::read_to_string(&file_path)` with no per-file size guard (unlike `read_file`'s 10MB max) and, for `output_mode == "content"`, pushes every matched/context line into `content_lines`. Only after the full traversal does it call `apply_limit(filenames, input.head_limit, input.offset)` and later `apply_limit(content_lines, head_limit, offset)`. The default limit is 250 output items, but it is not an execution budget: a repo with huge generated text files or thousands of matches still pays full read/regex/memory cost before returning 250 lines. **Required fix shape:** (a) stream search results and stop once `offset + head_limit` content lines/files have been collected, while continuing only if `count` mode explicitly needs totals; (b) add skip dirs/file-size guards shared with `glob_search` (`.git`, `node_modules`, `target`, etc.) and binary detection; (c) expose `truncated:true`, `files_scanned`, `files_skipped_size`, and `matches_seen` metadata; (d) add regression fixtures with a huge file and many matches proving `head_limit:1` does not read/accumulate the entire workspace. **Why this matters:** grep is a read-only diagnostic primitive. `head_limit` currently protects only the final JSON size, not runtime CPU/memory or accidental context blowups, so common searches in generated/vendor-heavy repos can look like tool hangs even when the caller asked for one line. Source: gaebal-gajae dogfood response to Clawhip message `1506763474955403414` on 2026-05-20.