docs(roadmap): add binary overwrite misreported create

This commit is contained in:
Yeachan-Heo
2026-05-20 22:30:53 +00:00
parent ea29fbeb44
commit 07a12d4cf3

View File

@@ -6581,3 +6581,5 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed)
512. **`glob_search` traverses and stores all matches before truncating to 100, then sorts them by metadata, so `truncated:true` still means the full workspace scan already happened** — dogfooded 2026-05-20 from the `#clawcode-building-in-public` 21:30 UTC nudge on `/home/bellman/Workspace/claw-code-pr2967` with branch/origin `docs/roadmap-workdir-provenance@0864f39`. Code inspection: `glob_search_impl` expands brace patterns, derives a walk root, then runs `WalkDir::new(&walk_root).filter_entry(...)` and pushes every matching file into `matches`. Only after all patterns and all entries are exhausted does it sort the entire `matches` vector by `fs::metadata(path).modified()` and compute `truncated = matches.len() > 100`, then `take(100)` for output. The ignored-dir list helps common vendor dirs, but there is no max traversal count, max match count, timeout/deadline, or early stop once enough results are known. A broad pattern like `**/*` in a generated workspace can collect/sort/stat tens or hundreds of thousands of paths just to return 100 names. **Required fix shape:** (a) add execution budgets for glob traversal (`max_entries_scanned`, `max_matches_collected`, optional deadline); (b) stream/top-k results instead of collecting every match before truncation, or make `sort_by_mtime` opt-in when exact newest-100 is required; (c) return `entries_scanned`, `matches_seen`, `truncated_reason`, and `ignored_dirs` metadata; (d) share traversal budget primitives with `grep_search` so read-only discovery tools fail/degrade consistently; (e) add a regression with >1000 generated files proving `glob_search` returns promptly without storing/sorting every match when capped. **Why this matters:** glob is a safe-looking read-only discovery tool, but broad globs in large repos are a common source of startup friction and apparent hangs. Output truncation alone is not enough; the work done before truncation must also be bounded and observable. Source: gaebal-gajae dogfood response to Clawhip message `1506771028834123986` on 2026-05-20.
513. **`make_patch` is not a diff hunk generator; it emits the entire old file as removed lines plus the entire new file as added lines, so `structured_patch` itself can double the output size for every write/edit** — dogfooded 2026-05-20 from the `#clawcode-building-in-public` 22:00 UTC nudge on `/home/bellman/Workspace/claw-code-pr2967` with branch/origin `docs/roadmap-workdir-provenance@b3b9eb2`. Code inspection: `runtime/src/file_ops.rs::make_patch(original, updated)` builds one `StructuredPatchHunk` by iterating `for line in original.lines() { lines.push(format!("-{line}")); }` and then `for line in updated.lines() { lines.push(format!("+{line}")); }`; `old_lines` and `new_lines` are the full line counts. This means the supposedly structured patch is a whole-file delete+whole-file add, not a minimal/contextual diff. Combined with #509's full `content`/`original_file` fields, editing a 1MB file can return: full original file, full new file, and a `structured_patch` containing full original+full new again. Even after removing raw content fields, `structured_patch` would still be an output-amplification bug unless it becomes a real bounded diff. **Required fix shape:** (a) replace `make_patch` with a real line diff/hunk algorithm that emits only changed hunks plus configurable context; (b) cap patch lines/bytes with `patch_truncated`, `omitted_hunks`, and full old/new byte counts; (c) for full-file rewrites, return a summary (`rewrite:true`, changed line counts, previews) rather than every line; (d) add regressions for one-line edit in a 10k-line file proving patch output is O(changed lines + context), not O(file size). **Why this matters:** structured patches are the right contract for coding tools, but a whole-file pseudo-patch creates the same context-window blowup as raw file echoes while looking machine-friendly. Review/debug loops need concise, truthful diffs. Source: gaebal-gajae dogfood response to Clawhip message `1506778574143754422` on 2026-05-20.
514. **`write_file` silently overwrites existing binary/non-UTF8 files while reporting them as creates because the previous-file read uses `fs::read_to_string(...).ok()` and drops decode/errors** — dogfooded 2026-05-20 from the `#clawcode-building-in-public` 22:30 UTC nudge on `/home/bellman/Workspace/claw-code-pr2967` with branch/origin `docs/roadmap-workdir-provenance@ea29fbe`. Code inspection: `runtime/src/file_ops.rs::write_file` enforces only the new content byte length, resolves the destination, then does `let original_file = fs::read_to_string(&absolute_path).ok();`. Any read error — file missing, permission denied, directory race, or existing binary/non-UTF8 content — is collapsed to `None`. The function then creates parents and `fs::write(&absolute_path, content)?`. Because `kind` is computed as `if original_file.is_some() { "update" } else { "create" }`, overwriting a binary file is reported as a create with no warning and no binary guard. `read_file` has binary detection; `write_file` does not apply it before clobbering existing files. **Required fix shape:** (a) distinguish `NotFound` from other read/decode errors instead of using `.ok()`; (b) if the destination exists and is not valid UTF-8 or appears binary, require an explicit `overwrite_binary:true` / `force:true` flag or return `kind:"binary_overwrite_refused"`; (c) report `existed:true` based on metadata, not successful UTF-8 decoding; (d) preserve structured diff only for text files; (e) add regressions for overwriting a binary file and a permission-denied/unreadable file proving the tool does not silently treat them as creates. **Why this matters:** write tools are allowed to create/update source artifacts, but silently clobbering a binary asset or unreadable file is data-lossy and misleading. The model/operator needs to know whether a file existed and whether a safe text diff is possible before replacement. Source: gaebal-gajae dogfood response to Clawhip message `1506786124176293919` on 2026-05-20.