docs(roadmap): add glob brace fanout gap

2026-05-22 13:46:44 +00:00 · 2026-05-22 10:30:50 +00:00
parent ab5754a524
commit 571b4c2cd1
1 changed files with 2 additions and 0 deletions
--- a/ROADMAP.md
+++ b/ROADMAP.md
@@ -6703,3 +6703,5 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed)
 574. **Kimi compatibility helpers only strip the `kimi/` routing prefix, so documented `dashscope/kimi-*` and `moonshot/kimi-*` slugs can leak provider prefixes onto the wire** — dogfooded 2026-05-22 from the `#clawcode-building-in-public` 09:00 UTC nudge on `/home/bellman/Workspace/claw-code-pr2967` with branch/origin `docs/roadmap-workdir-provenance@c1bb355`. Active tmux session at probe time: `omx-issue-2462-madmax-lock-diagnostic`; no active claw-code implementation session. Channel context included Jobdori's reasoning-history #581 finding in `openai_compat.rs`; this probe inspected the same provider file for another Kimi/OpenAI-compatible model-routing contract gap. Code inspection: `model_rejects_is_error_field` intentionally recognizes `dashscope/kimi-k2.5` and `moonshot/kimi-k2.5` by stripping any prefix with `rsplit('/')`, and tests assert those slugs reject `is_error`. But `wire_model_for_base_url` / `strip_routing_prefix` only strip prefixes matching `openai|xai|grok|qwen|kimi`; `dashscope` and `moonshot` are not included. Therefore a request model like `dashscope/kimi-k2.5` can be treated as Kimi for tool-result compatibility while the serialized `model` sent to a DashScope/Moonshot-compatible endpoint remains `dashscope/kimi-k2.5` instead of the expected `kimi-k2.5`. Existing tests cover `strip_routing_prefix("kimi/kimi-k2.5")` but not `dashscope/kimi-k2.5` or `moonshot/kimi-k2.5`, despite those exact prefixes being listed in Kimi compatibility tests. **Required fix shape:** (a) decide the supported routing prefixes for Kimi/DashScope/Moonshot and use one shared prefix-strip helper for compatibility checks and wire model serialization; (b) add `dashscope` and `moonshot` where appropriate, or reject those prefixed slugs early with a typed configuration error; (c) add tests proving `wire_model_for_base_url` and `strip_routing_prefix` produce `kimi-k2.5` for every documented Kimi prefix; (d) keep OpenRouter/non-default OpenAI base-url slash-slug preservation semantics intact; (e) update docs/config examples so model slugs and provider routing prefixes are unambiguous. **Why this matters:** provider compatibility logic and wire serialization must agree on the model identity. If one path says “this is Kimi” while another sends a prefixed slug the backend may not understand, users get avoidable 400/model-not-found errors that look like provider instability. Source: gaebal-gajae dogfood response to Clawhip message `1507307060998443059` on 2026-05-22.
 575. **`grep_search` walks `.git`, `node_modules`, `target`, and other heavy directories even though `glob_search` skips them, so grep can waste startup/context time scanning generated/vendor files by default** — dogfooded 2026-05-22 from the `#clawcode-building-in-public` 10:00 UTC nudge on `/home/bellman/Workspace/claw-code-pr2967` with branch/origin `docs/roadmap-workdir-provenance@819f67b`. Active tmux sessions at probe time: none. Code inspection: `rust/crates/runtime/src/file_ops.rs` defines `GLOB_SEARCH_IGNORED_DIRS` (`.git`, `node_modules`, `.build`, `target`, `dist`, `coverage`) and `glob_search_impl` applies it via `WalkDir::filter_entry(|entry| !should_skip_glob_dir(entry))`. But `grep_search_impl` calls `collect_search_files(&base_path)`, and `collect_search_files` uses raw `WalkDir::new(base_path)` with no `filter_entry` or ignore list. As a result, a default grep over a repo can descend through `.git/objects`, Rust `target/`, vendored `node_modules`, coverage, and dist outputs, then silently skip unreadable/non-UTF8 files (#571) or spend time decoding generated blobs before any model-visible answer. Existing tests prove `glob_search_skips_common_heavy_directories`, but no equivalent grep test exists. **Required fix shape:** (a) make `collect_search_files` share the same ignored-directory policy as `glob_search`, or define an explicit grep ignore policy with opt-in overrides; (b) add skipped/ignored directory counts to grep output so operators know scope was pruned; (c) add regressions where `.git`/`target` contain matching files but default grep ignores them, while direct path-to-file behavior remains explicit; (d) allow explicit include/override if users really want generated/vendor search; (e) align docs/tool schema so glob and grep default search scope semantics match. **Why this matters:** grep is a high-frequency dogfood/search tool. Scanning generated and VCS internals by default creates startup friction, noisy false matches, and token waste, especially in Rust/JS workspaces with huge `target` or `node_modules` trees. Source: gaebal-gajae dogfood response to Clawhip message `1507322157561151671` on 2026-05-22.
 576. **`glob_search` brace expansion has no fan-out cap, so a single pattern with large brace groups can explode into thousands of `WalkDir` traversals before any timeout/result limit applies** — dogfooded 2026-05-22 from the `#clawcode-building-in-public` 10:30 UTC nudge on `/home/bellman/Workspace/claw-code-pr2967` with branch/origin `docs/roadmap-workdir-provenance@ab5754a`. Active tmux sessions at probe time: none. Code inspection: `rust/crates/runtime/src/file_ops.rs::expand_braces` recursively expands the first `{...}` group by splitting on every comma and calling itself on each alternative. `glob_search_impl` then iterates every expanded pattern and creates a separate `WalkDir` traversal for each one before applying the hard `take(100)` result cap. There is no maximum number of alternatives, no recursion-depth/expanded-pattern cap, and no early deduplication by shared walk root. A user/model-supplied pattern like `**/*.{a,b,c,...hundreds}` or multiple brace groups can produce hundreds/thousands of full repo walks even though only 100 filenames can be returned. Existing tests cover a tiny `*.{rs,toml}` happy path and unmatched braces, but not expansion fan-out or timeout behavior. **Required fix shape:** (a) impose a small maximum expanded-pattern count and return `InvalidInput`/typed error when exceeded; (b) deduplicate shared walk roots or compile multiple patterns per root so alternatives do not trigger repeated full-tree walks; (c) include `expanded_pattern_count` and truncation/fanout metadata in diagnostics; (d) add regressions for large single brace groups, multiple nested brace groups, and normal small brace patterns; (e) keep the final 100-result cap but do not rely on it as protection against pre-result traversal explosions. **Why this matters:** glob is a model-facing discovery tool. Unbounded brace fan-out turns a compact pattern into many expensive filesystem walks, creating startup friction and an easy self-inflicted DoS before the existing result limit can help. Source: gaebal-gajae dogfood response to Clawhip message `1507329710257078353` on 2026-05-22.