docs(roadmap): add indented fence streaming gap

2026-05-22 13:46:44 +00:00 · 2026-05-22 12:30:49 +00:00
parent 9f762b26fa
commit 8c4b33d9b1
1 changed files with 2 additions and 0 deletions
--- a/ROADMAP.md
+++ b/ROADMAP.md
@@ -6711,3 +6711,5 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed)
 578. **`WebSearch` accepts `CLAWD_WEB_SEARCH_BASE_URL` with any scheme/host and follows the shared redirect policy, so a local environment variable can turn search into a private-network fetch without guardrails** — dogfooded 2026-05-22 from the `#clawcode-building-in-public` 11:30 UTC nudge on `/home/bellman/Workspace/claw-code-pr2967` with branch/origin `docs/roadmap-workdir-provenance@bff67c2`. Active tmux session at probe time: `gajae-issue-339-backlog-zero-candidate-selection`; no active claw-code implementation session. Code inspection: `rust/crates/tools/src/lib.rs::build_search_url` checks `CLAWD_WEB_SEARCH_BASE_URL`, parses it with `reqwest::Url::parse`, appends `q`, and returns it unchanged. Unlike `WebFetch`, there is no `normalize_fetch_url`-style scheme upgrade or host validation at all. `execute_web_search` then uses the same `build_http_client` as WebFetch, which follows up to 10 redirects. A poisoned or stale local env var can point search at `http://localhost`, RFC1918 services, metadata endpoints, or a redirector to those targets; the tool will fetch and parse generic links before domain allow/block filters are applied to extracted result URLs, not to the search endpoint itself. Existing tests/logic focus on hit filtering, not search-provider endpoint validation. **Required fix shape:** (a) validate `CLAWD_WEB_SEARCH_BASE_URL` at parse time against allowed schemes/hosts or require an explicit unsafe/local-test opt-in; (b) apply the same per-redirect target validation required by #577 to WebSearch; (c) distinguish configured test search providers from production search with a typed config/source field in output; (d) add regressions for env base URLs pointing to localhost/private/metadata and safe HTTPS test endpoints; (e) document the env var as test-only or constrain it to trusted public search domains. **Why this matters:** search is often treated as a low-risk public-web tool. An unvalidated base URL makes it an environment-controlled internal fetch surface, and result-domain filters do not protect the initial request or redirects. Source: gaebal-gajae dogfood response to Clawhip message `1507344805167108257` on 2026-05-22.

 579. **Streaming flushes pending markdown at `ContentBlockStop` before pending tool calls are rendered, so text immediately followed by a tool call can be displayed before the tool-use event that actually interrupted/ended the content block** — dogfooded 2026-05-22 from the `#clawcode-building-in-public` 12:00 UTC nudge on `/home/bellman/Workspace/claw-code-pr2967` with branch/origin `docs/roadmap-workdir-provenance@d8864ff`. Active tmux sessions at probe time: none. Channel context included Jobdori's #587 finding that `MarkdownStreamState::push` can hold prose until flush; this probe inspected the caller ordering. Code inspection: in `rust/crates/rusty-claude-cli/src/main.rs` stream handling, `ContentBlockDelta::TextDelta` buffers text through `markdown_stream.push`. On `ApiStreamEvent::ContentBlockStop`, the code first calls `markdown_stream.flush(&renderer)` and writes any pending prose, then only afterward checks `pending_tool.take()` and renders `format_tool_call_start`. If a provider emits a text block that has no safe boundary (single paragraph or unclosed markdown) followed by a tool-use block, the held text is flushed at the stop event just before the pending tool call display. Operators watching the terminal can see the assistant's prose burst immediately before the tool start marker, even though the next actionable event is the tool call; any timing/ordering cue that the model paused to call a tool is blurred by the delayed text flush. There is no test asserting streamed text/tool display ordering when markdown buffering holds content until block stop. **Required fix shape:** (a) make stream rendering preserve block/event order explicitly, perhaps flushing text at the exact text block stop and rendering tool-use starts at their own block starts/stops with clear separators; (b) when a text block is followed by a tool block, include a newline/phase boundary so delayed text does not visually merge into the tool call; (c) add streaming tests with single-paragraph text immediately followed by a tool call and with safe-boundary text, asserting terminal output order and separators; (d) consider the #587 word-boundary fallback so prose is not entirely held until the tool boundary; (e) keep persisted `AssistantEvent` ordering aligned with displayed output. **Why this matters:** streaming UI is also an event log. If buffered prose appears only when the next block stops and visually collides with a tool-use marker, users cannot tell whether the model is still speaking, has switched to tool execution, or the stream stalled. Source: gaebal-gajae dogfood response to Clawhip message `1507352360379482193` on 2026-05-22.
+
+580. **Streaming safe-boundary detection treats any indented triple-backtick line as a fence opener but will not close it if the closing fence is indented more than three spaces, so quoted/list code blocks can freeze streaming until final flush** — dogfooded 2026-05-22 from the `#clawcode-building-in-public` 12:30 UTC nudge on `/home/bellman/Workspace/claw-code-pr2967` with branch/origin `docs/roadmap-workdir-provenance@9f762b2`. Active tmux sessions at probe time: `gajae-pr-340-backlog-zero-candidate-selection-final-review`, `gajae-pr-340-backlog-zero-rereview2`; no active claw-code implementation session. Code inspection: `rust/crates/rusty-claude-cli/src/render.rs::parse_fence_opener` counts only literal spaces and accepts fence openers with indent <=3. Once inside a fence, `line_closes_fence` also requires the closing marker indent <=3. That matches CommonMark top-level fenced blocks, but streaming markdown from models often nests code fences under bullets, block quotes, or copied indentation where both opener and closer are indented four or more spaces. In that case the opener with >3 spaces is ignored (fine), but mixed/normalized output can still open at <=3 and then fail to close if the closer is rendered with extra list/quote indentation; `open_fence` remains set, `last_boundary` stops updating, and `MarkdownStreamState::push` returns `None` until `flush()`. Existing tests cover nested fence marker lengths and tilde/backtick distinction, but not list/blockquote/indented fence streaming behavior or a mismatched indentation close. **Required fix shape:** (a) decide whether the stream boundary detector should follow strict CommonMark or be tolerant for model-generated/list-nested fences; (b) if tolerant, close fences when the same marker appears after quote/list prefixes or consistent extra indentation, while still avoiding false closes inside literal code; (c) add tests for bullet-nested fences, blockquote fences, and an opener at indent <=3 with a closer at indent >3; (d) include a max-buffer/word-boundary fallback from #587 so a malformed fence cannot suppress all output indefinitely; (e) keep rendering tests aligned with boundary tests so visual output and streaming segmentation share one markdown dialect. **Why this matters:** model answers frequently include code inside bullets or quoted explanations. If the stream boundary tracker misses the closing fence, the terminal looks stalled even though deltas are arriving, turning a formatting edge case into startup/streaming opacity. Source: gaebal-gajae dogfood response to Clawhip message `1507359904959172758` on 2026-05-22.