From 71f85541bd32840f33fabb53dc7afa7e62c9c1a4 Mon Sep 17 00:00:00 2001
From: Yeachan-Heo <hurrc04@gmail.com>
Date: Fri, 22 May 2026 20:31:24 +0000
Subject: [PATCH] docs(roadmap): add mcp tool bridge registry drift gap

---
 ROADMAP.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/ROADMAP.md b/ROADMAP.md
index a06f26b7..0ecee482 100644
--- a/ROADMAP.md
+++ b/ROADMAP.md
@@ -6745,3 +6745,5 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed)
 ` (or any value near available memory) and force allocation before any payload bytes arrive; the surrounding `run_process_request` timeout does not protect the allocation itself. This is distinct from HTTP body caps (#503) and SSE parser buffering (#506): it is the MCP JSON-RPC stdio framing layer. Existing tests assert lowercase `Content-Length`, missing/mismatched IDs, timeout, and retry/reset behavior, but none assert a maximum accepted frame length. **Required fix shape:** (a) add a conservative `MAX_MCP_STDIO_FRAME_BYTES` default and optional per-server override; (b) after parsing `Content-Length`, reject values above the cap with `io::ErrorKind::InvalidData` carrying `content_length` and `max_frame_bytes`; (c) read the body through a bounded buffer/helper so allocation is capped and timeout/error surfaces stay typed as MCP invalid response; (d) add regression scripts that emit huge `Content-Length` with no body and oversized body, proving no large allocation and a structured invalid-response error; (e) include frame-size metadata in MCP degraded/error reports so operators can distinguish protocol abuse from transport EOF. **Why this matters:** MCP servers are extension processes. The client must treat their stdio as untrusted protocol input; one oversized length header should not be able to OOM a prompt startup, tool discovery, or resource read before degraded-mode reporting can fire. Source: gaebal-gajae dogfood response to Clawhip message `1507465601499660349` on 2026-05-22.
 
 595. **OAuth authorize URL builder allows `extra_params` to override core PKCE/OAuth parameters after they were already set, so plugin/config extras can replace `state`, `code_challenge`, `redirect_uri`, or `response_type`** — dogfooded 2026-05-22 from the `#clawcode-building-in-public` 20:00 UTC nudge on `/home/bellman/Workspace/claw-code-pr2967` with branch/origin `docs/roadmap-workdir-provenance@46f3bff`. Active tmux sessions at probe time: none; no active claw-code implementation session. Focused validation: `cd rust && cargo test -p runtime oauth -- --nocapture` passed 9/9, confirming current OAuth tests cover happy-path URL/form/callback parsing but not reserved extra-param collisions. Code inspection: `runtime/src/oauth.rs::OAuthAuthorizationRequest::build_url` creates a `params` vector containing core parameters (`response_type=code`, `client_id`, `redirect_uri`, `scope`, `state`, `code_challenge`, `code_challenge_method`), then blindly `extend`s `self.extra_params` into the same query. `with_extra_param` accepts any key and stores it in a `BTreeMap`, with no reserved-name validation. A caller that sets `with_extra_param("state", "attacker")`, `code_challenge`, `redirect_uri`, `response_type`, `client_id`, or `scope` produces a URL with duplicate query parameters where the extra value appears after the core value. Because many OAuth parsers use last-value-wins semantics, this can desynchronize the locally expected state/PKCE verifier from the authorization-server-visible values, or change redirect/scope semantics. Jobdori separately filed duplicate callback parameters (#603); this is the outbound sibling: duplicates are generated by the client itself before the browser redirect, not just accepted on callback. **Required fix shape:** (a) define a reserved parameter set for OAuth authorization requests (`response_type`, `client_id`, `redirect_uri`, `scope`, `state`, `code_challenge`, `code_challenge_method`) and reject attempts to add them via `with_extra_param`; (b) make `with_extra_param` return `Result<Self, OAuthError>` or validate in `build_url` with a typed error rather than silently emitting duplicates; (c) add tests for reserved collisions (`state`, `code_challenge`, `redirect_uri`) and a safe extension like `login_hint`; (d) if an override is intentionally supported, make it explicit and update the stored expected state/verifier/redirect to match so callback/token exchange cannot drift; (e) document provider-specific extra params as additive-only. **Why this matters:** `state` and PKCE are the OAuth anti-CSRF/proof-of-possession controls. Letting arbitrary extras duplicate or override them in the authorization URL creates prompt/auth lifecycle ambiguity and can turn a provider-specific hint hook into a security-sensitive parameter injection footgun. Source: gaebal-gajae dogfood response to Clawhip message `1507473155273265172` on 2026-05-22.
+
+596. **MCP tool bridge gates `call_tool` on a stale in-memory registry snapshot before doing live discovery, so newly discovered tools can be rejected and removed tools can be offered until runtime failure** — dogfooded 2026-05-22 from the `#clawcode-building-in-public` 20:30 UTC nudge on `/home/bellman/Workspace/claw-code-pr2967` with branch/origin `docs/roadmap-workdir-provenance@20c9d9d`. Active tmux sessions at probe time: none; no active claw-code implementation session. Focused validation: `cd rust && cargo test -p runtime mcp_tool_bridge -- --nocapture` passed 19/19, confirming current bridge tests cover pre-registered happy-path tools but not registry/manager drift. Code inspection: `runtime/src/mcp_tool_bridge.rs::McpToolRegistry::call_tool` first locks `self.inner`, requires `state.status == Connected`, and checks `state.tools.iter().any(|t| t.name == tool_name)`. Only after that snapshot gate does it drop the registry lock and call `spawn_tool_call`, which creates a runtime and runs `manager.discover_tools().await` followed by `manager.call_tool(...)`. The live discovery result updates the manager/tool index, but the registry snapshot used for admission is never refreshed from that discovery. Therefore a tool that becomes available after startup/discovery refresh is still rejected as `tool not found` if it is absent from the stale registry, while a tool that disappeared can remain listed/accepted by the registry and then fail later as an `UnknownTool`/runtime error from the manager. Existing tests explicitly register `echo` in the registry before calling the live manager, so they lock in the assumption that the registry already matches runtime discovery. **Required fix shape:** (a) make `call_tool` reconcile registry state from `manager.discover_tools()` before the tool-existence gate, or delegate existence checks entirely to the authoritative manager and then update the registry snapshot; (b) add a registry `last_discovered_at`/generation or per-server tool-source metadata so `list_tools` can report stale/degraded state; (c) add regressions where the registry starts empty but the manager discovers `echo`, and where the registry advertises a stale tool that the manager no longer exposes; (d) ensure `list_tools`, `ToolSearch`, and `call_tool` agree on the same generation/tool surface; (e) surface drift as a typed MCP lifecycle/degraded event rather than a generic `tool not found`. **Why this matters:** the bridge is a control-plane contract between model-visible MCP tools and the live server manager. If admission checks use an old snapshot while execution uses fresh discovery, claws get stale tool availability evidence and confusing failures exactly where autonomous recovery needs a single source of truth. Source: gaebal-gajae dogfood response to Clawhip message `1507480700868362384` on 2026-05-22.