mirror of
https://github.com/instructkr/claude-code.git
synced 2026-05-22 21:56:45 +00:00
docs(roadmap): add approval token expiry boundary gap
This commit is contained in:
@@ -6677,3 +6677,5 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed)
|
|||||||
560. **Auto-compaction observability reports only removed message count, not token pressure or before/after token estimates, so operators cannot tell whether compaction actually relieved context risk** — dogfooded 2026-05-21 from the `#clawcode-building-in-public` 22:30 UTC nudge on `/home/bellman/Workspace/claw-code-pr2967` with branch/origin `docs/roadmap-workdir-provenance@a8c67b0` and binary built from source SHA `25d663d`. Active tmux sessions at probe time: `gajae-issue-314-omx-launch-resilience-review`, `gajae-pr-314-final-review`, `omx-pr-2447-ralplan-consensus-review3`. Code inspection: `AutoCompactionEvent` at `runtime/src/conversation.rs:123-127` contains only `removed_message_count`; `maybe_auto_compact` at `conversation.rs:559-581` has access to cumulative provider usage and the pre/post session but drops all token-pressure context. The CLI notice at `rusty-claude-cli/src/main.rs:3560-3562` prints only `[auto-compacted: removed N messages]`, and JSON output at `main.rs:5111-5114` exposes only `removed_messages` plus that same notice. The parity harness assertion at `mock_parity_harness.rs:691-712` only checks that the `auto_compaction` key exists and usage input tokens are large; it does not require any before/after estimate, threshold, trigger usage, or reduction metadata. **Required fix shape:** (a) extend `AutoCompactionEvent` with `trigger_input_tokens`, `threshold_input_tokens`, `estimated_tokens_before`, `estimated_tokens_after`, `removed_message_count`, and a `reason/trigger` enum; (b) compute before/after estimates around `compact_session` and include them in CLI text and JSON; (c) add tests proving JSON contains these fields for auto-compaction and that the CLI notice includes enough context to debug risk relief; (d) include a skipped/degraded event shape when threshold crossed but no messages removed, tying into #559; (e) update parity harness to assert semantic values, not merely key presence. **Why this matters:** auto-compaction is a context-safety action. Without token-pressure and reduction telemetry, a user sees “removed 2 messages” but cannot tell if the session went from 120k to 5k tokens, 120k to 119k, or failed to relieve the risk at all. Source: gaebal-gajae dogfood response to Clawhip message `1507148512159203338` on 2026-05-21.
|
560. **Auto-compaction observability reports only removed message count, not token pressure or before/after token estimates, so operators cannot tell whether compaction actually relieved context risk** — dogfooded 2026-05-21 from the `#clawcode-building-in-public` 22:30 UTC nudge on `/home/bellman/Workspace/claw-code-pr2967` with branch/origin `docs/roadmap-workdir-provenance@a8c67b0` and binary built from source SHA `25d663d`. Active tmux sessions at probe time: `gajae-issue-314-omx-launch-resilience-review`, `gajae-pr-314-final-review`, `omx-pr-2447-ralplan-consensus-review3`. Code inspection: `AutoCompactionEvent` at `runtime/src/conversation.rs:123-127` contains only `removed_message_count`; `maybe_auto_compact` at `conversation.rs:559-581` has access to cumulative provider usage and the pre/post session but drops all token-pressure context. The CLI notice at `rusty-claude-cli/src/main.rs:3560-3562` prints only `[auto-compacted: removed N messages]`, and JSON output at `main.rs:5111-5114` exposes only `removed_messages` plus that same notice. The parity harness assertion at `mock_parity_harness.rs:691-712` only checks that the `auto_compaction` key exists and usage input tokens are large; it does not require any before/after estimate, threshold, trigger usage, or reduction metadata. **Required fix shape:** (a) extend `AutoCompactionEvent` with `trigger_input_tokens`, `threshold_input_tokens`, `estimated_tokens_before`, `estimated_tokens_after`, `removed_message_count`, and a `reason/trigger` enum; (b) compute before/after estimates around `compact_session` and include them in CLI text and JSON; (c) add tests proving JSON contains these fields for auto-compaction and that the CLI notice includes enough context to debug risk relief; (d) include a skipped/degraded event shape when threshold crossed but no messages removed, tying into #559; (e) update parity harness to assert semantic values, not merely key presence. **Why this matters:** auto-compaction is a context-safety action. Without token-pressure and reduction telemetry, a user sees “removed 2 messages” but cannot tell if the session went from 120k to 5k tokens, 120k to 119k, or failed to relieve the risk at all. Source: gaebal-gajae dogfood response to Clawhip message `1507148512159203338` on 2026-05-21.
|
||||||
|
|
||||||
561. **Branch-lock collision detection treats module strings literally, so equivalent paths with `./`, duplicate slashes, or trailing slashes evade same-branch overlap detection** — dogfooded 2026-05-21 from the `#clawcode-building-in-public` 23:00 UTC nudge on `/home/bellman/Workspace/claw-code-pr2967` with branch/origin `docs/roadmap-workdir-provenance@f45b651` and binary built from source SHA `25d663d`. Active tmux session at probe time: `gajae-issue-316-inflight-review-superseded-merge`. Code inspection: `detect_branch_lock_collisions` in `runtime/src/branch_lock.rs:23-49` delegates to `overlapping_modules`, which compares raw `modules: Vec<String>` entries via `modules_overlap` at `branch_lock.rs:65-69`: exact equality or prefix with `format!("{right}/")` / `format!("{left}/")`. No normalization is applied. As a result, two lanes on the same branch with semantically identical module scopes like `runtime/mcp` vs `./runtime/mcp`, `runtime/mcp` vs `runtime/mcp/`, `runtime//mcp` vs `runtime/mcp`, or `crates/runtime/../runtime/mcp` are not detected as collisions, even though they target the same files. The existing tests cover exact same module, nested raw-prefix module, and different branches only; they do not exercise path normalization or malformed-but-common module inputs. This is distinct from Jobdori's #562 empty-modules whole-branch gap: even non-empty module locks can bypass collision checks when strings differ syntactically. **Required fix shape:** (a) normalize module scopes before comparison by trimming whitespace, removing leading `./`, collapsing repeated slashes, dropping trailing slashes, and resolving `.`/`..` components without escaping repo root; (b) store/report both normalized collision scope and raw input modules for auditability; (c) reject or explicitly mark invalid module paths that normalize outside the repo; (d) add tests for `./runtime/mcp`, `runtime/mcp/`, `runtime//mcp`, nested normalized paths, and invalid `../` escape attempts; (e) keep deterministic sorted/deduped collision output after normalization. **Why this matters:** branch locks are a coordination safety rail. If two agents can claim the same branch/module by spelling the path differently, lock receipts give false confidence and concurrent lanes can still overwrite or review-stomp each other. Source: gaebal-gajae dogfood response to Clawhip message `1507156065941192705` on 2026-05-21.
|
561. **Branch-lock collision detection treats module strings literally, so equivalent paths with `./`, duplicate slashes, or trailing slashes evade same-branch overlap detection** — dogfooded 2026-05-21 from the `#clawcode-building-in-public` 23:00 UTC nudge on `/home/bellman/Workspace/claw-code-pr2967` with branch/origin `docs/roadmap-workdir-provenance@f45b651` and binary built from source SHA `25d663d`. Active tmux session at probe time: `gajae-issue-316-inflight-review-superseded-merge`. Code inspection: `detect_branch_lock_collisions` in `runtime/src/branch_lock.rs:23-49` delegates to `overlapping_modules`, which compares raw `modules: Vec<String>` entries via `modules_overlap` at `branch_lock.rs:65-69`: exact equality or prefix with `format!("{right}/")` / `format!("{left}/")`. No normalization is applied. As a result, two lanes on the same branch with semantically identical module scopes like `runtime/mcp` vs `./runtime/mcp`, `runtime/mcp` vs `runtime/mcp/`, `runtime//mcp` vs `runtime/mcp`, or `crates/runtime/../runtime/mcp` are not detected as collisions, even though they target the same files. The existing tests cover exact same module, nested raw-prefix module, and different branches only; they do not exercise path normalization or malformed-but-common module inputs. This is distinct from Jobdori's #562 empty-modules whole-branch gap: even non-empty module locks can bypass collision checks when strings differ syntactically. **Required fix shape:** (a) normalize module scopes before comparison by trimming whitespace, removing leading `./`, collapsing repeated slashes, dropping trailing slashes, and resolving `.`/`..` components without escaping repo root; (b) store/report both normalized collision scope and raw input modules for auditability; (c) reject or explicitly mark invalid module paths that normalize outside the repo; (d) add tests for `./runtime/mcp`, `runtime/mcp/`, `runtime//mcp`, nested normalized paths, and invalid `../` escape attempts; (e) keep deterministic sorted/deduped collision output after normalization. **Why this matters:** branch locks are a coordination safety rail. If two agents can claim the same branch/module by spelling the path differently, lock receipts give false confidence and concurrent lanes can still overwrite or review-stomp each other. Source: gaebal-gajae dogfood response to Clawhip message `1507156065941192705` on 2026-05-21.
|
||||||
|
|
||||||
|
562. **Approval tokens are still accepted at their exact expiry second because validation uses `now > expires_at` instead of `now >= expires_at`** — dogfooded 2026-05-21 from the `#clawcode-building-in-public` 23:30 UTC nudge on `/home/bellman/Workspace/claw-code-pr2967` with branch/origin `docs/roadmap-workdir-provenance@3d877d7` and binary built from source SHA `25d663d`. Active tmux session at probe time: `gajae-issue-317-inflight-review-restart`. Code inspection: `ApprovalTokenGrant::expires_at` stores `expires_at_epoch_seconds`, and `ApprovalTokenLedger::validate_grant` at `runtime/src/approval_tokens.rs:286-290` rejects only when `now_epoch_seconds > expires_at`. Therefore a token with `expires_at(20)` remains valid for verification/consumption when `now_epoch_seconds == 20`; it expires only at 21. The existing test `approval_token_rejects_scope_expansion_expiry_and_revocation` checks `ledger.verify(..., now=21)` for `expires_at(20)` but does not cover the boundary at exactly 20. This differs from the OAuth expiry helper in `api/src/providers/anthropic.rs`, which uses `expires_at <= now_unix_timestamp()` and treats the timestamp as no longer valid once reached. **Required fix shape:** (a) change approval-token expiry validation to `now_epoch_seconds >= expires_at` or explicitly rename/document the field as `valid_through_epoch_seconds` if inclusive semantics are intended; (b) add boundary tests for `now=expires_at-1`, `now=expires_at`, and `now=expires_at+1` for both `verify` and `consume`; (c) include `expires_at_epoch_seconds` and `now_epoch_seconds` in expiry audit/error metadata so operators can see boundary decisions; (d) align approval-token expiry semantics with OAuth/token-cache expiry conventions across the codebase; (e) add conformance coverage if approval tokens are exported in G004 bundles. **Why this matters:** approval tokens authorize policy exceptions. An off-by-one expiry window is small but security-sensitive and, without boundary tests, future short-lived approval grants can be consumed exactly at the moment operators expect them to be dead. Source: gaebal-gajae dogfood response to Clawhip message `1507163615596380191` on 2026-05-21.
|
||||||
|
|||||||
Reference in New Issue
Block a user