Map G003 boot session verification

Document the current G003 worker boot, trust, session-control, and preflight verification surfaces so leader integration can sequence worker-owned patches without mutating Ultragoal state.\n\nConstraint: Task 2 is audit-only/coordination; no .omx/ultragoal mutation and no shared implementation/test edits.\nRejected: Fixing clippy warnings in runtime integration tests | outside audit-only scope and owned by integration cleanup.\nConfidence: high\nScope-risk: narrow\nDirective: Keep this map updated when G003 worker splits or verification commands change.\nTested: ../scripts/fmt.sh --check; cargo test -p runtime worker_boot -- --nocapture; cargo test -p tools worker_ -- --nocapture; cargo check -p runtime -p tools -p commands\nNot-tested: cargo clippy -p runtime -p tools -p commands --all-targets --no-deps -- -D warnings fails on pre-existing runtime integration_tests duration_suboptimal_units warnings.
2026-05-14 09:56:44 +00:00 · 2026-05-14 17:48:18 +09:00
parent 79d3b809f9
commit 2012718749
1 changed files with 96 additions and 0 deletions
--- a/docs/g003-boot-session-verification-map.md
+++ b/docs/g003-boot-session-verification-map.md
@@ -0,0 +1,96 @@
+# G003 boot/session/preflight verification map
+
+Generated by `worker-1` for OMX team task 2 on 2026-05-14.
+
+## Scope and coordination
+
+- Active goal context: `G003-boot-session` / Stream 1 reliable worker boot and session control.
+- Boundary: this artifact is an audit/integration map only. It does not mutate `.omx/ultragoal` and it does not change shared implementation or tests.
+- Current worker split from leader mailbox:
+  - `worker-1`: task 1 worker boot / prompt SLA plus this task 2 audit map.
+  - `worker-2`: default trusted roots / trust resolver.
+  - `worker-3`: startup-no-evidence classifier.
+  - `worker-4`: session control plus preflight/doctor JSON surfaces.
+- Native subagent probes were attempted for Task 2 (`test probe` and `debug/root-cause probe`) but both failed before returning findings with `429 Too Many Requests`; the map below is based on direct repository inspection.
+
+## Implementation surface map
+
+### Worker boot lifecycle and prompt SLA
+
+- `rust/crates/runtime/src/worker_boot.rs`
+  - Core state types: `WorkerStatus`, `WorkerFailureKind`, `WorkerEventKind`, `WorkerEventPayload`, `StartupFailureClassification`, `StartupEvidenceBundle`, `WorkerTaskReceipt`, and `WorkerReadySnapshot`.
+  - Control plane: `WorkerRegistry::{create,get,observe,resolve_trust,send_prompt,await_ready,restart,terminate,observe_completion,observe_startup_timeout}`.
+  - Lifecycle states currently covered in code: `spawning`, `trust_required`, `tool_permission_required`, `ready_for_prompt`, `running`, `finished`, and `failed`.
+  - Prompt delivery semantics currently use `Running` events and fields `prompt_in_flight`, `last_prompt`, `expected_receipt`, `replay_prompt`, and `prompt_delivery_attempts`.
+  - Startup-no-evidence surface: `observe_startup_timeout` builds `StartupEvidenceBundle` and classifies trust, tool permission, prompt acceptance timeout, prompt misdelivery, transport death, worker crash, or unknown.
+  - File observability surface: `emit_state_file` writes `.claw/worker-state.json` with status, readiness, trust state, prompt-in-flight flag, last event, and update age.
+
+- `rust/crates/tools/src/lib.rs`
+  - Tool APIs expose the worker control plane through `WorkerCreate`, `WorkerGet`, `WorkerObserve`, `WorkerResolveTrust`, `WorkerAwaitReady`, `WorkerSendPrompt`, `WorkerRestart`, `WorkerTerminate`, and `WorkerObserveCompletion`.
+  - `WorkerCreate` merges `ConfigLoader::trusted_roots()` with per-call `trusted_roots` before calling `WorkerRegistry::create`.
+  - Tool-level tests exercise worker create/observe/send/restart/terminate/completion and state-file transitions.
+
+### Trust resolver and default trusted roots
+
+- `rust/crates/runtime/src/trust_resolver.rs`
+  - `TrustConfig`, `TrustAllowlistEntry`, and `TrustResolver` model trust prompts, allowlist/denylist policy, auto-trust, manual approval, and emitted trust events.
+  - `path_matches_trusted_root` and internal `path_matches` canonicalize paths when possible.
+  - Hazard: prefix matching must avoid accidental sibling matches such as `/tmp/work` matching `/tmp/work-evil`; worker-2 owns any changes here.
+
+- `rust/crates/runtime/src/config.rs`
+  - `trustedRoots` is parsed by `parse_optional_trusted_roots` and exposed through `RuntimeConfig::trusted_roots()` / feature config accessors.
+  - Current default is empty when unset; any project default roots work belongs to worker-2.
+
+### Session control
+
+- `rust/crates/runtime/src/session_control.rs`
+  - `SessionStore` namespaces sessions by canonical workspace fingerprint.
+  - Key API: `from_cwd`, `from_data_dir`, `create_handle`, `resolve_reference`, `resolve_managed_path`, `list_sessions`, `latest_session`, `load_session`, and `fork_session`.
+  - Guardrail: `validate_loaded_session` rejects cross-workspace sessions and allows legacy sessions only when their path remains inside the current workspace.
+  - Worker-4 owns changes to this lane.
+
+### CLI doctor/status/preflight and bootstrap-adjacent surfaces
+
+- `rust/crates/commands/src/lib.rs`
+  - Slash command definitions include `/status`, `/sandbox`, and `/doctor`.
+  - JSON rendering for command surfaces exists through handler functions and tests in the same module.
+
+- `rust/crates/tools/src/lib.rs`
+  - Bash and PowerShell tool runners include `workspace_test_branch_preflight`, which returns structured output with `return_code_interpretation: preflight_blocked:branch_divergence` for broad workspace tests on stale branches.
+  - Tests around `bash_workspace_tests_are_blocked_when_branch_is_behind_main` and targeted-test skipping protect this preflight behavior.
+
+## Existing focused verification commands
+
+Run from `rust/` unless noted.
+
+- Worker boot runtime contract:
+  - `cargo test -p runtime worker_boot -- --nocapture`
+- Worker tool API contract:
+  - `cargo test -p tools worker_ -- --nocapture`
+- Session control contract:
+  - `cargo test -p runtime session_control -- --nocapture`
+- Trust resolver/config trusted roots:
+  - `cargo test -p runtime trust_resolver -- --nocapture`
+  - `cargo test -p runtime config::tests::parses_trusted_roots_from_settings config::tests::trusted_roots_default_is_empty_when_unset -- --nocapture`
+- Preflight/tool branch guardrails:
+  - `cargo test -p tools bash_workspace_tests_are_blocked_when_branch_is_behind_main bash_targeted_tests_skip_branch_preflight -- --nocapture`
+- Formatting/type/lint baseline:
+  - `../scripts/fmt.sh --check`
+  - `cargo check -p runtime -p tools -p commands`
+  - `cargo clippy -p runtime -p tools -p commands --all-targets --no-deps -- -D warnings`
+
+## Gaps and hazards for leader integration
+
+- Prompt SLA event naming is partially implicit: `send_prompt` emits `WorkerEventKind::Running`; it does not expose separate `prompt.sent`, `prompt.accepted`, `prompt.acceptance_delayed`, or `prompt.acceptance_timeout` event names. The current equivalent evidence is `prompt_in_flight`, `Running`, `observe_completion`, and startup-timeout classification.
+- `StartupFailureClassification::PromptAcceptanceTimeout` is covered in `worker_boot` tests; full terminal/transport integration should still be verified by the leader or worker-3 if a real pane watcher exists outside the in-memory registry.
+- Default trusted roots are parsed and merged into `WorkerCreate`, but unset config currently means no default roots. Worker-2 owns any change to default root selection.
+- Session control protects workspace fingerprints at load/fork time; worker-4 owns CLI/doctor/preflight JSON contract changes.
+- Full-workspace clippy currently has known unrelated runtime findings observed during task 1 verification; do not block this docs-only map on those unless leader re-scopes cleanup.
+
+## Recommended safe integration order
+
+1. Integrate worker boot / prompt SLA changes first and run `cargo test -p runtime worker_boot -- --nocapture` plus `cargo test -p tools worker_ -- --nocapture`.
+2. Integrate trust-root changes and rerun trust/config tests plus the worker create config merge test.
+3. Integrate startup-no-evidence classifier changes and rerun `cargo test -p runtime worker_boot -- --nocapture`.
+4. Integrate session control / preflight / doctor JSON changes and rerun session-control, commands JSON, and preflight tests.
+5. Run final formatting, targeted cargo check/clippy, then broader workspace tests with known full-workspace failures documented separately.