Files
claude-code/docs/g003-boot-session-verification-map.md
bellman 2012718749 Map G003 boot session verification
Document the current G003 worker boot, trust, session-control, and preflight verification surfaces so leader integration can sequence worker-owned patches without mutating Ultragoal state.\n\nConstraint: Task 2 is audit-only/coordination; no .omx/ultragoal mutation and no shared implementation/test edits.\nRejected: Fixing clippy warnings in runtime integration tests | outside audit-only scope and owned by integration cleanup.\nConfidence: high\nScope-risk: narrow\nDirective: Keep this map updated when G003 worker splits or verification commands change.\nTested: ../scripts/fmt.sh --check; cargo test -p runtime worker_boot -- --nocapture; cargo test -p tools worker_ -- --nocapture; cargo check -p runtime -p tools -p commands\nNot-tested: cargo clippy -p runtime -p tools -p commands --all-targets --no-deps -- -D warnings fails on pre-existing runtime integration_tests duration_suboptimal_units warnings.
2026-05-14 17:50:30 +09:00

7.2 KiB

G003 boot/session/preflight verification map

Generated by worker-1 for OMX team task 2 on 2026-05-14.

Scope and coordination

  • Active goal context: G003-boot-session / Stream 1 reliable worker boot and session control.
  • Boundary: this artifact is an audit/integration map only. It does not mutate .omx/ultragoal and it does not change shared implementation or tests.
  • Current worker split from leader mailbox:
    • worker-1: task 1 worker boot / prompt SLA plus this task 2 audit map.
    • worker-2: default trusted roots / trust resolver.
    • worker-3: startup-no-evidence classifier.
    • worker-4: session control plus preflight/doctor JSON surfaces.
  • Native subagent probes were attempted for Task 2 (test probe and debug/root-cause probe) but both failed before returning findings with 429 Too Many Requests; the map below is based on direct repository inspection.

Implementation surface map

Worker boot lifecycle and prompt SLA

  • rust/crates/runtime/src/worker_boot.rs

    • Core state types: WorkerStatus, WorkerFailureKind, WorkerEventKind, WorkerEventPayload, StartupFailureClassification, StartupEvidenceBundle, WorkerTaskReceipt, and WorkerReadySnapshot.
    • Control plane: WorkerRegistry::{create,get,observe,resolve_trust,send_prompt,await_ready,restart,terminate,observe_completion,observe_startup_timeout}.
    • Lifecycle states currently covered in code: spawning, trust_required, tool_permission_required, ready_for_prompt, running, finished, and failed.
    • Prompt delivery semantics currently use Running events and fields prompt_in_flight, last_prompt, expected_receipt, replay_prompt, and prompt_delivery_attempts.
    • Startup-no-evidence surface: observe_startup_timeout builds StartupEvidenceBundle and classifies trust, tool permission, prompt acceptance timeout, prompt misdelivery, transport death, worker crash, or unknown.
    • File observability surface: emit_state_file writes .claw/worker-state.json with status, readiness, trust state, prompt-in-flight flag, last event, and update age.
  • rust/crates/tools/src/lib.rs

    • Tool APIs expose the worker control plane through WorkerCreate, WorkerGet, WorkerObserve, WorkerResolveTrust, WorkerAwaitReady, WorkerSendPrompt, WorkerRestart, WorkerTerminate, and WorkerObserveCompletion.
    • WorkerCreate merges ConfigLoader::trusted_roots() with per-call trusted_roots before calling WorkerRegistry::create.
    • Tool-level tests exercise worker create/observe/send/restart/terminate/completion and state-file transitions.

Trust resolver and default trusted roots

  • rust/crates/runtime/src/trust_resolver.rs

    • TrustConfig, TrustAllowlistEntry, and TrustResolver model trust prompts, allowlist/denylist policy, auto-trust, manual approval, and emitted trust events.
    • path_matches_trusted_root and internal path_matches canonicalize paths when possible.
    • Hazard: prefix matching must avoid accidental sibling matches such as /tmp/work matching /tmp/work-evil; worker-2 owns any changes here.
  • rust/crates/runtime/src/config.rs

    • trustedRoots is parsed by parse_optional_trusted_roots and exposed through RuntimeConfig::trusted_roots() / feature config accessors.
    • Current default is empty when unset; any project default roots work belongs to worker-2.

Session control

  • rust/crates/runtime/src/session_control.rs
    • SessionStore namespaces sessions by canonical workspace fingerprint.
    • Key API: from_cwd, from_data_dir, create_handle, resolve_reference, resolve_managed_path, list_sessions, latest_session, load_session, and fork_session.
    • Guardrail: validate_loaded_session rejects cross-workspace sessions and allows legacy sessions only when their path remains inside the current workspace.
    • Worker-4 owns changes to this lane.

CLI doctor/status/preflight and bootstrap-adjacent surfaces

  • rust/crates/commands/src/lib.rs

    • Slash command definitions include /status, /sandbox, and /doctor.
    • JSON rendering for command surfaces exists through handler functions and tests in the same module.
  • rust/crates/tools/src/lib.rs

    • Bash and PowerShell tool runners include workspace_test_branch_preflight, which returns structured output with return_code_interpretation: preflight_blocked:branch_divergence for broad workspace tests on stale branches.
    • Tests around bash_workspace_tests_are_blocked_when_branch_is_behind_main and targeted-test skipping protect this preflight behavior.

Existing focused verification commands

Run from rust/ unless noted.

  • Worker boot runtime contract:
    • cargo test -p runtime worker_boot -- --nocapture
  • Worker tool API contract:
    • cargo test -p tools worker_ -- --nocapture
  • Session control contract:
    • cargo test -p runtime session_control -- --nocapture
  • Trust resolver/config trusted roots:
    • cargo test -p runtime trust_resolver -- --nocapture
    • cargo test -p runtime config::tests::parses_trusted_roots_from_settings config::tests::trusted_roots_default_is_empty_when_unset -- --nocapture
  • Preflight/tool branch guardrails:
    • cargo test -p tools bash_workspace_tests_are_blocked_when_branch_is_behind_main bash_targeted_tests_skip_branch_preflight -- --nocapture
  • Formatting/type/lint baseline:
    • ../scripts/fmt.sh --check
    • cargo check -p runtime -p tools -p commands
    • cargo clippy -p runtime -p tools -p commands --all-targets --no-deps -- -D warnings

Gaps and hazards for leader integration

  • Prompt SLA event naming is partially implicit: send_prompt emits WorkerEventKind::Running; it does not expose separate prompt.sent, prompt.accepted, prompt.acceptance_delayed, or prompt.acceptance_timeout event names. The current equivalent evidence is prompt_in_flight, Running, observe_completion, and startup-timeout classification.
  • StartupFailureClassification::PromptAcceptanceTimeout is covered in worker_boot tests; full terminal/transport integration should still be verified by the leader or worker-3 if a real pane watcher exists outside the in-memory registry.
  • Default trusted roots are parsed and merged into WorkerCreate, but unset config currently means no default roots. Worker-2 owns any change to default root selection.
  • Session control protects workspace fingerprints at load/fork time; worker-4 owns CLI/doctor/preflight JSON contract changes.
  • Full-workspace clippy currently has known unrelated runtime findings observed during task 1 verification; do not block this docs-only map on those unless leader re-scopes cleanup.
  1. Integrate worker boot / prompt SLA changes first and run cargo test -p runtime worker_boot -- --nocapture plus cargo test -p tools worker_ -- --nocapture.
  2. Integrate trust-root changes and rerun trust/config tests plus the worker create config merge test.
  3. Integrate startup-no-evidence classifier changes and rerun cargo test -p runtime worker_boot -- --nocapture.
  4. Integrate session control / preflight / doctor JSON changes and rerun session-control, commands JSON, and preflight tests.
  5. Run final formatting, targeted cargo check/clippy, then broader workspace tests with known full-workspace failures documented separately.