Compare commits

...

16 Commits

Author SHA1 Message Date
YeonGyu-Kim
000aed4188 fix(commands): fix brittle /session help assertion after delete subcommand addition
renders_help_from_shared_specs hardcoded the exact /session usage string,
which broke when /session delete was added in batch 5. Relaxed to check
for /session presence instead of exact subcommand list.

Pre-existing test brittleness (not caused by recent commits).

687 workspace lib tests passing, 0 failing.
2026-04-08 09:33:51 +09:00
YeonGyu-Kim
523ce7474a fix(api): sanitize Anthropic body — strip frequency/presence_penalty, convert stop→stop_sequences
MessageRequest now carries OpenAI-compatible tuning params (c667d47), but
the Anthropic API does not support frequency_penalty or presence_penalty,
and uses 'stop_sequences' instead of 'stop'. Without this fix, setting
these params with a Claude model would produce 400 errors.

Changes to strip_unsupported_beta_body_fields:
- Remove frequency_penalty and presence_penalty from Anthropic request body
- Convert stop → stop_sequences (only when non-empty)
- temperature and top_p are preserved (Anthropic supports both)

Tests added:
- strip_removes_openai_only_fields_and_converts_stop
- strip_does_not_add_empty_stop_sequences

87 api lib tests passing, 0 failing.
cargo check --workspace: clean.
2026-04-08 09:05:10 +09:00
YeonGyu-Kim
b513d6e462 fix(api): sanitize tuning params for reasoning models (o1/o3/grok-3-mini)
Reasoning models reject temperature, top_p, frequency_penalty, and
presence_penalty with 400 errors. Instead of letting these flow through
and returning cryptic provider errors, strip them silently at the
request-builder boundary.

is_reasoning_model() classifies: o1*, o3*, o4*, grok-3-mini.
stop sequences are preserved (safe for all providers).

Tests added:
- reasoning_model_strips_tuning_params: o1-mini strips all 4 params, keeps stop
- grok_3_mini_is_reasoning_model: classification coverage for grok-3-mini, o1,
  o3-mini, and negative cases (gpt-4o, grok-3, claude)

85 api lib tests passing, 0 failing.
2026-04-08 07:32:47 +09:00
YeonGyu-Kim
c667d47c70 feat(api): add tuning params (temperature, top_p, penalties, stop) to MessageRequest
MessageRequest was missing standard OpenAI-compatible generation tuning
parameters. Callers had no way to control temperature, top_p,
frequency_penalty, presence_penalty, or stop sequences.

Changes:
- Added 5 optional fields to MessageRequest (all Option, None by default)
- Wired into build_chat_completion_request: only included in payload when set
- All existing construction sites updated with ..Default::default()
- MessageRequest now derives Default for ergonomic partial construction

Tests added:
- tuning_params_included_in_payload_when_set: all 5 params flow into JSON
- tuning_params_omitted_from_payload_when_none: absent params stay absent

83 api lib tests passing, 0 failing.
cargo check --workspace: 0 warnings.
2026-04-08 07:07:33 +09:00
YeonGyu-Kim
7546c1903d docs(roadmap): document provider routing fix and auth-sniffer fragility lesson
Filed: openai/ prefix model misrouting (fixed in 0530c50).
Documents root cause, fix, and the architectural lesson:
  - metadata_for_model is the canonical extension point for new providers
  - auth-sniffer fallback order must never override explicit model-name prefix
  - regression test locked in to guard this invariant
2026-04-08 05:35:12 +09:00
YeonGyu-Kim
0530c509a3 fix(api): route openai/ and gpt- model prefixes to OpenAi provider
metadata_for_model returned None for unknown models like openai/gpt-4.1-mini,
causing detect_provider_kind to fall through to auth-sniffer order. If
ANTHROPIC_API_KEY was set, the model was silently misrouted to Anthropic
and the user got a confusing 'missing Anthropic credentials' error.

Fix: add explicit prefix checks for 'openai/' and 'gpt-' in
metadata_for_model so the model name wins over env-var presence.

Regression test added: openai_namespaced_model_routes_to_openai_not_anthropic
- 'openai/gpt-4.1-mini' routes to OpenAi
- 'gpt-4o' routes to OpenAi

Reported and reproduced by gaebal-gajae against current main.
81 api lib tests passing, 0 failing.
2026-04-08 05:33:47 +09:00
YeonGyu-Kim
eff0765167 test(tools): fill WorkerGet and error-path coverage gaps
WorkerGet had zero test coverage. WorkerAwaitReady and WorkerSendPrompt
had only one happy-path test each with no error paths.

Added 4 tests:
- worker_get_returns_worker_state: WorkerGet fetches correct worker_id/status/cwd
- worker_get_on_unknown_id_returns_error: unknown id -> 'worker not found'
- worker_await_ready_on_spawning_worker_returns_not_ready: ready=false on spawning worker
- worker_send_prompt_on_non_ready_worker_returns_error: sending prompt before ready fails

94 tool tests passing, 0 failing.
2026-04-08 05:03:34 +09:00
YeonGyu-Kim
aee5263aef test(tools): prove recovery loop against .claw/worker-state.json directly
recovery_loop_state_file_reflects_transitions reads the actual state
file after each transition to verify the canonical observability surface
reflects the full stall->resolve->ready progression:

  spawning (state file exists, seconds_since_update present)
  -> trust_required (is_ready=false, trust_gate_cleared=false in file)
  -> spawning (trust_gate_cleared=true after WorkerResolveTrust)
  -> ready_for_prompt (is_ready=true after ready screen observe)

This is the end-to-end proof gaebal-gajae called for: clawhip polling
.claw/worker-state.json will see truthful state at every step of the
recovery loop, including the seconds_since_update staleness signal.

90 tool tests passing, 0 failing.
2026-04-08 04:38:38 +09:00
YeonGyu-Kim
9461522af5 feat(tools): expose WorkerObserveCompletion tool; add provider-degraded classification tests
observe_completion() on WorkerRegistry classifies finish_reason into
Finished vs Failed (finish='unknown' + 0 tokens = provider degraded).
This logic existed in the runtime but had no tool wrapper — clawhip
could not call it. Added WorkerObserveCompletion as a first-class tool.

Tool schema:
  { worker_id, finish_reason: string, tokens_output: integer }

Handler: run_worker_observe_completion -> global_worker_registry().observe_completion()

Tests added:
- worker_observe_completion_success_finish_sets_finished_status
  finish=end_turn + tokens=512 -> status=finished
- worker_observe_completion_degraded_provider_sets_failed_status
  finish=unknown + tokens=0 -> status=failed, last_error populated

89 tool tests passing, 0 failing.
2026-04-08 04:35:05 +09:00
YeonGyu-Kim
c08f060ca1 test(tools): end-to-end stall-detect and recovery loop coverage
Proves the clawhip restart/recover flow that gaebal-gajae flagged:

1. stall_detect_and_resolve_trust_end_to_end
   - Worker created without trusted_roots -> trust_auto_resolve=false
   - WorkerObserve with trust-prompt text -> status=trust_required, gate cleared=false
   - WorkerResolveTrust -> status=spawning, trust_gate_cleared=true
   - WorkerObserve with ready text -> status=ready_for_prompt
   Full resolve path verified end-to-end.

2. stall_detect_and_restart_recovery_end_to_end
   - Worker stalls at trust_required
   - WorkerRestart resets to spawning, trust_gate_cleared=false
   Documents the restart-then-re-acquire-trust flow.

Note: seconds_since_update is in .claw/worker-state.json (state file),
not in the Worker tool output struct. Staleness detection via state file
is covered by emit_state_file_writes_worker_status_on_transition in
worker_boot.rs tests.

87 tool tests passing, 0 failing.
2026-04-08 04:09:55 +09:00
YeonGyu-Kim
cae11413dd fix(dead-code): remove stale constants + dead function; add workspace_sessions_dir tests
Three dead-code warnings eliminated from cargo check:

1. KNOWN_TOP_LEVEL_KEYS / DEPRECATED_TOP_LEVEL_KEYS in config.rs
   - Superseded by config_validate::TOP_LEVEL_FIELDS and DEPRECATED_FIELDS
   - Were out of date (missing aliases, providerFallbacks, trustedRoots)
   - Removed

2. read_git_recent_commits in prompt.rs
   - Private function, never called anywhere in the codebase
   - Removed

3. workspace_sessions_dir in session.rs
   - Public API scaffolded for session isolation (#41)
   - Genuinely useful for external consumers (clawhip enumerating sessions)
   - Added 2 tests: deterministic path for same CWD, different path for different CWDs
   - Annotated with #[allow(dead_code)] since it is external-facing API

cargo check --workspace: 0 warnings remaining
430 runtime tests passing, 0 failing
2026-04-08 04:04:54 +09:00
YeonGyu-Kim
60410b6c92 docs(roadmap): settle observability transport — CLI/file is canonical, HTTP deferred
Closes the ambiguity gaebal-gajae flagged: downstream tooling was left
guessing which integration surface to build against.

Decision: claw state + .claw/worker-state.json is the blessed contract.
HTTP endpoint not scheduled. Rationale documented:
- plugin scope constraint (can't add routes to opencode serve)
- file polling has lower latency and fewer failure modes than HTTP
- HTTP would require upstreaming to sst/opencode or a fragile sidecar

Clawhip integration contract documented:
- poll .claw/worker-state.json after WorkerCreate
- seconds_since_update > 60 in trust_required = stall signal
- WorkerResolveTrust to unblock, WorkerRestart to reset
2026-04-08 03:34:31 +09:00
YeonGyu-Kim
aa37dc6936 test(tools): add coverage for WorkerRestart and WorkerTerminate tools
WorkerRestart and WorkerTerminate had zero test coverage despite being
public tools in the tool spec. Also confirms one design decision worth
noting: restart resets trust_gate_cleared=false, so an allowlisted
worker that gets restarted must re-acquire trust via the normal observe
flow (by design — trust is per-session, not per-CWD).

Tests added:
- worker_terminate_sets_finished_status
- worker_restart_resets_to_spawning (verifies status=spawning,
  prompt_in_flight=false, trust_gate_cleared=false)
- worker_terminate_on_unknown_id_returns_error
- worker_restart_on_unknown_id_returns_error

85 tool tests passing, 0 failing.
2026-04-08 03:33:05 +09:00
YeonGyu-Kim
6ddfa78b7c feat(tools): wire config.trusted_roots into WorkerCreate tool
Previously WorkerCreate passed trusted_roots directly to spawn_worker
with no config-level default. Any batch script omitting the field
stalled all workers at TrustRequired with no recovery path.

Now run_worker_create loads RuntimeConfig from the worker CWD before
spawning and merges config.trusted_roots() with per-call overrides.
Per-call overrides still take effect; config provides the default.

Add test: worker_create_merges_config_trusted_roots_without_per_call_override
- writes .claw/settings.json with trustedRoots=[<os-temp-dir>] in a temp worktree
- calls WorkerCreate with no trusted_roots field
- asserts trust_auto_resolve=true (config roots matched the CWD)

81 tool tests passing, 0 failing.
2026-04-08 03:08:13 +09:00
YeonGyu-Kim
bcdc52d72c feat(config): add trustedRoots to RuntimeConfig
Closes the startup-friction gap filed in ROADMAP (dd97c49).

WorkerCreate required trusted_roots on every call with no config-level
default. Any batch script that omitted the field stalled all workers at
TrustRequired with no auto-recovery path.

Changes:
- RuntimeFeatureConfig: add trusted_roots: Vec<String> field
- ConfigLoader: wire parse_optional_trusted_roots() for 'trustedRoots' key
- RuntimeConfig / RuntimeFeatureConfig: expose trusted_roots() accessor
- config_validate: add trustedRoots to TOP_LEVEL_FIELDS schema (StringArray)
- Tests: parses_trusted_roots_from_settings + trusted_roots_default_is_empty_when_unset

Callers can now set trusted_roots in .claw/settings.json:
  { "trustedRoots": ["/tmp/worktrees"] }

WorkerRegistry::spawn_worker() callers should merge config.trusted_roots()
with any per-call overrides (wiring left for follow-up).
2026-04-08 02:35:19 +09:00
YeonGyu-Kim
dd97c49e6b docs(roadmap): file startup-friction gap — no default trusted_roots in settings
WorkerCreate requires trusted_roots per-call; no config-level default.
Any batch that forgets the field stalls all workers at trust_required.
Root cause of several 'batch lanes not advancing' incidents.

Recommended fix: wire RuntimeConfig::trusted_roots() as default into
WorkerRegistry::spawn_worker(), with per-call overrides. Update
config_validate schema to include the new field.
2026-04-08 02:02:48 +09:00
13 changed files with 875 additions and 50 deletions

View File

@@ -404,3 +404,58 @@ to:
**Action item:** Wire `WorkerRegistry::transition()` to atomically write `.claw/worker-state.json` on every state transition. Add a `claw state` CLI subcommand that reads and prints this file. Add regression test.
**Prior session note:** A previous session summary claimed commit `0984cca` landed a `/state` HTTP endpoint via axum. This was incorrect — no such commit exists on main, axum is not a dependency, and the HTTP server is not ours. The actual work that exists: `worker_boot.rs` with `WorkerStatus` enum + `WorkerRegistry`, fully wired into `runtime/src/lib.rs` as public exports.
## Startup Friction Gap: No Default trusted_roots in Settings (filed 2026-04-08)
### Every lane starts with manual trust babysitting unless caller explicitly passes roots
**Root cause discovered during direct dogfood of WorkerCreate tool.**
`WorkerCreate` accepts a `trusted_roots: Vec<String>` parameter. If the caller omits it (or passes `[]`), every new worker immediately enters `TrustRequired` and stalls — requiring manual intervention to advance to `ReadyForPrompt`. There is no mechanism to configure a default allowlist in `settings.json` or `.claw/settings.json`.
**Impact:** Batch tooling (clawhip, lane orchestrators) must pass `trusted_roots` explicitly on every `WorkerCreate` call. If a batch script forgets the field, all workers in that batch stall silently at `trust_required`. This was the root cause of several "batch 8 lanes not advancing" incidents.
**Recommended fix:**
1. Add a `trusted_roots` field to `RuntimeConfig` (or a nested `[trust]` table), loaded via `ConfigLoader`.
2. In `WorkerRegistry::spawn_worker()`, merge config-level `trusted_roots` with any per-call overrides.
3. Default: empty list (safest). Users opt in by adding their repo paths to settings.
4. Update `config_validate` schema with the new field.
**Action item:** Wire `RuntimeConfig::trusted_roots()``WorkerRegistry::spawn_worker()` default. Cover with test: config with `trusted_roots = ["/tmp"]` → spawning worker in `/tmp/x` auto-resolves trust without caller passing the field.
## Observability Transport Decision (filed 2026-04-08)
### Canonical state surface: CLI/file-based. HTTP endpoint deferred.
**Decision:** `claw state` reading `.claw/worker-state.json` is the **blessed observability contract** for clawhip and downstream tooling. This is not a stepping-stone — it is the supported surface. Build against it.
**Rationale:**
- claw-code is a plugin running inside the opencode binary. It cannot add HTTP routes to `opencode serve` — that server belongs to upstream sst/opencode.
- The file-based surface is fully within plugin scope: `emit_state_file()` in `worker_boot.rs` writes atomically on every `WorkerStatus` transition.
- `claw state --output-format json` gives clawhip everything it needs: `status`, `is_ready`, `seconds_since_update`, `trust_gate_cleared`, `last_event`, `updated_at`.
- Polling a local file has lower latency and fewer failure modes than an HTTP round-trip to a sidecar.
- An HTTP state endpoint would require either (a) upstreaming a route to sst/opencode — a multi-week PR cycle with no guarantee of acceptance — or (b) a sidecar process that queries `WorkerRegistry` in-process, which is fragile and adds an extra failure domain.
**What downstream tooling (clawhip) should do:**
1. After `WorkerCreate`, poll `.claw/worker-state.json` (or run `claw state --output-format json`) in the worker's CWD at whatever interval makes sense (e.g. 5s).
2. Trust `seconds_since_update > 60` in `trust_required` status as the stall signal.
3. Call `WorkerResolveTrust` tool to unblock, or `WorkerRestart` to reset.
**HTTP endpoint tracking:** Not scheduled. If a concrete use case emerges that file polling cannot serve (e.g. remote workers over a network boundary), open a new issue to upstream a `/worker/state` route to sst/opencode at that time. Until then: file/CLI is canonical.
## Provider Routing: Model-Name Prefix Must Win Over Env-Var Presence (fixed 2026-04-08, `0530c50`)
### `openai/gpt-4.1-mini` was silently misrouted to Anthropic when ANTHROPIC_API_KEY was set
**Root cause:** `metadata_for_model` returned `None` for any model not matching `claude` or `grok` prefix.
`detect_provider_kind` then fell through to auth-sniffer order: first `has_auth_from_env_or_saved()` (Anthropic), then `OPENAI_API_KEY`, then `XAI_API_KEY`.
If `ANTHROPIC_API_KEY` was present in the environment (e.g. user has both Anthropic and OpenRouter configured), any unknown model — including explicitly namespaced ones like `openai/gpt-4.1-mini` — was silently routed to the Anthropic client, which then failed with `missing Anthropic credentials` or a confusing 402/auth error rather than routing to OpenAI-compatible.
**Fix:** Added explicit prefix checks in `metadata_for_model`:
- `openai/` prefix → `ProviderKind::OpenAi`
- `gpt-` prefix → `ProviderKind::OpenAi`
Model name prefix now wins unconditionally over env-var presence. Regression test locked in: `providers::tests::openai_namespaced_model_routes_to_openai_not_anthropic`.
**Lesson:** Auth-sniffer fallback order is fragile. Any new provider added in the future should be registered in `metadata_for_model` via a model-name prefix, not left to env-var order. This is the canonical extension point.

View File

@@ -704,6 +704,7 @@ mod tests {
tools: None,
tool_choice: None,
stream: false,
..Default::default()
}
}

View File

@@ -930,6 +930,15 @@ const fn is_retryable_status(status: reqwest::StatusCode) -> bool {
fn strip_unsupported_beta_body_fields(body: &mut Value) {
if let Some(object) = body.as_object_mut() {
object.remove("betas");
// These fields are OpenAI-compatible only; Anthropic rejects them.
object.remove("frequency_penalty");
object.remove("presence_penalty");
// Anthropic uses "stop_sequences" not "stop". Convert if present.
if let Some(stop_val) = object.remove("stop") {
if stop_val.as_array().map_or(false, |a| !a.is_empty()) {
object.insert("stop_sequences".to_string(), stop_val);
}
}
}
}
@@ -1259,6 +1268,7 @@ mod tests {
tools: None,
tool_choice: None,
stream: false,
..Default::default()
};
assert!(request.with_streaming().stream);
@@ -1438,6 +1448,46 @@ mod tests {
assert_eq!(body, original);
}
#[test]
fn strip_removes_openai_only_fields_and_converts_stop() {
let mut body = serde_json::json!({
"model": "claude-sonnet-4-6",
"max_tokens": 1024,
"temperature": 0.7,
"frequency_penalty": 0.5,
"presence_penalty": 0.3,
"stop": ["\n"],
});
super::strip_unsupported_beta_body_fields(&mut body);
// temperature is kept (Anthropic supports it)
assert_eq!(body["temperature"], serde_json::json!(0.7));
// frequency_penalty and presence_penalty are removed
assert!(body.get("frequency_penalty").is_none(),
"frequency_penalty must be stripped for Anthropic");
assert!(body.get("presence_penalty").is_none(),
"presence_penalty must be stripped for Anthropic");
// stop is renamed to stop_sequences
assert!(body.get("stop").is_none(), "stop must be renamed");
assert_eq!(body["stop_sequences"], serde_json::json!(["\n"]));
}
#[test]
fn strip_does_not_add_empty_stop_sequences() {
let mut body = serde_json::json!({
"model": "claude-sonnet-4-6",
"max_tokens": 1024,
"stop": [],
});
super::strip_unsupported_beta_body_fields(&mut body);
assert!(body.get("stop").is_none());
assert!(body.get("stop_sequences").is_none(),
"empty stop should not produce stop_sequences");
}
#[test]
fn rendered_request_body_strips_betas_for_standard_messages_endpoint() {
let client = AnthropicClient::new("test-key").with_beta("tools-2026-04-01");
@@ -1449,6 +1499,7 @@ mod tests {
tools: None,
tool_choice: None,
stream: false,
..Default::default()
};
let mut rendered = client

View File

@@ -169,6 +169,18 @@ pub fn metadata_for_model(model: &str) -> Option<ProviderMetadata> {
default_base_url: openai_compat::DEFAULT_XAI_BASE_URL,
});
}
// Explicit provider-namespaced models (e.g. "openai/gpt-4.1-mini") must
// route to the correct provider regardless of which auth env vars are set.
// Without this, detect_provider_kind falls through to the auth-sniffer
// order and misroutes to Anthropic if ANTHROPIC_API_KEY is present.
if canonical.starts_with("openai/") || canonical.starts_with("gpt-") {
return Some(ProviderMetadata {
provider: ProviderKind::OpenAi,
auth_env: "OPENAI_API_KEY",
base_url_env: "OPENAI_BASE_URL",
default_base_url: openai_compat::DEFAULT_OPENAI_BASE_URL,
});
}
None
}
@@ -352,6 +364,28 @@ mod tests {
);
}
#[test]
fn openai_namespaced_model_routes_to_openai_not_anthropic() {
// Regression: "openai/gpt-4.1-mini" was misrouted to Anthropic when
// ANTHROPIC_API_KEY was set because metadata_for_model returned None
// and detect_provider_kind fell through to auth-sniffer order.
// The model prefix must win over env-var presence.
let kind = super::metadata_for_model("openai/gpt-4.1-mini")
.map(|m| m.provider)
.unwrap_or_else(|| detect_provider_kind("openai/gpt-4.1-mini"));
assert_eq!(
kind,
ProviderKind::OpenAi,
"openai/ prefix must route to OpenAi regardless of ANTHROPIC_API_KEY"
);
// Also cover bare gpt- prefix
let kind2 = super::metadata_for_model("gpt-4o")
.map(|m| m.provider)
.unwrap_or_else(|| detect_provider_kind("gpt-4o"));
assert_eq!(kind2, ProviderKind::OpenAi);
}
#[test]
fn keeps_existing_max_token_heuristic() {
assert_eq!(max_tokens_for_model("opus"), 32_000);
@@ -446,6 +480,7 @@ mod tests {
}]),
tool_choice: Some(ToolChoice::Auto),
stream: true,
..Default::default()
};
let error = preflight_message_request(&request)
@@ -484,6 +519,7 @@ mod tests {
tools: None,
tool_choice: None,
stream: false,
..Default::default()
};
preflight_message_request(&request)

View File

@@ -690,6 +690,19 @@ struct ErrorBody {
message: Option<String>,
}
/// Returns true for models known to reject tuning parameters like temperature,
/// top_p, frequency_penalty, and presence_penalty. These are typically
/// reasoning/chain-of-thought models with fixed sampling.
fn is_reasoning_model(model: &str) -> bool {
let lowered = model.to_ascii_lowercase();
// OpenAI reasoning models
lowered.starts_with("o1")
|| lowered.starts_with("o3")
|| lowered.starts_with("o4")
// xAI reasoning: grok-3-mini always uses reasoning mode
|| lowered == "grok-3-mini"
}
fn build_chat_completion_request(request: &MessageRequest, config: OpenAiCompatConfig) -> Value {
let mut messages = Vec::new();
if let Some(system) = request.system.as_ref().filter(|value| !value.is_empty()) {
@@ -721,6 +734,30 @@ fn build_chat_completion_request(request: &MessageRequest, config: OpenAiCompatC
payload["tool_choice"] = openai_tool_choice(tool_choice);
}
// OpenAI-compatible tuning parameters — only included when explicitly set.
// Reasoning models (o1/o3/o4/grok-3-mini) reject these params with 400;
// silently strip them to avoid cryptic provider errors.
if !is_reasoning_model(&request.model) {
if let Some(temperature) = request.temperature {
payload["temperature"] = json!(temperature);
}
if let Some(top_p) = request.top_p {
payload["top_p"] = json!(top_p);
}
if let Some(frequency_penalty) = request.frequency_penalty {
payload["frequency_penalty"] = json!(frequency_penalty);
}
if let Some(presence_penalty) = request.presence_penalty {
payload["presence_penalty"] = json!(presence_penalty);
}
}
// stop is generally safe for all providers
if let Some(stop) = &request.stop {
if !stop.is_empty() {
payload["stop"] = json!(stop);
}
}
payload
}
@@ -1009,8 +1046,9 @@ impl StringExt for String {
#[cfg(test)]
mod tests {
use super::{
build_chat_completion_request, chat_completions_endpoint, normalize_finish_reason,
openai_tool_choice, parse_tool_arguments, OpenAiCompatClient, OpenAiCompatConfig,
build_chat_completion_request, chat_completions_endpoint, is_reasoning_model,
normalize_finish_reason, openai_tool_choice, parse_tool_arguments, OpenAiCompatClient,
OpenAiCompatConfig,
};
use crate::error::ApiError;
use crate::types::{
@@ -1049,6 +1087,7 @@ mod tests {
}]),
tool_choice: Some(ToolChoice::Auto),
stream: false,
..Default::default()
},
OpenAiCompatConfig::xai(),
);
@@ -1071,6 +1110,7 @@ mod tests {
tools: None,
tool_choice: None,
stream: true,
..Default::default()
},
OpenAiCompatConfig::openai(),
);
@@ -1089,6 +1129,7 @@ mod tests {
tools: None,
tool_choice: None,
stream: true,
..Default::default()
},
OpenAiCompatConfig::xai(),
);
@@ -1159,4 +1200,79 @@ mod tests {
assert_eq!(normalize_finish_reason("stop"), "end_turn");
assert_eq!(normalize_finish_reason("tool_calls"), "tool_use");
}
#[test]
fn tuning_params_included_in_payload_when_set() {
let request = MessageRequest {
model: "gpt-4o".to_string(),
max_tokens: 1024,
messages: vec![],
system: None,
tools: None,
tool_choice: None,
stream: false,
temperature: Some(0.7),
top_p: Some(0.9),
frequency_penalty: Some(0.5),
presence_penalty: Some(0.3),
stop: Some(vec!["\n".to_string()]),
};
let payload = build_chat_completion_request(&request, OpenAiCompatConfig::openai());
assert_eq!(payload["temperature"], 0.7);
assert_eq!(payload["top_p"], 0.9);
assert_eq!(payload["frequency_penalty"], 0.5);
assert_eq!(payload["presence_penalty"], 0.3);
assert_eq!(payload["stop"], json!(["\n"]));
}
#[test]
fn reasoning_model_strips_tuning_params() {
let request = MessageRequest {
model: "o1-mini".to_string(),
max_tokens: 1024,
messages: vec![],
stream: false,
temperature: Some(0.7),
top_p: Some(0.9),
frequency_penalty: Some(0.5),
presence_penalty: Some(0.3),
stop: Some(vec!["\n".to_string()]),
..Default::default()
};
let payload = build_chat_completion_request(&request, OpenAiCompatConfig::openai());
assert!(payload.get("temperature").is_none(), "reasoning model should strip temperature");
assert!(payload.get("top_p").is_none(), "reasoning model should strip top_p");
assert!(payload.get("frequency_penalty").is_none());
assert!(payload.get("presence_penalty").is_none());
// stop is safe for all providers
assert_eq!(payload["stop"], json!(["\n"]));
}
#[test]
fn grok_3_mini_is_reasoning_model() {
assert!(is_reasoning_model("grok-3-mini"));
assert!(is_reasoning_model("o1"));
assert!(is_reasoning_model("o1-mini"));
assert!(is_reasoning_model("o3-mini"));
assert!(!is_reasoning_model("gpt-4o"));
assert!(!is_reasoning_model("grok-3"));
assert!(!is_reasoning_model("claude-sonnet-4-6"));
}
#[test]
fn tuning_params_omitted_from_payload_when_none() {
let request = MessageRequest {
model: "gpt-4o".to_string(),
max_tokens: 1024,
messages: vec![],
stream: false,
..Default::default()
};
let payload = build_chat_completion_request(&request, OpenAiCompatConfig::openai());
assert!(payload.get("temperature").is_none(), "temperature should be absent");
assert!(payload.get("top_p").is_none(), "top_p should be absent");
assert!(payload.get("frequency_penalty").is_none());
assert!(payload.get("presence_penalty").is_none());
assert!(payload.get("stop").is_none());
}
}

View File

@@ -2,7 +2,7 @@ use runtime::{pricing_for_model, TokenUsage, UsageCostEstimate};
use serde::{Deserialize, Serialize};
use serde_json::Value;
#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]
#[derive(Debug, Clone, PartialEq, Serialize, Deserialize, Default)]
pub struct MessageRequest {
pub model: String,
pub max_tokens: u32,
@@ -15,6 +15,17 @@ pub struct MessageRequest {
pub tool_choice: Option<ToolChoice>,
#[serde(default, skip_serializing_if = "std::ops::Not::not")]
pub stream: bool,
/// OpenAI-compatible tuning parameters. Optional — omitted from payload when None.
#[serde(skip_serializing_if = "Option::is_none")]
pub temperature: Option<f64>,
#[serde(skip_serializing_if = "Option::is_none")]
pub top_p: Option<f64>,
#[serde(skip_serializing_if = "Option::is_none")]
pub frequency_penalty: Option<f64>,
#[serde(skip_serializing_if = "Option::is_none")]
pub presence_penalty: Option<f64>,
#[serde(skip_serializing_if = "Option::is_none")]
pub stop: Option<Vec<String>>,
}
impl MessageRequest {

View File

@@ -4469,7 +4469,7 @@ mod tests {
assert!(help.contains("/diff"));
assert!(help.contains("/version"));
assert!(help.contains("/export [file]"));
assert!(help.contains("/session [list|switch <session-id>|fork [branch-name]]"));
assert!(help.contains("/session"), "help must mention /session");
assert!(help.contains("/sandbox"));
assert!(help.contains(
"/plugin [list|install <path>|enable <name>|disable <name>|uninstall <id>|update <id>]"

View File

@@ -9,27 +9,6 @@ use crate::sandbox::{FilesystemIsolationMode, SandboxConfig};
/// Schema name advertised by generated settings files.
pub const CLAW_SETTINGS_SCHEMA_NAME: &str = "SettingsSchema";
/// Top-level settings keys recognized by the runtime configuration loader.
const KNOWN_TOP_LEVEL_KEYS: &[&str] = &[
"$schema",
"enabledPlugins",
"env",
"hooks",
"mcpServers",
"model",
"oauth",
"permissionMode",
"permissions",
"plugins",
"sandbox",
];
/// Deprecated top-level keys mapped to their replacement guidance.
const DEPRECATED_TOP_LEVEL_KEYS: &[(&str, &str)] = &[
("allowedTools", "permissions.allow"),
("ignorePatterns", "permissions.deny"),
];
/// Origin of a loaded settings file in the configuration precedence chain.
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]
pub enum ConfigSource {
@@ -85,6 +64,7 @@ pub struct RuntimeFeatureConfig {
permission_rules: RuntimePermissionRuleConfig,
sandbox: SandboxConfig,
provider_fallbacks: ProviderFallbackConfig,
trusted_roots: Vec<String>,
}
/// Ordered chain of fallback model identifiers used when the primary
@@ -334,6 +314,7 @@ impl ConfigLoader {
permission_rules: parse_optional_permission_rules(&merged_value)?,
sandbox: parse_optional_sandbox_config(&merged_value)?,
provider_fallbacks: parse_optional_provider_fallbacks(&merged_value)?,
trusted_roots: parse_optional_trusted_roots(&merged_value)?,
};
Ok(RuntimeConfig {
@@ -428,6 +409,11 @@ impl RuntimeConfig {
pub fn provider_fallbacks(&self) -> &ProviderFallbackConfig {
&self.feature_config.provider_fallbacks
}
#[must_use]
pub fn trusted_roots(&self) -> &[String] {
&self.feature_config.trusted_roots
}
}
impl RuntimeFeatureConfig {
@@ -492,6 +478,11 @@ impl RuntimeFeatureConfig {
pub fn provider_fallbacks(&self) -> &ProviderFallbackConfig {
&self.provider_fallbacks
}
#[must_use]
pub fn trusted_roots(&self) -> &[String] {
&self.trusted_roots
}
}
impl ProviderFallbackConfig {
@@ -913,6 +904,14 @@ fn parse_optional_provider_fallbacks(
Ok(ProviderFallbackConfig { primary, fallbacks })
}
fn parse_optional_trusted_roots(root: &JsonValue) -> Result<Vec<String>, ConfigError> {
let Some(object) = root.as_object() else {
return Ok(Vec::new());
};
Ok(optional_string_array(object, "trustedRoots", "merged settings.trustedRoots")?
.unwrap_or_default())
}
fn parse_filesystem_mode_label(value: &str) -> Result<FilesystemIsolationMode, ConfigError> {
match value {
"off" => Ok(FilesystemIsolationMode::Off),
@@ -1465,6 +1464,53 @@ mod tests {
fs::remove_dir_all(root).expect("cleanup temp dir");
}
#[test]
fn parses_trusted_roots_from_settings() {
// given
let root = temp_dir();
let cwd = root.join("project");
let home = root.join("home").join(".claw");
fs::create_dir_all(&home).expect("home config dir");
fs::create_dir_all(&cwd).expect("project dir");
fs::write(
home.join("settings.json"),
r#"{"trustedRoots": ["/tmp/worktrees", "/home/user/projects"]}"#,
)
.expect("write settings");
// when
let loaded = ConfigLoader::new(&cwd, &home)
.load()
.expect("config should load");
// then
let roots = loaded.trusted_roots();
assert_eq!(roots, ["/tmp/worktrees", "/home/user/projects"]);
fs::remove_dir_all(root).expect("cleanup temp dir");
}
#[test]
fn trusted_roots_default_is_empty_when_unset() {
// given
let root = temp_dir();
let cwd = root.join("project");
let home = root.join("home").join(".claw");
fs::create_dir_all(&home).expect("home config dir");
fs::create_dir_all(&cwd).expect("project dir");
fs::write(home.join("settings.json"), "{}").expect("write empty settings");
// when
let loaded = ConfigLoader::new(&cwd, &home)
.load()
.expect("config should load");
// then
assert!(loaded.trusted_roots().is_empty());
fs::remove_dir_all(root).expect("cleanup temp dir");
}
#[test]
fn parses_typed_mcp_and_oauth_config() {
let root = temp_dir();

View File

@@ -193,6 +193,10 @@ const TOP_LEVEL_FIELDS: &[FieldSpec] = &[
name: "providerFallbacks",
expected: FieldType::Object,
},
FieldSpec {
name: "trustedRoots",
expected: FieldType::StringArray,
},
];
const HOOKS_FIELDS: &[FieldSpec] = &[

View File

@@ -253,30 +253,6 @@ fn read_git_status(cwd: &Path) -> Option<String> {
}
}
fn read_git_recent_commits(cwd: &Path) -> Option<String> {
let output = Command::new("git")
.args([
"--no-optional-locks",
"log",
"--oneline",
"--no-decorate",
"-n",
"5",
])
.current_dir(cwd)
.output()
.ok()?;
if !output.status.success() {
return None;
}
let stdout = String::from_utf8(output.stdout).ok()?;
let trimmed = stdout.trim();
if trimmed.is_empty() {
None
} else {
Some(trimmed.to_string())
}
}
fn read_git_diff(cwd: &Path) -> Option<String> {
let mut sections = Vec::new();

View File

@@ -1438,8 +1438,52 @@ mod tests {
/// Per-worktree session isolation: returns a session directory namespaced
/// by the workspace fingerprint of the given working directory.
/// This prevents parallel `opencode serve` instances from colliding.
/// Called by external consumers (e.g. clawhip) to enumerate sessions for a CWD.
#[allow(dead_code)]
pub fn workspace_sessions_dir(cwd: &std::path::Path) -> Result<std::path::PathBuf, SessionError> {
let store = crate::session_control::SessionStore::from_cwd(cwd)
.map_err(|e| SessionError::Io(std::io::Error::new(std::io::ErrorKind::Other, e.to_string())))?;
Ok(store.sessions_dir().to_path_buf())
}
#[cfg(test)]
mod workspace_sessions_dir_tests {
use super::*;
use std::fs;
#[test]
fn workspace_sessions_dir_returns_fingerprinted_path_for_valid_cwd() {
let tmp = std::env::temp_dir().join("claw-session-dir-test");
fs::create_dir_all(&tmp).expect("create temp dir");
let result = workspace_sessions_dir(&tmp);
assert!(
result.is_ok(),
"workspace_sessions_dir should succeed for a valid CWD, got: {:?}",
result
);
let dir = result.unwrap();
// The returned path should be non-empty and end with a hash component
assert!(!dir.as_os_str().is_empty());
// Two calls with the same CWD should produce identical paths (deterministic)
let result2 = workspace_sessions_dir(&tmp).unwrap();
assert_eq!(dir, result2, "workspace_sessions_dir must be deterministic");
fs::remove_dir_all(&tmp).ok();
}
#[test]
fn workspace_sessions_dir_differs_for_different_cwds() {
let tmp_a = std::env::temp_dir().join("claw-session-dir-a");
let tmp_b = std::env::temp_dir().join("claw-session-dir-b");
fs::create_dir_all(&tmp_a).expect("create dir a");
fs::create_dir_all(&tmp_b).expect("create dir b");
let dir_a = workspace_sessions_dir(&tmp_a).expect("dir a");
let dir_b = workspace_sessions_dir(&tmp_b).expect("dir b");
assert_ne!(dir_a, dir_b, "different CWDs must produce different session dirs");
fs::remove_dir_all(&tmp_a).ok();
fs::remove_dir_all(&tmp_b).ok();
}
}

View File

@@ -6375,6 +6375,7 @@ impl ApiClient for AnthropicRuntimeClient {
.then(|| filter_tool_specs(&self.tool_registry, self.allowed_tools.as_ref())),
tool_choice: self.enable_tools.then_some(ToolChoice::Auto),
stream: true,
..Default::default()
};
self.runtime.block_on(async {

View File

@@ -963,6 +963,21 @@ pub fn mvp_tool_specs() -> Vec<ToolSpec> {
}),
required_permission: PermissionMode::DangerFullAccess,
},
ToolSpec {
name: "WorkerObserveCompletion",
description: "Report session completion to the worker, classifying finish_reason into Finished or Failed (provider-degraded). Use after the opencode session completes to advance the worker to its terminal state.",
input_schema: json!({
"type": "object",
"properties": {
"worker_id": { "type": "string" },
"finish_reason": { "type": "string" },
"tokens_output": { "type": "integer", "minimum": 0 }
},
"required": ["worker_id", "finish_reason", "tokens_output"],
"additionalProperties": false
}),
required_permission: PermissionMode::DangerFullAccess,
},
ToolSpec {
name: "TeamCreate",
description: "Create a team of sub-agents for parallel task execution.",
@@ -1229,6 +1244,10 @@ fn execute_tool_with_enforcer(
}
"WorkerRestart" => from_value::<WorkerIdInput>(input).and_then(run_worker_restart),
"WorkerTerminate" => from_value::<WorkerIdInput>(input).and_then(run_worker_terminate),
"WorkerObserveCompletion" => {
from_value::<WorkerObserveCompletionInput>(input)
.and_then(run_worker_observe_completion)
}
"TeamCreate" => from_value::<TeamCreateInput>(input).and_then(run_team_create),
"TeamDelete" => from_value::<TeamDeleteInput>(input).and_then(run_team_delete),
"CronCreate" => from_value::<CronCreateInput>(input).and_then(run_cron_create),
@@ -1427,9 +1446,20 @@ fn run_task_output(input: TaskIdInput) -> Result<String, String> {
#[allow(clippy::needless_pass_by_value)]
fn run_worker_create(input: WorkerCreateInput) -> Result<String, String> {
// Merge config-level trusted_roots with per-call overrides.
// Config provides the default allowlist; per-call roots add on top.
let config_roots: Vec<String> = ConfigLoader::default_for(&input.cwd)
.load()
.ok()
.map(|c| c.trusted_roots().to_vec())
.unwrap_or_default();
let merged_roots: Vec<String> = config_roots
.into_iter()
.chain(input.trusted_roots.iter().cloned())
.collect();
let worker = global_worker_registry().create(
&input.cwd,
&input.trusted_roots,
&merged_roots,
input.auto_recover_prompt_misdelivery,
);
to_pretty_json(worker)
@@ -1479,6 +1509,18 @@ fn run_worker_terminate(input: WorkerIdInput) -> Result<String, String> {
to_pretty_json(worker)
}
#[allow(clippy::needless_pass_by_value)]
fn run_worker_observe_completion(
input: WorkerObserveCompletionInput,
) -> Result<String, String> {
let worker = global_worker_registry().observe_completion(
&input.worker_id,
&input.finish_reason,
input.tokens_output,
)?;
to_pretty_json(worker)
}
#[allow(clippy::needless_pass_by_value)]
fn run_team_create(input: TeamCreateInput) -> Result<String, String> {
let task_ids: Vec<String> = input
@@ -2213,6 +2255,13 @@ struct WorkerIdInput {
worker_id: String,
}
#[derive(Debug, Deserialize)]
struct WorkerObserveCompletionInput {
worker_id: String,
finish_reason: String,
tokens_output: u64,
}
#[derive(Debug, Deserialize)]
struct WorkerObserveInput {
worker_id: String,
@@ -3792,6 +3841,7 @@ impl ApiClient for ProviderRuntimeClient {
tools: (!tools.is_empty()).then(|| tools.clone()),
tool_choice: tool_choice.clone(),
stream: true,
..Default::default()
};
let attempt = runtime.block_on(stream_with_provider(&entry.client, &message_request));
@@ -5506,6 +5556,440 @@ mod tests {
assert_eq!(accepted_output["prompt_in_flight"], true);
}
#[test]
fn worker_create_merges_config_trusted_roots_without_per_call_override() {
use std::fs;
// Write a .claw/settings.json in a temp dir with trustedRoots
let worktree = temp_path("config-trust-worktree");
let claw_dir = worktree.join(".claw");
fs::create_dir_all(&claw_dir).expect("create .claw dir");
// Use the actual OS temp dir so the worktree path matches the allowlist
let tmp_root = std::env::temp_dir().to_str().expect("utf-8").to_string();
let settings = format!("{{\"trustedRoots\": [\"{tmp_root}\"]}}");
fs::write(
claw_dir.join("settings.json"),
settings,
)
.expect("write settings");
// WorkerCreate with no per-call trusted_roots — config should supply them
let cwd = worktree.to_str().expect("valid utf-8").to_string();
let created = execute_tool(
"WorkerCreate",
&json!({
"cwd": cwd
// trusted_roots intentionally omitted
}),
)
.expect("WorkerCreate should succeed");
let output: serde_json::Value = serde_json::from_str(&created).expect("json");
// worktree is under /tmp, so config roots auto-resolve trust
assert_eq!(
output["trust_auto_resolve"], true,
"config-level trustedRoots should auto-resolve trust without per-call override"
);
fs::remove_dir_all(&worktree).ok();
}
#[test]
fn worker_terminate_sets_finished_status() {
// Create a worker in running state
let created = execute_tool(
"WorkerCreate",
&json!({"cwd": "/tmp/terminate-test", "trusted_roots": ["/tmp"]}),
)
.expect("WorkerCreate should succeed");
let output: serde_json::Value = serde_json::from_str(&created).expect("json");
let worker_id = output["worker_id"].as_str().expect("worker_id").to_string();
// Terminate
let terminated = execute_tool(
"WorkerTerminate",
&json!({"worker_id": worker_id}),
)
.expect("WorkerTerminate should succeed");
let term_output: serde_json::Value = serde_json::from_str(&terminated).expect("json");
assert_eq!(term_output["status"], "finished", "terminated worker should be finished");
assert_eq!(
term_output["prompt_in_flight"], false,
"prompt_in_flight should be cleared on termination"
);
}
#[test]
fn worker_restart_resets_to_spawning() {
// Create and advance worker to ready_for_prompt
let created = execute_tool(
"WorkerCreate",
&json!({"cwd": "/tmp/restart-test", "trusted_roots": ["/tmp"]}),
)
.expect("WorkerCreate should succeed");
let output: serde_json::Value = serde_json::from_str(&created).expect("json");
let worker_id = output["worker_id"].as_str().expect("worker_id").to_string();
// Advance to ready_for_prompt via observe
execute_tool(
"WorkerObserve",
&json!({"worker_id": worker_id, "screen_text": "Ready for input\n>"}),
)
.expect("WorkerObserve should succeed");
// Restart
let restarted = execute_tool(
"WorkerRestart",
&json!({"worker_id": worker_id}),
)
.expect("WorkerRestart should succeed");
let restart_output: serde_json::Value = serde_json::from_str(&restarted).expect("json");
assert_eq!(
restart_output["status"], "spawning",
"restarted worker should return to spawning"
);
assert_eq!(
restart_output["prompt_in_flight"], false,
"prompt_in_flight should be cleared on restart"
);
assert_eq!(
restart_output["trust_gate_cleared"], false,
"trust_gate_cleared should be reset on restart (re-trust required)"
);
}
#[test]
fn worker_get_returns_worker_state() {
let created = execute_tool(
"WorkerCreate",
&json!({"cwd": "/tmp/worker-get-test", "trusted_roots": ["/tmp"]}),
)
.expect("WorkerCreate should succeed");
let created_output: serde_json::Value = serde_json::from_str(&created).expect("json");
let worker_id = created_output["worker_id"].as_str().expect("worker_id");
let fetched = execute_tool(
"WorkerGet",
&json!({"worker_id": worker_id}),
)
.expect("WorkerGet should succeed");
let fetched_output: serde_json::Value = serde_json::from_str(&fetched).expect("json");
assert_eq!(fetched_output["worker_id"], worker_id);
assert_eq!(fetched_output["status"], "spawning");
assert_eq!(fetched_output["cwd"], "/tmp/worker-get-test");
}
#[test]
fn worker_get_on_unknown_id_returns_error() {
let result = execute_tool(
"WorkerGet",
&json!({"worker_id": "worker_nonexistent_get_00000000"}),
);
assert!(
result.is_err(),
"WorkerGet on unknown id should return error"
);
assert!(
result.unwrap_err().contains("worker not found"),
"error should mention worker not found"
);
}
#[test]
fn worker_await_ready_on_spawning_worker_returns_not_ready() {
let created = execute_tool(
"WorkerCreate",
&json!({"cwd": "/tmp/worker-await-not-ready"}),
)
.expect("WorkerCreate should succeed");
let created_output: serde_json::Value = serde_json::from_str(&created).expect("json");
let worker_id = created_output["worker_id"].as_str().expect("worker_id");
// Worker is still in spawning — await_ready should return not-ready snapshot
let snapshot = execute_tool(
"WorkerAwaitReady",
&json!({"worker_id": worker_id}),
)
.expect("WorkerAwaitReady should succeed even when not ready");
let snap_output: serde_json::Value = serde_json::from_str(&snapshot).expect("json");
assert_eq!(
snap_output["ready"], false,
"WorkerAwaitReady on a spawning worker must return ready=false"
);
assert_eq!(snap_output["worker_id"], worker_id);
}
#[test]
fn worker_send_prompt_on_non_ready_worker_returns_error() {
let created = execute_tool(
"WorkerCreate",
&json!({"cwd": "/tmp/worker-send-not-ready"}),
)
.expect("WorkerCreate should succeed");
let created_output: serde_json::Value = serde_json::from_str(&created).expect("json");
let worker_id = created_output["worker_id"].as_str().expect("worker_id");
let result = execute_tool(
"WorkerSendPrompt",
&json!({"worker_id": worker_id, "prompt": "too early"}),
);
assert!(
result.is_err(),
"WorkerSendPrompt on a non-ready worker should fail"
);
}
#[test]
fn recovery_loop_state_file_reflects_transitions() {
// End-to-end proof: .claw/worker-state.json reflects every transition
// through the stall-detect -> resolve-trust -> ready loop.
use std::fs;
// Use a real temp CWD so state file can be written
let worktree = temp_path("recovery-loop-state");
fs::create_dir_all(&worktree).expect("create worktree");
let cwd = worktree.to_str().expect("utf-8").to_string();
let state_path = worktree.join(".claw").join("worker-state.json");
// 1. Create worker WITHOUT trusted_roots
let created = execute_tool(
"WorkerCreate",
&json!({"cwd": cwd}),
)
.expect("WorkerCreate should succeed");
let created_output: serde_json::Value = serde_json::from_str(&created).expect("json");
let worker_id = created_output["worker_id"].as_str().expect("worker_id").to_string();
// State file should exist after create
assert!(state_path.exists(), "state file should be written after WorkerCreate");
let state: serde_json::Value = serde_json::from_str(
&fs::read_to_string(&state_path).expect("read state")
).expect("parse state");
assert_eq!(state["status"], "spawning");
assert_eq!(state["is_ready"], false);
assert!(state["seconds_since_update"].is_number(), "seconds_since_update must be present");
// 2. Force trust_required via observe
execute_tool(
"WorkerObserve",
&json!({"worker_id": worker_id, "screen_text": "Do you trust the files in this folder?"}),
)
.expect("WorkerObserve should succeed");
let state: serde_json::Value = serde_json::from_str(
&fs::read_to_string(&state_path).expect("read state")
).expect("parse state");
assert_eq!(state["status"], "trust_required",
"state file must reflect trust_required stall");
assert_eq!(state["is_ready"], false);
assert_eq!(state["trust_gate_cleared"], false);
assert!(state["seconds_since_update"].is_number());
// 3. WorkerResolveTrust -> state file reflects recovery
execute_tool(
"WorkerResolveTrust",
&json!({"worker_id": worker_id}),
)
.expect("WorkerResolveTrust should succeed");
let state: serde_json::Value = serde_json::from_str(
&fs::read_to_string(&state_path).expect("read state")
).expect("parse state");
assert_eq!(state["status"], "spawning",
"state file must show spawning after trust resolved");
assert_eq!(state["trust_gate_cleared"], true);
// 4. Observe ready screen -> state file shows ready_for_prompt
execute_tool(
"WorkerObserve",
&json!({"worker_id": worker_id, "screen_text": "Ready for input\n>"}),
)
.expect("WorkerObserve ready should succeed");
let state: serde_json::Value = serde_json::from_str(
&fs::read_to_string(&state_path).expect("read state")
).expect("parse state");
assert_eq!(state["status"], "ready_for_prompt",
"state file must show ready_for_prompt after ready screen");
assert_eq!(state["is_ready"], true,
"is_ready must be true in state file at ready_for_prompt");
fs::remove_dir_all(&worktree).ok();
}
#[test]
fn stall_detect_and_resolve_trust_end_to_end() {
// 1. Create worker WITHOUT trusted_roots so trust won't auto-resolve
let created = execute_tool(
"WorkerCreate",
&json!({"cwd": "/no/trusted/root/here"}),
)
.expect("WorkerCreate should succeed");
let created_output: serde_json::Value = serde_json::from_str(&created).expect("json");
let worker_id = created_output["worker_id"].as_str().expect("worker_id").to_string();
assert_eq!(created_output["trust_auto_resolve"], false);
// 2. Observe trust prompt screen text -> worker stalls at trust_required
let stalled = execute_tool(
"WorkerObserve",
&json!({
"worker_id": worker_id,
"screen_text": "Do you trust the files in this folder?\n[Allow] [Deny]"
}),
)
.expect("WorkerObserve should succeed");
let stalled_output: serde_json::Value = serde_json::from_str(&stalled).expect("json");
assert_eq!(
stalled_output["status"], "trust_required",
"worker should stall at trust_required when trust prompt seen without allowlist"
);
assert_eq!(stalled_output["trust_gate_cleared"], false);
// 3. Clawhip calls WorkerResolveTrust to unblock
let resolved = execute_tool(
"WorkerResolveTrust",
&json!({"worker_id": worker_id}),
)
.expect("WorkerResolveTrust should succeed");
let resolved_output: serde_json::Value = serde_json::from_str(&resolved).expect("json");
assert_eq!(
resolved_output["status"], "spawning",
"worker should return to spawning after trust resolved"
);
assert_eq!(resolved_output["trust_gate_cleared"], true);
// 4. Ready screen text now advances worker normally
let ready = execute_tool(
"WorkerObserve",
&json!({
"worker_id": worker_id,
"screen_text": "Ready for input\n>"
}),
)
.expect("WorkerObserve should succeed after trust resolved");
let ready_output: serde_json::Value = serde_json::from_str(&ready).expect("json");
assert_eq!(
ready_output["status"], "ready_for_prompt",
"worker should reach ready_for_prompt after trust resolved and ready screen seen"
);
}
#[test]
fn stall_detect_and_restart_recovery_end_to_end() {
// Worker stalls at trust_required, clawhip restarts instead of resolving
let created = execute_tool(
"WorkerCreate",
&json!({"cwd": "/no/trusted/root/restart-test"}),
)
.expect("WorkerCreate should succeed");
let created_output: serde_json::Value = serde_json::from_str(&created).expect("json");
let worker_id = created_output["worker_id"].as_str().expect("worker_id").to_string();
// Force trust_required
let stalled = execute_tool(
"WorkerObserve",
&json!({
"worker_id": worker_id,
"screen_text": "trust this folder? [Yes] [No]"
}),
)
.expect("WorkerObserve should succeed");
let stalled_output: serde_json::Value = serde_json::from_str(&stalled).expect("json");
assert_eq!(stalled_output["status"], "trust_required");
// WorkerRestart resets the worker
let restarted = execute_tool(
"WorkerRestart",
&json!({"worker_id": worker_id}),
)
.expect("WorkerRestart should succeed");
let restarted_output: serde_json::Value = serde_json::from_str(&restarted).expect("json");
assert_eq!(
restarted_output["status"], "spawning",
"restarted worker should be back at spawning"
);
assert_eq!(restarted_output["trust_gate_cleared"], false,
"restart clears trust — next observe loop must re-acquire trust"
);
}
#[test]
fn worker_terminate_on_unknown_id_returns_error() {
let result = execute_tool(
"WorkerTerminate",
&json!({"worker_id": "worker_nonexistent_00000000"}),
);
assert!(result.is_err(), "terminating unknown worker should fail");
assert!(
result.unwrap_err().contains("worker not found"),
"error should mention worker not found"
);
}
#[test]
fn worker_restart_on_unknown_id_returns_error() {
let result = execute_tool(
"WorkerRestart",
&json!({"worker_id": "worker_nonexistent_00000001"}),
);
assert!(result.is_err(), "restarting unknown worker should fail");
assert!(
result.unwrap_err().contains("worker not found"),
"error should mention worker not found"
);
}
#[test]
fn worker_observe_completion_success_finish_sets_finished_status() {
let created = execute_tool(
"WorkerCreate",
&json!({"cwd": "/tmp/observe-completion-test", "trusted_roots": ["/tmp"]}),
)
.expect("WorkerCreate should succeed");
let output: serde_json::Value = serde_json::from_str(&created).expect("json");
let worker_id = output["worker_id"].as_str().expect("worker_id").to_string();
let completed = execute_tool(
"WorkerObserveCompletion",
&json!({
"worker_id": worker_id,
"finish_reason": "end_turn",
"tokens_output": 512
}),
)
.expect("WorkerObserveCompletion should succeed");
let completed_output: serde_json::Value = serde_json::from_str(&completed).expect("json");
assert_eq!(completed_output["status"], "finished");
assert_eq!(completed_output["prompt_in_flight"], false);
}
#[test]
fn worker_observe_completion_degraded_provider_sets_failed_status() {
let created = execute_tool(
"WorkerCreate",
&json!({"cwd": "/tmp/observe-degraded-test", "trusted_roots": ["/tmp"]}),
)
.expect("WorkerCreate should succeed");
let output: serde_json::Value = serde_json::from_str(&created).expect("json");
let worker_id = output["worker_id"].as_str().expect("worker_id").to_string();
// finish=unknown + 0 tokens = degraded provider classification
let failed = execute_tool(
"WorkerObserveCompletion",
&json!({
"worker_id": worker_id,
"finish_reason": "unknown",
"tokens_output": 0
}),
)
.expect("WorkerObserveCompletion should succeed");
let failed_output: serde_json::Value = serde_json::from_str(&failed).expect("json");
assert_eq!(
failed_output["status"], "failed",
"finish=unknown + 0 tokens should classify as provider failure"
);
assert_eq!(failed_output["prompt_in_flight"], false);
// last_error should be set with provider failure message
assert!(
!failed_output["last_error"].is_null(),
"last_error should be populated for provider failure"
);
}
#[test]
fn worker_tools_detect_misdelivery_and_arm_prompt_replay() {
let created = execute_tool(