mirror of
https://github.com/instructkr/claude-code.git
synced 2026-05-18 11:46:45 +00:00
Compare commits
90 Commits
ac45bbec15
...
feat/134-1
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
7235260c61 | ||
|
|
230d97a8fa | ||
|
|
2b7095e4ae | ||
|
|
f55612ea47 | ||
|
|
8b52e77f23 | ||
|
|
2c42f8bcc8 | ||
|
|
f266505546 | ||
|
|
50e3fa3a83 | ||
|
|
a51b2105ed | ||
|
|
a3270db602 | ||
|
|
12f1f9a74e | ||
|
|
2678fa0af5 | ||
|
|
b9990bb27c | ||
|
|
f33c315c93 | ||
|
|
5c579e4a09 | ||
|
|
8a8ca8a355 | ||
|
|
b0b579ebe9 | ||
|
|
c956f78e8a | ||
|
|
dd73962d0b | ||
|
|
027efb2f9f | ||
|
|
866f030713 | ||
|
|
d2a83415dc | ||
|
|
8122029eba | ||
|
|
d284ef774e | ||
|
|
7370546c1c | ||
|
|
b56841c5f4 | ||
|
|
debbcbe7fb | ||
|
|
bb76ec9730 | ||
|
|
2bf2a11943 | ||
|
|
d1608aede4 | ||
|
|
b81e6422b4 | ||
|
|
78592221ec | ||
|
|
3848ea64e3 | ||
|
|
b9331ae61b | ||
|
|
f2d653896d | ||
|
|
ad02761918 | ||
|
|
ca09b6b374 | ||
|
|
43eac4d94b | ||
|
|
8b25daf915 | ||
|
|
a049bd29b1 | ||
|
|
b2366d113a | ||
|
|
16244cec34 | ||
|
|
21b2773233 | ||
|
|
91c79baf20 | ||
|
|
a436f9e2d6 | ||
|
|
71e77290b9 | ||
|
|
6580903d20 | ||
|
|
7447232688 | ||
|
|
6a16f0824d | ||
|
|
eabd257968 | ||
|
|
d63d58f3d0 | ||
|
|
63a0d30f57 | ||
|
|
0e263bee42 | ||
|
|
7a172a2534 | ||
|
|
3ab920ac30 | ||
|
|
8db8e4902b | ||
|
|
b7539e679e | ||
|
|
7f76e6bbd6 | ||
|
|
bab66bb226 | ||
|
|
d0de86e8bc | ||
|
|
478ba55063 | ||
|
|
64b29f16d5 | ||
|
|
9882f07e7d | ||
|
|
82bd8bbf77 | ||
|
|
d6003be373 | ||
|
|
586a92ba79 | ||
|
|
2eb6e0c1ee | ||
|
|
70a0f0cf44 | ||
|
|
e58c1947c1 | ||
|
|
1743e600e1 | ||
|
|
a48575fd83 | ||
|
|
688295ea6c | ||
|
|
9deaa29710 | ||
|
|
d05c8686b8 | ||
|
|
00d0eb61d4 | ||
|
|
8d8e2c3afd | ||
|
|
d037f9faa8 | ||
|
|
330dc28fc2 | ||
|
|
cec8d17ca8 | ||
|
|
4cb1db9faa | ||
|
|
5e65b33042 | ||
|
|
87b982ece5 | ||
|
|
f65d15fb2f | ||
|
|
3e4e1585b5 | ||
|
|
110d568bcf | ||
|
|
866ae7562c | ||
|
|
6376694669 | ||
|
|
1d5748f71f | ||
|
|
77fb62a9f1 | ||
|
|
21909da0b5 |
5
.claw.json
Normal file
5
.claw.json
Normal file
@@ -0,0 +1,5 @@
|
||||
{
|
||||
"aliases": {
|
||||
"quick": "haiku"
|
||||
}
|
||||
}
|
||||
3881
ROADMAP.md
3881
ROADMAP.md
File diff suppressed because it is too large
Load Diff
9
USAGE.md
9
USAGE.md
@@ -43,6 +43,15 @@ cd rust
|
||||
/doctor
|
||||
```
|
||||
|
||||
Or run doctor directly with JSON output for scripting:
|
||||
|
||||
```bash
|
||||
cd rust
|
||||
./target/debug/claw doctor --output-format json
|
||||
```
|
||||
|
||||
**Note:** Diagnostic verbs (`doctor`, `status`, `sandbox`, `version`) support `--output-format json` for machine-readable output. Invalid suffix arguments (e.g., `--json`) are now rejected at parse time rather than falling through to prompt dispatch.
|
||||
|
||||
### Interactive REPL
|
||||
|
||||
```bash
|
||||
|
||||
236
docs/MODEL_COMPATIBILITY.md
Normal file
236
docs/MODEL_COMPATIBILITY.md
Normal file
@@ -0,0 +1,236 @@
|
||||
# Model Compatibility Guide
|
||||
|
||||
This document describes model-specific handling in the OpenAI-compatible provider. When adding new models or providers, review this guide to ensure proper compatibility.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [Overview](#overview)
|
||||
- [Model-Specific Handling](#model-specific-handling)
|
||||
- [Kimi Models (is_error Exclusion)](#kimi-models-is_error-exclusion)
|
||||
- [Reasoning Models (Tuning Parameter Stripping)](#reasoning-models-tuning-parameter-stripping)
|
||||
- [GPT-5 (max_completion_tokens)](#gpt-5-max_completion_tokens)
|
||||
- [Qwen Models (DashScope Routing)](#qwen-models-dashscope-routing)
|
||||
- [Implementation Details](#implementation-details)
|
||||
- [Adding New Models](#adding-new-models)
|
||||
- [Testing](#testing)
|
||||
|
||||
## Overview
|
||||
|
||||
The `openai_compat.rs` provider translates Claude Code's internal message format to OpenAI-compatible chat completion requests. Different models have varying requirements for:
|
||||
|
||||
- Tool result message fields (`is_error`)
|
||||
- Sampling parameters (temperature, top_p, etc.)
|
||||
- Token limit fields (`max_tokens` vs `max_completion_tokens`)
|
||||
- Base URL routing
|
||||
|
||||
## Model-Specific Handling
|
||||
|
||||
### Kimi Models (is_error Exclusion)
|
||||
|
||||
**Affected models:** `kimi-k2.5`, `kimi-k1.5`, `kimi-moonshot`, and any model with `kimi` in the name (case-insensitive)
|
||||
|
||||
**Behavior:** The `is_error` field is **excluded** from tool result messages.
|
||||
|
||||
**Rationale:** Kimi models (via Moonshot AI and DashScope) reject the `is_error` field with a 400 Bad Request error:
|
||||
```json
|
||||
{
|
||||
"error": {
|
||||
"type": "invalid_request_error",
|
||||
"message": "Unknown field: is_error"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Detection:**
|
||||
```rust
|
||||
fn model_rejects_is_error_field(model: &str) -> bool {
|
||||
let lowered = model.to_ascii_lowercase();
|
||||
let canonical = lowered.rsplit('/').next().unwrap_or(lowered.as_str());
|
||||
canonical.starts_with("kimi-")
|
||||
}
|
||||
```
|
||||
|
||||
**Testing:** See `model_rejects_is_error_field_detects_kimi_models` and related tests in `openai_compat.rs`.
|
||||
|
||||
---
|
||||
|
||||
### Reasoning Models (Tuning Parameter Stripping)
|
||||
|
||||
**Affected models:**
|
||||
- OpenAI: `o1`, `o1-*`, `o3`, `o3-*`, `o4`, `o4-*`
|
||||
- xAI: `grok-3-mini`
|
||||
- Alibaba DashScope: `qwen-qwq-*`, `qwq-*`, `qwen3-*-thinking`
|
||||
|
||||
**Behavior:** The following tuning parameters are **stripped** from requests:
|
||||
- `temperature`
|
||||
- `top_p`
|
||||
- `frequency_penalty`
|
||||
- `presence_penalty`
|
||||
|
||||
**Rationale:** Reasoning/chain-of-thought models use fixed sampling strategies and reject these parameters with 400 errors.
|
||||
|
||||
**Exception:** `reasoning_effort` is included for compatible models when explicitly set.
|
||||
|
||||
**Detection:**
|
||||
```rust
|
||||
fn is_reasoning_model(model: &str) -> bool {
|
||||
let canonical = model.to_ascii_lowercase()
|
||||
.rsplit('/')
|
||||
.next()
|
||||
.unwrap_or(model);
|
||||
canonical.starts_with("o1")
|
||||
|| canonical.starts_with("o3")
|
||||
|| canonical.starts_with("o4")
|
||||
|| canonical == "grok-3-mini"
|
||||
|| canonical.starts_with("qwen-qwq")
|
||||
|| canonical.starts_with("qwq")
|
||||
|| (canonical.starts_with("qwen3") && canonical.contains("-thinking"))
|
||||
}
|
||||
```
|
||||
|
||||
**Testing:** See `reasoning_model_strips_tuning_params`, `grok_3_mini_is_reasoning_model`, and `qwen_reasoning_variants_are_detected` tests.
|
||||
|
||||
---
|
||||
|
||||
### GPT-5 (max_completion_tokens)
|
||||
|
||||
**Affected models:** All models starting with `gpt-5`
|
||||
|
||||
**Behavior:** Uses `max_completion_tokens` instead of `max_tokens` in the request payload.
|
||||
|
||||
**Rationale:** GPT-5 models require the `max_completion_tokens` field. Legacy `max_tokens` causes request validation failures:
|
||||
```json
|
||||
{
|
||||
"error": {
|
||||
"message": "Unknown field: max_tokens"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Implementation:**
|
||||
```rust
|
||||
let max_tokens_key = if wire_model.starts_with("gpt-5") {
|
||||
"max_completion_tokens"
|
||||
} else {
|
||||
"max_tokens"
|
||||
};
|
||||
```
|
||||
|
||||
**Testing:** See `gpt5_uses_max_completion_tokens_not_max_tokens` and `non_gpt5_uses_max_tokens` tests.
|
||||
|
||||
---
|
||||
|
||||
### Qwen Models (DashScope Routing)
|
||||
|
||||
**Affected models:** All models with `qwen` prefix
|
||||
|
||||
**Behavior:** Routed to DashScope (`https://dashscope.aliyuncs.com/compatible-mode/v1`) rather than default providers.
|
||||
|
||||
**Rationale:** Qwen models are hosted by Alibaba Cloud's DashScope service, not OpenAI or Anthropic.
|
||||
|
||||
**Configuration:**
|
||||
```rust
|
||||
pub const DEFAULT_DASHSCOPE_BASE_URL: &str = "https://dashscope.aliyuncs.com/compatible-mode/v1";
|
||||
```
|
||||
|
||||
**Authentication:** Uses `DASHSCOPE_API_KEY` environment variable.
|
||||
|
||||
**Note:** Some Qwen models are also reasoning models (see [Reasoning Models](#reasoning-models-tuning-parameter-stripping) above) and receive both treatments.
|
||||
|
||||
## Implementation Details
|
||||
|
||||
### File Location
|
||||
All model-specific logic is in:
|
||||
```
|
||||
rust/crates/api/src/providers/openai_compat.rs
|
||||
```
|
||||
|
||||
### Key Functions
|
||||
|
||||
| Function | Purpose |
|
||||
|----------|---------|
|
||||
| `model_rejects_is_error_field()` | Detects models that don't support `is_error` in tool results |
|
||||
| `is_reasoning_model()` | Detects reasoning models that need tuning param stripping |
|
||||
| `translate_message()` | Converts internal messages to OpenAI format (applies `is_error` logic) |
|
||||
| `build_chat_completion_request()` | Constructs full request payload (applies all model-specific logic) |
|
||||
|
||||
### Provider Prefix Handling
|
||||
|
||||
All model detection functions strip provider prefixes (e.g., `dashscope/kimi-k2.5` → `kimi-k2.5`) before matching:
|
||||
|
||||
```rust
|
||||
let canonical = model.to_ascii_lowercase()
|
||||
.rsplit('/')
|
||||
.next()
|
||||
.unwrap_or(model);
|
||||
```
|
||||
|
||||
This ensures consistent detection regardless of whether models are referenced with or without provider prefixes.
|
||||
|
||||
## Adding New Models
|
||||
|
||||
When adding support for new models:
|
||||
|
||||
1. **Check if the model is a reasoning model**
|
||||
- Does it reject temperature/top_p parameters?
|
||||
- Add to `is_reasoning_model()` detection
|
||||
|
||||
2. **Check tool result compatibility**
|
||||
- Does it reject the `is_error` field?
|
||||
- Add to `model_rejects_is_error_field()` detection
|
||||
|
||||
3. **Check token limit field**
|
||||
- Does it require `max_completion_tokens` instead of `max_tokens`?
|
||||
- Update the `max_tokens_key` logic
|
||||
|
||||
4. **Add tests**
|
||||
- Unit test for detection function
|
||||
- Integration test in `build_chat_completion_request`
|
||||
|
||||
5. **Update this documentation**
|
||||
- Add the model to the affected lists
|
||||
- Document any special behavior
|
||||
|
||||
## Testing
|
||||
|
||||
### Running Model-Specific Tests
|
||||
|
||||
```bash
|
||||
# All OpenAI compatibility tests
|
||||
cargo test --package api providers::openai_compat
|
||||
|
||||
# Specific test categories
|
||||
cargo test --package api model_rejects_is_error_field
|
||||
cargo test --package api reasoning_model
|
||||
cargo test --package api gpt5
|
||||
cargo test --package api qwen
|
||||
```
|
||||
|
||||
### Test Files
|
||||
|
||||
- Unit tests: `rust/crates/api/src/providers/openai_compat.rs` (in `mod tests`)
|
||||
- Integration tests: `rust/crates/api/tests/openai_compat_integration.rs`
|
||||
|
||||
### Verifying Model Detection
|
||||
|
||||
To verify a model is detected correctly without making API calls:
|
||||
|
||||
```rust
|
||||
#[test]
|
||||
fn my_new_model_is_detected() {
|
||||
// is_error handling
|
||||
assert!(model_rejects_is_error_field("my-model"));
|
||||
|
||||
// Reasoning model detection
|
||||
assert!(is_reasoning_model("my-model"));
|
||||
|
||||
// Provider prefix handling
|
||||
assert!(model_rejects_is_error_field("provider/my-model"));
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
*Last updated: 2026-04-16*
|
||||
|
||||
For questions or updates, see the implementation in `rust/crates/api/src/providers/openai_compat.rs`.
|
||||
356
prd.json
Normal file
356
prd.json
Normal file
@@ -0,0 +1,356 @@
|
||||
{
|
||||
"version": "1.0",
|
||||
"description": "Clawable Coding Harness - Clear roadmap stories and commit each",
|
||||
"stories": [
|
||||
{
|
||||
"id": "US-001",
|
||||
"title": "Phase 1.6 - startup-no-evidence evidence bundle + classifier",
|
||||
"description": "When startup times out, emit typed worker.startup_no_evidence event with evidence bundle including last known worker lifecycle state, pane command, prompt-send timestamp, prompt-acceptance state, trust-prompt detection result, and transport/MCP health summary. Classifier should down-rank into specific failure classes.",
|
||||
"acceptanceCriteria": [
|
||||
"worker.startup_no_evidence event emitted on startup timeout with evidence bundle",
|
||||
"Evidence bundle includes: last lifecycle state, pane command, prompt-send timestamp, prompt-acceptance state, trust-prompt detection, transport/MCP health",
|
||||
"Classifier attempts to categorize into: trust_required, prompt_misdelivery, prompt_acceptance_timeout, transport_dead, worker_crashed, or unknown",
|
||||
"Tests verify evidence bundle structure and classifier behavior"
|
||||
],
|
||||
"passes": true,
|
||||
"priority": "P0"
|
||||
},
|
||||
{
|
||||
"id": "US-002",
|
||||
"title": "Phase 2 - Canonical lane event schema (4.x series)",
|
||||
"description": "Define typed events for lane lifecycle: lane.started, lane.ready, lane.prompt_misdelivery, lane.blocked, lane.red, lane.green, lane.commit.created, lane.pr.opened, lane.merge.ready, lane.finished, lane.failed, branch.stale_against_main. Also implement event ordering, reconciliation, provenance, deduplication, and projection contracts.",
|
||||
"acceptanceCriteria": [
|
||||
"LaneEvent enum with all required variants defined",
|
||||
"Event ordering with monotonic sequence metadata attached",
|
||||
"Event provenance labels (live_lane, test, healthcheck, replay, transport)",
|
||||
"Session identity completeness at creation (title, workspace, purpose)",
|
||||
"Duplicate terminal-event suppression with fingerprinting",
|
||||
"Lane ownership/scope binding in events",
|
||||
"Nudge acknowledgment with dedupe contract",
|
||||
"clawhip consumes typed lane events instead of pane scraping"
|
||||
],
|
||||
"passes": true,
|
||||
"priority": "P0"
|
||||
},
|
||||
{
|
||||
"id": "US-003",
|
||||
"title": "Phase 3 - Stale-branch detection before broad verification",
|
||||
"description": "Before broad test runs, compare current branch to main and detect if known fixes are missing. Emit branch.stale_against_main event and suggest/auto-run rebase/merge-forward.",
|
||||
"acceptanceCriteria": [
|
||||
"Branch freshness comparison against main implemented",
|
||||
"branch.stale_against_main event emitted when behind",
|
||||
"Auto-rebase/merge-forward policy integration",
|
||||
"Avoid misclassifying stale-branch failures as new regressions"
|
||||
],
|
||||
"passes": true,
|
||||
"priority": "P1"
|
||||
},
|
||||
{
|
||||
"id": "US-004",
|
||||
"title": "Phase 3 - Recovery recipes with ledger",
|
||||
"description": "Encode automatic recoveries for common failures (trust prompt, prompt misdelivery, stale branch, compile red, MCP startup). Expose recovery attempt ledger with recipe id, attempt count, state, timestamps, failure summary.",
|
||||
"acceptanceCriteria": [
|
||||
"Recovery recipes defined for: trust_prompt_unresolved, prompt_delivered_to_shell, stale_branch, compile_red_after_refactor, MCP_handshake_failure, partial_plugin_startup",
|
||||
"Recovery attempt ledger with: recipe id, attempt count, state, timestamps, failure summary, escalation reason",
|
||||
"One automatic recovery attempt before escalation",
|
||||
"Ledger emitted as structured event data"
|
||||
],
|
||||
"passes": true,
|
||||
"priority": "P1"
|
||||
},
|
||||
{
|
||||
"id": "US-005",
|
||||
"title": "Phase 4 - Typed task packet format",
|
||||
"description": "Define structured task packet with fields: objective, scope, repo/worktree, branch policy, acceptance tests, commit policy, reporting contract, escalation policy.",
|
||||
"acceptanceCriteria": [
|
||||
"TaskPacket struct with all required fields",
|
||||
"TaskScope resolution (workspace/module/single-file/custom)",
|
||||
"Validation and serialization support",
|
||||
"Integration into tools/src/lib.rs"
|
||||
],
|
||||
"passes": true,
|
||||
"priority": "P1"
|
||||
},
|
||||
{
|
||||
"id": "US-006",
|
||||
"title": "Phase 4 - Policy engine for autonomous coding",
|
||||
"description": "Encode automation rules: if green + scoped diff + review passed -> merge to dev; if stale branch -> merge-forward before broad tests; if startup blocked -> recover once, then escalate; if lane completed -> emit closeout and cleanup session.",
|
||||
"acceptanceCriteria": [
|
||||
"Policy rules engine implemented",
|
||||
"Rules: green + scoped diff + review -> merge",
|
||||
"Rules: stale branch -> merge-forward before tests",
|
||||
"Rules: startup blocked -> recover once, then escalate",
|
||||
"Rules: lane completed -> closeout and cleanup"
|
||||
],
|
||||
"passes": true,
|
||||
"priority": "P2"
|
||||
},
|
||||
{
|
||||
"id": "US-007",
|
||||
"title": "Phase 5 - Plugin/MCP lifecycle maturity",
|
||||
"description": "First-class plugin/MCP lifecycle contract: config validation, startup healthcheck, discovery result, degraded-mode behavior, shutdown/cleanup. Close gaps in end-to-end lifecycle.",
|
||||
"acceptanceCriteria": [
|
||||
"Plugin/MCP config validation contract",
|
||||
"Startup healthcheck with structured results",
|
||||
"Discovery result reporting",
|
||||
"Degraded-mode behavior documented and implemented",
|
||||
"Shutdown/cleanup contract",
|
||||
"Partial startup and per-server failures reported structurally"
|
||||
],
|
||||
"passes": true,
|
||||
"priority": "P2"
|
||||
},
|
||||
{
|
||||
"id": "US-008",
|
||||
"title": "Fix kimi-k2.5 model API compatibility",
|
||||
"description": "The kimi-k2.5 model (and other kimi models) reject API requests containing the is_error field in tool result messages. The OpenAI-compatible provider currently always includes is_error for all models. Need to make this field conditional based on model support.",
|
||||
"acceptanceCriteria": [
|
||||
"translate_message function accepts model parameter",
|
||||
"is_error field excluded for kimi models (kimi-k2.5, kimi-k1.5, etc.)",
|
||||
"is_error field included for models that support it (openai, grok, xai, etc.)",
|
||||
"build_chat_completion_request passes model to translate_message",
|
||||
"Tests verify is_error presence/absence based on model",
|
||||
"cargo test passes",
|
||||
"cargo clippy passes",
|
||||
"cargo fmt passes"
|
||||
],
|
||||
"passes": true,
|
||||
"priority": "P0"
|
||||
},
|
||||
{
|
||||
"id": "US-009",
|
||||
"title": "Add unit tests for kimi model compatibility fix",
|
||||
"description": "During dogfooding we discovered the existing test coverage for model-specific is_error handling is insufficient. Need to add dedicated tests for model_rejects_is_error_field function and translate_message behavior with different models.",
|
||||
"acceptanceCriteria": [
|
||||
"Test model_rejects_is_error_field identifies kimi-k2.5, kimi-k1.5, dashscope/kimi-k2.5",
|
||||
"Test translate_message includes is_error for gpt-4, grok-3, claude models",
|
||||
"Test translate_message excludes is_error for kimi models",
|
||||
"Test build_chat_completion_request produces correct payload for kimi vs non-kimi",
|
||||
"All new tests pass",
|
||||
"cargo test --package api passes"
|
||||
],
|
||||
"passes": true,
|
||||
"priority": "P1"
|
||||
},
|
||||
{
|
||||
"id": "US-010",
|
||||
"title": "Add model compatibility documentation",
|
||||
"description": "Document which models require special handling (is_error exclusion, reasoning model tuning param stripping, etc.) in a MODEL_COMPATIBILITY.md file for operators and contributors.",
|
||||
"acceptanceCriteria": [
|
||||
"MODEL_COMPATIBILITY.md created in docs/ or repo root",
|
||||
"Document kimi models is_error exclusion",
|
||||
"Document reasoning models (o1, o3, grok-3-mini) tuning param stripping",
|
||||
"Document gpt-5 max_completion_tokens requirement",
|
||||
"Document qwen model routing through dashscope",
|
||||
"Cross-reference with existing code comments"
|
||||
],
|
||||
"passes": true,
|
||||
"priority": "P2"
|
||||
},
|
||||
{
|
||||
"id": "US-011",
|
||||
"title": "Performance optimization: reduce API request serialization overhead",
|
||||
"description": "The translate_message function creates intermediate JSON Value objects that could be optimized. Profile and optimize the hot path for API request building, especially for conversations with many tool results.",
|
||||
"acceptanceCriteria": [
|
||||
"Profile current request building with criterion or similar",
|
||||
"Identify bottlenecks in translate_message and build_chat_completion_request",
|
||||
"Implement optimizations (Vec pre-allocation, reduced cloning, etc.)",
|
||||
"Benchmark before/after showing improvement",
|
||||
"No functional changes or API breakage"
|
||||
],
|
||||
"passes": true,
|
||||
"priority": "P2"
|
||||
},
|
||||
{
|
||||
"id": "US-012",
|
||||
"title": "Trust prompt resolver with allowlist auto-trust",
|
||||
"description": "Add allowlisted auto-trust behavior for known repos/worktrees. Trust prompts currently block TUI startup and require manual intervention. Implement automatic trust resolution for pre-approved repositories.",
|
||||
"acceptanceCriteria": [
|
||||
"TrustAllowlist config structure with repo patterns",
|
||||
"Auto-trust behavior for allowlisted repos/worktrees",
|
||||
"trust_required event emitted when trust prompt detected",
|
||||
"trust_resolved event emitted when trust is granted",
|
||||
"Non-allowlisted repos remain gated (manual trust required)",
|
||||
"Integration with worker boot lifecycle",
|
||||
"Tests for allowlist matching and event emission"
|
||||
],
|
||||
"passes": true,
|
||||
"priority": "P1"
|
||||
},
|
||||
{
|
||||
"id": "US-013",
|
||||
"title": "Phase 2 - Session event ordering + terminal-state reconciliation",
|
||||
"description": "When the same session emits contradictory lifecycle events (idle, error, completed, transport/server-down) in close succession, expose deterministic final truth. Attach monotonic sequence/causal ordering metadata, classify terminal vs advisory events, reconcile duplicate/out-of-order terminal events into one canonical lane outcome.",
|
||||
"acceptanceCriteria": [
|
||||
"Monotonic sequence / causal ordering metadata attached to session lifecycle events",
|
||||
"Terminal vs advisory event classification implemented",
|
||||
"Reconcile duplicate or out-of-order terminal events into one canonical outcome",
|
||||
"Distinguish 'session terminal state unknown because transport died' from real 'completed'",
|
||||
"Tests verify reconciliation behavior with out-of-order event bursts"
|
||||
],
|
||||
"passes": true,
|
||||
"priority": "P1"
|
||||
},
|
||||
{
|
||||
"id": "US-014",
|
||||
"title": "Phase 2 - Event provenance / environment labeling",
|
||||
"description": "Every emitted event should declare its source (live_lane, test, healthcheck, replay, transport) so claws do not mistake test noise for production truth. Include environment/channel label, emitter identity, and confidence/trust level.",
|
||||
"acceptanceCriteria": [
|
||||
"EventProvenance enum with live_lane, test, healthcheck, replay, transport variants",
|
||||
"Environment/channel label attached to all events",
|
||||
"Emitter identity field on events",
|
||||
"Confidence/trust level field for downstream automation",
|
||||
"Tests verify provenance labeling and filtering"
|
||||
],
|
||||
"passes": true,
|
||||
"priority": "P1"
|
||||
},
|
||||
{
|
||||
"id": "US-015",
|
||||
"title": "Phase 2 - Session identity completeness at creation time",
|
||||
"description": "A newly created session should emit stable title, workspace/worktree path, and lane/session purpose at creation time. If any field is not yet known, emit explicit typed placeholder reason rather than bare unknown string.",
|
||||
"acceptanceCriteria": [
|
||||
"Session creation emits stable title, workspace/worktree path, purpose immediately",
|
||||
"Explicit typed placeholder when fields unknown (not bare 'unknown' strings)",
|
||||
"Later-enriched metadata reconciles onto same session identity without ambiguity",
|
||||
"Tests verify session identity completeness and placeholder handling"
|
||||
],
|
||||
"passes": true,
|
||||
"priority": "P1"
|
||||
},
|
||||
{
|
||||
"id": "US-016",
|
||||
"title": "Phase 2 - Duplicate terminal-event suppression",
|
||||
"description": "When the same session emits repeated completed/failed/terminal notifications, collapse duplicates before they trigger repeated downstream reactions. Attach canonical terminal-event fingerprint per lane/session outcome.",
|
||||
"acceptanceCriteria": [
|
||||
"Canonical terminal-event fingerprint attached per lane/session outcome",
|
||||
"Suppress/coalesce repeated terminal notifications within reconciliation window",
|
||||
"Preserve raw event history for audit while exposing one actionable outcome downstream",
|
||||
"Surface when later duplicate materially differs from original terminal payload",
|
||||
"Tests verify deduplication and material difference detection"
|
||||
],
|
||||
"passes": true,
|
||||
"priority": "P2"
|
||||
},
|
||||
{
|
||||
"id": "US-017",
|
||||
"title": "Phase 2 - Lane ownership / scope binding",
|
||||
"description": "Each session and lane event should declare who owns it and what workflow scope it belongs to. Attach owner/assignee identity, workflow scope (claw-code-dogfood, external-git-maintenance, infra-health, manual-operator), and mark whether watcher is expected to act, observe only, or ignore.",
|
||||
"acceptanceCriteria": [
|
||||
"Owner/assignee identity attached to sessions and lane events",
|
||||
"Workflow scope field (claw-code-dogfood, external-git-maintenance, etc.)",
|
||||
"Watcher action expectation field (act, observe-only, ignore)",
|
||||
"Preserve scope through session restarts, resumes, and late terminal events",
|
||||
"Tests verify ownership and scope binding"
|
||||
],
|
||||
"passes": true,
|
||||
"priority": "P2"
|
||||
},
|
||||
{
|
||||
"id": "US-018",
|
||||
"title": "Phase 2 - Nudge acknowledgment / dedupe contract",
|
||||
"description": "Periodic clawhip nudges should carry nudge id/cycle id and delivery timestamp. Expose whether claw has already acknowledged or responded for that cycle. Distinguish new nudge, retry nudge, and stale duplicate.",
|
||||
"acceptanceCriteria": [
|
||||
"Nudge id / cycle id and delivery timestamp attached",
|
||||
"Acknowledgment state exposed (already acknowledged or not)",
|
||||
"Distinguish new nudge vs retry nudge vs stale duplicate",
|
||||
"Allow downstream summaries to bind reported pinpoint back to triggering nudge id",
|
||||
"Tests verify nudge deduplication and acknowledgment tracking"
|
||||
],
|
||||
"passes": true,
|
||||
"priority": "P2"
|
||||
},
|
||||
{
|
||||
"id": "US-019",
|
||||
"title": "Phase 2 - Stable roadmap-id assignment for newly filed pinpoints",
|
||||
"description": "When a claw records a new pinpoint/follow-up, assign or expose a stable tracking id immediately. Expose that id in structured event/report payload and preserve across edits, reorderings, and summary compression.",
|
||||
"acceptanceCriteria": [
|
||||
"Canonical roadmap id assigned at filing time",
|
||||
"Roadmap id exposed in structured event/report payload",
|
||||
"Same id preserved across edits, reorderings, summary compression",
|
||||
"Distinguish 'new roadmap filing' from 'update to existing roadmap item'",
|
||||
"Tests verify stable id assignment and update detection"
|
||||
],
|
||||
"passes": true,
|
||||
"priority": "P2"
|
||||
},
|
||||
{
|
||||
"id": "US-020",
|
||||
"title": "Phase 2 - Roadmap item lifecycle state contract",
|
||||
"description": "Each roadmap pinpoint should carry machine-readable lifecycle state (filed, acknowledged, in_progress, blocked, done, superseded). Attach last state-change timestamp and preserve lineage when one pinpoint supersedes or merges into another.",
|
||||
"acceptanceCriteria": [
|
||||
"Lifecycle state enum with filed, acknowledged, in_progress, blocked, done, superseded",
|
||||
"Last state-change timestamp attached",
|
||||
"New report can declare first filing, status update, or closure",
|
||||
"Preserve lineage when one pinpoint supersedes or merges into another",
|
||||
"Tests verify lifecycle state transitions"
|
||||
],
|
||||
"passes": true,
|
||||
"priority": "P2"
|
||||
},
|
||||
{
|
||||
"id": "US-021",
|
||||
"title": "Request body size pre-flight check for OpenAI-compatible provider",
|
||||
"description": "Implement pre-flight request body size estimation to prevent 400 Bad Request errors from API gateways with size limits. Based on dogfood findings with kimi-k2.5 testing, DashScope API has a 6MB request body limit that was exceeded by large system prompts.",
|
||||
"acceptanceCriteria": [
|
||||
"Pre-flight size estimation before sending requests to OpenAI-compatible providers",
|
||||
"Clear error message when request exceeds provider-specific size limit",
|
||||
"Configuration for different provider limits (6MB DashScope, 100MB OpenAI, etc.)",
|
||||
"Unit tests for size estimation and limit checking",
|
||||
"Integration with existing error handling for actionable user messages"
|
||||
],
|
||||
"passes": true,
|
||||
"priority": "P1"
|
||||
},
|
||||
{
|
||||
"id": "US-022",
|
||||
"title": "Enhanced error context for API failures",
|
||||
"description": "Add structured error context to API failures including request ID tracking across retries, provider-specific error code mapping, and suggested user actions based on error type (e.g., 'Reduce prompt size' for 413, 'Check API key' for 401).",
|
||||
"acceptanceCriteria": [
|
||||
"Request ID tracking across retries with full context in error messages",
|
||||
"Provider-specific error code mapping with actionable suggestions",
|
||||
"Suggested user actions for common error types (401, 403, 413, 429, 500, 502-504)",
|
||||
"Unit tests for error context extraction",
|
||||
"All existing tests pass and clippy is clean"
|
||||
],
|
||||
"passes": true,
|
||||
"priority": "P1"
|
||||
},
|
||||
{
|
||||
"id": "US-023",
|
||||
"title": "Add automatic routing for kimi models to DashScope",
|
||||
"description": "Based on dogfood findings with kimi-k2.5 testing, users must manually prefix with dashscope/kimi-k2.5 instead of just using kimi-k2.5. Add automatic routing for kimi/ and kimi- prefixed models to DashScope (similar to qwen models), and add a 'kimi' alias to the model registry.",
|
||||
"acceptanceCriteria": [
|
||||
"kimi/ and kimi- prefix routing to DashScope in metadata_for_model()",
|
||||
"'kimi' alias in MODEL_REGISTRY that resolves to 'kimi-k2.5'",
|
||||
"resolve_model_alias() handles the kimi alias correctly",
|
||||
"Unit tests for kimi routing (similar to qwen routing tests)",
|
||||
"All tests pass and clippy is clean"
|
||||
],
|
||||
"passes": true,
|
||||
"priority": "P1"
|
||||
},
|
||||
{
|
||||
"id": "US-024",
|
||||
"title": "Add token limit metadata for kimi models",
|
||||
"description": "The model_token_limit() function has no entries for kimi-k2.5 or kimi-k1.5, causing preflight context window validation to skip these models. Add token limit metadata to enable preflight checks and accurate max token defaults. Per Moonshot AI documentation, kimi-k2.5 supports 256K context window and 16K max output tokens.",
|
||||
"acceptanceCriteria": [
|
||||
"model_token_limit('kimi-k2.5') returns Some(ModelTokenLimit { max_output_tokens: 16384, context_window_tokens: 256000 })",
|
||||
"model_token_limit('kimi-k1.5') returns appropriate limits",
|
||||
"model_token_limit('kimi') follows alias chain (kimi → kimi-k2.5) and returns k2.5 limits",
|
||||
"preflight_message_request() validates context window for kimi models (via generic preflight, no provider-specific code needed)",
|
||||
"Unit tests verify limits and preflight behavior for kimi models",
|
||||
"All tests pass and clippy is clean"
|
||||
],
|
||||
"passes": true,
|
||||
"priority": "P1"
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"lastUpdated": "2026-04-17",
|
||||
"completedStories": ["US-001", "US-002", "US-003", "US-004", "US-005", "US-006", "US-007", "US-008", "US-009", "US-010", "US-011", "US-012", "US-013", "US-014", "US-015", "US-016", "US-017", "US-018", "US-019", "US-020", "US-021", "US-022", "US-023", "US-024"],
|
||||
"inProgressStories": [],
|
||||
"totalStories": 24,
|
||||
"status": "completed"
|
||||
}
|
||||
}
|
||||
133
progress.txt
Normal file
133
progress.txt
Normal file
@@ -0,0 +1,133 @@
|
||||
Ralph Iteration Summary - claw-code Roadmap Implementation
|
||||
===========================================================
|
||||
|
||||
Iteration 1: 2026-04-16
|
||||
------------------------
|
||||
|
||||
US-001 COMPLETED (Phase 1.6 - startup-no-evidence evidence bundle + classifier)
|
||||
- Files: rust/crates/runtime/src/worker_boot.rs
|
||||
- Added StartupFailureClassification enum with 6 variants
|
||||
- Added StartupEvidenceBundle with 8 fields
|
||||
- Implemented classify_startup_failure() logic
|
||||
- Added observe_startup_timeout() method to Worker
|
||||
- Tests: 6 new tests verifying classification logic
|
||||
|
||||
US-002 COMPLETED (Phase 2 - Canonical lane event schema)
|
||||
- Files: rust/crates/runtime/src/lane_events.rs
|
||||
- Added EventProvenance enum with 5 labels
|
||||
- Added SessionIdentity, LaneOwnership structs
|
||||
- Added LaneEventMetadata with sequence/ordering
|
||||
- Added LaneEventBuilder for construction
|
||||
- Implemented is_terminal_event(), dedupe_terminal_events()
|
||||
- Tests: 10 new tests for events and deduplication
|
||||
|
||||
US-005 COMPLETED (Phase 4 - Typed task packet format)
|
||||
- Files:
|
||||
- rust/crates/runtime/src/task_packet.rs
|
||||
- rust/crates/runtime/src/task_registry.rs
|
||||
- rust/crates/tools/src/lib.rs
|
||||
- Added TaskScope enum (Workspace, Module, SingleFile, Custom)
|
||||
- Updated TaskPacket with scope_path and worktree fields
|
||||
- Added validate_scope_requirements() validation logic
|
||||
- Fixed all test compilation errors in dependent modules
|
||||
- Tests: Updated existing tests to use new types
|
||||
|
||||
PRE-EXISTING IMPLEMENTATIONS (verified working):
|
||||
------------------------------------------------
|
||||
|
||||
US-003 COMPLETE (Phase 3 - Stale-branch detection)
|
||||
- Files: rust/crates/runtime/src/stale_branch.rs
|
||||
- BranchFreshness enum (Fresh, Stale, Diverged)
|
||||
- StaleBranchPolicy (AutoRebase, AutoMergeForward, WarnOnly, Block)
|
||||
- StaleBranchEvent with structured events
|
||||
- check_freshness() with git integration
|
||||
- apply_policy() with policy resolution
|
||||
- Tests: 12 unit tests + 5 integration tests passing
|
||||
|
||||
US-004 COMPLETE (Phase 3 - Recovery recipes with ledger)
|
||||
- Files: rust/crates/runtime/src/recovery_recipes.rs
|
||||
- FailureScenario enum with 7 scenarios
|
||||
- RecoveryStep enum with actionable steps
|
||||
- RecoveryRecipe with step sequences
|
||||
- RecoveryLedger for attempt tracking
|
||||
- RecoveryEvent for structured emission
|
||||
- attempt_recovery() with escalation logic
|
||||
- Tests: 15 unit tests + 1 integration test passing
|
||||
|
||||
US-006 COMPLETE (Phase 4 - Policy engine for autonomous coding)
|
||||
- Files: rust/crates/runtime/src/policy_engine.rs
|
||||
- PolicyRule with condition/action/priority
|
||||
- PolicyCondition (And, Or, GreenAt, StaleBranch, etc.)
|
||||
- PolicyAction (MergeToDev, RecoverOnce, Escalate, etc.)
|
||||
- LaneContext for evaluation context
|
||||
- evaluate() for rule matching
|
||||
- Tests: 18 unit tests + 6 integration tests passing
|
||||
|
||||
US-007 COMPLETE (Phase 5 - Plugin/MCP lifecycle maturity)
|
||||
- Files: rust/crates/runtime/src/plugin_lifecycle.rs
|
||||
- ServerStatus enum (Healthy, Degraded, Failed)
|
||||
- ServerHealth with capabilities tracking
|
||||
- PluginState with full lifecycle states
|
||||
- PluginLifecycle event tracking
|
||||
- PluginHealthcheck structured results
|
||||
- DiscoveryResult for capability discovery
|
||||
- DegradedMode behavior
|
||||
- Tests: 11 unit tests passing
|
||||
|
||||
VERIFICATION STATUS:
|
||||
------------------
|
||||
- cargo build --workspace: PASSED
|
||||
- cargo test --workspace: PASSED (476+ unit tests, 12 integration tests)
|
||||
- cargo clippy --workspace: PASSED
|
||||
|
||||
All 7 stories from prd.json now have passes: true
|
||||
|
||||
Iteration 2: 2026-04-16
|
||||
------------------------
|
||||
|
||||
US-009 COMPLETED (Add unit tests for kimi model compatibility fix)
|
||||
- Files: rust/crates/api/src/providers/openai_compat.rs
|
||||
- Added 4 comprehensive unit tests:
|
||||
1. model_rejects_is_error_field_detects_kimi_models - verifies detection of kimi-k2.5, kimi-k1.5, dashscope/kimi-k2.5, case insensitivity
|
||||
2. translate_message_includes_is_error_for_non_kimi_models - verifies gpt-4o, grok-3, claude include is_error
|
||||
3. translate_message_excludes_is_error_for_kimi_models - verifies kimi models exclude is_error (prevents 400 Bad Request)
|
||||
4. build_chat_completion_request_kimi_vs_non_kimi_tool_results - full integration test for request building
|
||||
- Tests: 4 new tests, 119 unit tests total in api crate (+4), all passing
|
||||
- Integration tests: 29 passing (no regressions)
|
||||
|
||||
US-010 COMPLETED (Add model compatibility documentation)
|
||||
- Files: docs/MODEL_COMPATIBILITY.md
|
||||
- Created comprehensive documentation covering:
|
||||
1. Kimi Models (is_error Exclusion) - documents the 400 Bad Request issue and solution
|
||||
2. Reasoning Models (Tuning Parameter Stripping) - covers o1, o3, o4, grok-3-mini, qwen-qwq, qwen3-thinking
|
||||
3. GPT-5 (max_completion_tokens) - documents max_tokens vs max_completion_tokens requirement
|
||||
4. Qwen Models (DashScope Routing) - explains routing and authentication
|
||||
- Added implementation details section with key functions
|
||||
- Added "Adding New Models" guide for future contributors
|
||||
- Added testing section with example commands
|
||||
- Cross-referenced with existing code comments in openai_compat.rs
|
||||
- cargo clippy passes
|
||||
|
||||
US-011 COMPLETED (Performance optimization: reduce API request serialization overhead)
|
||||
- Files:
|
||||
- rust/crates/api/Cargo.toml (added criterion dev-dependency and bench config)
|
||||
- rust/crates/api/benches/request_building.rs (new benchmark suite)
|
||||
- rust/crates/api/src/providers/openai_compat.rs (optimizations)
|
||||
- rust/crates/api/src/lib.rs (public exports for benchmarks)
|
||||
- Optimizations implemented:
|
||||
1. flatten_tool_result_content: Pre-allocate String capacity and avoid intermediate Vec
|
||||
- Before: collected to Vec<String> then joined
|
||||
- After: single String with pre-calculated capacity, push directly
|
||||
2. Made key functions public for benchmarking: translate_message, build_chat_completion_request,
|
||||
flatten_tool_result_content, is_reasoning_model, model_rejects_is_error_field
|
||||
- Benchmark results:
|
||||
- flatten_tool_result_content/single_text: ~17ns
|
||||
- flatten_tool_result_content/multi_text (10 blocks): ~46ns
|
||||
- flatten_tool_result_content/large_content (50 blocks): ~11.7µs
|
||||
- translate_message/text_only: ~200ns
|
||||
- translate_message/tool_result: ~348ns
|
||||
- build_chat_completion_request/10 messages: ~16.4µs
|
||||
- build_chat_completion_request/100 messages: ~209µs
|
||||
- is_reasoning_model detection: ~26-42ns depending on model
|
||||
- All tests pass (119 unit tests + 29 integration tests)
|
||||
- cargo clippy passes
|
||||
5
rust/.claw.json
Normal file
5
rust/.claw.json
Normal file
@@ -0,0 +1,5 @@
|
||||
{
|
||||
"permissions": {
|
||||
"defaultMode": "dontAsk"
|
||||
}
|
||||
}
|
||||
4
rust/.gitignore
vendored
4
rust/.gitignore
vendored
@@ -1,3 +1,7 @@
|
||||
target/
|
||||
.omx/
|
||||
.clawd-agents/
|
||||
# Claw Code local artifacts
|
||||
.claw/settings.local.json
|
||||
.claw/sessions/
|
||||
.clawhip/
|
||||
|
||||
15
rust/CLAUDE.md
Normal file
15
rust/CLAUDE.md
Normal file
@@ -0,0 +1,15 @@
|
||||
# CLAUDE.md
|
||||
|
||||
This file provides guidance to Claw Code (clawcode.dev) when working with code in this repository.
|
||||
|
||||
## Detected stack
|
||||
- Languages: Rust.
|
||||
- Frameworks: none detected from the supported starter markers.
|
||||
|
||||
## Verification
|
||||
- Run Rust verification from the repo root: `cargo fmt`, `cargo clippy --workspace --all-targets -- -D warnings`, `cargo test --workspace`
|
||||
|
||||
## Working agreement
|
||||
- Prefer small, reviewable changes and keep generated bootstrap files aligned with actual repo workflows.
|
||||
- Keep shared defaults in `.claw.json`; reserve `.claw/settings.local.json` for machine-local overrides.
|
||||
- Do not overwrite existing `CLAUDE.md` content automatically; update it intentionally when repo workflows change.
|
||||
264
rust/Cargo.lock
generated
264
rust/Cargo.lock
generated
@@ -17,10 +17,23 @@ dependencies = [
|
||||
"memchr",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "anes"
|
||||
version = "0.1.6"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "4b46cbb362ab8752921c97e041f5e366ee6297bd428a31275b9fcf1e380f7299"
|
||||
|
||||
[[package]]
|
||||
name = "anstyle"
|
||||
version = "1.0.14"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "940b3a0ca603d1eade50a4846a2afffd5ef57a9feac2c0e2ec2e14f9ead76000"
|
||||
|
||||
[[package]]
|
||||
name = "api"
|
||||
version = "0.1.0"
|
||||
dependencies = [
|
||||
"criterion",
|
||||
"reqwest",
|
||||
"runtime",
|
||||
"serde",
|
||||
@@ -35,6 +48,12 @@ version = "1.1.2"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "1505bd5d3d116872e7271a6d4e16d81d0c8570876c8de68093a09ac269d8aac0"
|
||||
|
||||
[[package]]
|
||||
name = "autocfg"
|
||||
version = "1.5.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "c08606f8c3cbf4ce6ec8e28fb0014a2c086708fe954eaa885384a6165172e7e8"
|
||||
|
||||
[[package]]
|
||||
name = "base64"
|
||||
version = "0.22.1"
|
||||
@@ -77,6 +96,12 @@ version = "1.11.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "1e748733b7cbc798e1434b6ac524f0c1ff2ab456fe201501e6497c8417a4fc33"
|
||||
|
||||
[[package]]
|
||||
name = "cast"
|
||||
version = "0.3.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "37b2a672a2cb129a2e41c10b1224bb368f9f37a2b16b612598138befd7b37eb5"
|
||||
|
||||
[[package]]
|
||||
name = "cc"
|
||||
version = "1.2.58"
|
||||
@@ -99,6 +124,58 @@ version = "0.2.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "613afe47fcd5fac7ccf1db93babcb082c5994d996f20b8b159f2ad1658eb5724"
|
||||
|
||||
[[package]]
|
||||
name = "ciborium"
|
||||
version = "0.2.2"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "42e69ffd6f0917f5c029256a24d0161db17cea3997d185db0d35926308770f0e"
|
||||
dependencies = [
|
||||
"ciborium-io",
|
||||
"ciborium-ll",
|
||||
"serde",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "ciborium-io"
|
||||
version = "0.2.2"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "05afea1e0a06c9be33d539b876f1ce3692f4afea2cb41f740e7743225ed1c757"
|
||||
|
||||
[[package]]
|
||||
name = "ciborium-ll"
|
||||
version = "0.2.2"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "57663b653d948a338bfb3eeba9bb2fd5fcfaecb9e199e87e1eda4d9e8b240fd9"
|
||||
dependencies = [
|
||||
"ciborium-io",
|
||||
"half",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "clap"
|
||||
version = "4.6.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "1ddb117e43bbf7dacf0a4190fef4d345b9bad68dfc649cb349e7d17d28428e51"
|
||||
dependencies = [
|
||||
"clap_builder",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "clap_builder"
|
||||
version = "4.6.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "714a53001bf66416adb0e2ef5ac857140e7dc3a0c48fb28b2f10762fc4b5069f"
|
||||
dependencies = [
|
||||
"anstyle",
|
||||
"clap_lex",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "clap_lex"
|
||||
version = "1.1.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "c8d4a3bb8b1e0c1050499d1815f5ab16d04f0959b233085fb31653fbfc9d98f9"
|
||||
|
||||
[[package]]
|
||||
name = "clipboard-win"
|
||||
version = "5.4.1"
|
||||
@@ -144,6 +221,67 @@ dependencies = [
|
||||
"cfg-if",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "criterion"
|
||||
version = "0.5.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "f2b12d017a929603d80db1831cd3a24082f8137ce19c69e6447f54f5fc8d692f"
|
||||
dependencies = [
|
||||
"anes",
|
||||
"cast",
|
||||
"ciborium",
|
||||
"clap",
|
||||
"criterion-plot",
|
||||
"is-terminal",
|
||||
"itertools",
|
||||
"num-traits",
|
||||
"once_cell",
|
||||
"oorandom",
|
||||
"plotters",
|
||||
"rayon",
|
||||
"regex",
|
||||
"serde",
|
||||
"serde_derive",
|
||||
"serde_json",
|
||||
"tinytemplate",
|
||||
"walkdir",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "criterion-plot"
|
||||
version = "0.5.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "6b50826342786a51a89e2da3a28f1c32b06e387201bc2d19791f622c673706b1"
|
||||
dependencies = [
|
||||
"cast",
|
||||
"itertools",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "crossbeam-deque"
|
||||
version = "0.8.6"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "9dd111b7b7f7d55b72c0a6ae361660ee5853c9af73f70c3c2ef6858b950e2e51"
|
||||
dependencies = [
|
||||
"crossbeam-epoch",
|
||||
"crossbeam-utils",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "crossbeam-epoch"
|
||||
version = "0.9.18"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "5b82ac4a3c2ca9c3460964f020e1402edd5753411d7737aa39c3714ad1b5420e"
|
||||
dependencies = [
|
||||
"crossbeam-utils",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "crossbeam-utils"
|
||||
version = "0.8.21"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "d0a5c400df2834b80a4c3327b3aad3a4c4cd4de0629063962b03235697506a28"
|
||||
|
||||
[[package]]
|
||||
name = "crossterm"
|
||||
version = "0.28.1"
|
||||
@@ -169,6 +307,12 @@ dependencies = [
|
||||
"winapi",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "crunchy"
|
||||
version = "0.2.4"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "460fbee9c2c2f33933d720630a6a0bac33ba7053db5344fac858d4b8952d77d5"
|
||||
|
||||
[[package]]
|
||||
name = "crypto-common"
|
||||
version = "0.1.7"
|
||||
@@ -209,6 +353,12 @@ dependencies = [
|
||||
"syn",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "either"
|
||||
version = "1.15.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "48c757948c5ede0e46177b7add2e67155f70e33c07fea8284df6576da70b3719"
|
||||
|
||||
[[package]]
|
||||
name = "endian-type"
|
||||
version = "0.1.2"
|
||||
@@ -245,7 +395,7 @@ checksum = "0ce92ff622d6dadf7349484f42c93271a0d49b7cc4d466a936405bacbe10aa78"
|
||||
dependencies = [
|
||||
"cfg-if",
|
||||
"rustix 1.1.4",
|
||||
"windows-sys 0.52.0",
|
||||
"windows-sys 0.59.0",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
@@ -380,12 +530,29 @@ version = "0.3.3"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "0cc23270f6e1808e30a928bdc84dea0b9b4136a8bc82338574f23baf47bbd280"
|
||||
|
||||
[[package]]
|
||||
name = "half"
|
||||
version = "2.7.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "6ea2d84b969582b4b1864a92dc5d27cd2b77b622a8d79306834f1be5ba20d84b"
|
||||
dependencies = [
|
||||
"cfg-if",
|
||||
"crunchy",
|
||||
"zerocopy",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "hashbrown"
|
||||
version = "0.16.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "841d1cc9bed7f9236f321df977030373f4a4163ae1a7dbfe1a51a2c1a51d9100"
|
||||
|
||||
[[package]]
|
||||
name = "hermit-abi"
|
||||
version = "0.5.2"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "fc0fef456e4baa96da950455cd02c081ca953b141298e41db3fc7e36b1da849c"
|
||||
|
||||
[[package]]
|
||||
name = "home"
|
||||
version = "0.5.12"
|
||||
@@ -622,6 +789,26 @@ dependencies = [
|
||||
"serde",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "is-terminal"
|
||||
version = "0.4.17"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "3640c1c38b8e4e43584d8df18be5fc6b0aa314ce6ebf51b53313d4306cca8e46"
|
||||
dependencies = [
|
||||
"hermit-abi",
|
||||
"libc",
|
||||
"windows-sys 0.61.2",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "itertools"
|
||||
version = "0.10.5"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "b0fd2260e829bddf4cb6ea802289de2f86d6a7a690192fbe91b3f46e0f2c8473"
|
||||
dependencies = [
|
||||
"either",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "itoa"
|
||||
version = "1.0.18"
|
||||
@@ -755,6 +942,15 @@ version = "0.2.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "c6673768db2d862beb9b39a78fdcb1a69439615d5794a1be50caa9bc92c81967"
|
||||
|
||||
[[package]]
|
||||
name = "num-traits"
|
||||
version = "0.2.19"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "071dfc062690e90b734c0b2273ce72ad0ffa95f0c74596bc250dcfd960262841"
|
||||
dependencies = [
|
||||
"autocfg",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "once_cell"
|
||||
version = "1.21.4"
|
||||
@@ -783,6 +979,12 @@ dependencies = [
|
||||
"pkg-config",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "oorandom"
|
||||
version = "11.1.5"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "d6790f58c7ff633d8771f42965289203411a5e5c68388703c06e14f24770b41e"
|
||||
|
||||
[[package]]
|
||||
name = "parking_lot"
|
||||
version = "0.12.5"
|
||||
@@ -837,6 +1039,34 @@ dependencies = [
|
||||
"time",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "plotters"
|
||||
version = "0.3.7"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "5aeb6f403d7a4911efb1e33402027fc44f29b5bf6def3effcc22d7bb75f2b747"
|
||||
dependencies = [
|
||||
"num-traits",
|
||||
"plotters-backend",
|
||||
"plotters-svg",
|
||||
"wasm-bindgen",
|
||||
"web-sys",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "plotters-backend"
|
||||
version = "0.3.7"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "df42e13c12958a16b3f7f4386b9ab1f3e7933914ecea48da7139435263a4172a"
|
||||
|
||||
[[package]]
|
||||
name = "plotters-svg"
|
||||
version = "0.3.7"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "51bae2ac328883f7acdfea3d66a7c35751187f870bc81f94563733a154d7a670"
|
||||
dependencies = [
|
||||
"plotters-backend",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "plugins"
|
||||
version = "0.1.0"
|
||||
@@ -1015,6 +1245,26 @@ dependencies = [
|
||||
"getrandom 0.3.4",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "rayon"
|
||||
version = "1.12.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "fb39b166781f92d482534ef4b4b1b2568f42613b53e5b6c160e24cfbfa30926d"
|
||||
dependencies = [
|
||||
"either",
|
||||
"rayon-core",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "rayon-core"
|
||||
version = "1.13.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "22e18b0f0062d30d4230b2e85ff77fdfe4326feb054b9783a3460d8435c8ab91"
|
||||
dependencies = [
|
||||
"crossbeam-deque",
|
||||
"crossbeam-utils",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "redox_syscall"
|
||||
version = "0.5.18"
|
||||
@@ -1138,7 +1388,7 @@ dependencies = [
|
||||
"errno",
|
||||
"libc",
|
||||
"linux-raw-sys 0.4.15",
|
||||
"windows-sys 0.52.0",
|
||||
"windows-sys 0.59.0",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
@@ -1522,6 +1772,16 @@ dependencies = [
|
||||
"zerovec",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "tinytemplate"
|
||||
version = "1.2.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "be4d6b5f19ff7664e8c98d03e2139cb510db9b0a60b55f8e8709b689d939b6bc"
|
||||
dependencies = [
|
||||
"serde",
|
||||
"serde_json",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "tinyvec"
|
||||
version = "1.11.0"
|
||||
|
||||
@@ -13,5 +13,12 @@ serde_json.workspace = true
|
||||
telemetry = { path = "../telemetry" }
|
||||
tokio = { version = "1", features = ["io-util", "macros", "net", "rt-multi-thread", "time"] }
|
||||
|
||||
[dev-dependencies]
|
||||
criterion = { version = "0.5", features = ["html_reports"] }
|
||||
|
||||
[lints]
|
||||
workspace = true
|
||||
|
||||
[[bench]]
|
||||
name = "request_building"
|
||||
harness = false
|
||||
|
||||
329
rust/crates/api/benches/request_building.rs
Normal file
329
rust/crates/api/benches/request_building.rs
Normal file
@@ -0,0 +1,329 @@
|
||||
// Benchmarks for API request building performance
|
||||
// Benchmarks are exempt from strict linting as they are test/performance code
|
||||
#![allow(
|
||||
clippy::cognitive_complexity,
|
||||
clippy::doc_markdown,
|
||||
clippy::explicit_iter_loop,
|
||||
clippy::format_in_format_args,
|
||||
clippy::missing_docs_in_private_items,
|
||||
clippy::must_use_candidate,
|
||||
clippy::needless_pass_by_value,
|
||||
clippy::clone_on_copy,
|
||||
clippy::too_many_lines,
|
||||
clippy::uninlined_format_args
|
||||
)]
|
||||
|
||||
use api::{
|
||||
build_chat_completion_request, flatten_tool_result_content, is_reasoning_model,
|
||||
translate_message, InputContentBlock, InputMessage, MessageRequest, OpenAiCompatConfig,
|
||||
ToolResultContentBlock,
|
||||
};
|
||||
use criterion::{black_box, criterion_group, criterion_main, BenchmarkId, Criterion};
|
||||
use serde_json::json;
|
||||
|
||||
/// Create a sample message request with various content types
|
||||
fn create_sample_request(message_count: usize) -> MessageRequest {
|
||||
let mut messages = Vec::with_capacity(message_count);
|
||||
|
||||
for i in 0..message_count {
|
||||
match i % 4 {
|
||||
0 => messages.push(InputMessage::user_text(format!("Message {}", i))),
|
||||
1 => messages.push(InputMessage {
|
||||
role: "assistant".to_string(),
|
||||
content: vec![
|
||||
InputContentBlock::Text {
|
||||
text: format!("Assistant response {}", i),
|
||||
},
|
||||
InputContentBlock::ToolUse {
|
||||
id: format!("call_{}", i),
|
||||
name: "read_file".to_string(),
|
||||
input: json!({"path": format!("/tmp/file{}", i)}),
|
||||
},
|
||||
],
|
||||
}),
|
||||
2 => messages.push(InputMessage {
|
||||
role: "user".to_string(),
|
||||
content: vec![InputContentBlock::ToolResult {
|
||||
tool_use_id: format!("call_{}", i - 1),
|
||||
content: vec![ToolResultContentBlock::Text {
|
||||
text: format!("Tool result content {}", i),
|
||||
}],
|
||||
is_error: false,
|
||||
}],
|
||||
}),
|
||||
_ => messages.push(InputMessage {
|
||||
role: "assistant".to_string(),
|
||||
content: vec![InputContentBlock::ToolUse {
|
||||
id: format!("call_{}", i),
|
||||
name: "write_file".to_string(),
|
||||
input: json!({"path": format!("/tmp/out{}", i), "content": "data"}),
|
||||
}],
|
||||
}),
|
||||
}
|
||||
}
|
||||
|
||||
MessageRequest {
|
||||
model: "gpt-4o".to_string(),
|
||||
max_tokens: 1024,
|
||||
messages,
|
||||
stream: false,
|
||||
system: Some("You are a helpful assistant.".to_string()),
|
||||
temperature: Some(0.7),
|
||||
top_p: None,
|
||||
tools: None,
|
||||
tool_choice: None,
|
||||
frequency_penalty: None,
|
||||
presence_penalty: None,
|
||||
stop: None,
|
||||
reasoning_effort: None,
|
||||
}
|
||||
}
|
||||
|
||||
/// Benchmark translate_message with various message types
|
||||
fn bench_translate_message(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("translate_message");
|
||||
|
||||
// Text-only message
|
||||
let text_message = InputMessage::user_text("Simple text message".to_string());
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("text_only", "single"),
|
||||
&text_message,
|
||||
|b, msg| {
|
||||
b.iter(|| translate_message(black_box(msg), black_box("gpt-4o")));
|
||||
},
|
||||
);
|
||||
|
||||
// Assistant message with tool calls
|
||||
let assistant_message = InputMessage {
|
||||
role: "assistant".to_string(),
|
||||
content: vec![
|
||||
InputContentBlock::Text {
|
||||
text: "I'll help you with that.".to_string(),
|
||||
},
|
||||
InputContentBlock::ToolUse {
|
||||
id: "call_1".to_string(),
|
||||
name: "read_file".to_string(),
|
||||
input: json!({"path": "/tmp/test"}),
|
||||
},
|
||||
InputContentBlock::ToolUse {
|
||||
id: "call_2".to_string(),
|
||||
name: "write_file".to_string(),
|
||||
input: json!({"path": "/tmp/out", "content": "data"}),
|
||||
},
|
||||
],
|
||||
};
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("assistant_with_tools", "2_tools"),
|
||||
&assistant_message,
|
||||
|b, msg| {
|
||||
b.iter(|| translate_message(black_box(msg), black_box("gpt-4o")));
|
||||
},
|
||||
);
|
||||
|
||||
// Tool result message
|
||||
let tool_result_message = InputMessage {
|
||||
role: "user".to_string(),
|
||||
content: vec![InputContentBlock::ToolResult {
|
||||
tool_use_id: "call_1".to_string(),
|
||||
content: vec![ToolResultContentBlock::Text {
|
||||
text: "File contents here".to_string(),
|
||||
}],
|
||||
is_error: false,
|
||||
}],
|
||||
};
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("tool_result", "single"),
|
||||
&tool_result_message,
|
||||
|b, msg| {
|
||||
b.iter(|| translate_message(black_box(msg), black_box("gpt-4o")));
|
||||
},
|
||||
);
|
||||
|
||||
// Tool result for kimi model (is_error excluded)
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("tool_result_kimi", "kimi-k2.5"),
|
||||
&tool_result_message,
|
||||
|b, msg| {
|
||||
b.iter(|| translate_message(black_box(msg), black_box("kimi-k2.5")));
|
||||
},
|
||||
);
|
||||
|
||||
// Large content message
|
||||
let large_content = "x".repeat(10000);
|
||||
let large_message = InputMessage::user_text(large_content);
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("large_text", "10kb"),
|
||||
&large_message,
|
||||
|b, msg| {
|
||||
b.iter(|| translate_message(black_box(msg), black_box("gpt-4o")));
|
||||
},
|
||||
);
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
/// Benchmark build_chat_completion_request with various message counts
|
||||
fn bench_build_request(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("build_chat_completion_request");
|
||||
let config = OpenAiCompatConfig::openai();
|
||||
|
||||
for message_count in [10, 50, 100].iter() {
|
||||
let request = create_sample_request(*message_count);
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("message_count", message_count),
|
||||
&request,
|
||||
|b, req| {
|
||||
b.iter(|| build_chat_completion_request(black_box(req), config.clone()));
|
||||
},
|
||||
);
|
||||
}
|
||||
|
||||
// Benchmark with reasoning model (tuning params stripped)
|
||||
let mut reasoning_request = create_sample_request(50);
|
||||
reasoning_request.model = "o1-mini".to_string();
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("reasoning_model", "o1-mini"),
|
||||
&reasoning_request,
|
||||
|b, req| {
|
||||
b.iter(|| build_chat_completion_request(black_box(req), config.clone()));
|
||||
},
|
||||
);
|
||||
|
||||
// Benchmark with gpt-5 (max_completion_tokens)
|
||||
let mut gpt5_request = create_sample_request(50);
|
||||
gpt5_request.model = "gpt-5".to_string();
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("gpt5", "gpt-5"),
|
||||
&gpt5_request,
|
||||
|b, req| {
|
||||
b.iter(|| build_chat_completion_request(black_box(req), config.clone()));
|
||||
},
|
||||
);
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
/// Benchmark flatten_tool_result_content
|
||||
fn bench_flatten_tool_result(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("flatten_tool_result_content");
|
||||
|
||||
// Single text block
|
||||
let single_text = vec![ToolResultContentBlock::Text {
|
||||
text: "Simple result".to_string(),
|
||||
}];
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("single_text", "1_block"),
|
||||
&single_text,
|
||||
|b, content| {
|
||||
b.iter(|| flatten_tool_result_content(black_box(content)));
|
||||
},
|
||||
);
|
||||
|
||||
// Multiple text blocks
|
||||
let multi_text: Vec<ToolResultContentBlock> = (0..10)
|
||||
.map(|i| ToolResultContentBlock::Text {
|
||||
text: format!("Line {}: some content here\n", i),
|
||||
})
|
||||
.collect();
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("multi_text", "10_blocks"),
|
||||
&multi_text,
|
||||
|b, content| {
|
||||
b.iter(|| flatten_tool_result_content(black_box(content)));
|
||||
},
|
||||
);
|
||||
|
||||
// JSON content blocks
|
||||
let json_content: Vec<ToolResultContentBlock> = (0..5)
|
||||
.map(|i| ToolResultContentBlock::Json {
|
||||
value: json!({"index": i, "data": "test content", "nested": {"key": "value"}}),
|
||||
})
|
||||
.collect();
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("json_content", "5_blocks"),
|
||||
&json_content,
|
||||
|b, content| {
|
||||
b.iter(|| flatten_tool_result_content(black_box(content)));
|
||||
},
|
||||
);
|
||||
|
||||
// Mixed content
|
||||
let mixed_content = vec![
|
||||
ToolResultContentBlock::Text {
|
||||
text: "Here's the result:".to_string(),
|
||||
},
|
||||
ToolResultContentBlock::Json {
|
||||
value: json!({"status": "success", "count": 42}),
|
||||
},
|
||||
ToolResultContentBlock::Text {
|
||||
text: "Processing complete.".to_string(),
|
||||
},
|
||||
];
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("mixed_content", "text+json"),
|
||||
&mixed_content,
|
||||
|b, content| {
|
||||
b.iter(|| flatten_tool_result_content(black_box(content)));
|
||||
},
|
||||
);
|
||||
|
||||
// Large content - simulating typical tool output
|
||||
let large_content: Vec<ToolResultContentBlock> = (0..50)
|
||||
.map(|i| {
|
||||
if i % 3 == 0 {
|
||||
ToolResultContentBlock::Json {
|
||||
value: json!({"line": i, "content": "x".repeat(100)}),
|
||||
}
|
||||
} else {
|
||||
ToolResultContentBlock::Text {
|
||||
text: format!("Line {}: {}", i, "some output content here"),
|
||||
}
|
||||
}
|
||||
})
|
||||
.collect();
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("large_content", "50_blocks"),
|
||||
&large_content,
|
||||
|b, content| {
|
||||
b.iter(|| flatten_tool_result_content(black_box(content)));
|
||||
},
|
||||
);
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
/// Benchmark is_reasoning_model detection
|
||||
fn bench_is_reasoning_model(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("is_reasoning_model");
|
||||
|
||||
let models = vec![
|
||||
("gpt-4o", false),
|
||||
("o1-mini", true),
|
||||
("o3", true),
|
||||
("grok-3", false),
|
||||
("grok-3-mini", true),
|
||||
("qwen/qwen-qwq-32b", true),
|
||||
("qwen/qwen-plus", false),
|
||||
];
|
||||
|
||||
for (model, expected) in models {
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new(model, if expected { "reasoning" } else { "normal" }),
|
||||
model,
|
||||
|b, m| {
|
||||
b.iter(|| is_reasoning_model(black_box(m)));
|
||||
},
|
||||
);
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
criterion_group!(
|
||||
benches,
|
||||
bench_translate_message,
|
||||
bench_build_request,
|
||||
bench_flatten_tool_result,
|
||||
bench_is_reasoning_model
|
||||
);
|
||||
criterion_main!(benches);
|
||||
@@ -53,6 +53,8 @@ pub enum ApiError {
|
||||
request_id: Option<String>,
|
||||
body: String,
|
||||
retryable: bool,
|
||||
/// Suggested user action based on error type (e.g., "Reduce prompt size" for 413)
|
||||
suggested_action: Option<String>,
|
||||
},
|
||||
RetriesExhausted {
|
||||
attempts: u32,
|
||||
@@ -63,6 +65,11 @@ pub enum ApiError {
|
||||
attempt: u32,
|
||||
base_delay: Duration,
|
||||
},
|
||||
RequestBodySizeExceeded {
|
||||
estimated_bytes: usize,
|
||||
max_bytes: usize,
|
||||
provider: &'static str,
|
||||
},
|
||||
}
|
||||
|
||||
impl ApiError {
|
||||
@@ -129,7 +136,8 @@ impl ApiError {
|
||||
| Self::Io(_)
|
||||
| Self::Json { .. }
|
||||
| Self::InvalidSseFrame(_)
|
||||
| Self::BackoffOverflow { .. } => false,
|
||||
| Self::BackoffOverflow { .. }
|
||||
| Self::RequestBodySizeExceeded { .. } => false,
|
||||
}
|
||||
}
|
||||
|
||||
@@ -147,7 +155,8 @@ impl ApiError {
|
||||
| Self::Io(_)
|
||||
| Self::Json { .. }
|
||||
| Self::InvalidSseFrame(_)
|
||||
| Self::BackoffOverflow { .. } => None,
|
||||
| Self::BackoffOverflow { .. }
|
||||
| Self::RequestBodySizeExceeded { .. } => None,
|
||||
}
|
||||
}
|
||||
|
||||
@@ -172,6 +181,7 @@ impl ApiError {
|
||||
"provider_transport"
|
||||
}
|
||||
Self::InvalidApiKeyEnv(_) | Self::Io(_) | Self::Json { .. } => "runtime_io",
|
||||
Self::RequestBodySizeExceeded { .. } => "request_size",
|
||||
}
|
||||
}
|
||||
|
||||
@@ -194,7 +204,8 @@ impl ApiError {
|
||||
| Self::Io(_)
|
||||
| Self::Json { .. }
|
||||
| Self::InvalidSseFrame(_)
|
||||
| Self::BackoffOverflow { .. } => false,
|
||||
| Self::BackoffOverflow { .. }
|
||||
| Self::RequestBodySizeExceeded { .. } => false,
|
||||
}
|
||||
}
|
||||
|
||||
@@ -223,12 +234,14 @@ impl ApiError {
|
||||
| Self::Io(_)
|
||||
| Self::Json { .. }
|
||||
| Self::InvalidSseFrame(_)
|
||||
| Self::BackoffOverflow { .. } => false,
|
||||
| Self::BackoffOverflow { .. }
|
||||
| Self::RequestBodySizeExceeded { .. } => false,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl Display for ApiError {
|
||||
#[allow(clippy::too_many_lines)]
|
||||
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
|
||||
match self {
|
||||
Self::MissingCredentials {
|
||||
@@ -324,6 +337,14 @@ impl Display for ApiError {
|
||||
f,
|
||||
"retry backoff overflowed on attempt {attempt} with base delay {base_delay:?}"
|
||||
),
|
||||
Self::RequestBodySizeExceeded {
|
||||
estimated_bytes,
|
||||
max_bytes,
|
||||
provider,
|
||||
} => write!(
|
||||
f,
|
||||
"request body size ({estimated_bytes} bytes) exceeds {provider} limit ({max_bytes} bytes); reduce prompt length or context before retrying"
|
||||
),
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -469,6 +490,7 @@ mod tests {
|
||||
request_id: Some("req_jobdori_123".to_string()),
|
||||
body: String::new(),
|
||||
retryable: true,
|
||||
suggested_action: None,
|
||||
};
|
||||
|
||||
assert!(error.is_generic_fatal_wrapper());
|
||||
@@ -491,6 +513,7 @@ mod tests {
|
||||
request_id: Some("req_nested_456".to_string()),
|
||||
body: String::new(),
|
||||
retryable: true,
|
||||
suggested_action: None,
|
||||
}),
|
||||
};
|
||||
|
||||
@@ -511,6 +534,7 @@ mod tests {
|
||||
request_id: Some("req_ctx_123".to_string()),
|
||||
body: String::new(),
|
||||
retryable: false,
|
||||
suggested_action: None,
|
||||
};
|
||||
|
||||
assert!(error.is_context_window_failure());
|
||||
|
||||
@@ -19,7 +19,10 @@ pub use prompt_cache::{
|
||||
PromptCacheStats,
|
||||
};
|
||||
pub use providers::anthropic::{AnthropicClient, AnthropicClient as ApiClient, AuthSource};
|
||||
pub use providers::openai_compat::{OpenAiCompatClient, OpenAiCompatConfig};
|
||||
pub use providers::openai_compat::{
|
||||
build_chat_completion_request, flatten_tool_result_content, is_reasoning_model,
|
||||
model_rejects_is_error_field, translate_message, OpenAiCompatClient, OpenAiCompatConfig,
|
||||
};
|
||||
pub use providers::{
|
||||
detect_provider_kind, max_tokens_for_model, max_tokens_for_model_with_override,
|
||||
resolve_model_alias, ProviderKind,
|
||||
|
||||
@@ -885,6 +885,7 @@ async fn expect_success(response: reqwest::Response) -> Result<reqwest::Response
|
||||
request_id,
|
||||
body,
|
||||
retryable,
|
||||
suggested_action: None,
|
||||
})
|
||||
}
|
||||
|
||||
@@ -909,6 +910,7 @@ fn enrich_bearer_auth_error(error: ApiError, auth: &AuthSource) -> ApiError {
|
||||
request_id,
|
||||
body,
|
||||
retryable,
|
||||
suggested_action,
|
||||
} = error
|
||||
else {
|
||||
return error;
|
||||
@@ -921,6 +923,7 @@ fn enrich_bearer_auth_error(error: ApiError, auth: &AuthSource) -> ApiError {
|
||||
request_id,
|
||||
body,
|
||||
retryable,
|
||||
suggested_action,
|
||||
};
|
||||
}
|
||||
let Some(bearer_token) = auth.bearer_token() else {
|
||||
@@ -931,6 +934,7 @@ fn enrich_bearer_auth_error(error: ApiError, auth: &AuthSource) -> ApiError {
|
||||
request_id,
|
||||
body,
|
||||
retryable,
|
||||
suggested_action,
|
||||
};
|
||||
};
|
||||
if !bearer_token.starts_with("sk-ant-") {
|
||||
@@ -941,6 +945,7 @@ fn enrich_bearer_auth_error(error: ApiError, auth: &AuthSource) -> ApiError {
|
||||
request_id,
|
||||
body,
|
||||
retryable,
|
||||
suggested_action,
|
||||
};
|
||||
}
|
||||
// Only append the hint when the AuthSource is pure BearerToken. If both
|
||||
@@ -955,6 +960,7 @@ fn enrich_bearer_auth_error(error: ApiError, auth: &AuthSource) -> ApiError {
|
||||
request_id,
|
||||
body,
|
||||
retryable,
|
||||
suggested_action,
|
||||
};
|
||||
}
|
||||
let enriched_message = match message {
|
||||
@@ -968,6 +974,7 @@ fn enrich_bearer_auth_error(error: ApiError, auth: &AuthSource) -> ApiError {
|
||||
request_id,
|
||||
body,
|
||||
retryable,
|
||||
suggested_action,
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1555,6 +1562,7 @@ mod tests {
|
||||
request_id: Some("req_varleg_001".to_string()),
|
||||
body: String::new(),
|
||||
retryable: false,
|
||||
suggested_action: None,
|
||||
};
|
||||
|
||||
// when
|
||||
@@ -1595,6 +1603,7 @@ mod tests {
|
||||
request_id: None,
|
||||
body: String::new(),
|
||||
retryable: true,
|
||||
suggested_action: None,
|
||||
};
|
||||
|
||||
// when
|
||||
@@ -1623,6 +1632,7 @@ mod tests {
|
||||
request_id: None,
|
||||
body: String::new(),
|
||||
retryable: false,
|
||||
suggested_action: None,
|
||||
};
|
||||
|
||||
// when
|
||||
@@ -1650,6 +1660,7 @@ mod tests {
|
||||
request_id: None,
|
||||
body: String::new(),
|
||||
retryable: false,
|
||||
suggested_action: None,
|
||||
};
|
||||
|
||||
// when
|
||||
@@ -1674,6 +1685,7 @@ mod tests {
|
||||
request_id: None,
|
||||
body: String::new(),
|
||||
retryable: false,
|
||||
suggested_action: None,
|
||||
};
|
||||
|
||||
// when
|
||||
|
||||
@@ -122,6 +122,15 @@ const MODEL_REGISTRY: &[(&str, ProviderMetadata)] = &[
|
||||
default_base_url: openai_compat::DEFAULT_XAI_BASE_URL,
|
||||
},
|
||||
),
|
||||
(
|
||||
"kimi",
|
||||
ProviderMetadata {
|
||||
provider: ProviderKind::OpenAi,
|
||||
auth_env: "DASHSCOPE_API_KEY",
|
||||
base_url_env: "DASHSCOPE_BASE_URL",
|
||||
default_base_url: openai_compat::DEFAULT_DASHSCOPE_BASE_URL,
|
||||
},
|
||||
),
|
||||
];
|
||||
|
||||
#[must_use]
|
||||
@@ -144,7 +153,10 @@ pub fn resolve_model_alias(model: &str) -> String {
|
||||
"grok-2" => "grok-2",
|
||||
_ => trimmed,
|
||||
},
|
||||
ProviderKind::OpenAi => trimmed,
|
||||
ProviderKind::OpenAi => match *alias {
|
||||
"kimi" => "kimi-k2.5",
|
||||
_ => trimmed,
|
||||
},
|
||||
})
|
||||
})
|
||||
.map_or_else(|| trimmed.to_string(), ToOwned::to_owned)
|
||||
@@ -194,6 +206,16 @@ pub fn metadata_for_model(model: &str) -> Option<ProviderMetadata> {
|
||||
default_base_url: openai_compat::DEFAULT_DASHSCOPE_BASE_URL,
|
||||
});
|
||||
}
|
||||
// Kimi models (kimi-k2.5, kimi-k1.5, etc.) via DashScope compatible-mode.
|
||||
// Routes kimi/* and kimi-* model names to DashScope endpoint.
|
||||
if canonical.starts_with("kimi/") || canonical.starts_with("kimi-") {
|
||||
return Some(ProviderMetadata {
|
||||
provider: ProviderKind::OpenAi,
|
||||
auth_env: "DASHSCOPE_API_KEY",
|
||||
base_url_env: "DASHSCOPE_BASE_URL",
|
||||
default_base_url: openai_compat::DEFAULT_DASHSCOPE_BASE_URL,
|
||||
});
|
||||
}
|
||||
None
|
||||
}
|
||||
|
||||
@@ -267,6 +289,12 @@ pub fn model_token_limit(model: &str) -> Option<ModelTokenLimit> {
|
||||
max_output_tokens: 64_000,
|
||||
context_window_tokens: 131_072,
|
||||
}),
|
||||
// Kimi models via DashScope (Moonshot AI)
|
||||
// Source: https://platform.moonshot.cn/docs/intro
|
||||
"kimi-k2.5" | "kimi-k1.5" => Some(ModelTokenLimit {
|
||||
max_output_tokens: 16_384,
|
||||
context_window_tokens: 256_000,
|
||||
}),
|
||||
_ => None,
|
||||
}
|
||||
}
|
||||
@@ -554,6 +582,34 @@ mod tests {
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn kimi_prefix_routes_to_dashscope() {
|
||||
// Kimi models via DashScope (kimi-k2.5, kimi-k1.5, etc.)
|
||||
let meta = super::metadata_for_model("kimi-k2.5")
|
||||
.expect("kimi-k2.5 must resolve to DashScope metadata");
|
||||
assert_eq!(meta.auth_env, "DASHSCOPE_API_KEY");
|
||||
assert_eq!(meta.base_url_env, "DASHSCOPE_BASE_URL");
|
||||
assert!(meta.default_base_url.contains("dashscope.aliyuncs.com"));
|
||||
assert_eq!(meta.provider, ProviderKind::OpenAi);
|
||||
|
||||
// With provider prefix
|
||||
let meta2 = super::metadata_for_model("kimi/kimi-k2.5")
|
||||
.expect("kimi/kimi-k2.5 must resolve to DashScope metadata");
|
||||
assert_eq!(meta2.auth_env, "DASHSCOPE_API_KEY");
|
||||
assert_eq!(meta2.provider, ProviderKind::OpenAi);
|
||||
|
||||
// Different kimi variants
|
||||
let meta3 = super::metadata_for_model("kimi-k1.5")
|
||||
.expect("kimi-k1.5 must resolve to DashScope metadata");
|
||||
assert_eq!(meta3.auth_env, "DASHSCOPE_API_KEY");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn kimi_alias_resolves_to_kimi_k2_5() {
|
||||
assert_eq!(super::resolve_model_alias("kimi"), "kimi-k2.5");
|
||||
assert_eq!(super::resolve_model_alias("KIMI"), "kimi-k2.5"); // case insensitive
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn keeps_existing_max_token_heuristic() {
|
||||
assert_eq!(max_tokens_for_model("opus"), 32_000);
|
||||
@@ -694,6 +750,69 @@ mod tests {
|
||||
.expect("models without context metadata should skip the guarded preflight");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn returns_context_window_metadata_for_kimi_models() {
|
||||
// kimi-k2.5
|
||||
let k25_limit = model_token_limit("kimi-k2.5")
|
||||
.expect("kimi-k2.5 should have token limit metadata");
|
||||
assert_eq!(k25_limit.max_output_tokens, 16_384);
|
||||
assert_eq!(k25_limit.context_window_tokens, 256_000);
|
||||
|
||||
// kimi-k1.5
|
||||
let k15_limit = model_token_limit("kimi-k1.5")
|
||||
.expect("kimi-k1.5 should have token limit metadata");
|
||||
assert_eq!(k15_limit.max_output_tokens, 16_384);
|
||||
assert_eq!(k15_limit.context_window_tokens, 256_000);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn kimi_alias_resolves_to_kimi_k25_token_limits() {
|
||||
// The "kimi" alias resolves to "kimi-k2.5" via resolve_model_alias()
|
||||
let alias_limit = model_token_limit("kimi")
|
||||
.expect("kimi alias should resolve to kimi-k2.5 limits");
|
||||
let direct_limit = model_token_limit("kimi-k2.5")
|
||||
.expect("kimi-k2.5 should have limits");
|
||||
assert_eq!(alias_limit.max_output_tokens, direct_limit.max_output_tokens);
|
||||
assert_eq!(
|
||||
alias_limit.context_window_tokens,
|
||||
direct_limit.context_window_tokens
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn preflight_blocks_oversized_requests_for_kimi_models() {
|
||||
let request = MessageRequest {
|
||||
model: "kimi-k2.5".to_string(),
|
||||
max_tokens: 16_384,
|
||||
messages: vec![InputMessage {
|
||||
role: "user".to_string(),
|
||||
content: vec![InputContentBlock::Text {
|
||||
text: "x".repeat(1_000_000), // Large input to exceed context window
|
||||
}],
|
||||
}],
|
||||
system: Some("Keep the answer short.".to_string()),
|
||||
tools: None,
|
||||
tool_choice: None,
|
||||
stream: true,
|
||||
..Default::default()
|
||||
};
|
||||
|
||||
let error = preflight_message_request(&request)
|
||||
.expect_err("oversized request should be rejected for kimi models");
|
||||
|
||||
match error {
|
||||
ApiError::ContextWindowExceeded {
|
||||
model,
|
||||
context_window_tokens,
|
||||
..
|
||||
} => {
|
||||
assert_eq!(model, "kimi-k2.5");
|
||||
assert_eq!(context_window_tokens, 256_000);
|
||||
}
|
||||
other => panic!("expected context-window preflight failure, got {other:?}"),
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn parse_dotenv_extracts_keys_handles_comments_quotes_and_export_prefix() {
|
||||
// given
|
||||
|
||||
@@ -31,12 +31,22 @@ pub struct OpenAiCompatConfig {
|
||||
pub api_key_env: &'static str,
|
||||
pub base_url_env: &'static str,
|
||||
pub default_base_url: &'static str,
|
||||
/// Maximum request body size in bytes. Provider-specific limits:
|
||||
/// - `DashScope`: 6MB (`6_291_456` bytes) - observed in dogfood testing
|
||||
/// - `OpenAI`: 100MB (`104_857_600` bytes)
|
||||
/// - `xAI`: 50MB (`52_428_800` bytes)
|
||||
pub max_request_body_bytes: usize,
|
||||
}
|
||||
|
||||
const XAI_ENV_VARS: &[&str] = &["XAI_API_KEY"];
|
||||
const OPENAI_ENV_VARS: &[&str] = &["OPENAI_API_KEY"];
|
||||
const DASHSCOPE_ENV_VARS: &[&str] = &["DASHSCOPE_API_KEY"];
|
||||
|
||||
// Provider-specific request body size limits in bytes
|
||||
const XAI_MAX_REQUEST_BODY_BYTES: usize = 52_428_800; // 50MB
|
||||
const OPENAI_MAX_REQUEST_BODY_BYTES: usize = 104_857_600; // 100MB
|
||||
const DASHSCOPE_MAX_REQUEST_BODY_BYTES: usize = 6_291_456; // 6MB (observed limit in dogfood)
|
||||
|
||||
impl OpenAiCompatConfig {
|
||||
#[must_use]
|
||||
pub const fn xai() -> Self {
|
||||
@@ -45,6 +55,7 @@ impl OpenAiCompatConfig {
|
||||
api_key_env: "XAI_API_KEY",
|
||||
base_url_env: "XAI_BASE_URL",
|
||||
default_base_url: DEFAULT_XAI_BASE_URL,
|
||||
max_request_body_bytes: XAI_MAX_REQUEST_BODY_BYTES,
|
||||
}
|
||||
}
|
||||
|
||||
@@ -55,6 +66,7 @@ impl OpenAiCompatConfig {
|
||||
api_key_env: "OPENAI_API_KEY",
|
||||
base_url_env: "OPENAI_BASE_URL",
|
||||
default_base_url: DEFAULT_OPENAI_BASE_URL,
|
||||
max_request_body_bytes: OPENAI_MAX_REQUEST_BODY_BYTES,
|
||||
}
|
||||
}
|
||||
|
||||
@@ -69,6 +81,7 @@ impl OpenAiCompatConfig {
|
||||
api_key_env: "DASHSCOPE_API_KEY",
|
||||
base_url_env: "DASHSCOPE_BASE_URL",
|
||||
default_base_url: DEFAULT_DASHSCOPE_BASE_URL,
|
||||
max_request_body_bytes: DASHSCOPE_MAX_REQUEST_BODY_BYTES,
|
||||
}
|
||||
}
|
||||
|
||||
@@ -183,6 +196,10 @@ impl OpenAiCompatClient {
|
||||
request_id,
|
||||
body,
|
||||
retryable: false,
|
||||
suggested_action: suggested_action_for_status(
|
||||
reqwest::StatusCode::from_u16(code.unwrap_or(400))
|
||||
.unwrap_or(reqwest::StatusCode::BAD_REQUEST),
|
||||
),
|
||||
});
|
||||
}
|
||||
}
|
||||
@@ -249,6 +266,9 @@ impl OpenAiCompatClient {
|
||||
&self,
|
||||
request: &MessageRequest,
|
||||
) -> Result<reqwest::Response, ApiError> {
|
||||
// Pre-flight check: verify request body size against provider limits
|
||||
check_request_body_size(request, self.config())?;
|
||||
|
||||
let request_url = chat_completions_endpoint(&self.base_url);
|
||||
self.http
|
||||
.post(&request_url)
|
||||
@@ -752,7 +772,12 @@ struct ErrorBody {
|
||||
/// Returns true for models known to reject tuning parameters like temperature,
|
||||
/// `top_p`, `frequency_penalty`, and `presence_penalty`. These are typically
|
||||
/// reasoning/chain-of-thought models with fixed sampling.
|
||||
fn is_reasoning_model(model: &str) -> bool {
|
||||
/// Returns true for models known to reject tuning parameters like temperature,
|
||||
/// `top_p`, `frequency_penalty`, and `presence_penalty`. These are typically
|
||||
/// reasoning/chain-of-thought models with fixed sampling.
|
||||
/// Public for benchmarking and testing purposes.
|
||||
#[must_use]
|
||||
pub fn is_reasoning_model(model: &str) -> bool {
|
||||
let lowered = model.to_ascii_lowercase();
|
||||
// Strip any provider/ prefix for the check (e.g. qwen/qwen-qwq -> qwen-qwq)
|
||||
let canonical = lowered.rsplit('/').next().unwrap_or(lowered.as_str());
|
||||
@@ -776,7 +801,7 @@ fn strip_routing_prefix(model: &str) -> &str {
|
||||
let prefix = &model[..pos];
|
||||
// Only strip if the prefix before "/" is a known routing prefix,
|
||||
// not if "/" appears in the middle of the model name for other reasons.
|
||||
if matches!(prefix, "openai" | "xai" | "grok" | "qwen") {
|
||||
if matches!(prefix, "openai" | "xai" | "grok" | "qwen" | "kimi") {
|
||||
&model[pos + 1..]
|
||||
} else {
|
||||
model
|
||||
@@ -786,7 +811,41 @@ fn strip_routing_prefix(model: &str) -> &str {
|
||||
}
|
||||
}
|
||||
|
||||
fn build_chat_completion_request(request: &MessageRequest, config: OpenAiCompatConfig) -> Value {
|
||||
/// Estimate the serialized JSON size of a request payload in bytes.
|
||||
/// This is a pre-flight check to avoid hitting provider-specific size limits.
|
||||
pub fn estimate_request_body_size(request: &MessageRequest, config: OpenAiCompatConfig) -> usize {
|
||||
let payload = build_chat_completion_request(request, config);
|
||||
// serde_json::to_vec gives us the exact byte size of the serialized JSON
|
||||
serde_json::to_vec(&payload).map_or(0, |v| v.len())
|
||||
}
|
||||
|
||||
/// Pre-flight check for request body size against provider limits.
|
||||
/// Returns Ok(()) if the request is within limits, or an error with
|
||||
/// a clear message about the size limit being exceeded.
|
||||
pub fn check_request_body_size(
|
||||
request: &MessageRequest,
|
||||
config: OpenAiCompatConfig,
|
||||
) -> Result<(), ApiError> {
|
||||
let estimated_bytes = estimate_request_body_size(request, config);
|
||||
let max_bytes = config.max_request_body_bytes;
|
||||
|
||||
if estimated_bytes > max_bytes {
|
||||
Err(ApiError::RequestBodySizeExceeded {
|
||||
estimated_bytes,
|
||||
max_bytes,
|
||||
provider: config.provider_name,
|
||||
})
|
||||
} else {
|
||||
Ok(())
|
||||
}
|
||||
}
|
||||
|
||||
/// Builds a chat completion request payload from a `MessageRequest`.
|
||||
/// Public for benchmarking purposes.
|
||||
pub fn build_chat_completion_request(
|
||||
request: &MessageRequest,
|
||||
config: OpenAiCompatConfig,
|
||||
) -> Value {
|
||||
let mut messages = Vec::new();
|
||||
if let Some(system) = request.system.as_ref().filter(|value| !value.is_empty()) {
|
||||
messages.push(json!({
|
||||
@@ -794,8 +853,10 @@ fn build_chat_completion_request(request: &MessageRequest, config: OpenAiCompatC
|
||||
"content": system,
|
||||
}));
|
||||
}
|
||||
// Strip routing prefix (e.g., "openai/gpt-4" → "gpt-4") for the wire.
|
||||
let wire_model = strip_routing_prefix(&request.model);
|
||||
for message in &request.messages {
|
||||
messages.extend(translate_message(message));
|
||||
messages.extend(translate_message(message, wire_model));
|
||||
}
|
||||
// Sanitize: drop any `role:"tool"` message that does not have a valid
|
||||
// paired `role:"assistant"` with a `tool_calls` entry carrying the same
|
||||
@@ -806,9 +867,6 @@ fn build_chat_completion_request(request: &MessageRequest, config: OpenAiCompatC
|
||||
// still proceed with the remaining history intact.
|
||||
messages = sanitize_tool_message_pairing(messages);
|
||||
|
||||
// Strip routing prefix (e.g., "openai/gpt-4" → "gpt-4") for the wire.
|
||||
let wire_model = strip_routing_prefix(&request.model);
|
||||
|
||||
// gpt-5* requires `max_completion_tokens`; older OpenAI models accept both.
|
||||
// We send the correct field based on the wire model name so gpt-5.x requests
|
||||
// don't fail with "unknown field max_tokens".
|
||||
@@ -868,7 +926,25 @@ fn build_chat_completion_request(request: &MessageRequest, config: OpenAiCompatC
|
||||
payload
|
||||
}
|
||||
|
||||
fn translate_message(message: &InputMessage) -> Vec<Value> {
|
||||
/// Returns true for models that do NOT support the `is_error` field in tool results.
|
||||
/// kimi models (via Moonshot AI/Dashscope) reject this field with 400 Bad Request.
|
||||
/// Returns true for models that do NOT support the `is_error` field in tool results.
|
||||
/// kimi models (via Moonshot AI/Dashscope) reject this field with 400 Bad Request.
|
||||
/// Public for benchmarking and testing purposes.
|
||||
#[must_use]
|
||||
pub fn model_rejects_is_error_field(model: &str) -> bool {
|
||||
let lowered = model.to_ascii_lowercase();
|
||||
// Strip any provider/ prefix for the check
|
||||
let canonical = lowered.rsplit('/').next().unwrap_or(lowered.as_str());
|
||||
// kimi models (kimi-k2.5, kimi-k1.5, kimi-moonshot, etc.)
|
||||
canonical.starts_with("kimi")
|
||||
}
|
||||
|
||||
/// Translates an `InputMessage` into OpenAI-compatible message format.
|
||||
/// Public for benchmarking purposes.
|
||||
#[must_use]
|
||||
pub fn translate_message(message: &InputMessage, model: &str) -> Vec<Value> {
|
||||
let supports_is_error = !model_rejects_is_error_field(model);
|
||||
match message.role.as_str() {
|
||||
"assistant" => {
|
||||
let mut text = String::new();
|
||||
@@ -914,12 +990,19 @@ fn translate_message(message: &InputMessage) -> Vec<Value> {
|
||||
tool_use_id,
|
||||
content,
|
||||
is_error,
|
||||
} => Some(json!({
|
||||
"role": "tool",
|
||||
"tool_call_id": tool_use_id,
|
||||
"content": flatten_tool_result_content(content),
|
||||
"is_error": is_error,
|
||||
})),
|
||||
} => {
|
||||
let mut msg = json!({
|
||||
"role": "tool",
|
||||
"tool_call_id": tool_use_id,
|
||||
"content": flatten_tool_result_content(content),
|
||||
});
|
||||
// Only include is_error for models that support it.
|
||||
// kimi models reject this field with 400 Bad Request.
|
||||
if supports_is_error {
|
||||
msg["is_error"] = json!(is_error);
|
||||
}
|
||||
Some(msg)
|
||||
}
|
||||
InputContentBlock::ToolUse { .. } => None,
|
||||
})
|
||||
.collect(),
|
||||
@@ -938,7 +1021,10 @@ fn translate_message(message: &InputMessage) -> Vec<Value> {
|
||||
/// `tool_calls` array containing an entry whose `id` matches the tool
|
||||
/// message's `tool_call_id`, the pair is valid and both are kept. Otherwise
|
||||
/// the tool message is dropped.
|
||||
fn sanitize_tool_message_pairing(messages: Vec<Value>) -> Vec<Value> {
|
||||
/// Remove `role:"tool"` messages from `messages` that have no valid paired
|
||||
/// `role:"assistant"` message with a matching `tool_calls[].id` immediately
|
||||
/// preceding them. Public for benchmarking purposes.
|
||||
pub fn sanitize_tool_message_pairing(messages: Vec<Value>) -> Vec<Value> {
|
||||
// Collect indices of tool messages that are orphaned.
|
||||
let mut drop_indices = std::collections::HashSet::new();
|
||||
for (i, msg) in messages.iter().enumerate() {
|
||||
@@ -994,15 +1080,36 @@ fn sanitize_tool_message_pairing(messages: Vec<Value>) -> Vec<Value> {
|
||||
.collect()
|
||||
}
|
||||
|
||||
fn flatten_tool_result_content(content: &[ToolResultContentBlock]) -> String {
|
||||
content
|
||||
/// Flattens tool result content blocks into a single string.
|
||||
/// Optimized to pre-allocate capacity and avoid intermediate `Vec` construction.
|
||||
#[must_use]
|
||||
pub fn flatten_tool_result_content(content: &[ToolResultContentBlock]) -> String {
|
||||
// Pre-calculate total capacity needed to avoid reallocations
|
||||
let total_len: usize = content
|
||||
.iter()
|
||||
.map(|block| match block {
|
||||
ToolResultContentBlock::Text { text } => text.clone(),
|
||||
ToolResultContentBlock::Json { value } => value.to_string(),
|
||||
ToolResultContentBlock::Text { text } => text.len(),
|
||||
ToolResultContentBlock::Json { value } => value.to_string().len(),
|
||||
})
|
||||
.collect::<Vec<_>>()
|
||||
.join("\n")
|
||||
.sum();
|
||||
|
||||
// Add capacity for newlines between blocks
|
||||
let capacity = total_len + content.len().saturating_sub(1);
|
||||
|
||||
let mut result = String::with_capacity(capacity);
|
||||
for (i, block) in content.iter().enumerate() {
|
||||
if i > 0 {
|
||||
result.push('\n');
|
||||
}
|
||||
match block {
|
||||
ToolResultContentBlock::Text { text } => result.push_str(text),
|
||||
ToolResultContentBlock::Json { value } => {
|
||||
// Use write! to append without creating intermediate String
|
||||
result.push_str(&value.to_string());
|
||||
}
|
||||
}
|
||||
}
|
||||
result
|
||||
}
|
||||
|
||||
/// Recursively ensure every object-type node in a JSON Schema has
|
||||
@@ -1186,6 +1293,7 @@ fn parse_sse_frame(
|
||||
request_id: None,
|
||||
body: payload.clone(),
|
||||
retryable: false,
|
||||
suggested_action: suggested_action_for_status(status),
|
||||
});
|
||||
}
|
||||
}
|
||||
@@ -1243,6 +1351,8 @@ async fn expect_success(response: reqwest::Response) -> Result<reqwest::Response
|
||||
let parsed_error = serde_json::from_str::<ErrorEnvelope>(&body).ok();
|
||||
let retryable = is_retryable_status(status);
|
||||
|
||||
let suggested_action = suggested_action_for_status(status);
|
||||
|
||||
Err(ApiError::Api {
|
||||
status,
|
||||
error_type: parsed_error
|
||||
@@ -1254,6 +1364,7 @@ async fn expect_success(response: reqwest::Response) -> Result<reqwest::Response
|
||||
request_id,
|
||||
body,
|
||||
retryable,
|
||||
suggested_action,
|
||||
})
|
||||
}
|
||||
|
||||
@@ -1261,6 +1372,20 @@ const fn is_retryable_status(status: reqwest::StatusCode) -> bool {
|
||||
matches!(status.as_u16(), 408 | 409 | 429 | 500 | 502 | 503 | 504)
|
||||
}
|
||||
|
||||
/// Generate a suggested user action based on the HTTP status code and error context.
|
||||
/// This provides actionable guidance when API requests fail.
|
||||
fn suggested_action_for_status(status: reqwest::StatusCode) -> Option<String> {
|
||||
match status.as_u16() {
|
||||
401 => Some("Check API key is set correctly and has not expired".to_string()),
|
||||
403 => Some("Verify API key has required permissions for this operation".to_string()),
|
||||
413 => Some("Reduce prompt size or context window before retrying".to_string()),
|
||||
429 => Some("Wait a moment before retrying; consider reducing request rate".to_string()),
|
||||
500 => Some("Provider server error - retry after a brief wait".to_string()),
|
||||
502..=504 => Some("Provider gateway error - retry after a brief wait".to_string()),
|
||||
_ => None,
|
||||
}
|
||||
}
|
||||
|
||||
fn normalize_finish_reason(value: &str) -> String {
|
||||
match value {
|
||||
"stop" => "end_turn",
|
||||
@@ -1794,4 +1919,292 @@ mod tests {
|
||||
"gpt-4o must not emit max_completion_tokens"
|
||||
);
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// US-009: kimi model compatibility tests
|
||||
// ============================================================================
|
||||
|
||||
#[test]
|
||||
fn model_rejects_is_error_field_detects_kimi_models() {
|
||||
// kimi models (various formats) should be detected
|
||||
assert!(super::model_rejects_is_error_field("kimi-k2.5"));
|
||||
assert!(super::model_rejects_is_error_field("kimi-k1.5"));
|
||||
assert!(super::model_rejects_is_error_field("kimi-moonshot"));
|
||||
assert!(super::model_rejects_is_error_field("KIMI-K2.5")); // case insensitive
|
||||
assert!(super::model_rejects_is_error_field("dashscope/kimi-k2.5")); // with prefix
|
||||
assert!(super::model_rejects_is_error_field("moonshot/kimi-k2.5")); // different prefix
|
||||
|
||||
// Non-kimi models should NOT be detected
|
||||
assert!(!super::model_rejects_is_error_field("gpt-4o"));
|
||||
assert!(!super::model_rejects_is_error_field("gpt-4"));
|
||||
assert!(!super::model_rejects_is_error_field("claude-sonnet-4-6"));
|
||||
assert!(!super::model_rejects_is_error_field("grok-3"));
|
||||
assert!(!super::model_rejects_is_error_field("grok-3-mini"));
|
||||
assert!(!super::model_rejects_is_error_field("xai/grok-3"));
|
||||
assert!(!super::model_rejects_is_error_field("qwen/qwen-plus"));
|
||||
assert!(!super::model_rejects_is_error_field("o1-mini"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn translate_message_includes_is_error_for_non_kimi_models() {
|
||||
use crate::types::{InputContentBlock, InputMessage, ToolResultContentBlock};
|
||||
|
||||
// Test with gpt-4o (should include is_error)
|
||||
let message = InputMessage {
|
||||
role: "user".to_string(),
|
||||
content: vec![InputContentBlock::ToolResult {
|
||||
tool_use_id: "call_1".to_string(),
|
||||
content: vec![ToolResultContentBlock::Text {
|
||||
text: "Error occurred".to_string(),
|
||||
}],
|
||||
is_error: true,
|
||||
}],
|
||||
};
|
||||
|
||||
let translated = super::translate_message(&message, "gpt-4o");
|
||||
assert_eq!(translated.len(), 1);
|
||||
let tool_msg = &translated[0];
|
||||
assert_eq!(tool_msg["role"], json!("tool"));
|
||||
assert_eq!(tool_msg["tool_call_id"], json!("call_1"));
|
||||
assert_eq!(tool_msg["content"], json!("Error occurred"));
|
||||
assert!(
|
||||
tool_msg.get("is_error").is_some(),
|
||||
"gpt-4o should include is_error field"
|
||||
);
|
||||
assert_eq!(tool_msg["is_error"], json!(true));
|
||||
|
||||
// Test with grok-3 (should include is_error)
|
||||
let message2 = InputMessage {
|
||||
role: "user".to_string(),
|
||||
content: vec![InputContentBlock::ToolResult {
|
||||
tool_use_id: "call_2".to_string(),
|
||||
content: vec![ToolResultContentBlock::Text {
|
||||
text: "Success".to_string(),
|
||||
}],
|
||||
is_error: false,
|
||||
}],
|
||||
};
|
||||
|
||||
let translated2 = super::translate_message(&message2, "grok-3");
|
||||
assert!(
|
||||
translated2[0].get("is_error").is_some(),
|
||||
"grok-3 should include is_error field"
|
||||
);
|
||||
assert_eq!(translated2[0]["is_error"], json!(false));
|
||||
|
||||
// Test with claude model (should include is_error)
|
||||
let translated3 = super::translate_message(&message, "claude-sonnet-4-6");
|
||||
assert!(
|
||||
translated3[0].get("is_error").is_some(),
|
||||
"claude should include is_error field"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn translate_message_excludes_is_error_for_kimi_models() {
|
||||
use crate::types::{InputContentBlock, InputMessage, ToolResultContentBlock};
|
||||
|
||||
// Test with kimi-k2.5 (should EXCLUDE is_error)
|
||||
let message = InputMessage {
|
||||
role: "user".to_string(),
|
||||
content: vec![InputContentBlock::ToolResult {
|
||||
tool_use_id: "call_1".to_string(),
|
||||
content: vec![ToolResultContentBlock::Text {
|
||||
text: "Error occurred".to_string(),
|
||||
}],
|
||||
is_error: true,
|
||||
}],
|
||||
};
|
||||
|
||||
let translated = super::translate_message(&message, "kimi-k2.5");
|
||||
assert_eq!(translated.len(), 1);
|
||||
let tool_msg = &translated[0];
|
||||
assert_eq!(tool_msg["role"], json!("tool"));
|
||||
assert_eq!(tool_msg["tool_call_id"], json!("call_1"));
|
||||
assert_eq!(tool_msg["content"], json!("Error occurred"));
|
||||
assert!(
|
||||
tool_msg.get("is_error").is_none(),
|
||||
"kimi-k2.5 must NOT include is_error field (would cause 400 Bad Request)"
|
||||
);
|
||||
|
||||
// Test with kimi-k1.5
|
||||
let translated2 = super::translate_message(&message, "kimi-k1.5");
|
||||
assert!(
|
||||
translated2[0].get("is_error").is_none(),
|
||||
"kimi-k1.5 must NOT include is_error field"
|
||||
);
|
||||
|
||||
// Test with dashscope/kimi-k2.5 (with provider prefix)
|
||||
let translated3 = super::translate_message(&message, "dashscope/kimi-k2.5");
|
||||
assert!(
|
||||
translated3[0].get("is_error").is_none(),
|
||||
"dashscope/kimi-k2.5 must NOT include is_error field"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn build_chat_completion_request_kimi_vs_non_kimi_tool_results() {
|
||||
use crate::types::{InputContentBlock, InputMessage, ToolResultContentBlock};
|
||||
|
||||
// Helper to create a request with a tool result
|
||||
let make_request = |model: &str| MessageRequest {
|
||||
model: model.to_string(),
|
||||
max_tokens: 100,
|
||||
messages: vec![
|
||||
InputMessage {
|
||||
role: "assistant".to_string(),
|
||||
content: vec![InputContentBlock::ToolUse {
|
||||
id: "call_1".to_string(),
|
||||
name: "read_file".to_string(),
|
||||
input: serde_json::json!({"path": "/tmp/test"}),
|
||||
}],
|
||||
},
|
||||
InputMessage {
|
||||
role: "user".to_string(),
|
||||
content: vec![InputContentBlock::ToolResult {
|
||||
tool_use_id: "call_1".to_string(),
|
||||
content: vec![ToolResultContentBlock::Text {
|
||||
text: "file contents".to_string(),
|
||||
}],
|
||||
is_error: false,
|
||||
}],
|
||||
},
|
||||
],
|
||||
stream: false,
|
||||
..Default::default()
|
||||
};
|
||||
|
||||
// Non-kimi model: should have is_error field
|
||||
let request_gpt = make_request("gpt-4o");
|
||||
let payload_gpt = build_chat_completion_request(&request_gpt, OpenAiCompatConfig::openai());
|
||||
let messages_gpt = payload_gpt["messages"].as_array().unwrap();
|
||||
let tool_msg_gpt = messages_gpt.iter().find(|m| m["role"] == "tool").unwrap();
|
||||
assert!(
|
||||
tool_msg_gpt.get("is_error").is_some(),
|
||||
"gpt-4o request should include is_error in tool result"
|
||||
);
|
||||
|
||||
// kimi model: should NOT have is_error field
|
||||
let request_kimi = make_request("kimi-k2.5");
|
||||
let payload_kimi =
|
||||
build_chat_completion_request(&request_kimi, OpenAiCompatConfig::dashscope());
|
||||
let messages_kimi = payload_kimi["messages"].as_array().unwrap();
|
||||
let tool_msg_kimi = messages_kimi.iter().find(|m| m["role"] == "tool").unwrap();
|
||||
assert!(
|
||||
tool_msg_kimi.get("is_error").is_none(),
|
||||
"kimi-k2.5 request must NOT include is_error in tool result (would cause 400)"
|
||||
);
|
||||
|
||||
// Verify both have the essential fields
|
||||
assert_eq!(tool_msg_gpt["tool_call_id"], json!("call_1"));
|
||||
assert_eq!(tool_msg_kimi["tool_call_id"], json!("call_1"));
|
||||
assert_eq!(tool_msg_gpt["content"], json!("file contents"));
|
||||
assert_eq!(tool_msg_kimi["content"], json!("file contents"));
|
||||
}
|
||||
|
||||
// ============================================================================
|
||||
// US-021: Request body size pre-flight check tests
|
||||
// ============================================================================
|
||||
|
||||
#[test]
|
||||
fn estimate_request_body_size_returns_reasonable_estimate() {
|
||||
let request = MessageRequest {
|
||||
model: "gpt-4o".to_string(),
|
||||
max_tokens: 100,
|
||||
messages: vec![InputMessage::user_text("Hello world".to_string())],
|
||||
stream: false,
|
||||
..Default::default()
|
||||
};
|
||||
|
||||
let size = super::estimate_request_body_size(&request, OpenAiCompatConfig::openai());
|
||||
// Should be non-zero and reasonable for a small request
|
||||
assert!(size > 0, "estimated size should be positive");
|
||||
assert!(size < 10_000, "small request should be under 10KB");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn check_request_body_size_passes_for_small_requests() {
|
||||
let request = MessageRequest {
|
||||
model: "gpt-4o".to_string(),
|
||||
max_tokens: 100,
|
||||
messages: vec![InputMessage::user_text("Hello".to_string())],
|
||||
stream: false,
|
||||
..Default::default()
|
||||
};
|
||||
|
||||
// Should pass for all providers with a small request
|
||||
assert!(super::check_request_body_size(&request, OpenAiCompatConfig::openai()).is_ok());
|
||||
assert!(super::check_request_body_size(&request, OpenAiCompatConfig::xai()).is_ok());
|
||||
assert!(super::check_request_body_size(&request, OpenAiCompatConfig::dashscope()).is_ok());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn check_request_body_size_fails_for_dashscope_when_exceeds_6mb() {
|
||||
// Create a request that exceeds DashScope's 6MB limit
|
||||
let large_content = "x".repeat(7_000_000); // 7MB of content
|
||||
let request = MessageRequest {
|
||||
model: "qwen-plus".to_string(),
|
||||
max_tokens: 100,
|
||||
messages: vec![InputMessage::user_text(large_content)],
|
||||
stream: false,
|
||||
..Default::default()
|
||||
};
|
||||
|
||||
let result = super::check_request_body_size(&request, OpenAiCompatConfig::dashscope());
|
||||
assert!(result.is_err(), "should fail for 7MB request to DashScope");
|
||||
|
||||
let err = result.unwrap_err();
|
||||
match err {
|
||||
crate::error::ApiError::RequestBodySizeExceeded {
|
||||
estimated_bytes,
|
||||
max_bytes,
|
||||
provider,
|
||||
} => {
|
||||
assert_eq!(provider, "DashScope");
|
||||
assert_eq!(max_bytes, 6_291_456); // 6MB limit
|
||||
assert!(estimated_bytes > max_bytes);
|
||||
}
|
||||
_ => panic!("expected RequestBodySizeExceeded error, got {err:?}"),
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn check_request_body_size_allows_large_requests_for_openai() {
|
||||
// Create a request that exceeds DashScope's limit but is under OpenAI's 100MB limit
|
||||
let large_content = "x".repeat(10_000_000); // 10MB of content
|
||||
let request = MessageRequest {
|
||||
model: "gpt-4o".to_string(),
|
||||
max_tokens: 100,
|
||||
messages: vec![InputMessage::user_text(large_content)],
|
||||
stream: false,
|
||||
..Default::default()
|
||||
};
|
||||
|
||||
// Should pass for OpenAI (100MB limit)
|
||||
assert!(
|
||||
super::check_request_body_size(&request, OpenAiCompatConfig::openai()).is_ok(),
|
||||
"10MB request should pass for OpenAI's 100MB limit"
|
||||
);
|
||||
|
||||
// Should fail for DashScope (6MB limit)
|
||||
assert!(
|
||||
super::check_request_body_size(&request, OpenAiCompatConfig::dashscope()).is_err(),
|
||||
"10MB request should fail for DashScope's 6MB limit"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn provider_specific_size_limits_are_correct() {
|
||||
assert_eq!(OpenAiCompatConfig::dashscope().max_request_body_bytes, 6_291_456); // 6MB
|
||||
assert_eq!(OpenAiCompatConfig::openai().max_request_body_bytes, 104_857_600); // 100MB
|
||||
assert_eq!(OpenAiCompatConfig::xai().max_request_body_bytes, 52_428_800); // 50MB
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn strip_routing_prefix_strips_kimi_provider_prefix() {
|
||||
// US-023: kimi prefix should be stripped for wire format
|
||||
assert_eq!(super::strip_routing_prefix("kimi/kimi-k2.5"), "kimi-k2.5");
|
||||
assert_eq!(super::strip_routing_prefix("kimi-k2.5"), "kimi-k2.5"); // no prefix, unchanged
|
||||
assert_eq!(super::strip_routing_prefix("kimi/kimi-k1.5"), "kimi-k1.5");
|
||||
}
|
||||
}
|
||||
|
||||
@@ -8,6 +8,7 @@ use tokio::process::Command as TokioCommand;
|
||||
use tokio::runtime::Builder;
|
||||
use tokio::time::timeout;
|
||||
|
||||
use crate::lane_events::{LaneEvent, ShipMergeMethod, ShipProvenance};
|
||||
use crate::sandbox::{
|
||||
build_linux_sandbox_command, resolve_sandbox_status_for_request, FilesystemIsolationMode,
|
||||
SandboxConfig, SandboxStatus,
|
||||
@@ -102,11 +103,76 @@ pub fn execute_bash(input: BashCommandInput) -> io::Result<BashCommandOutput> {
|
||||
runtime.block_on(execute_bash_async(input, sandbox_status, cwd))
|
||||
}
|
||||
|
||||
/// Detect git push to main and emit ship provenance event
|
||||
fn detect_and_emit_ship_prepared(command: &str) {
|
||||
let trimmed = command.trim();
|
||||
// Simple detection: git push with main/master
|
||||
if trimmed.contains("git push") && (trimmed.contains("main") || trimmed.contains("master")) {
|
||||
// Emit ship.prepared event
|
||||
let now = std::time::SystemTime::now()
|
||||
.duration_since(std::time::UNIX_EPOCH)
|
||||
.unwrap_or_default()
|
||||
.as_millis();
|
||||
let provenance = ShipProvenance {
|
||||
source_branch: get_current_branch().unwrap_or_else(|| "unknown".to_string()),
|
||||
base_commit: get_head_commit().unwrap_or_default(),
|
||||
commit_count: 0, // Would need to calculate from range
|
||||
commit_range: "unknown..HEAD".to_string(),
|
||||
merge_method: ShipMergeMethod::DirectPush,
|
||||
actor: get_git_actor().unwrap_or_else(|| "unknown".to_string()),
|
||||
pr_number: None,
|
||||
};
|
||||
let _event = LaneEvent::ship_prepared(format!("{}", now), &provenance);
|
||||
// Log to stderr as interim routing before event stream integration
|
||||
eprintln!(
|
||||
"[ship.prepared] branch={} -> main, commits={}, actor={}",
|
||||
provenance.source_branch, provenance.commit_count, provenance.actor
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
fn get_current_branch() -> Option<String> {
|
||||
let output = Command::new("git")
|
||||
.args(["branch", "--show-current"])
|
||||
.output()
|
||||
.ok()?;
|
||||
if output.status.success() {
|
||||
Some(String::from_utf8_lossy(&output.stdout).trim().to_string())
|
||||
} else {
|
||||
None
|
||||
}
|
||||
}
|
||||
|
||||
fn get_head_commit() -> Option<String> {
|
||||
let output = Command::new("git")
|
||||
.args(["rev-parse", "--short", "HEAD"])
|
||||
.output()
|
||||
.ok()?;
|
||||
if output.status.success() {
|
||||
Some(String::from_utf8_lossy(&output.stdout).trim().to_string())
|
||||
} else {
|
||||
None
|
||||
}
|
||||
}
|
||||
|
||||
fn get_git_actor() -> Option<String> {
|
||||
let name = Command::new("git")
|
||||
.args(["config", "user.name"])
|
||||
.output()
|
||||
.ok()
|
||||
.filter(|o| o.status.success())
|
||||
.map(|o| String::from_utf8_lossy(&o.stdout).trim().to_string())?;
|
||||
Some(name)
|
||||
}
|
||||
|
||||
async fn execute_bash_async(
|
||||
input: BashCommandInput,
|
||||
sandbox_status: SandboxStatus,
|
||||
cwd: std::path::PathBuf,
|
||||
) -> io::Result<BashCommandOutput> {
|
||||
// Detect and emit ship provenance for git push operations
|
||||
detect_and_emit_ship_prepared(&input.command);
|
||||
|
||||
let mut command = prepare_tokio_command(&input.command, &cwd, &sandbox_status, true);
|
||||
|
||||
let output_result = if let Some(timeout_ms) = input.timeout {
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
#![allow(clippy::similar_names)]
|
||||
#![allow(clippy::similar_names, clippy::cast_possible_truncation)]
|
||||
use serde::{Deserialize, Serialize};
|
||||
use serde_json::Value;
|
||||
|
||||
@@ -38,6 +38,15 @@ pub enum LaneEventName {
|
||||
BranchStaleAgainstMain,
|
||||
#[serde(rename = "branch.workspace_mismatch")]
|
||||
BranchWorkspaceMismatch,
|
||||
/// Ship/provenance events — §4.44.5
|
||||
#[serde(rename = "ship.prepared")]
|
||||
ShipPrepared,
|
||||
#[serde(rename = "ship.commits_selected")]
|
||||
ShipCommitsSelected,
|
||||
#[serde(rename = "ship.merged")]
|
||||
ShipMerged,
|
||||
#[serde(rename = "ship.pushed_main")]
|
||||
ShipPushedMain,
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
|
||||
@@ -73,11 +82,354 @@ pub enum LaneFailureClass {
|
||||
Infra,
|
||||
}
|
||||
|
||||
/// Provenance labels for event source classification.
|
||||
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
|
||||
#[serde(rename_all = "snake_case")]
|
||||
pub enum EventProvenance {
|
||||
/// Event from a live, active lane
|
||||
LiveLane,
|
||||
/// Event from a synthetic test
|
||||
Test,
|
||||
/// Event from a healthcheck probe
|
||||
Healthcheck,
|
||||
/// Event from a replay/log replay
|
||||
Replay,
|
||||
/// Event from the transport layer itself
|
||||
Transport,
|
||||
}
|
||||
|
||||
/// Session identity metadata captured at creation time.
|
||||
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
|
||||
pub struct SessionIdentity {
|
||||
/// Stable title for the session
|
||||
pub title: String,
|
||||
/// Workspace/worktree path
|
||||
pub workspace: String,
|
||||
/// Lane/session purpose
|
||||
pub purpose: String,
|
||||
/// Placeholder reason if any field is unknown
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub placeholder_reason: Option<String>,
|
||||
}
|
||||
|
||||
impl SessionIdentity {
|
||||
/// Create complete session identity
|
||||
#[must_use]
|
||||
pub fn new(
|
||||
title: impl Into<String>,
|
||||
workspace: impl Into<String>,
|
||||
purpose: impl Into<String>,
|
||||
) -> Self {
|
||||
Self {
|
||||
title: title.into(),
|
||||
workspace: workspace.into(),
|
||||
purpose: purpose.into(),
|
||||
placeholder_reason: None,
|
||||
}
|
||||
}
|
||||
|
||||
/// Create session identity with placeholder for missing fields
|
||||
#[must_use]
|
||||
pub fn with_placeholder(
|
||||
title: impl Into<String>,
|
||||
workspace: impl Into<String>,
|
||||
purpose: impl Into<String>,
|
||||
reason: impl Into<String>,
|
||||
) -> Self {
|
||||
Self {
|
||||
title: title.into(),
|
||||
workspace: workspace.into(),
|
||||
purpose: purpose.into(),
|
||||
placeholder_reason: Some(reason.into()),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// Lane ownership and workflow scope binding.
|
||||
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
|
||||
pub struct LaneOwnership {
|
||||
/// Owner/assignee identity
|
||||
pub owner: String,
|
||||
/// Workflow scope (e.g., claw-code-dogfood, external-git-maintenance)
|
||||
pub workflow_scope: String,
|
||||
/// Whether the watcher is expected to act, observe, or ignore
|
||||
pub watcher_action: WatcherAction,
|
||||
}
|
||||
|
||||
/// Watcher action expectation for a lane event.
|
||||
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
|
||||
#[serde(rename_all = "snake_case")]
|
||||
pub enum WatcherAction {
|
||||
/// Watcher should take action on this event
|
||||
Act,
|
||||
/// Watcher should only observe
|
||||
Observe,
|
||||
/// Watcher should ignore this event
|
||||
Ignore,
|
||||
}
|
||||
|
||||
/// Event metadata for ordering, provenance, deduplication, and ownership.
|
||||
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
|
||||
pub struct LaneEventMetadata {
|
||||
/// Monotonic sequence number for event ordering
|
||||
pub seq: u64,
|
||||
/// Event provenance source
|
||||
pub provenance: EventProvenance,
|
||||
/// Session identity at creation
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub session_identity: Option<SessionIdentity>,
|
||||
/// Lane ownership and scope
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub ownership: Option<LaneOwnership>,
|
||||
/// Nudge ID for deduplication cycles
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub nudge_id: Option<String>,
|
||||
/// Event fingerprint for terminal event deduplication
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub event_fingerprint: Option<String>,
|
||||
/// Timestamp when event was observed/created
|
||||
pub timestamp_ms: u64,
|
||||
}
|
||||
|
||||
impl LaneEventMetadata {
|
||||
/// Create new event metadata
|
||||
#[must_use]
|
||||
pub fn new(seq: u64, provenance: EventProvenance) -> Self {
|
||||
Self {
|
||||
seq,
|
||||
provenance,
|
||||
session_identity: None,
|
||||
ownership: None,
|
||||
nudge_id: None,
|
||||
event_fingerprint: None,
|
||||
timestamp_ms: std::time::SystemTime::now()
|
||||
.duration_since(std::time::UNIX_EPOCH)
|
||||
.unwrap_or_default()
|
||||
.as_millis() as u64,
|
||||
}
|
||||
}
|
||||
|
||||
/// Add session identity
|
||||
#[must_use]
|
||||
pub fn with_session_identity(mut self, identity: SessionIdentity) -> Self {
|
||||
self.session_identity = Some(identity);
|
||||
self
|
||||
}
|
||||
|
||||
/// Add ownership info
|
||||
#[must_use]
|
||||
pub fn with_ownership(mut self, ownership: LaneOwnership) -> Self {
|
||||
self.ownership = Some(ownership);
|
||||
self
|
||||
}
|
||||
|
||||
/// Add nudge ID for dedupe
|
||||
#[must_use]
|
||||
pub fn with_nudge_id(mut self, nudge_id: impl Into<String>) -> Self {
|
||||
self.nudge_id = Some(nudge_id.into());
|
||||
self
|
||||
}
|
||||
|
||||
/// Compute and add event fingerprint for terminal events
|
||||
#[must_use]
|
||||
pub fn with_fingerprint(mut self, fingerprint: impl Into<String>) -> Self {
|
||||
self.event_fingerprint = Some(fingerprint.into());
|
||||
self
|
||||
}
|
||||
}
|
||||
|
||||
/// Builder for constructing [`LaneEvent`]s with proper metadata.
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct LaneEventBuilder {
|
||||
event: LaneEventName,
|
||||
status: LaneEventStatus,
|
||||
emitted_at: String,
|
||||
session_id: Option<String>,
|
||||
metadata: LaneEventMetadata,
|
||||
detail: Option<String>,
|
||||
failure_class: Option<LaneFailureClass>,
|
||||
data: Option<serde_json::Value>,
|
||||
}
|
||||
|
||||
impl LaneEventBuilder {
|
||||
/// Start building a new lane event
|
||||
#[must_use]
|
||||
pub fn new(
|
||||
event: LaneEventName,
|
||||
status: LaneEventStatus,
|
||||
emitted_at: impl Into<String>,
|
||||
seq: u64,
|
||||
provenance: EventProvenance,
|
||||
) -> Self {
|
||||
Self {
|
||||
event,
|
||||
status,
|
||||
emitted_at: emitted_at.into(),
|
||||
session_id: None,
|
||||
metadata: LaneEventMetadata::new(seq, provenance),
|
||||
detail: None,
|
||||
failure_class: None,
|
||||
data: None,
|
||||
}
|
||||
}
|
||||
|
||||
/// Add session identity
|
||||
#[must_use]
|
||||
pub fn with_session_identity(mut self, identity: SessionIdentity) -> Self {
|
||||
self.metadata = self.metadata.with_session_identity(identity);
|
||||
self
|
||||
}
|
||||
|
||||
/// Add boot-scoped session correlation id
|
||||
#[must_use]
|
||||
pub fn with_session_id(mut self, session_id: impl Into<String>) -> Self {
|
||||
self.session_id = Some(session_id.into());
|
||||
self
|
||||
}
|
||||
|
||||
/// Add ownership info
|
||||
#[must_use]
|
||||
pub fn with_ownership(mut self, ownership: LaneOwnership) -> Self {
|
||||
self.metadata = self.metadata.with_ownership(ownership);
|
||||
self
|
||||
}
|
||||
|
||||
/// Add nudge ID
|
||||
#[must_use]
|
||||
pub fn with_nudge_id(mut self, nudge_id: impl Into<String>) -> Self {
|
||||
self.metadata = self.metadata.with_nudge_id(nudge_id);
|
||||
self
|
||||
}
|
||||
|
||||
/// Add detail
|
||||
#[must_use]
|
||||
pub fn with_detail(mut self, detail: impl Into<String>) -> Self {
|
||||
self.detail = Some(detail.into());
|
||||
self
|
||||
}
|
||||
|
||||
/// Add failure class
|
||||
#[must_use]
|
||||
pub fn with_failure_class(mut self, failure_class: LaneFailureClass) -> Self {
|
||||
self.failure_class = Some(failure_class);
|
||||
self
|
||||
}
|
||||
|
||||
/// Add data payload
|
||||
#[must_use]
|
||||
pub fn with_data(mut self, data: serde_json::Value) -> Self {
|
||||
self.data = Some(data);
|
||||
self
|
||||
}
|
||||
|
||||
/// Compute fingerprint and build terminal event
|
||||
#[must_use]
|
||||
pub fn build_terminal(mut self) -> LaneEvent {
|
||||
let fingerprint = compute_event_fingerprint(&self.event, &self.status, self.data.as_ref());
|
||||
self.metadata = self.metadata.with_fingerprint(fingerprint);
|
||||
self.build()
|
||||
}
|
||||
|
||||
/// Build the event
|
||||
#[must_use]
|
||||
pub fn build(self) -> LaneEvent {
|
||||
LaneEvent {
|
||||
event: self.event,
|
||||
status: self.status,
|
||||
emitted_at: self.emitted_at,
|
||||
session_id: self.session_id,
|
||||
failure_class: self.failure_class,
|
||||
detail: self.detail,
|
||||
data: self.data,
|
||||
metadata: self.metadata,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// Check if an event kind is terminal (completed, failed, superseded, closed).
|
||||
#[must_use]
|
||||
pub fn is_terminal_event(event: LaneEventName) -> bool {
|
||||
matches!(
|
||||
event,
|
||||
LaneEventName::Finished
|
||||
| LaneEventName::Failed
|
||||
| LaneEventName::Superseded
|
||||
| LaneEventName::Closed
|
||||
| LaneEventName::Merged
|
||||
)
|
||||
}
|
||||
|
||||
/// Compute a fingerprint for terminal event deduplication.
|
||||
#[must_use]
|
||||
pub fn compute_event_fingerprint(
|
||||
event: &LaneEventName,
|
||||
status: &LaneEventStatus,
|
||||
data: Option<&serde_json::Value>,
|
||||
) -> String {
|
||||
use std::collections::hash_map::DefaultHasher;
|
||||
use std::hash::{Hash, Hasher};
|
||||
|
||||
let mut hasher = DefaultHasher::new();
|
||||
format!("{event:?}").hash(&mut hasher);
|
||||
format!("{status:?}").hash(&mut hasher);
|
||||
if let Some(d) = data {
|
||||
serde_json::to_string(d)
|
||||
.unwrap_or_default()
|
||||
.hash(&mut hasher);
|
||||
}
|
||||
format!("{:016x}", hasher.finish())
|
||||
}
|
||||
|
||||
/// Deduplicate terminal events within a reconciliation window.
|
||||
/// Returns only the first occurrence of each terminal fingerprint.
|
||||
#[must_use]
|
||||
pub fn dedupe_terminal_events(events: &[LaneEvent]) -> Vec<LaneEvent> {
|
||||
let mut seen_fingerprints = std::collections::HashSet::new();
|
||||
let mut result = Vec::new();
|
||||
|
||||
for event in events {
|
||||
if is_terminal_event(event.event) {
|
||||
if let Some(fp) = &event.metadata.event_fingerprint {
|
||||
if seen_fingerprints.contains(fp) {
|
||||
continue; // Skip duplicate terminal event
|
||||
}
|
||||
seen_fingerprints.insert(fp.clone());
|
||||
}
|
||||
}
|
||||
result.push(event.clone());
|
||||
}
|
||||
|
||||
result
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
|
||||
pub enum BlockedSubphase {
|
||||
#[serde(rename = "blocked.trust_prompt")]
|
||||
TrustPrompt { gate_repo: String },
|
||||
#[serde(rename = "blocked.prompt_delivery")]
|
||||
PromptDelivery { attempt: u32 },
|
||||
#[serde(rename = "blocked.plugin_init")]
|
||||
PluginInit { plugin_name: String },
|
||||
#[serde(rename = "blocked.mcp_handshake")]
|
||||
McpHandshake { server_name: String, attempt: u32 },
|
||||
#[serde(rename = "blocked.branch_freshness")]
|
||||
BranchFreshness { behind_main: u32 },
|
||||
#[serde(rename = "blocked.test_hang")]
|
||||
TestHang {
|
||||
elapsed_secs: u32,
|
||||
test_name: Option<String>,
|
||||
},
|
||||
#[serde(rename = "blocked.report_pending")]
|
||||
ReportPending { since_secs: u32 },
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
|
||||
pub struct LaneEventBlocker {
|
||||
#[serde(rename = "failureClass")]
|
||||
pub failure_class: LaneFailureClass,
|
||||
pub detail: String,
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub subphase: Option<BlockedSubphase>,
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
|
||||
@@ -94,21 +446,50 @@ pub struct LaneCommitProvenance {
|
||||
pub lineage: Vec<String>,
|
||||
}
|
||||
|
||||
/// Ship/provenance metadata — §4.44.5
|
||||
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
|
||||
pub struct ShipProvenance {
|
||||
pub source_branch: String,
|
||||
pub base_commit: String,
|
||||
pub commit_count: u32,
|
||||
pub commit_range: String,
|
||||
pub merge_method: ShipMergeMethod,
|
||||
pub actor: String,
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub pr_number: Option<u32>,
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
|
||||
#[serde(rename_all = "snake_case")]
|
||||
pub enum ShipMergeMethod {
|
||||
DirectPush,
|
||||
FastForward,
|
||||
MergeCommit,
|
||||
SquashMerge,
|
||||
RebaseMerge,
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
|
||||
pub struct LaneEvent {
|
||||
pub event: LaneEventName,
|
||||
pub status: LaneEventStatus,
|
||||
#[serde(rename = "emittedAt")]
|
||||
pub emitted_at: String,
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub session_id: Option<String>,
|
||||
#[serde(rename = "failureClass", skip_serializing_if = "Option::is_none")]
|
||||
pub failure_class: Option<LaneFailureClass>,
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub detail: Option<String>,
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub data: Option<Value>,
|
||||
/// Event metadata for ordering, provenance, dedupe, and ownership
|
||||
pub metadata: LaneEventMetadata,
|
||||
}
|
||||
|
||||
impl LaneEvent {
|
||||
/// Create a new lane event with minimal metadata (seq=0, provenance=LiveLane)
|
||||
/// Use `LaneEventBuilder` for events requiring full metadata.
|
||||
#[must_use]
|
||||
pub fn new(
|
||||
event: LaneEventName,
|
||||
@@ -119,9 +500,11 @@ impl LaneEvent {
|
||||
event,
|
||||
status,
|
||||
emitted_at: emitted_at.into(),
|
||||
session_id: None,
|
||||
failure_class: None,
|
||||
detail: None,
|
||||
data: None,
|
||||
metadata: LaneEventMetadata::new(0, EventProvenance::LiveLane),
|
||||
}
|
||||
}
|
||||
|
||||
@@ -172,16 +555,74 @@ impl LaneEvent {
|
||||
|
||||
#[must_use]
|
||||
pub fn blocked(emitted_at: impl Into<String>, blocker: &LaneEventBlocker) -> Self {
|
||||
Self::new(LaneEventName::Blocked, LaneEventStatus::Blocked, emitted_at)
|
||||
let mut event = Self::new(LaneEventName::Blocked, LaneEventStatus::Blocked, emitted_at)
|
||||
.with_failure_class(blocker.failure_class)
|
||||
.with_detail(blocker.detail.clone())
|
||||
.with_detail(blocker.detail.clone());
|
||||
if let Some(ref subphase) = blocker.subphase {
|
||||
event =
|
||||
event.with_data(serde_json::to_value(subphase).expect("subphase should serialize"));
|
||||
}
|
||||
event
|
||||
}
|
||||
|
||||
#[must_use]
|
||||
pub fn failed(emitted_at: impl Into<String>, blocker: &LaneEventBlocker) -> Self {
|
||||
Self::new(LaneEventName::Failed, LaneEventStatus::Failed, emitted_at)
|
||||
let mut event = Self::new(LaneEventName::Failed, LaneEventStatus::Failed, emitted_at)
|
||||
.with_failure_class(blocker.failure_class)
|
||||
.with_detail(blocker.detail.clone())
|
||||
.with_detail(blocker.detail.clone());
|
||||
if let Some(ref subphase) = blocker.subphase {
|
||||
event =
|
||||
event.with_data(serde_json::to_value(subphase).expect("subphase should serialize"));
|
||||
}
|
||||
event
|
||||
}
|
||||
|
||||
/// Ship prepared — §4.44.5
|
||||
#[must_use]
|
||||
pub fn ship_prepared(emitted_at: impl Into<String>, provenance: &ShipProvenance) -> Self {
|
||||
Self::new(
|
||||
LaneEventName::ShipPrepared,
|
||||
LaneEventStatus::Ready,
|
||||
emitted_at,
|
||||
)
|
||||
.with_data(serde_json::to_value(provenance).expect("ship provenance should serialize"))
|
||||
}
|
||||
|
||||
/// Ship commits selected — §4.44.5
|
||||
#[must_use]
|
||||
pub fn ship_commits_selected(
|
||||
emitted_at: impl Into<String>,
|
||||
commit_count: u32,
|
||||
commit_range: impl Into<String>,
|
||||
) -> Self {
|
||||
Self::new(
|
||||
LaneEventName::ShipCommitsSelected,
|
||||
LaneEventStatus::Ready,
|
||||
emitted_at,
|
||||
)
|
||||
.with_detail(format!("{} commits: {}", commit_count, commit_range.into()))
|
||||
}
|
||||
|
||||
/// Ship merged — §4.44.5
|
||||
#[must_use]
|
||||
pub fn ship_merged(emitted_at: impl Into<String>, provenance: &ShipProvenance) -> Self {
|
||||
Self::new(
|
||||
LaneEventName::ShipMerged,
|
||||
LaneEventStatus::Completed,
|
||||
emitted_at,
|
||||
)
|
||||
.with_data(serde_json::to_value(provenance).expect("ship provenance should serialize"))
|
||||
}
|
||||
|
||||
/// Ship pushed to main — §4.44.5
|
||||
#[must_use]
|
||||
pub fn ship_pushed_main(emitted_at: impl Into<String>, provenance: &ShipProvenance) -> Self {
|
||||
Self::new(
|
||||
LaneEventName::ShipPushedMain,
|
||||
LaneEventStatus::Completed,
|
||||
emitted_at,
|
||||
)
|
||||
.with_data(serde_json::to_value(provenance).expect("ship provenance should serialize"))
|
||||
}
|
||||
|
||||
#[must_use]
|
||||
@@ -207,6 +648,12 @@ impl LaneEvent {
|
||||
self.data = Some(data);
|
||||
self
|
||||
}
|
||||
|
||||
#[must_use]
|
||||
pub fn with_session_id(mut self, session_id: impl Into<String>) -> Self {
|
||||
self.session_id = Some(session_id.into());
|
||||
self
|
||||
}
|
||||
}
|
||||
|
||||
#[must_use]
|
||||
@@ -254,8 +701,11 @@ mod tests {
|
||||
use serde_json::json;
|
||||
|
||||
use super::{
|
||||
dedupe_superseded_commit_events, LaneCommitProvenance, LaneEvent, LaneEventBlocker,
|
||||
LaneEventName, LaneEventStatus, LaneFailureClass,
|
||||
compute_event_fingerprint, dedupe_superseded_commit_events, dedupe_terminal_events,
|
||||
is_terminal_event, BlockedSubphase, EventProvenance, LaneCommitProvenance, LaneEvent,
|
||||
LaneEventBlocker, LaneEventBuilder, LaneEventMetadata, LaneEventName, LaneEventStatus,
|
||||
LaneFailureClass, LaneOwnership, SessionIdentity, ShipMergeMethod, ShipProvenance,
|
||||
WatcherAction,
|
||||
};
|
||||
|
||||
#[test]
|
||||
@@ -284,6 +734,10 @@ mod tests {
|
||||
LaneEventName::BranchWorkspaceMismatch,
|
||||
"branch.workspace_mismatch",
|
||||
),
|
||||
(LaneEventName::ShipPrepared, "ship.prepared"),
|
||||
(LaneEventName::ShipCommitsSelected, "ship.commits_selected"),
|
||||
(LaneEventName::ShipMerged, "ship.merged"),
|
||||
(LaneEventName::ShipPushedMain, "ship.pushed_main"),
|
||||
];
|
||||
|
||||
for (event, expected) in cases {
|
||||
@@ -324,6 +778,10 @@ mod tests {
|
||||
let blocker = LaneEventBlocker {
|
||||
failure_class: LaneFailureClass::McpStartup,
|
||||
detail: "broken server".to_string(),
|
||||
subphase: Some(BlockedSubphase::McpHandshake {
|
||||
server_name: "test-server".to_string(),
|
||||
attempt: 1,
|
||||
}),
|
||||
};
|
||||
|
||||
let blocked = LaneEvent::blocked("2026-04-04T00:00:00Z", &blocker);
|
||||
@@ -369,6 +827,34 @@ mod tests {
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn ship_provenance_events_serialize_to_expected_wire_values() {
|
||||
let provenance = ShipProvenance {
|
||||
source_branch: "feature/provenance".to_string(),
|
||||
base_commit: "dd73962".to_string(),
|
||||
commit_count: 6,
|
||||
commit_range: "dd73962..c956f78".to_string(),
|
||||
merge_method: ShipMergeMethod::DirectPush,
|
||||
actor: "Jobdori".to_string(),
|
||||
pr_number: None,
|
||||
};
|
||||
|
||||
let prepared = LaneEvent::ship_prepared("2026-04-20T14:30:00Z", &provenance);
|
||||
let prepared_json = serde_json::to_value(&prepared).expect("ship event should serialize");
|
||||
assert_eq!(prepared_json["event"], "ship.prepared");
|
||||
assert_eq!(prepared_json["data"]["commit_count"], 6);
|
||||
assert_eq!(prepared_json["data"]["source_branch"], "feature/provenance");
|
||||
|
||||
let pushed = LaneEvent::ship_pushed_main("2026-04-20T14:35:00Z", &provenance);
|
||||
let pushed_json = serde_json::to_value(&pushed).expect("ship event should serialize");
|
||||
assert_eq!(pushed_json["event"], "ship.pushed_main");
|
||||
assert_eq!(pushed_json["data"]["merge_method"], "direct_push");
|
||||
|
||||
let round_trip: LaneEvent =
|
||||
serde_json::from_value(pushed_json).expect("ship event should deserialize");
|
||||
assert_eq!(round_trip.event, LaneEventName::ShipPushedMain);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn commit_events_can_carry_worktree_and_supersession_metadata() {
|
||||
let event = LaneEvent::commit_created(
|
||||
@@ -420,4 +906,254 @@ mod tests {
|
||||
assert_eq!(retained.len(), 1);
|
||||
assert_eq!(retained[0].detail.as_deref(), Some("new"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn lane_event_metadata_includes_monotonic_sequence() {
|
||||
let meta1 = LaneEventMetadata::new(0, EventProvenance::LiveLane);
|
||||
let meta2 = LaneEventMetadata::new(1, EventProvenance::LiveLane);
|
||||
let meta3 = LaneEventMetadata::new(2, EventProvenance::Test);
|
||||
|
||||
assert_eq!(meta1.seq, 0);
|
||||
assert_eq!(meta2.seq, 1);
|
||||
assert_eq!(meta3.seq, 2);
|
||||
assert!(meta1.timestamp_ms <= meta2.timestamp_ms);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn event_provenance_round_trips_through_serialization() {
|
||||
let cases = [
|
||||
(EventProvenance::LiveLane, "live_lane"),
|
||||
(EventProvenance::Test, "test"),
|
||||
(EventProvenance::Healthcheck, "healthcheck"),
|
||||
(EventProvenance::Replay, "replay"),
|
||||
(EventProvenance::Transport, "transport"),
|
||||
];
|
||||
|
||||
for (provenance, expected) in cases {
|
||||
let json = serde_json::to_value(provenance).expect("should serialize");
|
||||
assert_eq!(json, serde_json::json!(expected));
|
||||
|
||||
let round_trip: EventProvenance =
|
||||
serde_json::from_value(json).expect("should deserialize");
|
||||
assert_eq!(round_trip, provenance);
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn session_identity_is_complete_at_creation() {
|
||||
let identity = SessionIdentity::new("my-lane", "/tmp/repo", "implement feature X");
|
||||
|
||||
assert_eq!(identity.title, "my-lane");
|
||||
assert_eq!(identity.workspace, "/tmp/repo");
|
||||
assert_eq!(identity.purpose, "implement feature X");
|
||||
assert!(identity.placeholder_reason.is_none());
|
||||
|
||||
// Test with placeholder
|
||||
let with_placeholder = SessionIdentity::with_placeholder(
|
||||
"untitled",
|
||||
"/tmp/unknown",
|
||||
"unknown",
|
||||
"session created before title was known",
|
||||
);
|
||||
assert_eq!(
|
||||
with_placeholder.placeholder_reason,
|
||||
Some("session created before title was known".to_string())
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn lane_ownership_binding_includes_workflow_scope() {
|
||||
let ownership = LaneOwnership {
|
||||
owner: "claw-1".to_string(),
|
||||
workflow_scope: "claw-code-dogfood".to_string(),
|
||||
watcher_action: WatcherAction::Act,
|
||||
};
|
||||
|
||||
assert_eq!(ownership.owner, "claw-1");
|
||||
assert_eq!(ownership.workflow_scope, "claw-code-dogfood");
|
||||
assert_eq!(ownership.watcher_action, WatcherAction::Act);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn watcher_action_round_trips_through_serialization() {
|
||||
let cases = [
|
||||
(WatcherAction::Act, "act"),
|
||||
(WatcherAction::Observe, "observe"),
|
||||
(WatcherAction::Ignore, "ignore"),
|
||||
];
|
||||
|
||||
for (action, expected) in cases {
|
||||
let json = serde_json::to_value(action).expect("should serialize");
|
||||
assert_eq!(json, serde_json::json!(expected));
|
||||
|
||||
let round_trip: WatcherAction =
|
||||
serde_json::from_value(json).expect("should deserialize");
|
||||
assert_eq!(round_trip, action);
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn is_terminal_event_detects_terminal_states() {
|
||||
assert!(is_terminal_event(LaneEventName::Finished));
|
||||
assert!(is_terminal_event(LaneEventName::Failed));
|
||||
assert!(is_terminal_event(LaneEventName::Superseded));
|
||||
assert!(is_terminal_event(LaneEventName::Closed));
|
||||
assert!(is_terminal_event(LaneEventName::Merged));
|
||||
|
||||
assert!(!is_terminal_event(LaneEventName::Started));
|
||||
assert!(!is_terminal_event(LaneEventName::Ready));
|
||||
assert!(!is_terminal_event(LaneEventName::Blocked));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn compute_event_fingerprint_is_deterministic() {
|
||||
let fp1 = compute_event_fingerprint(
|
||||
&LaneEventName::Finished,
|
||||
&LaneEventStatus::Completed,
|
||||
Some(&json!({"commit": "abc123"})),
|
||||
);
|
||||
let fp2 = compute_event_fingerprint(
|
||||
&LaneEventName::Finished,
|
||||
&LaneEventStatus::Completed,
|
||||
Some(&json!({"commit": "abc123"})),
|
||||
);
|
||||
|
||||
assert_eq!(fp1, fp2, "same inputs should produce same fingerprint");
|
||||
assert!(!fp1.is_empty());
|
||||
assert_eq!(fp1.len(), 16, "fingerprint should be 16 hex chars");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn compute_event_fingerprint_differs_for_different_inputs() {
|
||||
let fp1 =
|
||||
compute_event_fingerprint(&LaneEventName::Finished, &LaneEventStatus::Completed, None);
|
||||
let fp2 = compute_event_fingerprint(&LaneEventName::Failed, &LaneEventStatus::Failed, None);
|
||||
let fp3 = compute_event_fingerprint(
|
||||
&LaneEventName::Finished,
|
||||
&LaneEventStatus::Completed,
|
||||
Some(&json!({"commit": "abc123"})),
|
||||
);
|
||||
|
||||
assert_ne!(fp1, fp2, "different event/status should differ");
|
||||
assert_ne!(fp1, fp3, "different data should differ");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn dedupe_terminal_events_suppresses_duplicates() {
|
||||
let event1 = LaneEventBuilder::new(
|
||||
LaneEventName::Finished,
|
||||
LaneEventStatus::Completed,
|
||||
"2026-04-04T00:00:00Z",
|
||||
0,
|
||||
EventProvenance::LiveLane,
|
||||
)
|
||||
.build_terminal();
|
||||
|
||||
let event2 = LaneEventBuilder::new(
|
||||
LaneEventName::Started,
|
||||
LaneEventStatus::Running,
|
||||
"2026-04-04T00:00:01Z",
|
||||
1,
|
||||
EventProvenance::LiveLane,
|
||||
)
|
||||
.build();
|
||||
|
||||
let event3 = LaneEventBuilder::new(
|
||||
LaneEventName::Finished,
|
||||
LaneEventStatus::Completed,
|
||||
"2026-04-04T00:00:02Z",
|
||||
2,
|
||||
EventProvenance::LiveLane,
|
||||
)
|
||||
.build_terminal(); // Same fingerprint as event1
|
||||
|
||||
let deduped = dedupe_terminal_events(&[event1.clone(), event2.clone(), event3.clone()]);
|
||||
|
||||
assert_eq!(deduped.len(), 2, "should have 2 events after dedupe");
|
||||
assert_eq!(deduped[0].event, LaneEventName::Finished);
|
||||
assert_eq!(deduped[1].event, LaneEventName::Started);
|
||||
// event3 should be suppressed as duplicate of event1
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn lane_event_builder_constructs_event_with_metadata() {
|
||||
let event = LaneEventBuilder::new(
|
||||
LaneEventName::Started,
|
||||
LaneEventStatus::Running,
|
||||
"2026-04-04T00:00:00Z",
|
||||
42,
|
||||
EventProvenance::Test,
|
||||
)
|
||||
.with_session_id("boot-abc123def4567890")
|
||||
.with_session_identity(SessionIdentity::new("test-lane", "/tmp", "test"))
|
||||
.with_ownership(LaneOwnership {
|
||||
owner: "bot-1".to_string(),
|
||||
workflow_scope: "test-suite".to_string(),
|
||||
watcher_action: WatcherAction::Observe,
|
||||
})
|
||||
.with_nudge_id("nudge-123")
|
||||
.with_detail("starting test run")
|
||||
.build();
|
||||
|
||||
assert_eq!(event.event, LaneEventName::Started);
|
||||
assert_eq!(event.session_id.as_deref(), Some("boot-abc123def4567890"));
|
||||
assert_eq!(event.metadata.seq, 42);
|
||||
assert_eq!(event.metadata.provenance, EventProvenance::Test);
|
||||
assert_eq!(
|
||||
event.metadata.session_identity.as_ref().unwrap().title,
|
||||
"test-lane"
|
||||
);
|
||||
assert_eq!(event.metadata.ownership.as_ref().unwrap().owner, "bot-1");
|
||||
assert_eq!(event.metadata.nudge_id, Some("nudge-123".to_string()));
|
||||
assert_eq!(event.detail, Some("starting test run".to_string()));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn lane_event_metadata_round_trips_through_serialization() {
|
||||
let meta = LaneEventMetadata::new(5, EventProvenance::Healthcheck)
|
||||
.with_session_identity(SessionIdentity::new("lane-1", "/tmp", "purpose"))
|
||||
.with_nudge_id("nudge-abc");
|
||||
|
||||
let json = serde_json::to_value(&meta).expect("should serialize");
|
||||
assert_eq!(json["seq"], 5);
|
||||
assert_eq!(json["provenance"], "healthcheck");
|
||||
assert_eq!(json["nudge_id"], "nudge-abc");
|
||||
assert!(json["timestamp_ms"].as_u64().is_some());
|
||||
|
||||
let round_trip: LaneEventMetadata =
|
||||
serde_json::from_value(json).expect("should deserialize");
|
||||
assert_eq!(round_trip.seq, 5);
|
||||
assert_eq!(round_trip.provenance, EventProvenance::Healthcheck);
|
||||
assert_eq!(round_trip.nudge_id, Some("nudge-abc".to_string()));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn lane_event_session_id_round_trips_through_serialization() {
|
||||
let event = LaneEventBuilder::new(
|
||||
LaneEventName::Started,
|
||||
LaneEventStatus::Running,
|
||||
"2026-04-04T00:00:00Z",
|
||||
1,
|
||||
EventProvenance::LiveLane,
|
||||
)
|
||||
.with_session_id("boot-0123456789abcdef")
|
||||
.build();
|
||||
|
||||
let json = serde_json::to_value(&event).expect("should serialize");
|
||||
assert_eq!(json["session_id"], "boot-0123456789abcdef");
|
||||
|
||||
let round_trip: LaneEvent = serde_json::from_value(json).expect("should deserialize");
|
||||
assert_eq!(
|
||||
round_trip.session_id.as_deref(),
|
||||
Some("boot-0123456789abcdef")
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn lane_event_session_id_omits_field_when_absent() {
|
||||
let event = LaneEvent::started("2026-04-04T00:00:00Z");
|
||||
let json = serde_json::to_value(&event).expect("should serialize");
|
||||
|
||||
assert!(json.get("session_id").is_none());
|
||||
}
|
||||
}
|
||||
|
||||
@@ -36,6 +36,7 @@ mod remote;
|
||||
pub mod sandbox;
|
||||
mod session;
|
||||
pub mod session_control;
|
||||
mod session_identity;
|
||||
pub use session_control::SessionStore;
|
||||
mod sse;
|
||||
pub mod stale_base;
|
||||
@@ -83,8 +84,11 @@ pub use hooks::{
|
||||
HookAbortSignal, HookEvent, HookProgressEvent, HookProgressReporter, HookRunResult, HookRunner,
|
||||
};
|
||||
pub use lane_events::{
|
||||
dedupe_superseded_commit_events, LaneCommitProvenance, LaneEvent, LaneEventBlocker,
|
||||
LaneEventName, LaneEventStatus, LaneFailureClass,
|
||||
compute_event_fingerprint, dedupe_superseded_commit_events, dedupe_terminal_events,
|
||||
is_terminal_event, BlockedSubphase, EventProvenance, LaneCommitProvenance, LaneEvent,
|
||||
LaneEventBlocker, LaneEventBuilder, LaneEventMetadata, LaneEventName, LaneEventStatus,
|
||||
LaneFailureClass, LaneOwnership, SessionIdentity, ShipMergeMethod, ShipProvenance,
|
||||
WatcherAction,
|
||||
};
|
||||
pub use mcp::{
|
||||
mcp_server_signature, mcp_tool_name, mcp_tool_prefix, normalize_name_for_mcp,
|
||||
@@ -150,6 +154,9 @@ pub use session::{
|
||||
ContentBlock, ConversationMessage, MessageRole, Session, SessionCompaction, SessionError,
|
||||
SessionFork, SessionPromptEntry,
|
||||
};
|
||||
pub use session_identity::{
|
||||
begin_session, current_boot_session_id, end_session, is_active_session,
|
||||
};
|
||||
pub use sse::{IncrementalSseParser, SseEvent};
|
||||
pub use stale_base::{
|
||||
check_base_commit, format_stale_base_warning, read_claw_base_file, resolve_expected_base,
|
||||
|
||||
@@ -48,7 +48,9 @@ impl FailureScenario {
|
||||
WorkerFailureKind::TrustGate => Self::TrustPromptUnresolved,
|
||||
WorkerFailureKind::PromptDelivery => Self::PromptMisdelivery,
|
||||
WorkerFailureKind::Protocol => Self::McpHandshakeFailure,
|
||||
WorkerFailureKind::Provider => Self::ProviderFailure,
|
||||
WorkerFailureKind::Provider | WorkerFailureKind::StartupNoEvidence => {
|
||||
Self::ProviderFailure
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
84
rust/crates/runtime/src/session_identity.rs
Normal file
84
rust/crates/runtime/src/session_identity.rs
Normal file
@@ -0,0 +1,84 @@
|
||||
use std::collections::hash_map::DefaultHasher;
|
||||
use std::env;
|
||||
use std::hash::{Hash, Hasher};
|
||||
use std::process;
|
||||
use std::sync::atomic::{AtomicBool, AtomicU64, Ordering};
|
||||
use std::sync::OnceLock;
|
||||
use std::time::{SystemTime, UNIX_EPOCH};
|
||||
|
||||
static BOOT_SESSION_ID: OnceLock<String> = OnceLock::new();
|
||||
static BOOT_SESSION_COUNTER: AtomicU64 = AtomicU64::new(0);
|
||||
static ACTIVE_SESSION: AtomicBool = AtomicBool::new(false);
|
||||
|
||||
#[must_use]
|
||||
pub fn current_boot_session_id() -> &'static str {
|
||||
BOOT_SESSION_ID.get_or_init(resolve_boot_session_id)
|
||||
}
|
||||
|
||||
pub fn begin_session() {
|
||||
ACTIVE_SESSION.store(true, Ordering::SeqCst);
|
||||
}
|
||||
|
||||
pub fn end_session() {
|
||||
ACTIVE_SESSION.store(false, Ordering::SeqCst);
|
||||
}
|
||||
|
||||
#[must_use]
|
||||
pub fn is_active_session() -> bool {
|
||||
ACTIVE_SESSION.load(Ordering::SeqCst)
|
||||
}
|
||||
|
||||
fn resolve_boot_session_id() -> String {
|
||||
match env::var("CLAW_SESSION_ID") {
|
||||
Ok(value) if !value.trim().is_empty() => value,
|
||||
_ => generate_boot_session_id(),
|
||||
}
|
||||
}
|
||||
|
||||
fn generate_boot_session_id() -> String {
|
||||
let nanos = SystemTime::now()
|
||||
.duration_since(UNIX_EPOCH)
|
||||
.unwrap_or_default()
|
||||
.as_nanos();
|
||||
let counter = BOOT_SESSION_COUNTER.fetch_add(1, Ordering::Relaxed);
|
||||
let mut hasher = DefaultHasher::new();
|
||||
process::id().hash(&mut hasher);
|
||||
nanos.hash(&mut hasher);
|
||||
counter.hash(&mut hasher);
|
||||
format!("boot-{:016x}", hasher.finish())
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::{begin_session, current_boot_session_id, end_session, is_active_session};
|
||||
|
||||
#[test]
|
||||
fn given_current_boot_session_id_when_called_twice_then_it_is_stable() {
|
||||
let first = current_boot_session_id();
|
||||
let second = current_boot_session_id();
|
||||
|
||||
assert_eq!(first, second);
|
||||
assert!(first.starts_with("boot-"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn given_current_boot_session_id_when_inspected_then_it_is_opaque_and_non_empty() {
|
||||
let session_id = current_boot_session_id();
|
||||
|
||||
assert!(!session_id.trim().is_empty());
|
||||
assert_eq!(session_id.len(), 21);
|
||||
assert!(!session_id.contains(' '));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn given_begin_and_end_session_when_checked_then_active_state_toggles() {
|
||||
end_session();
|
||||
assert!(!is_active_session());
|
||||
|
||||
begin_session();
|
||||
assert!(is_active_session());
|
||||
|
||||
end_session();
|
||||
assert!(!is_active_session());
|
||||
}
|
||||
}
|
||||
@@ -1,11 +1,42 @@
|
||||
use serde::{Deserialize, Serialize};
|
||||
use std::fmt::{Display, Formatter};
|
||||
|
||||
/// Task scope resolution for defining the granularity of work.
|
||||
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
|
||||
#[serde(rename_all = "snake_case")]
|
||||
pub enum TaskScope {
|
||||
/// Work across the entire workspace
|
||||
Workspace,
|
||||
/// Work within a specific module/crate
|
||||
Module,
|
||||
/// Work on a single file
|
||||
SingleFile,
|
||||
/// Custom scope defined by the user
|
||||
Custom,
|
||||
}
|
||||
|
||||
impl std::fmt::Display for TaskScope {
|
||||
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
|
||||
match self {
|
||||
Self::Workspace => write!(f, "workspace"),
|
||||
Self::Module => write!(f, "module"),
|
||||
Self::SingleFile => write!(f, "single-file"),
|
||||
Self::Custom => write!(f, "custom"),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
|
||||
pub struct TaskPacket {
|
||||
pub objective: String,
|
||||
pub scope: String,
|
||||
pub scope: TaskScope,
|
||||
/// Optional scope path when scope is `Module`, `SingleFile`, or `Custom`
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub scope_path: Option<String>,
|
||||
pub repo: String,
|
||||
/// Worktree path for the task
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub worktree: Option<String>,
|
||||
pub branch_policy: String,
|
||||
pub acceptance_tests: Vec<String>,
|
||||
pub commit_policy: String,
|
||||
@@ -57,7 +88,6 @@ pub fn validate_packet(packet: TaskPacket) -> Result<ValidatedPacket, TaskPacket
|
||||
let mut errors = Vec::new();
|
||||
|
||||
validate_required("objective", &packet.objective, &mut errors);
|
||||
validate_required("scope", &packet.scope, &mut errors);
|
||||
validate_required("repo", &packet.repo, &mut errors);
|
||||
validate_required("branch_policy", &packet.branch_policy, &mut errors);
|
||||
validate_required("commit_policy", &packet.commit_policy, &mut errors);
|
||||
@@ -68,6 +98,9 @@ pub fn validate_packet(packet: TaskPacket) -> Result<ValidatedPacket, TaskPacket
|
||||
);
|
||||
validate_required("escalation_policy", &packet.escalation_policy, &mut errors);
|
||||
|
||||
// Validate scope-specific requirements
|
||||
validate_scope_requirements(&packet, &mut errors);
|
||||
|
||||
for (index, test) in packet.acceptance_tests.iter().enumerate() {
|
||||
if test.trim().is_empty() {
|
||||
errors.push(format!(
|
||||
@@ -83,6 +116,26 @@ pub fn validate_packet(packet: TaskPacket) -> Result<ValidatedPacket, TaskPacket
|
||||
}
|
||||
}
|
||||
|
||||
fn validate_scope_requirements(packet: &TaskPacket, errors: &mut Vec<String>) {
|
||||
// Scope path is required for Module, SingleFile, and Custom scopes
|
||||
let needs_scope_path = matches!(
|
||||
packet.scope,
|
||||
TaskScope::Module | TaskScope::SingleFile | TaskScope::Custom
|
||||
);
|
||||
|
||||
if needs_scope_path
|
||||
&& packet
|
||||
.scope_path
|
||||
.as_ref()
|
||||
.is_none_or(|p| p.trim().is_empty())
|
||||
{
|
||||
errors.push(format!(
|
||||
"scope_path is required for scope '{}'",
|
||||
packet.scope
|
||||
));
|
||||
}
|
||||
}
|
||||
|
||||
fn validate_required(field: &str, value: &str, errors: &mut Vec<String>) {
|
||||
if value.trim().is_empty() {
|
||||
errors.push(format!("{field} must not be empty"));
|
||||
@@ -96,8 +149,10 @@ mod tests {
|
||||
fn sample_packet() -> TaskPacket {
|
||||
TaskPacket {
|
||||
objective: "Implement typed task packet format".to_string(),
|
||||
scope: "runtime/task system".to_string(),
|
||||
scope: TaskScope::Module,
|
||||
scope_path: Some("runtime/task system".to_string()),
|
||||
repo: "claw-code-parity".to_string(),
|
||||
worktree: Some("/tmp/wt-1".to_string()),
|
||||
branch_policy: "origin/main only".to_string(),
|
||||
acceptance_tests: vec![
|
||||
"cargo build --workspace".to_string(),
|
||||
@@ -119,9 +174,12 @@ mod tests {
|
||||
|
||||
#[test]
|
||||
fn invalid_packet_accumulates_errors() {
|
||||
use super::TaskScope;
|
||||
let packet = TaskPacket {
|
||||
objective: " ".to_string(),
|
||||
scope: String::new(),
|
||||
scope: TaskScope::Workspace,
|
||||
scope_path: None,
|
||||
worktree: None,
|
||||
repo: String::new(),
|
||||
branch_policy: "\t".to_string(),
|
||||
acceptance_tests: vec!["ok".to_string(), " ".to_string()],
|
||||
@@ -136,9 +194,6 @@ mod tests {
|
||||
assert!(error
|
||||
.errors()
|
||||
.contains(&"objective must not be empty".to_string()));
|
||||
assert!(error
|
||||
.errors()
|
||||
.contains(&"scope must not be empty".to_string()));
|
||||
assert!(error
|
||||
.errors()
|
||||
.contains(&"repo must not be empty".to_string()));
|
||||
|
||||
@@ -85,11 +85,12 @@ impl TaskRegistry {
|
||||
packet: TaskPacket,
|
||||
) -> Result<Task, TaskPacketValidationError> {
|
||||
let packet = validate_packet(packet)?.into_inner();
|
||||
Ok(self.create_task(
|
||||
packet.objective.clone(),
|
||||
Some(packet.scope.clone()),
|
||||
Some(packet),
|
||||
))
|
||||
// Use scope_path as description if available, otherwise use scope as string
|
||||
let description = packet
|
||||
.scope_path
|
||||
.clone()
|
||||
.or_else(|| Some(packet.scope.to_string()));
|
||||
Ok(self.create_task(packet.objective.clone(), description, Some(packet)))
|
||||
}
|
||||
|
||||
fn create_task(
|
||||
@@ -249,10 +250,13 @@ mod tests {
|
||||
|
||||
#[test]
|
||||
fn creates_task_from_packet() {
|
||||
use crate::task_packet::TaskScope;
|
||||
let registry = TaskRegistry::new();
|
||||
let packet = TaskPacket {
|
||||
objective: "Ship task packet support".to_string(),
|
||||
scope: "runtime/task system".to_string(),
|
||||
scope: TaskScope::Module,
|
||||
scope_path: Some("runtime/task system".to_string()),
|
||||
worktree: Some("/tmp/wt-task".to_string()),
|
||||
repo: "claw-code-parity".to_string(),
|
||||
branch_policy: "origin/main only".to_string(),
|
||||
acceptance_tests: vec!["cargo test --workspace".to_string()],
|
||||
|
||||
@@ -18,6 +18,8 @@ use std::time::{SystemTime, UNIX_EPOCH};
|
||||
|
||||
use serde::{Deserialize, Serialize};
|
||||
|
||||
use crate::current_boot_session_id;
|
||||
|
||||
fn now_secs() -> u64 {
|
||||
SystemTime::now()
|
||||
.duration_since(UNIX_EPOCH)
|
||||
@@ -56,6 +58,7 @@ pub enum WorkerFailureKind {
|
||||
PromptDelivery,
|
||||
Protocol,
|
||||
Provider,
|
||||
StartupNoEvidence,
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
|
||||
@@ -78,6 +81,7 @@ pub enum WorkerEventKind {
|
||||
Restarted,
|
||||
Finished,
|
||||
Failed,
|
||||
StartupNoEvidence,
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
|
||||
@@ -96,6 +100,46 @@ pub enum WorkerPromptTarget {
|
||||
Unknown,
|
||||
}
|
||||
|
||||
/// Classification of startup failure when no evidence is available.
|
||||
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
|
||||
#[serde(rename_all = "snake_case")]
|
||||
pub enum StartupFailureClassification {
|
||||
/// Trust prompt is required but not detected/resolved
|
||||
TrustRequired,
|
||||
/// Prompt was delivered to wrong target (shell misdelivery)
|
||||
PromptMisdelivery,
|
||||
/// Prompt was sent but acceptance timed out
|
||||
PromptAcceptanceTimeout,
|
||||
/// Transport layer is dead/unresponsive
|
||||
TransportDead,
|
||||
/// Worker process crashed during startup
|
||||
WorkerCrashed,
|
||||
/// Cannot determine specific cause
|
||||
Unknown,
|
||||
}
|
||||
|
||||
/// Evidence bundle collected when worker startup times out without clear evidence.
|
||||
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
|
||||
pub struct StartupEvidenceBundle {
|
||||
/// Last known worker lifecycle state before timeout
|
||||
pub last_lifecycle_state: WorkerStatus,
|
||||
/// The pane/command that was being executed
|
||||
pub pane_command: String,
|
||||
/// Timestamp when prompt was sent (if any), unix epoch seconds
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub prompt_sent_at: Option<u64>,
|
||||
/// Whether prompt acceptance was detected
|
||||
pub prompt_acceptance_state: bool,
|
||||
/// Result of trust prompt detection at timeout
|
||||
pub trust_prompt_detected: bool,
|
||||
/// Transport health summary (true = healthy/responsive)
|
||||
pub transport_healthy: bool,
|
||||
/// MCP health summary (true = all servers healthy)
|
||||
pub mcp_healthy: bool,
|
||||
/// Seconds since worker creation
|
||||
pub elapsed_seconds: u64,
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
|
||||
#[serde(tag = "type", rename_all = "snake_case")]
|
||||
pub enum WorkerEventPayload {
|
||||
@@ -115,6 +159,10 @@ pub enum WorkerEventPayload {
|
||||
task_receipt: Option<WorkerTaskReceipt>,
|
||||
recovery_armed: bool,
|
||||
},
|
||||
StartupNoEvidence {
|
||||
evidence: StartupEvidenceBundle,
|
||||
classification: StartupFailureClassification,
|
||||
},
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
|
||||
@@ -560,6 +608,117 @@ impl WorkerRegistry {
|
||||
|
||||
Ok(worker.clone())
|
||||
}
|
||||
|
||||
/// Handle startup timeout by emitting typed `worker.startup_no_evidence` event with evidence bundle.
|
||||
/// Classifier attempts to down-rank the vague bucket into a specific failure classification.
|
||||
pub fn observe_startup_timeout(
|
||||
&self,
|
||||
worker_id: &str,
|
||||
pane_command: &str,
|
||||
transport_healthy: bool,
|
||||
mcp_healthy: bool,
|
||||
) -> Result<Worker, String> {
|
||||
let mut inner = self.inner.lock().expect("worker registry lock poisoned");
|
||||
let worker = inner
|
||||
.workers
|
||||
.get_mut(worker_id)
|
||||
.ok_or_else(|| format!("worker not found: {worker_id}"))?;
|
||||
|
||||
let now = now_secs();
|
||||
let elapsed = now.saturating_sub(worker.created_at);
|
||||
|
||||
// Build evidence bundle
|
||||
let evidence = StartupEvidenceBundle {
|
||||
last_lifecycle_state: worker.status,
|
||||
pane_command: pane_command.to_string(),
|
||||
prompt_sent_at: if worker.prompt_delivery_attempts > 0 {
|
||||
Some(worker.updated_at)
|
||||
} else {
|
||||
None
|
||||
},
|
||||
prompt_acceptance_state: worker.status == WorkerStatus::Running
|
||||
&& !worker.prompt_in_flight,
|
||||
trust_prompt_detected: worker
|
||||
.events
|
||||
.iter()
|
||||
.any(|e| e.kind == WorkerEventKind::TrustRequired),
|
||||
transport_healthy,
|
||||
mcp_healthy,
|
||||
elapsed_seconds: elapsed,
|
||||
};
|
||||
|
||||
// Classify the failure
|
||||
let classification = classify_startup_failure(&evidence);
|
||||
|
||||
// Emit failure with evidence
|
||||
worker.last_error = Some(WorkerFailure {
|
||||
kind: WorkerFailureKind::StartupNoEvidence,
|
||||
message: format!(
|
||||
"worker startup stalled after {elapsed}s — classified as {classification:?}"
|
||||
),
|
||||
created_at: now,
|
||||
});
|
||||
worker.status = WorkerStatus::Failed;
|
||||
worker.prompt_in_flight = false;
|
||||
|
||||
push_event(
|
||||
worker,
|
||||
WorkerEventKind::StartupNoEvidence,
|
||||
WorkerStatus::Failed,
|
||||
Some(format!(
|
||||
"startup timeout with evidence: last_state={:?}, trust_detected={}, prompt_accepted={}",
|
||||
evidence.last_lifecycle_state,
|
||||
evidence.trust_prompt_detected,
|
||||
evidence.prompt_acceptance_state
|
||||
)),
|
||||
Some(WorkerEventPayload::StartupNoEvidence {
|
||||
evidence,
|
||||
classification,
|
||||
}),
|
||||
);
|
||||
|
||||
Ok(worker.clone())
|
||||
}
|
||||
}
|
||||
|
||||
/// Classify startup failure based on evidence bundle.
|
||||
/// Attempts to down-rank the vague `startup-no-evidence` bucket into a specific failure class.
|
||||
fn classify_startup_failure(evidence: &StartupEvidenceBundle) -> StartupFailureClassification {
|
||||
// Check for transport death first
|
||||
if !evidence.transport_healthy {
|
||||
return StartupFailureClassification::TransportDead;
|
||||
}
|
||||
|
||||
// Check for trust prompt that wasn't resolved
|
||||
if evidence.trust_prompt_detected
|
||||
&& evidence.last_lifecycle_state == WorkerStatus::TrustRequired
|
||||
{
|
||||
return StartupFailureClassification::TrustRequired;
|
||||
}
|
||||
|
||||
// Check for prompt acceptance timeout
|
||||
if evidence.prompt_sent_at.is_some()
|
||||
&& !evidence.prompt_acceptance_state
|
||||
&& evidence.last_lifecycle_state == WorkerStatus::Running
|
||||
{
|
||||
return StartupFailureClassification::PromptAcceptanceTimeout;
|
||||
}
|
||||
|
||||
// Check for misdelivery when prompt was sent but not accepted
|
||||
if evidence.prompt_sent_at.is_some()
|
||||
&& !evidence.prompt_acceptance_state
|
||||
&& evidence.elapsed_seconds > 30
|
||||
{
|
||||
return StartupFailureClassification::PromptMisdelivery;
|
||||
}
|
||||
|
||||
// If MCP is unhealthy but transport is fine, worker may have crashed
|
||||
if !evidence.mcp_healthy && evidence.transport_healthy {
|
||||
return StartupFailureClassification::WorkerCrashed;
|
||||
}
|
||||
|
||||
// Default to unknown if no stronger classification exists
|
||||
StartupFailureClassification::Unknown
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
|
||||
@@ -611,6 +770,7 @@ fn push_event(
|
||||
#[derive(serde::Serialize)]
|
||||
struct StateSnapshot<'a> {
|
||||
worker_id: &'a str,
|
||||
session_id: &'a str,
|
||||
status: WorkerStatus,
|
||||
is_ready: bool,
|
||||
trust_gate_cleared: bool,
|
||||
@@ -633,6 +793,7 @@ fn emit_state_file(worker: &Worker) {
|
||||
let now = now_secs();
|
||||
let snapshot = StateSnapshot {
|
||||
worker_id: &worker.worker_id,
|
||||
session_id: current_boot_session_id(),
|
||||
status: worker.status,
|
||||
is_ready: worker.status == WorkerStatus::ReadyForPrompt,
|
||||
trust_gate_cleared: worker.trust_gate_cleared,
|
||||
@@ -1292,6 +1453,10 @@ mod tests {
|
||||
Some("spawning"),
|
||||
"initial status should be spawning"
|
||||
);
|
||||
assert_eq!(
|
||||
value["session_id"].as_str(),
|
||||
Some(current_boot_session_id())
|
||||
);
|
||||
assert_eq!(value["is_ready"].as_bool(), Some(false));
|
||||
|
||||
// Transition to ReadyForPrompt by observing trust-cleared text
|
||||
@@ -1337,4 +1502,215 @@ mod tests {
|
||||
.iter()
|
||||
.any(|event| event.kind == WorkerEventKind::Finished));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn startup_timeout_emits_evidence_bundle_with_classification() {
|
||||
let registry = WorkerRegistry::new();
|
||||
let worker = registry.create("/tmp/repo-timeout", &[], true);
|
||||
|
||||
// Simulate startup timeout with transport dead
|
||||
let timed_out = registry
|
||||
.observe_startup_timeout(&worker.worker_id, "cargo test", false, true)
|
||||
.expect("startup timeout observe should succeed");
|
||||
|
||||
assert_eq!(timed_out.status, WorkerStatus::Failed);
|
||||
let error = timed_out
|
||||
.last_error
|
||||
.expect("startup timeout error should exist");
|
||||
assert_eq!(error.kind, WorkerFailureKind::StartupNoEvidence);
|
||||
// Check for "TransportDead" (the Debug representation of the enum variant)
|
||||
assert!(
|
||||
error.message.contains("TransportDead"),
|
||||
"expected TransportDead in: {}",
|
||||
error.message
|
||||
);
|
||||
|
||||
let event = timed_out
|
||||
.events
|
||||
.iter()
|
||||
.find(|e| e.kind == WorkerEventKind::StartupNoEvidence)
|
||||
.expect("startup no evidence event should exist");
|
||||
|
||||
match event.payload.as_ref() {
|
||||
Some(WorkerEventPayload::StartupNoEvidence {
|
||||
evidence,
|
||||
classification,
|
||||
}) => {
|
||||
assert_eq!(
|
||||
evidence.last_lifecycle_state,
|
||||
WorkerStatus::Spawning,
|
||||
"last state should be spawning"
|
||||
);
|
||||
assert_eq!(evidence.pane_command, "cargo test");
|
||||
assert!(!evidence.transport_healthy);
|
||||
assert!(evidence.mcp_healthy);
|
||||
assert_eq!(*classification, StartupFailureClassification::TransportDead);
|
||||
}
|
||||
_ => panic!(
|
||||
"expected StartupNoEvidence payload, got {:?}",
|
||||
event.payload
|
||||
),
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn startup_timeout_classifies_trust_required_when_prompt_blocked() {
|
||||
let registry = WorkerRegistry::new();
|
||||
let worker = registry.create("/tmp/repo-trust", &[], false);
|
||||
|
||||
// Simulate trust prompt detected but not resolved
|
||||
registry
|
||||
.observe(
|
||||
&worker.worker_id,
|
||||
"Do you trust the files in this folder?\n1. Yes, proceed\n2. No",
|
||||
)
|
||||
.expect("trust observe should succeed");
|
||||
|
||||
// Now simulate startup timeout
|
||||
let timed_out = registry
|
||||
.observe_startup_timeout(&worker.worker_id, "claw prompt", true, true)
|
||||
.expect("startup timeout observe should succeed");
|
||||
|
||||
let event = timed_out
|
||||
.events
|
||||
.iter()
|
||||
.find(|e| e.kind == WorkerEventKind::StartupNoEvidence)
|
||||
.expect("startup no evidence event should exist");
|
||||
|
||||
match event.payload.as_ref() {
|
||||
Some(WorkerEventPayload::StartupNoEvidence { classification, .. }) => {
|
||||
assert_eq!(
|
||||
*classification,
|
||||
StartupFailureClassification::TrustRequired,
|
||||
"should classify as trust_required when trust prompt detected"
|
||||
);
|
||||
}
|
||||
_ => panic!("expected StartupNoEvidence payload"),
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn startup_timeout_classifies_prompt_acceptance_timeout() {
|
||||
let registry = WorkerRegistry::new();
|
||||
let worker = registry.create("/tmp/repo-accept", &[], true);
|
||||
|
||||
// Get worker to ReadyForPrompt
|
||||
registry
|
||||
.observe(&worker.worker_id, "Ready for your input\n>")
|
||||
.expect("ready observe should succeed");
|
||||
|
||||
// Send prompt but don't get acceptance
|
||||
registry
|
||||
.send_prompt(&worker.worker_id, Some("Run tests"), None)
|
||||
.expect("prompt send should succeed");
|
||||
|
||||
// Simulate startup timeout while prompt is still in flight
|
||||
let timed_out = registry
|
||||
.observe_startup_timeout(&worker.worker_id, "claw prompt", true, true)
|
||||
.expect("startup timeout observe should succeed");
|
||||
|
||||
let event = timed_out
|
||||
.events
|
||||
.iter()
|
||||
.find(|e| e.kind == WorkerEventKind::StartupNoEvidence)
|
||||
.expect("startup no evidence event should exist");
|
||||
|
||||
match event.payload.as_ref() {
|
||||
Some(WorkerEventPayload::StartupNoEvidence {
|
||||
evidence,
|
||||
classification,
|
||||
}) => {
|
||||
assert!(
|
||||
evidence.prompt_sent_at.is_some(),
|
||||
"should have prompt_sent_at"
|
||||
);
|
||||
assert!(!evidence.prompt_acceptance_state, "prompt not yet accepted");
|
||||
assert_eq!(
|
||||
*classification,
|
||||
StartupFailureClassification::PromptAcceptanceTimeout
|
||||
);
|
||||
}
|
||||
_ => panic!("expected StartupNoEvidence payload"),
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn startup_evidence_bundle_serializes_correctly() {
|
||||
let bundle = StartupEvidenceBundle {
|
||||
last_lifecycle_state: WorkerStatus::Running,
|
||||
pane_command: "test command".to_string(),
|
||||
prompt_sent_at: Some(1_234_567_890),
|
||||
prompt_acceptance_state: false,
|
||||
trust_prompt_detected: true,
|
||||
transport_healthy: true,
|
||||
mcp_healthy: false,
|
||||
elapsed_seconds: 60,
|
||||
};
|
||||
|
||||
let json = serde_json::to_string(&bundle).expect("should serialize");
|
||||
assert!(json.contains("\"last_lifecycle_state\""));
|
||||
assert!(json.contains("\"pane_command\""));
|
||||
assert!(json.contains("\"prompt_sent_at\":1234567890"));
|
||||
assert!(json.contains("\"trust_prompt_detected\":true"));
|
||||
assert!(json.contains("\"transport_healthy\":true"));
|
||||
assert!(json.contains("\"mcp_healthy\":false"));
|
||||
|
||||
let deserialized: StartupEvidenceBundle =
|
||||
serde_json::from_str(&json).expect("should deserialize");
|
||||
assert_eq!(deserialized.last_lifecycle_state, WorkerStatus::Running);
|
||||
assert_eq!(deserialized.prompt_sent_at, Some(1_234_567_890));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn classify_startup_failure_detects_transport_dead() {
|
||||
let evidence = StartupEvidenceBundle {
|
||||
last_lifecycle_state: WorkerStatus::Spawning,
|
||||
pane_command: "test".to_string(),
|
||||
prompt_sent_at: None,
|
||||
prompt_acceptance_state: false,
|
||||
trust_prompt_detected: false,
|
||||
transport_healthy: false,
|
||||
mcp_healthy: true,
|
||||
elapsed_seconds: 30,
|
||||
};
|
||||
|
||||
let classification = classify_startup_failure(&evidence);
|
||||
assert_eq!(classification, StartupFailureClassification::TransportDead);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn classify_startup_failure_defaults_to_unknown() {
|
||||
let evidence = StartupEvidenceBundle {
|
||||
last_lifecycle_state: WorkerStatus::Spawning,
|
||||
pane_command: "test".to_string(),
|
||||
prompt_sent_at: None,
|
||||
prompt_acceptance_state: false,
|
||||
trust_prompt_detected: false,
|
||||
transport_healthy: true,
|
||||
mcp_healthy: true,
|
||||
elapsed_seconds: 10,
|
||||
};
|
||||
|
||||
let classification = classify_startup_failure(&evidence);
|
||||
assert_eq!(classification, StartupFailureClassification::Unknown);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn classify_startup_failure_detects_worker_crashed() {
|
||||
// Worker crashed scenario: transport healthy but MCP unhealthy
|
||||
// Don't have prompt in flight (no prompt_sent_at) to avoid matching PromptAcceptanceTimeout
|
||||
let evidence = StartupEvidenceBundle {
|
||||
last_lifecycle_state: WorkerStatus::Spawning,
|
||||
pane_command: "test".to_string(),
|
||||
prompt_sent_at: None, // No prompt sent yet
|
||||
prompt_acceptance_state: false,
|
||||
trust_prompt_detected: false,
|
||||
transport_healthy: true,
|
||||
mcp_healthy: false, // MCP unhealthy but transport healthy suggests crash
|
||||
elapsed_seconds: 45,
|
||||
};
|
||||
|
||||
let classification = classify_startup_failure(&evidence);
|
||||
assert_eq!(classification, StartupFailureClassification::WorkerCrashed);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -447,11 +447,14 @@ fn parse_args(args: &[String]) -> Result<CliAction, String> {
|
||||
let value = args
|
||||
.get(index + 1)
|
||||
.ok_or_else(|| "missing value for --model".to_string())?;
|
||||
validate_model_syntax(value)?;
|
||||
model = resolve_model_alias_with_config(value);
|
||||
index += 2;
|
||||
}
|
||||
flag if flag.starts_with("--model=") => {
|
||||
model = resolve_model_alias_with_config(&flag[8..]);
|
||||
let value = &flag[8..];
|
||||
validate_model_syntax(value)?;
|
||||
model = resolve_model_alias_with_config(value);
|
||||
index += 1;
|
||||
}
|
||||
"--output-format" => {
|
||||
@@ -743,6 +746,31 @@ fn parse_single_word_command_alias(
|
||||
permission_mode_override: Option<PermissionMode>,
|
||||
output_format: CliOutputFormat,
|
||||
) -> Option<Result<CliAction, String>> {
|
||||
if rest.is_empty() {
|
||||
return None;
|
||||
}
|
||||
|
||||
// Diagnostic verbs (help, version, status, sandbox, doctor, state) accept only the verb itself
|
||||
// or --help / -h as a suffix. Any other suffix args are unrecognized.
|
||||
let verb = &rest[0];
|
||||
let is_diagnostic = matches!(
|
||||
verb.as_str(),
|
||||
"help" | "version" | "status" | "sandbox" | "doctor" | "state"
|
||||
);
|
||||
|
||||
if is_diagnostic && rest.len() > 1 {
|
||||
// Diagnostic verb with trailing args: reject unrecognized suffix
|
||||
if is_help_flag(&rest[1]) && rest.len() == 2 {
|
||||
// "doctor --help" is valid, routed to parse_local_help_action() instead
|
||||
return None;
|
||||
}
|
||||
// Unrecognized suffix like "--json"
|
||||
return Some(Err(format!(
|
||||
"unrecognized argument `{}` for subcommand `{}`",
|
||||
rest[1], verb
|
||||
)));
|
||||
}
|
||||
|
||||
if rest.len() != 1 {
|
||||
return None;
|
||||
}
|
||||
@@ -1035,6 +1063,37 @@ fn resolve_model_alias_with_config(model: &str) -> String {
|
||||
resolve_model_alias(trimmed).to_string()
|
||||
}
|
||||
|
||||
/// Validate model syntax at parse time.
|
||||
/// Accepts: known aliases (opus, sonnet, haiku) or provider/model pattern.
|
||||
/// Rejects: empty, whitespace-only, strings with spaces, or invalid chars.
|
||||
fn validate_model_syntax(model: &str) -> Result<(), String> {
|
||||
let trimmed = model.trim();
|
||||
if trimmed.is_empty() {
|
||||
return Err("model string cannot be empty".to_string());
|
||||
}
|
||||
// Known aliases are always valid
|
||||
match trimmed {
|
||||
"opus" | "sonnet" | "haiku" => return Ok(()),
|
||||
_ => {}
|
||||
}
|
||||
// Check for spaces (malformed)
|
||||
if trimmed.contains(' ') {
|
||||
return Err(format!(
|
||||
"invalid model syntax: '{}' contains spaces. Use provider/model format or known alias",
|
||||
trimmed
|
||||
));
|
||||
}
|
||||
// Check provider/model format: provider_id/model_id
|
||||
let parts: Vec<&str> = trimmed.split('/').collect();
|
||||
if parts.len() != 2 || parts[0].is_empty() || parts[1].is_empty() {
|
||||
return Err(format!(
|
||||
"invalid model syntax: '{}'. Expected provider/model (e.g., anthropic/claude-opus-4-6) or known alias (opus, sonnet, haiku)",
|
||||
trimmed
|
||||
));
|
||||
}
|
||||
Ok(())
|
||||
}
|
||||
|
||||
fn config_alias_for_current_dir(alias: &str) -> Option<String> {
|
||||
if alias.is_empty() {
|
||||
return None;
|
||||
@@ -1497,6 +1556,8 @@ fn render_doctor_report() -> Result<DoctorReport, Box<dyn std::error::Error>> {
|
||||
project_root,
|
||||
git_branch,
|
||||
git_summary,
|
||||
active_session: false,
|
||||
session_id: None,
|
||||
sandbox_status: resolve_sandbox_status(sandbox_config.sandbox(), &cwd),
|
||||
};
|
||||
Ok(DoctorReport {
|
||||
@@ -2317,6 +2378,8 @@ struct ResumeCommandOutcome {
|
||||
struct StatusContext {
|
||||
cwd: PathBuf,
|
||||
session_path: Option<PathBuf>,
|
||||
active_session: bool,
|
||||
session_id: Option<String>,
|
||||
loaded_config_files: usize,
|
||||
discovered_config_files: usize,
|
||||
memory_file_count: usize,
|
||||
@@ -2326,6 +2389,16 @@ struct StatusContext {
|
||||
sandbox_status: runtime::SandboxStatus,
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, Deserialize)]
|
||||
struct WorkerStateSnapshot {
|
||||
#[serde(default)]
|
||||
status: Option<String>,
|
||||
#[serde(default)]
|
||||
session_id: Option<String>,
|
||||
#[serde(default)]
|
||||
prompt_in_flight: bool,
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, Copy)]
|
||||
struct StatusUsage {
|
||||
message_count: usize,
|
||||
@@ -4934,6 +5007,8 @@ fn status_json_value(
|
||||
"kind": "status",
|
||||
"model": model,
|
||||
"permission_mode": permission_mode,
|
||||
"active_session": context.active_session,
|
||||
"session_id": context.session_id,
|
||||
"usage": {
|
||||
"messages": usage.message_count,
|
||||
"turns": usage.turns,
|
||||
@@ -4992,9 +5067,12 @@ fn status_context(
|
||||
parse_git_status_metadata(project_context.git_status.as_deref());
|
||||
let git_summary = parse_git_workspace_summary(project_context.git_status.as_deref());
|
||||
let sandbox_status = resolve_sandbox_status(runtime_config.sandbox(), &cwd);
|
||||
let worker_state = read_worker_state_snapshot(&cwd);
|
||||
Ok(StatusContext {
|
||||
cwd,
|
||||
session_path: session_path.map(Path::to_path_buf),
|
||||
active_session: worker_state.as_ref().is_some_and(worker_state_is_active),
|
||||
session_id: worker_state.and_then(|snapshot| snapshot.session_id),
|
||||
loaded_config_files: runtime_config.loaded_entries().len(),
|
||||
discovered_config_files,
|
||||
memory_file_count: project_context.instruction_files.len(),
|
||||
@@ -5005,6 +5083,20 @@ fn status_context(
|
||||
})
|
||||
}
|
||||
|
||||
fn read_worker_state_snapshot(cwd: &Path) -> Option<WorkerStateSnapshot> {
|
||||
let state_path = cwd.join(".claw").join("worker-state.json");
|
||||
let raw = fs::read_to_string(state_path).ok()?;
|
||||
serde_json::from_str(&raw).ok()
|
||||
}
|
||||
|
||||
fn worker_state_is_active(snapshot: &WorkerStateSnapshot) -> bool {
|
||||
snapshot.prompt_in_flight
|
||||
|| matches!(
|
||||
snapshot.status.as_deref(),
|
||||
Some("spawning" | "trust_required" | "ready_for_prompt" | "running")
|
||||
)
|
||||
}
|
||||
|
||||
fn format_status_report(
|
||||
model: &str,
|
||||
usage: StatusUsage,
|
||||
@@ -5058,7 +5150,7 @@ fn format_status_report(
|
||||
context.git_summary.unstaged_files,
|
||||
context.git_summary.untracked_files,
|
||||
context.session_path.as_ref().map_or_else(
|
||||
|| "live-repl".to_string(),
|
||||
|| format_active_session(context),
|
||||
|path| path.display().to_string()
|
||||
),
|
||||
context.loaded_config_files,
|
||||
@@ -5074,6 +5166,17 @@ fn format_status_report(
|
||||
)
|
||||
}
|
||||
|
||||
fn format_active_session(context: &StatusContext) -> String {
|
||||
if context.active_session {
|
||||
match context.session_id.as_deref() {
|
||||
Some(session_id) => format!("active ({session_id})"),
|
||||
None => "active".to_string(),
|
||||
}
|
||||
} else {
|
||||
"idle".to_string()
|
||||
}
|
||||
}
|
||||
|
||||
fn format_sandbox_report(status: &runtime::SandboxStatus) -> String {
|
||||
format!(
|
||||
"Sandbox
|
||||
@@ -5181,28 +5284,32 @@ fn sandbox_json_value(status: &runtime::SandboxStatus) -> serde_json::Value {
|
||||
fn render_help_topic(topic: LocalHelpTopic) -> String {
|
||||
match topic {
|
||||
LocalHelpTopic::Status => "Status
|
||||
Usage claw status
|
||||
Usage claw status [--output-format <format>]
|
||||
Purpose show the local workspace snapshot without entering the REPL
|
||||
Output model, permissions, git state, config files, and sandbox status
|
||||
Formats text (default), json
|
||||
Related /status · claw --resume latest /status"
|
||||
.to_string(),
|
||||
LocalHelpTopic::Sandbox => "Sandbox
|
||||
Usage claw sandbox
|
||||
Usage claw sandbox [--output-format <format>]
|
||||
Purpose inspect the resolved sandbox and isolation state for the current directory
|
||||
Output namespace, network, filesystem, and fallback details
|
||||
Formats text (default), json
|
||||
Related /sandbox · claw status"
|
||||
.to_string(),
|
||||
LocalHelpTopic::Doctor => "Doctor
|
||||
Usage claw doctor
|
||||
Usage claw doctor [--output-format <format>]
|
||||
Purpose diagnose local auth, config, workspace, sandbox, and build metadata
|
||||
Output local-only health report; no provider request or session resume required
|
||||
Formats text (default), json
|
||||
Related /doctor · claw --resume latest /doctor"
|
||||
.to_string(),
|
||||
LocalHelpTopic::Acp => "ACP / Zed
|
||||
Usage claw acp [serve]
|
||||
Usage claw acp [serve] [--output-format <format>]
|
||||
Aliases claw --acp · claw -acp
|
||||
Purpose explain the current editor-facing ACP/Zed launch contract without starting the runtime
|
||||
Status discoverability only; `serve` is a status alias and does not launch a daemon yet
|
||||
Formats text (default), json
|
||||
Related ROADMAP #64a (discoverability) · ROADMAP #76 (real ACP support) · claw --help"
|
||||
.to_string(),
|
||||
}
|
||||
@@ -8421,6 +8528,7 @@ mod tests {
|
||||
request_id: Some("req_jobdori_789".to_string()),
|
||||
body: String::new(),
|
||||
retryable: true,
|
||||
suggested_action: None,
|
||||
};
|
||||
|
||||
let rendered = format_user_visible_api_error("session-issue-22", &error);
|
||||
@@ -8443,6 +8551,7 @@ mod tests {
|
||||
request_id: Some("req_jobdori_790".to_string()),
|
||||
body: String::new(),
|
||||
retryable: true,
|
||||
suggested_action: None,
|
||||
}),
|
||||
};
|
||||
|
||||
@@ -8506,6 +8615,7 @@ mod tests {
|
||||
request_id: Some("req_ctx_456".to_string()),
|
||||
body: String::new(),
|
||||
retryable: false,
|
||||
suggested_action: None,
|
||||
};
|
||||
|
||||
let rendered = format_user_visible_api_error("session-issue-32", &error);
|
||||
@@ -8537,6 +8647,7 @@ mod tests {
|
||||
request_id: Some("req_ctx_retry_789".to_string()),
|
||||
body: String::new(),
|
||||
retryable: false,
|
||||
suggested_action: None,
|
||||
}),
|
||||
};
|
||||
|
||||
@@ -8879,7 +8990,7 @@ mod tests {
|
||||
let args = vec![
|
||||
"--output-format=json".to_string(),
|
||||
"--model".to_string(),
|
||||
"claude-opus".to_string(),
|
||||
"opus".to_string(),
|
||||
"explain".to_string(),
|
||||
"this".to_string(),
|
||||
];
|
||||
@@ -8887,7 +8998,7 @@ mod tests {
|
||||
parse_args(&args).expect("args should parse"),
|
||||
CliAction::Prompt {
|
||||
prompt: "explain this".to_string(),
|
||||
model: "claude-opus".to_string(),
|
||||
model: "claude-opus-4-6".to_string(),
|
||||
output_format: CliOutputFormat::Json,
|
||||
allowed_tools: None,
|
||||
permission_mode: PermissionMode::DangerFullAccess,
|
||||
@@ -9657,15 +9768,21 @@ mod tests {
|
||||
fn multi_word_prompt_still_uses_shorthand_prompt_mode() {
|
||||
let _guard = env_lock();
|
||||
std::env::remove_var("RUSTY_CLAUDE_PERMISSION_MODE");
|
||||
// Input is ["help", "me", "debug"] so the joined prompt shorthand
|
||||
// must be "help me debug". A previous batch accidentally rewrote
|
||||
// the expected string to "$help overview" (copy-paste slip).
|
||||
// Input is ["--model", "opus", "please", "debug", "this"] so the joined
|
||||
// prompt shorthand must stay a normal multi-word prompt while still
|
||||
// honoring alias validation at parse time.
|
||||
assert_eq!(
|
||||
parse_args(&["help".to_string(), "me".to_string(), "debug".to_string()])
|
||||
.expect("prompt shorthand should still work"),
|
||||
parse_args(&[
|
||||
"--model".to_string(),
|
||||
"opus".to_string(),
|
||||
"please".to_string(),
|
||||
"debug".to_string(),
|
||||
"this".to_string(),
|
||||
])
|
||||
.expect("prompt shorthand should still work"),
|
||||
CliAction::Prompt {
|
||||
prompt: "help me debug".to_string(),
|
||||
model: DEFAULT_MODEL.to_string(),
|
||||
prompt: "please debug this".to_string(),
|
||||
model: "claude-opus-4-6".to_string(),
|
||||
output_format: CliOutputFormat::Text,
|
||||
allowed_tools: None,
|
||||
permission_mode: crate::default_permission_mode(),
|
||||
@@ -10279,6 +10396,8 @@ mod tests {
|
||||
&super::StatusContext {
|
||||
cwd: PathBuf::from("/tmp/project"),
|
||||
session_path: Some(PathBuf::from("session.jsonl")),
|
||||
active_session: true,
|
||||
session_id: Some("boot-status-test".to_string()),
|
||||
loaded_config_files: 2,
|
||||
discovered_config_files: 3,
|
||||
memory_file_count: 4,
|
||||
@@ -10307,10 +10426,10 @@ mod tests {
|
||||
status.contains("Git state dirty · 3 files · 1 staged, 1 unstaged, 1 untracked")
|
||||
);
|
||||
assert!(status.contains("Changed files 3"));
|
||||
assert!(status.contains("Session session.jsonl"));
|
||||
assert!(status.contains("Staged 1"));
|
||||
assert!(status.contains("Unstaged 1"));
|
||||
assert!(status.contains("Untracked 1"));
|
||||
assert!(status.contains("Session session.jsonl"));
|
||||
assert!(status.contains("Config files loaded 2/3"));
|
||||
assert!(status.contains("Memory files 4"));
|
||||
assert!(status.contains("Suggested flow /status → /diff → /commit"));
|
||||
|
||||
@@ -39,6 +39,8 @@ fn status_and_sandbox_emit_json_when_requested() {
|
||||
|
||||
let status = assert_json_command(&root, &["--output-format", "json", "status"]);
|
||||
assert_eq!(status["kind"], "status");
|
||||
assert_eq!(status["active_session"], false);
|
||||
assert!(status["session_id"].is_null());
|
||||
assert!(status["workspace"]["cwd"].as_str().is_some());
|
||||
|
||||
let sandbox = assert_json_command(&root, &["--output-format", "json", "sandbox"]);
|
||||
@@ -384,6 +386,47 @@ fn resumed_version_and_init_emit_structured_json_when_requested() {
|
||||
assert!(root.join("CLAUDE.md").exists());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn status_json_surfaces_active_session_and_boot_session_id_from_worker_state() {
|
||||
let root = unique_temp_dir("status-worker-state-json");
|
||||
fs::create_dir_all(&root).expect("temp dir should exist");
|
||||
write_worker_state_fixture(&root, "running", "boot-fixture-123");
|
||||
|
||||
let status = assert_json_command(&root, &["--output-format", "json", "status"]);
|
||||
assert_eq!(status["kind"], "status");
|
||||
assert_eq!(status["active_session"], true);
|
||||
assert_eq!(status["session_id"], "boot-fixture-123");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn status_text_surfaces_active_session_and_boot_session_id_from_worker_state() {
|
||||
let root = unique_temp_dir("status-worker-state-text");
|
||||
fs::create_dir_all(&root).expect("temp dir should exist");
|
||||
write_worker_state_fixture(&root, "running", "boot-fixture-456");
|
||||
|
||||
let output = run_claw(&root, &["status"], &[]);
|
||||
assert!(output.status.success());
|
||||
let stdout = String::from_utf8_lossy(&output.stdout);
|
||||
assert!(stdout.contains("Session active (boot-fixture-456)"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn worker_state_fixture_round_trips_session_id_across_status_surface() {
|
||||
let root = unique_temp_dir("status-worker-state-roundtrip");
|
||||
fs::create_dir_all(&root).expect("temp dir should exist");
|
||||
let session_id = "boot-roundtrip-789";
|
||||
write_worker_state_fixture(&root, "running", session_id);
|
||||
|
||||
let status = assert_json_command(&root, &["--output-format", "json", "status"]);
|
||||
assert_eq!(status["active_session"], true);
|
||||
assert_eq!(status["session_id"], session_id);
|
||||
|
||||
let raw = fs::read_to_string(root.join(".claw").join("worker-state.json"))
|
||||
.expect("worker state should exist");
|
||||
let state: Value = serde_json::from_str(&raw).expect("worker state should be valid json");
|
||||
assert_eq!(state["session_id"], session_id);
|
||||
}
|
||||
|
||||
fn assert_json_command(current_dir: &Path, args: &[&str]) -> Value {
|
||||
assert_json_command_with_env(current_dir, args, &[])
|
||||
}
|
||||
@@ -431,6 +474,26 @@ fn write_upstream_fixture(root: &Path) -> PathBuf {
|
||||
upstream
|
||||
}
|
||||
|
||||
fn write_worker_state_fixture(root: &Path, status: &str, session_id: &str) {
|
||||
let claw_dir = root.join(".claw");
|
||||
fs::create_dir_all(&claw_dir).expect("worker state dir should exist");
|
||||
fs::write(
|
||||
claw_dir.join("worker-state.json"),
|
||||
serde_json::to_string_pretty(&serde_json::json!({
|
||||
"worker_id": "worker-test",
|
||||
"session_id": session_id,
|
||||
"status": status,
|
||||
"is_ready": status == "ready_for_prompt",
|
||||
"trust_gate_cleared": false,
|
||||
"prompt_in_flight": status == "running",
|
||||
"updated_at": 1,
|
||||
"seconds_since_update": 0
|
||||
}))
|
||||
.expect("worker state json should serialize"),
|
||||
)
|
||||
.expect("worker state fixture should write");
|
||||
}
|
||||
|
||||
fn write_session_fixture(root: &Path, session_id: &str, user_text: Option<&str>) -> PathBuf {
|
||||
let session_path = root.join("session.jsonl");
|
||||
let mut session = Session::new()
|
||||
|
||||
@@ -11,8 +11,8 @@ use api::{
|
||||
use plugins::PluginTool;
|
||||
use reqwest::blocking::Client;
|
||||
use runtime::{
|
||||
check_freshness, dedupe_superseded_commit_events, edit_file, execute_bash, glob_search,
|
||||
grep_search, load_system_prompt,
|
||||
check_freshness, current_boot_session_id, dedupe_superseded_commit_events, edit_file,
|
||||
execute_bash, glob_search, grep_search, load_system_prompt,
|
||||
lsp_client::LspRegistry,
|
||||
mcp_tool_bridge::McpToolRegistry,
|
||||
permission_enforcer::{EnforcementResult, PermissionEnforcer},
|
||||
@@ -3535,7 +3535,9 @@ where
|
||||
created_at: created_at.clone(),
|
||||
started_at: Some(created_at),
|
||||
completed_at: None,
|
||||
lane_events: vec![LaneEvent::started(iso8601_now())],
|
||||
lane_events: vec![
|
||||
LaneEvent::started(iso8601_now()).with_session_id(current_boot_session_id())
|
||||
],
|
||||
current_blocker: None,
|
||||
derived_state: String::from("working"),
|
||||
error: None,
|
||||
@@ -3744,6 +3746,11 @@ fn persist_agent_terminal_state(
|
||||
error: Option<String>,
|
||||
) -> Result<(), String> {
|
||||
let blocker = error.as_deref().map(classify_lane_blocker);
|
||||
let session_id = manifest
|
||||
.lane_events
|
||||
.last()
|
||||
.and_then(|event| event.session_id.clone())
|
||||
.unwrap_or_else(|| current_boot_session_id().to_string());
|
||||
append_agent_output(
|
||||
&manifest.output_file,
|
||||
&format_agent_terminal_output(status, result, blocker.as_ref(), error.as_deref()),
|
||||
@@ -3758,26 +3765,31 @@ fn persist_agent_terminal_state(
|
||||
if let Some(blocker) = blocker {
|
||||
next_manifest
|
||||
.lane_events
|
||||
.push(LaneEvent::blocked(iso8601_now(), &blocker));
|
||||
.push(LaneEvent::blocked(iso8601_now(), &blocker).with_session_id(session_id.clone()));
|
||||
next_manifest
|
||||
.lane_events
|
||||
.push(LaneEvent::failed(iso8601_now(), &blocker));
|
||||
.push(LaneEvent::failed(iso8601_now(), &blocker).with_session_id(session_id.clone()));
|
||||
} else {
|
||||
next_manifest.current_blocker = None;
|
||||
let mut finished_summary = build_lane_finished_summary(&next_manifest, result);
|
||||
finished_summary.data.disabled_cron_ids = disable_matching_crons(&next_manifest, result);
|
||||
next_manifest.lane_events.push(
|
||||
LaneEvent::finished(iso8601_now(), finished_summary.detail).with_data(
|
||||
serde_json::to_value(&finished_summary.data)
|
||||
.expect("lane summary metadata should serialize"),
|
||||
),
|
||||
LaneEvent::finished(iso8601_now(), finished_summary.detail)
|
||||
.with_data(
|
||||
serde_json::to_value(&finished_summary.data)
|
||||
.expect("lane summary metadata should serialize"),
|
||||
)
|
||||
.with_session_id(session_id.clone()),
|
||||
);
|
||||
if let Some(provenance) = maybe_commit_provenance(result) {
|
||||
next_manifest.lane_events.push(LaneEvent::commit_created(
|
||||
iso8601_now(),
|
||||
Some(format!("commit {}", provenance.commit)),
|
||||
provenance,
|
||||
));
|
||||
next_manifest.lane_events.push(
|
||||
LaneEvent::commit_created(
|
||||
iso8601_now(),
|
||||
Some(format!("commit {}", provenance.commit)),
|
||||
provenance,
|
||||
)
|
||||
.with_session_id(session_id),
|
||||
);
|
||||
}
|
||||
}
|
||||
write_agent_manifest(&next_manifest)
|
||||
@@ -4459,6 +4471,7 @@ fn classify_lane_blocker(error: &str) -> LaneEventBlocker {
|
||||
LaneEventBlocker {
|
||||
failure_class: classify_lane_failure(error),
|
||||
detail,
|
||||
subphase: None,
|
||||
}
|
||||
}
|
||||
|
||||
@@ -7760,6 +7773,9 @@ mod tests {
|
||||
assert!(manifest_contents.contains("\"status\": \"running\""));
|
||||
assert_eq!(manifest_json["laneEvents"][0]["event"], "lane.started");
|
||||
assert_eq!(manifest_json["laneEvents"][0]["status"], "running");
|
||||
assert!(manifest_json["laneEvents"][0]["session_id"]
|
||||
.as_str()
|
||||
.is_some());
|
||||
assert!(manifest_json["currentBlocker"].is_null());
|
||||
let captured_job = captured
|
||||
.lock()
|
||||
@@ -7837,10 +7853,17 @@ mod tests {
|
||||
completed_manifest_json["laneEvents"][0]["event"],
|
||||
"lane.started"
|
||||
);
|
||||
let session_id = completed_manifest_json["laneEvents"][0]["session_id"]
|
||||
.as_str()
|
||||
.expect("startup session_id should exist");
|
||||
assert_eq!(
|
||||
completed_manifest_json["laneEvents"][1]["event"],
|
||||
"lane.finished"
|
||||
);
|
||||
assert_eq!(
|
||||
completed_manifest_json["laneEvents"][1]["session_id"],
|
||||
session_id
|
||||
);
|
||||
assert_eq!(
|
||||
completed_manifest_json["laneEvents"][1]["data"]["qualityFloorApplied"],
|
||||
false
|
||||
@@ -7853,6 +7876,10 @@ mod tests {
|
||||
completed_manifest_json["laneEvents"][2]["event"],
|
||||
"lane.commit.created"
|
||||
);
|
||||
assert_eq!(
|
||||
completed_manifest_json["laneEvents"][2]["session_id"],
|
||||
session_id
|
||||
);
|
||||
assert_eq!(
|
||||
completed_manifest_json["laneEvents"][2]["data"]["commit"],
|
||||
"abc1234"
|
||||
@@ -9554,9 +9581,12 @@ printf 'pwsh:%s' "$1"
|
||||
|
||||
#[test]
|
||||
fn run_task_packet_creates_packet_backed_task() {
|
||||
use runtime::task_packet::TaskScope;
|
||||
let result = run_task_packet(TaskPacket {
|
||||
objective: "Ship packetized runtime task".to_string(),
|
||||
scope: "runtime/task system".to_string(),
|
||||
scope: TaskScope::Module,
|
||||
scope_path: Some("runtime/task system".to_string()),
|
||||
worktree: Some("/tmp/wt-packet".to_string()),
|
||||
repo: "claw-code-parity".to_string(),
|
||||
branch_policy: "origin/main only".to_string(),
|
||||
acceptance_tests: vec![
|
||||
|
||||
Reference in New Issue
Block a user