Commit Graph

119 Commits

Author SHA1 Message Date
bellman
a212c662e5 omx(team): auto-checkpoint worker-1 [1] 2026-05-15 10:30:00 +09:00
bellman
2cac66cd38 Stabilize provider compatibility integration verification
Keep integrated G008 provider changes formatted and compile-ready so worker follow-up commits can merge against a clean leader baseline.

Constraint: G008 provider verification must pass before ultragoal checkpointing.
Confidence: high
Scope-risk: narrow
Directive: Keep provider compatibility follow-ups rebased on this formatted baseline before retrying failed cherry-picks.
Tested: cargo test --manifest-path rust/Cargo.toml -p api providers:: -- --nocapture; cargo test --manifest-path rust/Cargo.toml -p api --test openai_compat_integration -- --nocapture --test-threads=1
Not-tested: full workspace clippy; known pre-existing runtime policy_engine LaneContext clippy warning remains outside this change.

Co-authored-by: OmX <omx@oh-my-codex.dev>
2026-05-15 10:28:50 +09:00
bellman
685f078204 omx(team): auto-checkpoint worker-1 [1] 2026-05-15 10:23:37 +09:00
bellman
e4ef0f7f19 omx(team): auto-checkpoint worker-4 [unknown] 2026-05-15 10:22:03 +09:00
bellman
76581f7239 omx(team): auto-checkpoint worker-3 [unknown] 2026-05-15 10:21:58 +09:00
bellman
82ec223ed4 omx(team): auto-checkpoint worker-2 [unknown] 2026-05-15 10:21:55 +09:00
bellman
a6ca5c489b omx(team): auto-checkpoint worker-4 [unknown] 2026-05-15 10:21:28 +09:00
bellman
3ff8743e79 omx(team): auto-checkpoint worker-2 [unknown] 2026-05-15 10:21:23 +09:00
bellman
29029bfc14 omx(team): auto-checkpoint worker-1 [1] 2026-05-15 10:21:18 +09:00
YeonGyu-Kim
75c08bc982 fix: REPL display, /compact panic, identity leak, DeepSeek reasoning, thinking blocks
Five interrelated fixes from parallel Hephaestus sessions:

1. fix(repl): display assistant text after spinner (#2981, #2982, #2937)
   - Added final_assistant_text() call after run_turn spinner completes
   - REPL now shows response text like run_prompt_json does

2. fix(compact): handle Thinking content blocks (#2985)
   - Added ContentBlock::Thinking variant throughout compact summarizer
   - Prevents panic when /compact encounters thinking blocks

3. fix(prompt): provider-aware model identity (#2822)
   - New ModelFamilyIdentity enum (Claude vs Generic)
   - Non-Anthropic models no longer say 'I am Claude'
   - model_family_identity_for() detects provider and sets identity

4. fix(openai): preserve DeepSeek reasoning_content (#2821)
   - Stream parser now captures reasoning_content from OpenAI-compat
   - Emits ThinkingDelta/SignatureDelta events for reasoning models
   - Thinking blocks included in conversation history for re-send

5. feat(runtime): Thinking block support across codebase
   - AssistantEvent::Thinking variant in conversation.rs
   - ContentBlock::Thinking in session serialization
   - Thinking-aware compact summarization
   - Tests for thinking block ordering and content

Closes #2981, #2982, #2937, #2985, #2822, #2821
2026-05-06 15:32:34 +09:00
Andreas Haida
9a512633a5 Cap OpenAI default output tokens using model metadata 2026-05-03 22:16:12 +02:00
Andreas Haida
6ac13ffdad Handle OpenAI token-limit errors as context-window failures 2026-05-03 22:16:12 +02:00
Yeachan-Heo
74ea754d29 Restore Rust formatting compliance
Run rustfmt from the Rust workspace so CI format checks pass without changing behavior.

Constraint: Scope is formatting-only across tracked Rust files

Confidence: high

Scope-risk: narrow

Tested: cd rust && cargo fmt --check

Tested: git diff --check
2026-04-28 09:19:16 +00:00
Yeachan-Heo
00d0eb61d4 US-024: Add token limit metadata for kimi models
Add ModelTokenLimit entries for kimi-k2.5 and kimi-k1.5 to enable
preflight context window validation. Per Moonshot AI documentation:
- Context window: 256,000 tokens
- Max output: 16,384 tokens

Includes 3 unit tests:
- returns_context_window_metadata_for_kimi_models
- kimi_alias_resolves_to_kimi_k25_token_limits
- preflight_blocks_oversized_requests_for_kimi_models

All tests pass, clippy clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-17 04:15:38 +00:00
Yeachan-Heo
d037f9faa8 Fix strip_routing_prefix to handle kimi provider prefix (US-023)
Add "kimi" to the strip_routing_prefix matches so that models like
"kimi/kimi-k2.5" have their prefix stripped before sending to the
DashScope API (consistent with qwen/openai/xai/grok handling).

Also add unit test strip_routing_prefix_strips_kimi_provider_prefix.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 19:50:15 +00:00
Yeachan-Heo
cec8d17ca8 Implement US-023: Add automatic routing for kimi models to DashScope
Changes in rust/crates/api/src/providers/mod.rs:
- Add 'kimi' alias to MODEL_REGISTRY resolving to 'kimi-k2.5' with DashScope config
- Add kimi/kimi- prefix routing to DashScope endpoint in metadata_for_model()
- Add resolve_model_alias() handling for kimi -> kimi-k2.5
- Add unit tests: kimi_prefix_routes_to_dashscope, kimi_alias_resolves_to_kimi_k2_5

Users can now use:
- --model kimi (resolves to kimi-k2.5)
- --model kimi-k2.5 (auto-routes to DashScope)
- --model kimi/kimi-k2.5 (explicit provider prefix)

All 127 tests pass, clippy clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 19:44:21 +00:00
Yeachan-Heo
4cb1db9faa Implement US-022: Enhanced error context for API failures
Add structured error context to API failures:
- Request ID tracking across retries with full context in error messages
- Provider-specific error code mapping with actionable suggestions
- Suggested user actions for common error types (401, 403, 413, 429, 500, 502-504)
- Added suggested_action field to ApiError::Api variant
- Updated enrich_bearer_auth_error to preserve suggested_action

Files changed:
- rust/crates/api/src/error.rs: Add suggested_action field, update Display
- rust/crates/api/src/providers/openai_compat.rs: Add suggested_action_for_status()
- rust/crates/api/src/providers/anthropic.rs: Update error handling

All tests pass, clippy clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 19:15:00 +00:00
Yeachan-Heo
5e65b33042 US-021: Add request body size pre-flight check for OpenAI-compatible provider 2026-04-16 17:41:57 +00:00
Yeachan-Heo
87b982ece5 US-011: Performance optimization for API request serialization
Added criterion benchmarks and optimized flatten_tool_result_content:
- Added criterion dev-dependency and request_building benchmark suite
- Optimized flatten_tool_result_content to pre-allocate capacity and avoid
  intermediate Vec construction (was collecting to Vec then joining)
- Made key functions public for benchmarking: translate_message,
  build_chat_completion_request, flatten_tool_result_content,
  is_reasoning_model, model_rejects_is_error_field

Benchmark results:
- flatten_tool_result_content/single_text: ~17ns
- translate_message/text_only: ~200ns
- build_chat_completion_request/10 messages: ~16.4µs
- is_reasoning_model detection: ~26-42ns

All 119 unit tests and 29 integration tests pass.
cargo clippy passes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 11:11:45 +00:00
Yeachan-Heo
3e4e1585b5 US-009: Add comprehensive unit tests for kimi model compatibility fix
Added 4 unit tests to verify is_error field handling for kimi models:
- model_rejects_is_error_field_detects_kimi_models: Detects kimi-k2.5, kimi-k1.5, dashscope/kimi-k2.5 (case insensitive)
- translate_message_includes_is_error_for_non_kimi_models: Verifies gpt-4o, grok-3, claude include is_error
- translate_message_excludes_is_error_for_kimi_models: Verifies kimi models exclude is_error (prevents 400 Bad Request)
- build_chat_completion_request_kimi_vs_non_kimi_tool_results: Full integration test for request building

All 119 unit tests and 29 integration tests pass.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-16 10:54:48 +00:00
Yeachan-Heo
124e8661ed Remove the deprecated Claude subscription login path and restore a green Rust workspace
ROADMAP #37 was still open even though several earlier backlog items were
already closed. This change removes the local login/logout surface, stops
startup auth resolution from treating saved OAuth credentials as a supported
path, and updates diagnostics/help to point users at ANTHROPIC_API_KEY or
ANTHROPIC_AUTH_TOKEN only.

While proving the change with the user-requested workspace gates, clippy
surfaced additional pre-existing warning failures across the Rust workspace.
Those were cleaned up in-place so the required `cargo fmt`, `cargo clippy
--workspace --all-targets -- -D warnings`, and `cargo test --workspace`
sequence now passes end to end.

Constraint: User explicitly required full-workspace fmt/clippy/test before commit/push
Constraint: Existing dirty leader worktree had to be stashed before attempted OMX team worktree launch
Rejected: Keep login/logout but hide them from help | left unsupported auth flow and saved OAuth fallback intact
Rejected: Stop after ROADMAP #37 targeted tests | did not satisfy required full-workspace verification gate
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: Do not reintroduce saved OAuth as a silent Anthropic startup fallback without an explicit supported auth policy
Tested: cargo fmt --all --check; cargo clippy --workspace --all-targets -- -D warnings; cargo test --workspace
Not-tested: Remote push effects beyond origin/main update
2026-04-11 17:24:44 +00:00
YeonGyu-Kim
1ecdb1076c fix(api): OPENAI_BASE_URL wins over Anthropic fallback for unknown models
When OPENAI_BASE_URL is set, the user explicitly configured an
OpenAI-compatible endpoint (Ollama, LM Studio, vLLM, etc.). Model names
like 'qwen2.5-coder:7b' or 'llama3:latest' don't match any recognized
prefix, so detect_provider_kind() fell through to Anthropic — asking for
Anthropic credentials even though the user clearly intended a local
provider.

Now: OPENAI_BASE_URL + OPENAI_API_KEY beats Anthropic env-check in the
cascade. OPENAI_BASE_URL alone (no API key — common for Ollama) is a
last-resort fallback before the Anthropic default.

Source: MaxDerVerpeilte in #claw-code (Ollama + qwen2.5-coder:7b);
traced by gaebal-gajae.
2026-04-10 12:37:39 +09:00
YeonGyu-Kim
6ae8850d45 fix(api): silence dead_code warning and remove duplicated #[test] attr
- Add #[allow(dead_code)] on test-only Delta struct (content field
  used for deserialization but not read in assertion)
- Remove duplicated #[test] attribute on
  assistant_message_without_tool_calls_omits_tool_calls_field

Zero warnings in cargo test --workspace.
2026-04-10 07:33:22 +09:00
YeonGyu-Kim
a3d0c9e5e7 fix(api): sanitize orphaned tool messages at request-building layer
Adds sanitize_tool_message_pairing() called from build_chat_completion_request()
after translate_message() runs. Drops any role:"tool" message whose
immediately-preceding non-tool message is role:"assistant" but has no
tool_calls entry matching the tool_call_id.

This is the second layer of the tool-pairing invariant defense:
- 6e301c8: compaction boundary fix (producer layer)
- this commit: request-builder sanitizer (sender layer)

Together these close the 400-error loop for resumed/compacted multi-turn
tool sessions on OpenAI-compatible backends.

Sanitization only fires when preceding message is role:assistant (not
user/system) to avoid dropping valid translation artifacts from mixed
user-message content blocks.

Regression tests: sanitize_drops_orphaned_tool_messages covers valid pair,
orphaned tool (no tool_calls in preceding assistant), mismatched id, and
two tool results both referencing the same assistant turn.

116 api + 159 CLI + 431 runtime tests pass. Fmt clean.
2026-04-10 01:35:00 +09:00
YeonGyu-Kim
ed42f8f298 fix(api): surface provider error in SSE stream frames (companion to ff416ff)
Same fix as ff416ff but for the streaming path. Some backends embed an
error JSON object in an SSE data: frame:

  data: {"error":{"message":"context too long","code":400}}

parse_sse_frame() was attempting to deserialize this as ChatCompletionChunk
and failing with 'missing field' / 'invalid type', hiding the actual
backend error message.

Fix: check for an 'error' key before full chunk deserialization, same as
the non-streaming path in ff416ff. Symmetric pair:
- ff416ff: non-streaming path (response body)
- this:    streaming path (SSE data: frame)

115 api + 159 CLI tests pass. Fmt clean.
2026-04-09 23:03:33 +09:00
YeonGyu-Kim
ff416ff3e7 fix(api): surface provider error body before attempting completion parse
When a local/proxy OpenAI-compatible backend returns an error object:
  {"error":{"message":"...","type":"...","code":...}}

claw was trying to deserialize it as a ChatCompletionResponse and
failing with the cryptic 'failed to parse OpenAI response: missing
field id', completely hiding the actual backend error message.

Fix: before full deserialization, check if the parsed JSON has an
'error' key and promote it directly to ApiError::Api so the user
sees the real error (e.g. 'The number of tokens to keep from the
initial prompt is greater than the context length').

Source: devilayu in #claw-code 2026-04-09 — local LM Studio context
limit error was invisible; user saw 'missing field id' instead.
159 CLI + 115 api tests pass. Fmt clean.
2026-04-09 22:33:07 +09:00
YeonGyu-Kim
6ac7d8cd46 fix(api): omit tool_calls field from assistant messages when empty
When serializing a multi-turn conversation for the OpenAI-compatible path,
assistant messages with no tool calls were always emitting 'tool_calls: []'.
Some providers reject requests where a prior assistant turn carries an
explicit empty tool_calls array (400 on subsequent turns after a plain
text assistant response).

Fix: only include 'tool_calls' in the serialized assistant message when
the vec is non-empty. Empty case omits the field entirely.

This is a companion fix to fd7aade (null tool_calls in stream delta).
The two bugs are symmetric: fd7aade handled inbound null -> empty vec;
this handles outbound empty vec -> field omitted.

Two regression tests added:
- assistant_message_without_tool_calls_omits_tool_calls_field
- assistant_message_with_tool_calls_includes_tool_calls_field

115 api tests pass. Fmt clean.

Source: gaebal-gajae repro 2026-04-09 (400 on multi-turn, companion to
null tool_calls stream-delta fix).
2026-04-09 22:06:25 +09:00
YeonGyu-Kim
fd7aade5b5 fix(api): tolerate null tool_calls in OpenAI-compat stream delta chunks
Some OpenAI-compatible providers emit 'tool_calls: null' in streaming
delta chunks instead of omitting the field or using an empty array:

  "delta": {"content":"","function_call":null,"tool_calls":null}

serde's #[serde(default)] only handles absent keys — an explicit null
value still fails deserialization with:
  'invalid type: null, expected a sequence'

Fix: replace #[serde(default)] with a custom deserializer helper
deserialize_null_as_empty_vec() that maps null -> Vec::default(),
keeping the existing absent-key default behaviour.

Regression test added: delta_with_null_tool_calls_deserializes_as_empty_vec
uses the exact provider response shape from gaebal-gajae's repro (2026-04-09).

112 api lib tests pass. Fmt clean.

Companion to gaebal-gajae's local 448cf2c — independently reproduced
and landed on main.
2026-04-09 21:39:52 +09:00
YeonGyu-Kim
eb044f0a02 fix(api): emit max_completion_tokens for gpt-5* on OpenAI-compat path — closes ROADMAP #35
gpt-5.x models reject requests with max_tokens and require max_completion_tokens.
Detect wire model starting with 'gpt-5' and switch the JSON key accordingly.
Older models (gpt-4o etc.) continue to receive max_tokens unchanged.

Two regression tests added:
- gpt5_uses_max_completion_tokens_not_max_tokens
- non_gpt5_uses_max_tokens

140 api tests pass, cargo fmt clean.
2026-04-09 09:33:45 +09:00
Jobdori
e4c3871882 feat(api): add reasoning_effort field to MessageRequest and OpenAI-compat path
Users of OpenAI-compatible reasoning models (o4-mini, o3, deepseek-r1,
etc.) had no way to control reasoning effort — the field was missing from
MessageRequest and never emitted in the request body.

Changes:
- Add `reasoning_effort: Option<String>` to `MessageRequest` in types.rs
  - Annotated with skip_serializing_if = "Option::is_none" for clean JSON
  - Accepted values: "low", "medium", "high" (passed through verbatim)
- In `build_chat_completion_request`, emit `"reasoning_effort"` when set
- Two unit tests:
  - `reasoning_effort_is_included_when_set`: o4-mini + "high" → field present
  - `reasoning_effort_omitted_when_not_set`: gpt-4o, no field → absent

Existing callers use `..Default::default()` and are unaffected.
One struct-literal test that listed all fields explicitly updated with
`reasoning_effort: None`.

The CLI flag to expose this to users is a follow-up (ROADMAP #34 partial).
This commit lands the foundational API-layer plumbing needed for that.

Partial ROADMAP #34.
2026-04-09 04:02:59 +09:00
Jobdori
beb09df4b8 style(api): cargo fmt fix on normalize_object_schema test assertions 2026-04-09 03:43:59 +09:00
Jobdori
e7e0fd2dbf fix(api): strict object schema for OpenAI /responses endpoint
OpenAI /responses validates tool function schemas strictly:
- object types must have "properties" (at minimum {})
- "additionalProperties": false is required

/chat/completions is lenient and accepts schemas without these fields,
but /responses rejects them with "object schema missing properties" /
"invalid_function_parameters".

Add normalize_object_schema() which recursively walks the JSON Schema
tree and fills in missing "properties"/{} and "additionalProperties":false
on every object-type node. Existing values are not overwritten.

Call it in openai_tool_definition() before building the request payload
so both /chat/completions and /responses receive strict-validator-safe
schemas.

Add unit tests covering:
- bare object schema gets both fields injected
- nested object schemas are normalised recursively
- existing additionalProperties is not overwritten

Fixes the live repro where gpt-5.4 via OpenAI compat accepted connection
and routing but rejected every tool call with schema validation errors.

Closes ROADMAP #33.
2026-04-09 03:03:43 +09:00
Jobdori
adcea6bceb fix(api): route DashScope models to dashscope config, not openai
ProviderClient::from_model_with_anthropic_auth was dispatching every
ProviderKind::OpenAi match to OpenAiCompatConfig::openai(), which reads
OPENAI_API_KEY and points at api.openai.com. But DashScope models
(qwen-plus, qwen/qwen3-coder, etc.) also return ProviderKind::OpenAi
from detect_provider_kind because DashScope speaks the OpenAI wire
format. The metadata layer correctly identifies them as needing
DASHSCOPE_API_KEY and the DashScope compatible-mode endpoint, but that
metadata was being ignored at dispatch time.

Result: users running `claw --model qwen-plus` with DASHSCOPE_API_KEY
set would get a "missing OPENAI_API_KEY" error instead of being routed
to DashScope.

Fix: consult providers::metadata_for_model in the OpenAi dispatch arm
and pick dashscope() vs openai() based on metadata.auth_env.

Adds a regression test asserting ProviderClient::from_model("qwen-plus")
builds with the DashScope base URL. Exposes a pub base_url() accessor
on OpenAiCompatClient so the test can verify the routing.

Authored by droid (Kimi K2.5 Turbo) via acpx, cleaned up by Jobdori
(removed unsafe blocks unnecessary under edition 2021, imported
ProviderClient from super, adopted EnvVarGuard pattern from
providers/mod.rs tests).

Co-Authored-By: Droid <noreply@factory.ai>
2026-04-08 18:04:37 +09:00
YeonGyu-Kim
ff1df4c7ac fix(api): auth-provider error copy — prefix-routing hints + sk-ant-* bearer detection — closes ROADMAP #28
Two live users in #claw-code on 2026-04-08 hit adjacent auth confusion:
varleg set OPENAI_API_KEY for OpenRouter but prefix routing didn't
activate without openai/ model prefix, and stanley078852 put sk-ant-*
in ANTHROPIC_AUTH_TOKEN (Bearer path) instead of ANTHROPIC_API_KEY
(x-api-key path) and got 401 Invalid bearer token.

Changes:
1. ApiError::MissingCredentials gained optional hint field (error.rs)
2. anthropic_missing_credentials_hint() sniffs OPENAI/XAI/DASHSCOPE
   env vars and suggests prefix routing when present (providers/mod.rs)
3. All 4 Anthropic auth paths wire the hint helper (anthropic.rs)
4. 401 + sk-ant-* in bearer token detected and hint appended
5. 'Which env var goes where' section added to USAGE.md

Tests: unit tests for all three improvements (no HTTP calls needed).
Workspace: all tests green, fmt clean, clippy warnings-only.

Source: live users varleg + stanley078852 in #claw-code 2026-04-08.

Co-authored-by: gaebal-gajae <gaebal-gajae@layofflabs.com>
2026-04-08 16:29:03 +09:00
YeonGyu-Kim
8c6dfe57e6 fix(api): restore local preflight guard ahead of count_tokens round-trip
CI has been red since be561bf ('Use Anthropic count tokens for preflight')
because that commit replaced the free-function preflight_message_request
(byte-estimate guard) with an instance method that silently returns Ok on
any count_tokens failure:

    let counted_input_tokens = match self.count_tokens(request).await {
        Ok(count) => count,
        Err(_) => return Ok(()),  // <-- silent bypass
    };

Two consequences:

1. client_integration::send_message_blocks_oversized_requests_before_the_http_call
   has been FAILING on every CI run since be561bf. The mock server in that
   test only has one HTTP response queued (a bare '{}' to satisfy the main
   request), so the count_tokens POST parses into an empty body that fails
   to deserialize into CountTokensResponse -> Err -> silent bypass -> the
   oversized 600k-char request proceeds to the mock instead of being
   rejected with ContextWindowExceeded as the test expects.

2. In production, any third-party Anthropic-compatible gateway that doesn't
   implement /v1/messages/count_tokens (OpenRouter, Cloudflare AI Gateway,
   etc.) would silently disable the preflight guard entirely, letting
   oversized requests hit the upstream only to fail there with a provider-
   side context-window error. This is exactly the 'opaque failure surface'
   ROADMAP #22 asked us to avoid.

Fix: call the free-function super::preflight_message_request(request)? as
the first step in the instance method, before any network round-trip. This
guarantees the byte-estimate guard always fires, whether or not the remote
count_tokens endpoint is reachable. The count_tokens refinement still runs
afterward when available for more precise token counting, but it is now
strictly additive — it can only catch more cases, never silently skip the
guard.

Test results:
- cargo test -p api --lib: 89 passed, 0 failed
- cargo test --release -p api (all test binaries): 118 passed, 0 failed
- cargo test --release -p api --test client_integration \
    send_message_blocks_oversized_requests_before_the_http_call: passes
- cargo fmt --check: clean

This unblocks the Rust CI workflow which has been red on every push since
be561bf landed.
2026-04-08 14:34:38 +09:00
YeonGyu-Kim
3ac97e635e feat(api): add qwen/ prefix routing for Alibaba DashScope provider
Users in Discord #clawcode-get-help (web3g) asked for Qwen 3.6 Plus via
native Alibaba DashScope API instead of OpenRouter, which has stricter
rate limits. This commit adds first-class routing for qwen/ and bare
qwen- prefixed model names.

Changes:
- DEFAULT_DASHSCOPE_BASE_URL constant: /compatible-mode/v1 endpoint
- OpenAiCompatConfig::dashscope() factory mirroring openai()/xai()
- DASHSCOPE_ENV_VARS + credential_env_vars() wiring
- metadata_for_model: qwen/ and qwen- prefix routes to DashScope with
  auth_env=DASHSCOPE_API_KEY, reuses ProviderKind::OpenAi because
  DashScope speaks the OpenAI REST shape
- is_reasoning_model: detect qwen-qwq, qwq-*, and *-thinking variants
  so tuning params (temperature, top_p, etc.) get stripped before
  payload assembly (same pattern as o1/o3/grok-3-mini)

Tests added:
- providers::tests::qwen_prefix_routes_to_dashscope_not_anthropic
- openai_compat::tests::qwen_reasoning_variants_are_detected

89 api lib tests passing, 0 failing. cargo fmt --check: clean.

Closes the user-reported gap: 'use Qwen 3.6 Plus via Alibaba API
directly, not OpenRouter' without needing OPENAI_BASE_URL override
or unsetting ANTHROPIC_API_KEY.
2026-04-08 14:06:26 +09:00
YeonGyu-Kim
82baaf3f22 fix(ci): update integration test MessageRequest initializers for new tuning fields
openai_compat_integration.rs and client_integration.rs had MessageRequest
constructions without the new tuning param fields (temperature, top_p,
frequency_penalty, presence_penalty, stop) added in c667d47.

Added ..Default::default() to all 4 sites. cargo fmt applied.

This was the root cause of CI red on main (E0063 compile error in
integration tests, not caught by --lib tests).
2026-04-08 11:43:51 +09:00
YeonGyu-Kim
c7b3296ef6 style: cargo fmt — fix CI formatting failures
Pre-existing formatting issues in anthropic.rs surfaced by CI cargo fmt check.
No functional changes.
2026-04-08 11:21:13 +09:00
YeonGyu-Kim
523ce7474a fix(api): sanitize Anthropic body — strip frequency/presence_penalty, convert stop→stop_sequences
MessageRequest now carries OpenAI-compatible tuning params (c667d47), but
the Anthropic API does not support frequency_penalty or presence_penalty,
and uses 'stop_sequences' instead of 'stop'. Without this fix, setting
these params with a Claude model would produce 400 errors.

Changes to strip_unsupported_beta_body_fields:
- Remove frequency_penalty and presence_penalty from Anthropic request body
- Convert stop → stop_sequences (only when non-empty)
- temperature and top_p are preserved (Anthropic supports both)

Tests added:
- strip_removes_openai_only_fields_and_converts_stop
- strip_does_not_add_empty_stop_sequences

87 api lib tests passing, 0 failing.
cargo check --workspace: clean.
2026-04-08 09:05:10 +09:00
YeonGyu-Kim
b513d6e462 fix(api): sanitize tuning params for reasoning models (o1/o3/grok-3-mini)
Reasoning models reject temperature, top_p, frequency_penalty, and
presence_penalty with 400 errors. Instead of letting these flow through
and returning cryptic provider errors, strip them silently at the
request-builder boundary.

is_reasoning_model() classifies: o1*, o3*, o4*, grok-3-mini.
stop sequences are preserved (safe for all providers).

Tests added:
- reasoning_model_strips_tuning_params: o1-mini strips all 4 params, keeps stop
- grok_3_mini_is_reasoning_model: classification coverage for grok-3-mini, o1,
  o3-mini, and negative cases (gpt-4o, grok-3, claude)

85 api lib tests passing, 0 failing.
2026-04-08 07:32:47 +09:00
YeonGyu-Kim
c667d47c70 feat(api): add tuning params (temperature, top_p, penalties, stop) to MessageRequest
MessageRequest was missing standard OpenAI-compatible generation tuning
parameters. Callers had no way to control temperature, top_p,
frequency_penalty, presence_penalty, or stop sequences.

Changes:
- Added 5 optional fields to MessageRequest (all Option, None by default)
- Wired into build_chat_completion_request: only included in payload when set
- All existing construction sites updated with ..Default::default()
- MessageRequest now derives Default for ergonomic partial construction

Tests added:
- tuning_params_included_in_payload_when_set: all 5 params flow into JSON
- tuning_params_omitted_from_payload_when_none: absent params stay absent

83 api lib tests passing, 0 failing.
cargo check --workspace: 0 warnings.
2026-04-08 07:07:33 +09:00
YeonGyu-Kim
0530c509a3 fix(api): route openai/ and gpt- model prefixes to OpenAi provider
metadata_for_model returned None for unknown models like openai/gpt-4.1-mini,
causing detect_provider_kind to fall through to auth-sniffer order. If
ANTHROPIC_API_KEY was set, the model was silently misrouted to Anthropic
and the user got a confusing 'missing Anthropic credentials' error.

Fix: add explicit prefix checks for 'openai/' and 'gpt-' in
metadata_for_model so the model name wins over env-var presence.

Regression test added: openai_namespaced_model_routes_to_openai_not_anthropic
- 'openai/gpt-4.1-mini' routes to OpenAi
- 'gpt-4o' routes to OpenAi

Reported and reproduced by gaebal-gajae against current main.
81 api lib tests passing, 0 failing.
2026-04-08 05:33:47 +09:00
YeonGyu-Kim
b3ccd92d24 feat: b6-pdf-extract-v2 follow-up work — batch 6 2026-04-07 16:11:51 +09:00
YeonGyu-Kim
1f968b359f feat: b6-openai-models — batch 6 2026-04-07 15:52:30 +09:00
YeonGyu-Kim
5bcbc86a2b feat: b5-slash-help — batch 5 upstream parity 2026-04-07 14:51:27 +09:00
YeonGyu-Kim
6a6c5acb02 feat: b5-reasoning-guard — batch 5 upstream parity 2026-04-07 14:51:27 +09:00
YeonGyu-Kim
f982f24926 fix(api): Windows env hint + .env file loading fallback
When API key missing on Windows, hint about setx. Load .env from CWD
as fallback with simple key=value parser.
2026-04-07 14:22:41 +09:00
YeonGyu-Kim
2a642871ad fix(api): enrich JSON parse errors with response body, provider, and model
Raw 'json_error: no field X' now includes truncated response body,
provider name, and model ID for debugging context.
2026-04-07 14:22:05 +09:00
YeonGyu-Kim
ce360e0ff3 fix(api): strip anthropic beta fields from non-beta requests
mikejiang: 'betas: Extra inputs are not permitted' 400 error.
Only include beta headers when request targets beta endpoint.
2026-04-07 14:22:05 +09:00
YeonGyu-Kim
ce22d8fb4f fix(api): add serde(default) to all usage/token parse paths in SSE stream
Sterling reported 'json_error: no field input/input_tokens' still firing
despite existing serde(default) in types.rs. Root cause: SSE streaming
path had a separate deserialization site that didn't use the same defaults.

- Add serde(default) to sse.rs UsageEvent deserialization
- Add serde(default) to types.rs Usage struct fields (input_tokens, output_tokens)
- Add regression test with empty-usage JSON response in streaming context
2026-04-07 13:44:22 +09:00