mirror of
https://github.com/instructkr/claude-code.git
synced 2026-06-04 11:36:44 +00:00
fix: route local OpenAI-compatible models
This commit is contained in:
20
USAGE.md
20
USAGE.md
@@ -298,6 +298,18 @@ cd rust
|
||||
./target/debug/claw --model "llama3.2" prompt "summarize this repository in one sentence"
|
||||
```
|
||||
|
||||
For Ollama tags with punctuation (for example `qwen2.5-coder:7b`), `OPENAI_BASE_URL` selects the local OpenAI-compatible route even when `OPENAI_API_KEY` is unset:
|
||||
|
||||
```bash
|
||||
export OPENAI_BASE_URL="http://127.0.0.1:11434/v1"
|
||||
unset OPENAI_API_KEY
|
||||
|
||||
cd rust
|
||||
./target/debug/claw --model "qwen2.5-coder:7b" prompt "reply with ready"
|
||||
```
|
||||
|
||||
If the local server exposes a slash-containing model ID, prefix it with `local/` so Claw selects the OpenAI-compatible transport while sending the remainder verbatim on the wire: `--model "local/Qwen/Qwen3.6-27B-FP8"`.
|
||||
|
||||
### OpenRouter
|
||||
|
||||
```bash
|
||||
@@ -340,7 +352,7 @@ Reasoning variants (`qwen-qwq-*`, `qwq-*`, `*-thinking`) automatically strip `te
|
||||
|
||||
The OpenAI-compatible backend also serves as the gateway for **OpenRouter**, **Ollama**, and any other service that speaks the OpenAI `/v1/chat/completions` wire format — just point `OPENAI_BASE_URL` at the service.
|
||||
|
||||
**Model-name prefix routing:** If a model name starts with `openai/`, `gpt-`, `qwen/`, `qwen-`, `kimi/`, or `kimi-`, the provider is selected by the prefix regardless of which env vars are set. This prevents accidental misrouting to Anthropic when multiple credentials exist in the environment. For the default OpenAI API, `openai/` is a routing prefix and is stripped before the request hits the wire. For a custom `OPENAI_BASE_URL`, slash-containing OpenAI-compatible slugs (for example OpenRouter-style `openai/gpt-4.1-mini`) are preserved so the gateway receives the model ID it expects.
|
||||
**Model-name prefix routing:** If a model name starts with `openai/`, `local/`, `gpt-`, `qwen/`, `qwen-`, `kimi/`, or `kimi-`, the provider is selected by the prefix regardless of which env vars are set. This prevents accidental misrouting to Anthropic when multiple credentials exist in the environment. For the default OpenAI API and local/private OpenAI-compatible endpoints, `openai/` is a routing prefix and is stripped before the request hits the wire. For non-local custom `OPENAI_BASE_URL` gateways, slash-containing OpenAI-compatible slugs (for example OpenRouter-style `openai/gpt-4.1-mini`) are preserved so the gateway receives the model ID it expects. The `local/` prefix is an explicit escape hatch for local slash-containing model IDs: it is stripped while the rest of the model ID is sent verbatim.
|
||||
|
||||
### Tested models and aliases
|
||||
|
||||
@@ -360,7 +372,7 @@ These are the models registered in the built-in alias table with known token lim
|
||||
| `gpt-4.1` / `gpt-4.1-mini` / `gpt-4.1-nano` | same | OpenAI-compatible | 32 768 | 1 047 576 |
|
||||
| `gpt-5.4` / `gpt-5.4-mini` / `gpt-5.4-nano` | same | OpenAI-compatible | 128 000 | 1 000 000 / 400 000 |
|
||||
|
||||
Any model name that does not match an alias is passed through verbatim after provider routing is resolved. This is how you use OpenRouter model slugs (`openai/gpt-4.1-mini` with a custom `OPENAI_BASE_URL`), Ollama tags (`llama3.2`), or full Anthropic model IDs (`claude-sonnet-4-20250514`).
|
||||
Any model name that does not match an alias is passed through verbatim after provider routing is resolved. This is how you use OpenRouter model slugs (`openai/gpt-4.1-mini` with a custom `OPENAI_BASE_URL`), Ollama tags (`llama3.2` or `qwen2.5-coder:7b`), slash-containing local IDs (`local/Qwen/Qwen3.6-27B-FP8`), or full Anthropic model IDs (`claude-sonnet-4-20250514`).
|
||||
|
||||
### User-defined aliases
|
||||
|
||||
@@ -382,9 +394,9 @@ Local project settings override user-level settings. Aliases resolve through the
|
||||
|
||||
1. If the resolved model name starts with `claude` → Anthropic.
|
||||
2. If it starts with `grok` → xAI.
|
||||
3. If it starts with `openai/` or `gpt-` → OpenAI-compatible.
|
||||
3. If it starts with `openai/`, `local/`, or `gpt-` → OpenAI-compatible.
|
||||
4. If it starts with `qwen/`, `qwen-`, `kimi/`, or `kimi-` → DashScope-compatible OpenAI wire format.
|
||||
5. If `OPENAI_BASE_URL` and `OPENAI_API_KEY` are set, unknown model names route to the OpenAI-compatible client for local/gateway servers.
|
||||
5. If `OPENAI_BASE_URL` is set, local-looking unknown model names such as `llama3.2` or `qwen2.5-coder:7b` route to the OpenAI-compatible client for local/gateway servers.
|
||||
6. Otherwise, `claw` checks which credential is set: Anthropic first, then OpenAI, then xAI. If only `OPENAI_BASE_URL` is set, it still routes to OpenAI-compatible for authless local servers.
|
||||
7. If nothing matches, it defaults to Anthropic.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user