mirror of
https://github.com/instructkr/claude-code.git
synced 2026-06-06 20:26:45 +00:00
fix: route local OpenAI-compatible models
This commit is contained in:
@@ -13,7 +13,7 @@ If you need the most polished daily-driver experience for a specific non-Claude
|
||||
|
||||
## OpenAI-compatible routing basics
|
||||
|
||||
Set `OPENAI_BASE_URL` to the server’s `/v1` endpoint and set `OPENAI_API_KEY` to either the required token or a harmless placeholder for local servers that expect an Authorization header. The model name must match what the server exposes.
|
||||
Set `OPENAI_BASE_URL` to the server’s `/v1` endpoint and set `OPENAI_API_KEY` to either the required token or a harmless placeholder for local servers that expect an Authorization header. Authless local/private OpenAI-compatible servers can leave `OPENAI_API_KEY` unset. The model name must match what the server exposes.
|
||||
|
||||
```bash
|
||||
export OPENAI_BASE_URL="http://127.0.0.1:11434/v1"
|
||||
@@ -24,8 +24,8 @@ claw --model "qwen3:latest" prompt "Reply exactly HELLO_WORLD_123"
|
||||
Routing notes:
|
||||
|
||||
- Use the `openai/` prefix for OpenAI-compatible gateways when you need prefix routing to win over ambient Anthropic credentials, for example `--model "openai/gpt-4.1-mini"` with OpenRouter.
|
||||
- For local servers, prefer the exact model ID reported by the server (`qwen3:latest`, `llama3.2`, `Qwen/Qwen2.5-Coder-7B-Instruct`, etc.). If your local gateway exposes slash-containing IDs, use that exact slug.
|
||||
- If you have multiple provider keys in your environment, remove unrelated keys while smoke-testing a local route or choose a model prefix that unambiguously selects the intended provider.
|
||||
- For local servers, prefer the exact model ID reported by the server (`qwen3:latest`, `llama3.2`, etc.). If your local gateway exposes slash-containing IDs, prefix the exact slug with `local/` so Claw routes through OpenAI-compatible transport while sending the rest verbatim, for example `--model "local/Qwen/Qwen2.5-Coder-7B-Instruct"`.
|
||||
- If you have multiple provider keys in your environment, `OPENAI_BASE_URL` plus local-looking tags such as `llama3.2` or `qwen2.5-coder:7b` selects the local OpenAI-compatible route; use `local/` for slash-containing local IDs.
|
||||
- Tool workflows need model/server support for OpenAI-compatible tool calls. Plain prompt smoke tests can pass even when slash/tool workflows still fail because the server returns an incompatible tool-call shape.
|
||||
|
||||
## Raw `/v1/chat/completions` smoke test
|
||||
@@ -58,11 +58,11 @@ In another shell:
|
||||
|
||||
```bash
|
||||
export OPENAI_BASE_URL="http://127.0.0.1:11434/v1"
|
||||
export OPENAI_API_KEY="local-dev-token"
|
||||
unset OPENAI_API_KEY
|
||||
claw --model "qwen3:latest" prompt "Reply exactly HELLO_WORLD_123"
|
||||
```
|
||||
|
||||
If Ollama is running without auth and your build accepts authless local OpenAI-compatible servers, `unset OPENAI_API_KEY` is also acceptable. Use a placeholder token rather than a real cloud API key for local testing.
|
||||
If Ollama is running without auth, `unset OPENAI_API_KEY` is acceptable. Use a placeholder token rather than a real cloud API key if your local server requires an Authorization header.
|
||||
|
||||
## llama.cpp server
|
||||
|
||||
|
||||
Reference in New Issue
Block a user