Skip to content

Relay Setup

Relay is the preferred inference path for Hamr. Hamr expects an OpenAI-compatible API rooted at a base URL such as:

txt
http://127.0.0.1:1234/v1

The important endpoints are:

  • GET /models
  • POST /chat/completions

Local Relay Profile

Use .hamr.toml:

toml
[active]
provider = "relay"
model = "Qwen3.6-35B-A3B-UD-IQ3_XXS.gguf"
thinking = "off"

[providers.relay]
enabled = true
name = "Relay"
compatibility = "openai-compatible"
base_url = "http://127.0.0.1:1234/v1"

[[providers.relay.models]]
id = "Qwen3.6-35B-A3B-UD-IQ3_XXS.gguf"
display_name = "Qwen3.6 35B"
context_window = 131072
supports_thinking = false

Set model to the exact ID Relay reports from /models.

Provider Presets

Hamr normalizes these provider presets:

PresetUse
relayDefault local Relay endpoint at 127.0.0.1:1234
deepseekDeepSeek API
openaiOpenAI API
anthropicAnthropic API (anthropic-compatible format)
openrouterOpenRouter API
customAny custom OpenAI-compatible server

Normal local use should not require cloud-hosted APIs.

Headers And Keys

Use api_key_env to reference an environment variable holding your API key:

toml
[providers.deepseek]
api_key_env = "DEEPSEEK_API_KEY"

For endpoints requiring custom headers, use the [providers.<id>.headers] section:

toml
[providers.relay.headers]
"CF-Access-Client-Id" = "your-id.access"
"CF-Access-Client-Secret" = "your-secret"

Tool-Call Parsing

For local models emitting non-standard tool calls, set tool_call_parser in .hamr.toml:

toml
[providers.relay]
tool_call_parser = "qwen3_xml"

Hamr auto-detects the parser from your model name when not set explicitly. See Tool-Call Parsing for the full parser matrix.

Verification

Hamr runs your verification command after completing tasks in run mode. Set it in .hamr.toml:

toml
[verification]
defaultCommand = "bun run typecheck"

Testing Relay

Once Relay is running:

sh
hamr doctor --full

This probes /models and sends a small /chat/completions request to confirm end-to-end connectivity.

Skaft Software · MIT License