Configuring providers
yottacode can use native provider adapters where useful and OpenAI-compatible endpoints everywhere else.
Required settings
At startup, yottacode needs:
modelbase_urlapi_keyfor remote providers that require API-key auth
The openai-auth and copilot providers are exceptions: they use OAuth flows and store tokens under ~/.yottacode/auth/.
You can provide them through flags, environment variables, or ~/.yottacode/config.toml.
Environment variables
export YOTTACODE_PROVIDER=openai
export YOTTACODE_MODEL=<your-model-id>
export YOTTACODE_BASE_URL=https://api.openai.com/v1
export YOTTACODE_API_KEY=sk-...Flags override environment variables:
yottacode --model <your-model-id> --base-url https://api.openai.com/v1 --api-key sk-...Setup wizard
yottacode setupThe wizard writes provider profiles to ~/.yottacode/config.toml.
Provider profiles
Provider profiles live in ~/.yottacode/config.toml:
[active]
provider = "openai"
default_model = "<your-model-id>"
[[providers]]
name = "openai"
kind = "openai"
base_url = "https://api.openai.com/v1"
api_key_env = "OPENAI_API_KEY"
default_model = "<your-model-id>"Warning
Don’t put raw API keys in config.toml. Use api_key_env and set the secret in your shell environment or in ~/.yottacode/.env (the setup wizard does this for you).
Provider configuration
export YOTTACODE_PROVIDER=openai
export YOTTACODE_MODEL=<your-model-id>
export YOTTACODE_BASE_URL=https://api.openai.com/v1
export YOTTACODE_API_KEY=sk-...OpenAI reasoning models such as o1*, o3*, o4*, and gpt-5* are routed to the Responses API automatically when appropriate.
Hosted tools:
web_searchis enabled by defaultcode_interpretercan be enabled withYOTTACODE_ENABLE_CODE_INTERPRETER=1
Disable default hosted web search:
export YOTTACODE_DISABLE_WEB_SEARCH=1/usage reports token usage for every provider and links the billing dashboard for the paid cloud ones; it does not compute a dollar figure (no provider exposes per-model pricing on the inference key). Ollama and NVIDIA NIM (openai-compatible pointed at integrate.api.nvidia.com) have no billing dashboard โ token counts only. See cost.md.
Diagnostics
Inside the TUI:
/provider
/doctorFrom the shell:
yottacode doctor
yottacode doctor --json/provider shows static resolved config. /doctor performs an active /models probe for endpoint reachability, auth, and model visibility.
Switch providers
yottacode provider list
yottacode provider use openaiIn the TUI, use the provider picker or:
/provider use openaiSwitching provider in an active session rebuilds the adapter while preserving the session history.
Image support
Image support varies by provider. Two capabilities matter:
| Provider | Pasted images | read_file images |
|---|---|---|
| Anthropic | yes | yes |
| OpenAI | yes | no |
ChatGPT OAuth (openai-auth) | yes | no |
| GitHub Copilot | yes | no |
| Gemini | yes | no |
| xAI | yes | no |
| Ollama | no | no |
| OpenAI-compatible (NVIDIA NIM, etc.) | no | no |
Note
Pasted images โ paste a screenshot path or file:/// URL in the input; the image is sent as a native content block the model can see. Providers marked “no” receive only the text marker (no image data), avoiding API errors from text-only models. read_file images โ read_file("photo.png") returns the image as a visual content block; only Anthropic supports image blocks in tool results today, so other providers receive a text label with file metadata.
Reasoning effort
Set how hard a reasoning-capable model thinks with --reasoning-effort (or YOTTACODE_REASONING_EFFORT) at launch, or /effort mid-session. The surface is uniform โ default ยท low ยท medium ยท high โ but each provider has a different underlying knob, so yottacode translates the level per provider:
| Provider | Underlying knob | Notes |
|---|---|---|
OpenAI (gpt-5*, o1/o3/o4) | reasoning.effort enum | low/medium/high map 1:1. Non-reasoning models (e.g. gpt-4o) ignore it. |
ChatGPT OAuth (openai-auth) | reasoning.effort enum | Same as OpenAI, on the Codex backend. |
| Anthropic (Claude) | extended-thinking token budget | Enables thinking with a budget sized as a fraction of the model’s max-output tokens (low โ 25%, high โ 75%); max_tokens is raised to leave room for the answer. A model the catalog doesn’t know falls back to a conservative budget so effort still engages โ refresh the catalog (yotta-models refresh) for the full model-scaled budget. |
| Gemini (2.5) | thinkingConfig.thinkingBudget | Enables thinking with a budget scaled per level, capped to the Gemini 2.5 family’s valid range. |
| xAI (Grok) | reasoning_effort enum | Only grok-*-mini accepts it (low/high; medium folds to high). grok-4 reasons unconditionally and is left untouched. |
Default is unchanged. When effort is unset, yottacode injects no reasoning parameter at all โ every provider behaves exactly as it does without the setting. In particular, Anthropic and Gemini do not get extended thinking unless you ask for it (it costs extra tokens). /effort default (or off/none) returns to this state mid-session.
No per-model table to maintain. Whether a model supports thinking, and how big a thinking budget to allow, come from the model catalog (Capabilities.Thinking and MaxOutput, fetched from each provider’s list-models endpoint via yotta-models refresh) โ not a hand-maintained list. A model the catalog doesn’t describe still works: enum providers (OpenAI/xAI) are gated on model-name prefixes, and Anthropic/Gemini fall back to a conservative thinking budget. If the catalog explicitly marks a model as non-thinking, the effort is a no-op rather than an error (surfaced as a /effort hint).