Providers

yottacode can use native provider adapters where useful and OpenAI-compatible endpoints everywhere else.

Required settings

At startup, yottacode needs:

model
base_url
api_key for remote providers that require API-key auth

The openai-auth and copilot providers are exceptions: they use OAuth flows and store tokens under ~/.yottacode/auth/.

You can provide them through flags, environment variables, or ~/.yottacode/config.toml.

Environment variables

export YOTTACODE_PROVIDER=openai
export YOTTACODE_MODEL=<your-model-id>
export YOTTACODE_BASE_URL=https://api.openai.com/v1
export YOTTACODE_API_KEY=sk-...

Flags override environment variables:

yottacode --model <your-model-id> --base-url https://api.openai.com/v1 --api-key sk-...

Setup wizard

yottacode setup

The wizard writes provider profiles to ~/.yottacode/config.toml. Inside a running session you can also add, remove, or switch providers from the /provider picker — no restart needed (see Switch providers).

Provider profiles

Provider profiles live in ~/.yottacode/config.toml:

[active]
provider      = "openai"
default_model = "<your-model-id>"

[[providers]]
name          = "openai"
kind          = "openai"
base_url      = "https://api.openai.com/v1"
api_key_env   = "OPENAI_API_KEY"
default_model = "<your-model-id>"

Warning

Don’t put raw API keys in config.toml. Use api_key_env and set the secret in your shell environment or in ~/.yottacode/.env (the setup wizard does this for you).

Provider configuration

Pick your provider for a copy-paste configuration example and provider-specific notes:

Anthropic

Native Messages API adapter for Claude models.

OpenAI

API keys, Responses API routing, hosted tools.

ChatGPT OAuth

GitHub Copilot

Device-code auth against your Copilot plan.

Gemini

Native Google generative-language adapter.

xAI

Grok models with web and X search.

Ollama

Local models, no API key required.

OpenAI-compatible

Any gateway speaking the OpenAI wire protocol.

NVIDIA NIM

100+ hosted models from build.nvidia.com.

/usage reports token usage for every provider and links the billing dashboard for the paid cloud ones; it does not compute a dollar figure (no provider exposes per-model pricing on the inference key). Ollama and NVIDIA NIM (openai-compatible pointed at integrate.api.nvidia.com) have no billing dashboard — token counts only. See cost.md.

Diagnostics

Inside the TUI:

/provider
/doctor

From the shell:

yottacode doctor
yottacode doctor --json

/provider shows static resolved config. /doctor performs an active /models probe for endpoint reachability, auth, and model visibility.

Switch providers

In the TUI, use the /provider picker, or switch to a saved profile directly:

/provider use openai

From the shell:

yottacode provider list
yottacode provider use openai

Switching provider in an active session rebuilds the adapter while preserving the session history.

Image support

Image support varies by provider. Two capabilities matter:

Provider	Pasted images	`read_file` images
Anthropic	yes	yes
OpenAI	yes	no
ChatGPT OAuth (`openai-auth`)	yes	no
GitHub Copilot	yes	no
Gemini	yes	no
xAI	yes	no
Ollama	no	no
OpenAI-compatible (NVIDIA NIM, etc.)	no	no

Note

Pasted images — paste a screenshot path or file:/// URL in the input; the image is sent as a native content block the model can see. Providers marked “no” receive only the text marker (no image data), avoiding API errors from text-only models. read_file images — read_file("photo.png") returns the image as a visual content block; only Anthropic supports image blocks in tool results today, so other providers receive a text label with file metadata.

Reasoning effort

Set how hard a reasoning-capable model thinks with --reasoning-effort (or YOTTACODE_REASONING_EFFORT) at launch, or /effort mid-session. The surface is uniform — default · low · medium · high — but each provider has a different underlying knob, so yottacode translates the level per provider:

Provider	Underlying knob	Notes
OpenAI (`gpt-5*`, `o1`/`o3`/`o4`)	`reasoning.effort` enum	`low`/`medium`/`high` map 1:1. Non-reasoning models (e.g. `gpt-4o`) ignore it.
ChatGPT OAuth (`openai-auth`)	`reasoning.effort` enum	Same as OpenAI, on the Codex backend.
Anthropic (Claude)	extended-thinking token budget	Enables thinking with a budget sized as a fraction of the model’s max-output tokens (low ≈ 25%, high ≈ 75%); `max_tokens` is raised to leave room for the answer. A model the catalog doesn’t know falls back to a conservative budget so effort still engages — refresh the catalog (`yotta-models refresh`) for the full model-scaled budget.
Gemini (2.5)	`thinkingConfig.thinkingBudget`	Enables thinking with a budget scaled per level, capped to the Gemini 2.5 family’s valid range.
xAI (Grok)	`reasoning_effort` enum	Only `grok-*-mini` accepts it (`low`/`high`; `medium` folds to `high`). `grok-4` reasons unconditionally and is left untouched.

Default is unchanged. When effort is unset, yottacode injects no reasoning parameter at all — every provider behaves exactly as it does without the setting. In particular, Anthropic and Gemini do not get extended thinking unless you ask for it (it costs extra tokens). /effort default (or off/none) returns to this state mid-session.

No per-model table to maintain. Whether a model supports thinking, and how big a thinking budget to allow, come from the model catalog (Capabilities.Thinking and MaxOutput, fetched from each provider’s list-models endpoint via yotta-models refresh) — not a hand-maintained list. A model the catalog doesn’t describe still works: enum providers (OpenAI/xAI) are gated on model-name prefixes, and Anthropic/Gemini fall back to a conservative thinking budget. If the catalog explicitly marks a model as non-thinking, the effort is a no-op rather than an error (surfaced as a /effort hint).