NVIDIA NIM
NVIDIA NIM (NVIDIA Inference Microservices) exposes 100+ hosted models β Llama,
DeepSeek, Nemotron, Qwen, and more β behind a single OpenAI-compatible API at
https://integrate.api.nvidia.com/v1. yottacode talks to it through the
openai-compatible provider.
Configure
You need an nvapi- API key first β see Register and get an API key below.
In the TUI β add or switch providers without restarting:
/provider # open the picker β Add a profile: kind=openai-compatible,
# base URL https://integrate.api.nvidia.com/v1, API key, model
/provider use openai-compatible # switch to a saved profile
/model meta/llama-3.3-70b-instructFrom the command line β set environment variables:
export YOTTACODE_PROVIDER=openai-compatible
export YOTTACODE_MODEL=meta/llama-3.3-70b-instruct
export YOTTACODE_BASE_URL=https://integrate.api.nvidia.com/v1
export YOTTACODE_API_KEY=nvapi-...β¦or pass flags at launch (they override the environment):
yottacode --provider openai-compatible \
--model meta/llama-3.3-70b-instruct \
--base-url https://integrate.api.nvidia.com/v1 \
--api-key nvapi-...NVIDIA’s endpoint speaks the standard OpenAI wire protocol (/v1/chat/completions
and /v1/models), so no native adapter is required.
Register and get an API key
You need a free NVIDIA account and an API key (it starts with nvapi-).
Create a free NVIDIA Developer account
Sign up at developer.nvidia.com. The NVIDIA Developer Program is free to join.
Open the API catalog
Go to build.nvidia.com and browse the model catalog. Each model has a playground plus a code panel showing the OpenAI-compatible request.
Generate your key
Pick any model and click Get API Key. Copy the generated key β it begins
with nvapi-. The same key works for every model in the catalog.
Configure yottacode
Set the key and base URL, then choose a model β see Configure
above. To save a reusable profile, run the setup wizard and add an
openai-compatible entry:
yottacode setupTip
New accounts get a pool of free inference credits (1,000 at the time of writing) and a modest request rate limit. These terms are set by NVIDIA and can change β check build.nvidia.com for current limits.
Finding model IDs
The model ID is the namespaced slug shown on each model’s page (the model
field in the code sample), for example:
meta/llama-3.3-70b-instructmeta/llama-3.1-8b-instructdeepseek-ai/deepseek-r1nvidia/llama-3.1-nemotron-70b-instruct
You can also list what your key can reach from inside yottacode:
/modelsor from the shell:
yottacode doctorNotes
- No billing dashboard. Like other
openai-compatibleendpoints pointed atintegrate.api.nvidia.com, NVIDIA NIM reports token counts only in/usageβ there is no per-model dollar figure. See Usage and cost. - Hosted tools. Provider-native hosted tools (web search, code interpreter) are not available over the OpenAI-compatible endpoint; the model can still use yottacode’s local tools.
- Self-hosted NIM. If you run NIM containers yourself, point
YOTTACODE_BASE_URLat your own/v1endpoint instead β the configuration is otherwise identical.