Expand description
Local OpenAI-compatible provider.
Targets an on-device inference server (Ollama / LM Studio / llama.cpp)
that exposes the OpenAI /v1/chat/completions API. Unlike the cloud
openai provider, the endpoint is per-request (req.base_url,
sourced from the user’s local_ai_base_url setting) and authentication
is optional - local servers typically ignore the bearer token.
Gated by allow_local_ai at the executor layer (not allow_cloud_ai):
a localhost call is not network egress, so it stays usable even when
cloud AI is disabled.
Structs§
Functions§
- local_
chat_ url - Resolve the chat-completions endpoint from any accepted base form.
- local_
embeddings_ url - Resolve the embeddings endpoint from any accepted base form.
- local_
models_ url - Resolve the model-listing endpoint from any accepted base form.
- normalize_
local_ base - Reduce a user-entered server address to its origin/base, tolerating both
a bare
http://127.0.0.1:1234and a fully-qualifiedhttp://127.0.0.1:1234/v1/chat/completions. Trailing slashes and the known OpenAI-compat path suffixes are stripped so the result can have a fresh path appended. - parse_
model_ list