Skip to main content

Module openai_compat

Module openai_compat 

Source
Expand description

Shared OpenAI Chat Completions wire format.

Both the cloud openai provider (api.openai.com) and the local provider (Ollama / LM Studio / llama.cpp on localhost) speak the exact same /v1/chat/completions request and response shape. This module owns the body builder, non-streaming response parser, and the streaming SSE reader so the two providers can’t drift.

Functions§

build_body
Build a Chat Completions request body from a CloudAiRequest.
build_embeddings_body
Build an OpenAI-compatible /v1/embeddings body. The text to embed is the request prompt.
build_streaming_body
Build the request body for a streaming Chat Completions call. Same shape as build_body plus stream: true and a stream_options block that asks the server to send a final usage chunk so the caller can report token counts even on streaming responses.
parse_embeddings_response
Parse an OpenAI-compatible embeddings response into the first vector.
parse_response
Parse a Chat Completions response body. provider is the caller’s name, stamped into errors and the returned CloudAiResponse.provider.
run_tools_loop
Run the OpenAI-compatible tool-use loop: expose tools, and on each turn, if the model requests tool calls, run them through dispatcher, append the results, and re-call - until the model answers without tool calls (returned as the CloudAiResponse) or max_iters is exhausted. api_key empty means no bearer header (local servers). Non-streaming: tool-call deltas don’t reassemble cleanly over SSE.
stream_response
Stream an OpenAI-compatible Chat Completions response, emitting each content delta through sink. Returns the accumulated full response once the stream closes so the caller still gets a CloudAiResponse to surface in the node output.
strip_reasoning
Strip inline reasoning so only the user-facing answer is surfaced: removes <think>…</think> spans (some local models inline their chain-of-thought there). Separate reasoning_content fields are never read into text, so they’re already excluded. Idempotent; trims surrounding whitespace.