Crate flow_models_server

Expand description

Managed local LLM server (the “LM Studio role”).

flow-studio runs models on-device. A 35B MoE can’t run in-process, so we manage an OpenAI-compatible server subprocess - llama.cpp’s llama-server

and point the existing local provider at it. Inference stays on the machine; this is purely process lifecycle.

The engine binary is fetched on first use (see fetch) into the data dir, or supplied by the caller (a saved setting / the Model Hub / $PATH); this module then manages the subprocess + readiness polling.

Modules§

fetch: Engine provisioning for the managed local-model server.

Structs§

LlamaParams: Per-model llama-server load parameters (the “Load” tab, à la LM Studio). Absent fields fall back to the server’s own defaults (no flag passed).
LlmServerHandle: Lifecycle manager for the local llama-server subprocess. Cloneable handle around shared interior state; one instance lives on FlowApp.
LlmServerStatus: Serialisable status reported to the frontend (Models tab indicator).

Crate flow_models_server

Crate flow_models_server Copy item path

Modules§

Structs§

Crate flow_models_server