Skip to main content

Crate flow_models_server

Crate flow_models_server 

Source
Expand description

Managed local LLM server (the “LM Studio role”).

flow-studio runs models on-device. A 35B MoE can’t run in-process, so we manage an OpenAI-compatible server subprocess - llama.cpp’s llama-server

  • and point the existing local provider at it. Inference stays on the machine; this is purely process lifecycle.

The engine binary is fetched on first use (see fetch) into the data dir, or supplied by the caller (a saved setting / the Model Hub / $PATH); this module then manages the subprocess + readiness polling.

Modules§

fetch
Engine provisioning for the managed local-model server.

Structs§

LlamaParams
Per-model llama-server load parameters (the “Load” tab, à la LM Studio). Absent fields fall back to the server’s own defaults (no flag passed).
LlmServerHandle
Lifecycle manager for the local llama-server subprocess. Cloneable handle around shared interior state; one instance lives on FlowApp.
LlmServerStatus
Serialisable status reported to the frontend (Models tab indicator).