Expand description
Managed local LLM server (the “LM Studio role”).
flow-studio runs models on-device. A 35B MoE can’t run in-process, so we
manage an OpenAI-compatible server subprocess - llama.cpp’s llama-server
- and point the existing
localprovider at it. Inference stays on the machine; this is purely process lifecycle.
The engine binary is fetched on first use (see fetch) into the data
dir, or supplied by the caller (a saved setting / the Model Hub / $PATH);
this module then manages the subprocess + readiness polling.
Modules§
- fetch
- Engine provisioning for the managed local-model server.
Structs§
- Llama
Params - Per-model
llama-serverload parameters (the “Load” tab, à la LM Studio). Absent fields fall back to the server’s own defaults (no flag passed). - LlmServer
Handle - Lifecycle manager for the local
llama-serversubprocess. Cloneable handle around shared interior state; one instance lives onFlowApp. - LlmServer
Status - Serialisable status reported to the frontend (Models tab indicator).