AI in Flow

Autonomy with a governor

AI models in Flow interpret, validate, and recommend. The orchestration engine retains all execution authority. Every loop is bounded, every fix is audited, and inference runs on your machine by default.

Local-first inference

Zero egress by default

This is one property of Flow's architecture, Reasoning-Augmented Orchestration: AI advises, the engine executes. AI models run on-device through a managed llama.cpp server, and local inference makes no network calls. The model process has no credential access, no file system access, and no ability to spawn processes. A PII sanitizer redacts credentials, hostnames, dataset names, and IPs before any prompt reaches a model, whether that model is local or cloud.

Cloud providers such as Claude, OpenAI, Gemini, and NVIDIA are a deliberate, opt-in carve-out. They are off by default and gated by settings. Keys are held in the OS keyring, and execution history is kept metadata-only unless you explicitly opt into full-content audit.

The unified AI node

One node, capability-driven execution

An AI node binds any model from the Model Hub. The model's capabilities drive both the inspector options and how the node executes.

Reasoning

A thinking toggle passes a per-request reasoning flag to the provider. The reasoning traces are stripped, so only the answer surfaces.

Vision

Image inputs, given as paths or URLs, are sent as multimodal content parts for models that support them.

Tool use

You bind sandboxed adapters as tools, and the model calls them in a bounded in-node loop. Every call runs through the real, workspace-confined adapter.

Embeddings

Route to an embeddings call and get a vector back for similarity, clustering, and retrieval flows.

Classification

Constrain the model to a label set and branch on the result: when {{node.label}} == 'retry'.

Structured output

Provide a JSON schema and each field becomes branchable node output. This is enforced through the provider's native schema support where available.

Agentic flows

Generate, review, monitor, converge

An AI node in agentic mode turns a natural-language request into a flow graph. You preview that graph in a review modal before anything merges onto the canvas. After you apply it, the node flips into a monitor role. On an unhandled failure it proposes a corrective fix. Where possible that fix is a precise single-node patch, and otherwise it is a grouped sub-flow.

Autonomous mode hands the whole generate, run, observe, re-plan loop to the engine in one bounded call. Three budgets govern it: an iteration cap, a wall-clock ceiling, and a token ceiling. A per-step destructive-action gate pauses before any risky operation. Every interception is recorded with a before/after DSL snapshot that you can revert or replay from History.

Agentic and autonomous runs
agentic.flow
flow "Fix failing build"
inspect[ai: "agentic"] {
  input: "diagnose the failing cargo build and fix it"
  workspace: "~/projects/api"
}
build[action: "Run command"] {
  adapter: "shell"
  command: "cargo build"
}
inspect --> build
Cloud carve-out

Local versus cloud, side by side

AI node, local provider AI node, cloud provider
Network egress None Yes, to the provider's API
Credentials None required Provider API key (OS keyring, env fallback)
PII sanitization Mandatory Mandatory (same sanitizer)
Default policy Always on Off; opt-in via Settings
Persisted output Full assistant text Metadata + 200-char preview unless audit opted in

Describe it. Review the graph. Press Run.

Natural-language flow authoring runs through the same review-before-execute discipline as everything else.