Skip to content

Model Hub

The Model Hub is how AI models are distributed, updated, and governed. It works like an internal app store for AI capabilities. It is a distribution mechanism, not an inference server. Local inference always runs on the user’s machine, and the Hub never processes inference data.

The Hub opens from the workspace icon rail. It runs against a rich structured catalog served by the Flow Hub service’s models registry when one is configured. That catalog is synced into a local snapshot, and the embedded catalog is the offline floor.

  • Browse / Installed / Updates tabs with search plus format and capability filters.
  • Per-model metadata. Each model lists its parameters, capabilities, license, tags, version history, hardware requirements, and download options across the gguf, mlx, and safetensors formats.
  • Device-compatibility check. A system probe reads RAM, free disk, OS, and architecture, then matches them against each model’s requirements. The result drives a per-model Compatible badge and a compatible-only filter.
  • Verified downloads. Artifacts stream to the local models directory with live progress and optional SHA-256 verification. Cancelling removes the partial file, and downloads keep running when you switch away from the Hub view.
  • Local hosting. Load starts the managed llama-server against the model, with per-model load settings for context, GPU layers, threads, flash attention, KV-cache type, and the reasoning toggle. See the local runtime.

A catalog download option names an artifact by its format, quantization, size, URL, and optional SHA-256. The pipeline resolves the entry and streams it to disk, emitting progress events as it goes. When a checksum is present, it hashes the file and compares the two values, and a mismatch deletes the partial file. The pipeline then finalizes the artifact under the catalog file name. Downloads are confined to the models directory, and deletion is path-confined.

FailureWhat happens
Network or HTTP errorDownload errors; the partial file is cleaned up
SHA-256 mismatchPartial file removed; download errors
Disk fullDownload errors mid-stream
User cancelledPartial file removed; treated as a cancel, not a failure

The models registry is approval-gated like the template registry. A model entry publishes to the Hub with its metadata and download options, while the artifacts themselves stay at their source URLs. The entry lands pending, and only an admin-approved version joins the served catalog, carrying its publisher and publish date. Versions are monotonic per model and are never reused.

Verification is hub-managed. A model’s verified badge is an admin toggle on the Hub, and publishers cannot set it themselves. The verified set is exactly the approved-model list that template governance checks. When an admin flips a model’s verification, the compliance badges of every template that references it move immediately.

Popularity counters are hub-managed too. Publish payloads never change a model’s downloads or stars. The served values carry over across versions, and they start at zero for ids the Hub has never seen. Admins can reset them. Every Hub action, whether publish, review, verify, or reset, lands in the Hub’s audit log.

SubsystemStatus
RegistryShipped - versioned model entries with approval gates, served as seed UNION latest-approved
Metadata APIShipped - catalog, per-entry version history, recorded validation issues
Trust policyHub-managed verification shipped; publisher identity and signing planned
DistributionApp bundles, first-run download, or administrator-managed package folders (planned)
VersioningRegistry versions are monotonic today; app-compat pinning planned
IntegritySHA-256 today; package signing planned for provenance

This is a future deployment pattern for evaluating candidates. A candidate model version installs beside the active one, and both run on the same local input. Only the active result is shown, but both results are logged. After validation, the candidate is promoted or removed. This lets you compare accuracy on real workloads without affecting the workflow.

Planned model observability preserves zero egress. It covers local invocation records such as model version, token counts, latency, confidence, and pass/fail, along with user feedback marks. Optional exports contain anonymized signal only, never spool content, prompts, or PII. Cloud AI nodes already minimize persisted content by default, keeping metadata plus a short preview unless content audit is explicitly enabled per node.