LlamaParams
Per-model llama-server load parameters (the Model Hub “Load settings”
panel). Absent fields fall back to the server’s own defaults.
Properties
Section titled “Properties”batchSize?
Section titled “batchSize?”
optionalbatchSize?:number
cacheTypeK?
Section titled “cacheTypeK?”
optionalcacheTypeK?:string
cacheTypeV?
Section titled “cacheTypeV?”
optionalcacheTypeV?:string
ctxSize?
Section titled “ctxSize?”
optionalctxSize?:number
enableThinking?
Section titled “enableThinking?”
optionalenableThinking?:boolean
Reasoning models only: false disables thinking (faster, less memory).
flashAttn?
Section titled “flashAttn?”
optionalflashAttn?:boolean
mlock?
Section titled “mlock?”
optionalmlock?:boolean
optionalmmap?:boolean
nGpuLayers?
Section titled “nGpuLayers?”
optionalnGpuLayers?:number
parallel?
Section titled “parallel?”
optionalparallel?:number
optionalseed?:number
threads?
Section titled “threads?”
optionalthreads?:number