Server Configuration (`config.yml`)

This document describes Hugind server config parsing as implemented in src/core/config/loader.rs and src/core/config/server.rs.

File Location

Config files are written by hugind config init to ~/.hugind/configs/<name>.yml.

Top-Level Sections

server
model
context
multimodal
sampling
lora
fit
quantize
advanced

The base template is in src/resources/config.yml.

`server` Section

host (default 0.0.0.0)
port (default 8080)
api_key (optional bearer token)
max_slots (defaults to context.seq_max)
system_prompt (default "You are a helpful assistant.")
system_prompt_file (optional path; errors if unreadable)
embeddings (boolish; default false)
session_home (optional path; defaults to ~/.hugind/sessions)
unified_memory_mode (boolish; default false)
verbose (boolish; default false)
enable_thinking_default (boolish; default false)
thinking_budget_tokens (optional u32; null = no cap)

`model` Section

path (model .gguf path)
name (optional public model name)
mmproj_path (optional vision projector path)
Model params: gpu_layers, split_mode, main_gpu, tensor_split, use_mmap, use_mlock, etc.

Notes:

gpu_layers also accepts alias n_gpu_layers.
Relative paths resolved relative to the config file directory.
~ in paths is expanded to home directory.

`context` Section

size (n_ctx)
batch_size (n_batch)
ubatch_size (n_ubatch)
seq_max (n_seq_max)
threads / threads_batch
Attention/rope/pooling options
KV cache settings (cache_type_k, cache_type_v, offload_kqv, etc.)

Notes:

flash_attention: true auto-sets flash_attn_type to on.
For vision models, low batch_size is auto-raised to 8192.

`fit` Section

enabled (default false)
target_mib (per-device memory targets in MiB)
min_ctx (minimum context size when fitting)

When enabled, the engine adjusts context size at startup to fit available memory.

Other Sections

multimodal, sampling, lora, quantize, advanced map directly to structs in src/core/config/server.rs. See src/resources/config.yml for the full key set with defaults.

Boolish Parsing

For boolish fields, Hugind accepts booleans and common strings:

true: true, on, yes, enabled, 1
false: false, off, no, disabled, 0

Unrecognized values produce a warning.

Hardware Auto-Configuration

hugind config init detects hardware and auto-configures:

GPU layers (99 for NVIDIA/Apple Silicon, 0 for CPU-only)
Flash attention (enabled for NVIDIA)
KV offload (enabled when GPU available)
Thread count (optimized for CPU architecture)
Unified memory mode (enabled for Apple Silicon)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Server Configuration (`config.yml`)

File Location

Top-Level Sections

`server` Section

`model` Section

`context` Section

`fit` Section

Other Sections

Boolish Parsing

Hardware Auto-Configuration

FilesExpand file tree

server_config.md

Latest commit

History

server_config.md

File metadata and controls

Server Configuration (config.yml)

File Location

Top-Level Sections

server Section

model Section

context Section

fit Section

Other Sections

Boolish Parsing

Hardware Auto-Configuration

Server Configuration (`config.yml`)

`server` Section

`model` Section

`context` Section

`fit` Section