This repository was archived by the owner on Mar 21, 2026. It is now read-only.

v0.9.3

OlivierDehaene released this 18 Jul 16:53

· 1123 commits to main since this release

5e6ddfd

Highlights

server: add support for flash attention v2
server: add support for llamav2

Features

launcher: add debug logs
server: rework the quantization to support all models

Full Changelog: v0.9.2...v0.9.3

Assets 2