Skip to content

Add S₀ Tuning (PEFT for hybrid recurrent-attention models)#14

Open
JackYoung27 wants to merge 1 commit intoxmindflow:mainfrom
S0-Tuning:add-s0-tuning
Open

Add S₀ Tuning (PEFT for hybrid recurrent-attention models)#14
JackYoung27 wants to merge 1 commit intoxmindflow:mainfrom
S0-Tuning:add-s0-tuning

Conversation

@JackYoung27
Copy link
Copy Markdown

S₀ tuning optimizes one state matrix per recurrent layer while freezing all model weights. On Qwen3.5-4B: +23.6 pp on HumanEval (p < 0.001, 10 seeds), +10.8 pp over LoRA, zero inference overhead. Tested on FalconH1-7B (Mamba-2).

Paper: https://arxiv.org/abs/2604.01168
Code: https://github.com/JackYoung27/s0-tuning

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant