Spike-Gated Residual Unit (S-GRU)

CVPR EDGE 2026 (Poster)

Kiran Nair, Rodrigue Rizk, KC Santosh
USD Artificial Intelligence Research
Department of Computer Science, University of South Dakota, USA

🚀 News

Mar. 23, 2026: Accepted as a Poster at CVPR EDGE 2026
Apr. 04, 2026: Initial codebase released

🧠 Abstract

Large Language Models (LLMs) achieve state-of-the-art performance but incur substantial computational and energy costs due to their dense, fixed-depth Transformer architectures. We introduce the Spike-Gated Residual Unit (S-GRU), a lightweight module that enables dynamic depth adaptation in pretrained Transformers without modifying backbone weights. By inserting spike-gated units into each residual block and optimizing a sparsity-aware objective controlled by a regularization coefficient $\lambda_{\text{sparsity}}$, the model learns to selectively bypass redundant layers during inference. Using a gate-only fine-tuning strategy on TinyLlama-1.1B, S-GRU reduces the average active depth while maintaining competitive performance, achieving significant efficiency gains. Layer-wise analysis reveals an emergent hierarchy where early and late layers remain critical, while intermediate layers are dynamically suppressed under sparsity constraints. These results demonstrate that S-GRU provides a practical, software-level pathway toward energy-efficient Transformer inference, enabling controllable trade-offs along an efficiency–intelligence Pareto frontier without requiring full retraining or specialized hardware.

🏗️ Architecture

(a) Transformer architecture with S-GRU applied to each decoder layer
(b) Spike-gated residual and dynamic gating module

⚙️ Requirements

Python 3.10+
PyTorch
Transformers (Hugging Face)

Install dependencies:

pip install -r requirements.txt

📊 Results

Model	Params	HellaSwag ↑	Wiki-2 ↓	Active Layers ↓	Speedup ↑	C4	LAMBADA	Avg.
TinyLlama-1.1B	1.1B	44.0%	10.18	100%	1.00x	11.00	22.30	14.49
Phi-2	2.7B	52.0%	13.14	100%	0.41x*	16.71	37.02	22.29
Qwen-2.5-1.5B	1.5B	52.0%	12.49	100%	0.73x*	19.03	28.85	20.12
DeepSeek-MoE	1.3B	46.0%	11.26	100%	1.00x*	14.84	23.34	16.48
S-GRU (Ours)	1.1B	41.0%	12.45	76.5%	1.31x	16.50	28.68	19.21

Theoretical speedup relative to TinyLlama-1.1B base throughput.

🚀 Training & Evaluation

python -m src.train \
  --model <model_name> \
  --device <device> \
  --lambda_sparsity <value> \
  --max_steps <steps> \
  --batch_size <batch_size>

📬 Contact Information

For questions or issues, please open a GitHub issue.

For direct contact: 📧 kiran.prasannannair@coyotes.usd.edu

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
figures		figures
src		src
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spike-Gated Residual Unit (S-GRU)

🚀 News

🧠 Abstract

🏗️ Architecture

⚙️ Requirements

📊 Results

🚀 Training & Evaluation

📬 Contact Information

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Spike-Gated Residual Unit (S-GRU)

🚀 News

🧠 Abstract

🏗️ Architecture

⚙️ Requirements

📊 Results

🚀 Training & Evaluation

📬 Contact Information

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages