Skip to content

Latest commit

 

History

History
497 lines (481 loc) · 14 KB

File metadata and controls

497 lines (481 loc) · 14 KB
title Welcome to SGLang
description High-performance serving framework for large language and multimodal models.
keywords
sglang
llm serving
multimodal
inference runtime
mode wide

<a class="github-button" href="https://github.com/sgl-project/sglang" data-size="large" data-show-count="true" aria-label="Star sgl-project/sglang on GitHub"

Star <a class="github-button" href="https://github.com/sgl-project/sglang/fork" data-icon="octicon-repo-forked" data-size="large" data-show-count="true" aria-label="Fork sgl-project/sglang on GitHub"

Fork

<script async defer src="https://buttons.github.io/buttons.js"></script>



Designed for low-latency, high-throughput inference with RadixAttention, prefix caching, and multi-GPU parallelism. Broad support for Llama, Qwen, DeepSeek, and more. Compatible with Hugging Face and OpenAI APIs. Native support across Hardware Platforms including NVIDIA, AMD, Intel Xeon, Google TPU, and Ascend NPU accelerators. Open-source with widespread adoption, powering 400k+ GPUs and integrated with major RL frameworks.

SGLang powers large-scale production deployments, generating trillions of tokens each day across more than 400,000 GPUs worldwide. It is hosted under the non-profit open-source organization LMSYS.


Get Started

SGLang is an inference framework meant for production level serving. It is designed to deliver low-latency and high-throughput inference across a wide range of setups, from a single GPU to large distributed clusters.

Install SGLang with pip, from source, or via Docker on your preferred hardware platform. Launch your first model server and send requests in minutes with OpenAI-compatible APIs.

News and latest blogs

{/* BEGIN_LMSYS_SGLANG_BLOG_CARDS */}

{/* END_LMSYS_SGLANG_BLOG_CARDS */}

Learn more and join the community

Stay connected

{" "} Development roadmap to follow current priorities and upcoming work.
{" "} Weekly public development meeting to hear updates and join open discussions.
{" "} Slack for questions, feedback, and community support.
X Twitter and {" "} LinkedIn for project updates.
{" "} LMSYS blog for release notes, benchmarks, and technical deep dives.
{" "} Learning materials for blogs, slides, and videos.