This page is a router. Pick the goal that matches what you're trying to do — each row points at the page that explains the how.
Requires Python 3.10 through 3.14.
pip install git+https://github.com/swiss-ai/model-launch.git
sml --versionThen pick your goal below. (Contributing to SML itself? Skip the install above and see Development for the editable-install flow.)
| Goal | Where to go |
|---|---|
| Try a model — vibe-check responses, see what it sounds like | Run an example script for a 1-shot launch, or use sml for the interactive menu. Both give you a live model in one command. |
| Run a model with low latency (chat, interactive demos) | Sizing → Latency tuning. Short version: smaller model, FP8/INT4 if quality allows, batch-1, no router. |
| Run a model at high throughput (batch eval, dataset processing) | Sizing → Throughput tuning for the layout, Benchmarking for measuring it. |
| Keep the model private — only I can reach it | Pass --disable-ocf so the replica never registers with the public gateway. See When to disable OCF. |
| Run a model that isn't in the catalog | Use sml advanced and point at the model's path on the cluster filesystem. Try this yourself first — see Adding a new model recipe. The SML team can't take a custom request for every model. |
| Keep a model running 24/7 | SML can't — SLURM jobs are time-limited. You want Kubernetes. See the 24/7 hosting answer for who to contact. |
| Drive SML from Claude Desktop / Cursor | MCP Server — wire up the JSON config snippet and you get launch/monitor/cancel as native tools. |
| Set up credentials for the first time | Initialization. Pick FirecREST (laptop) or SLURM (already on the cluster). |
If you have a specific operational question — "why is my job stuck pending?", "where do metrics live?", "what's the difference between sml and sml advanced?" — start with the FAQ. Unfamiliar word? See the Glossary.