feat: add NVIDIA Jetson GPU support by toolboc · Pull Request #364 · ahmetoner/whisper-asr-webservice

toolboc · 2026-02-27T23:57:09Z

Summary

Add Dockerfile.jetson and docker-compose.jetson.yml for building and running the whisper-asr-webservice on NVIDIA Jetson devices with full GPU acceleration.

Closes #359
Relates to #54, #133

Target Platform

NVIDIA Jetson Orin (Nano / NX / AGX) — JetPack 6.x, L4T R36.x, CUDA 12.6, aarch64
Configurable via build args for other Jetson generations

What's Included

File	Purpose
`Dockerfile.jetson`	Multi-stage build with CUDA support
`docker-compose.jetson.yml`	Compose file for Jetson with GPU passthrough

Key Technical Decisions

CTranslate2 compiled from source — PyPI ships CPU-only aarch64 wheels. Built with -DWITH_CUDA=ON -DCUDA_ARCH_LIST="8.7" against JetPack's CUDA toolkit.
Jetson AI Lab pip index — PyTorch, torchaudio, and onnxruntime-gpu installed from https://pypi.jetson-ai-lab.io/jp6/cu126/+simple/ using --index-url (not --extra-index-url) because pip prefers manylinux_2_28 (CPU) over linux_aarch64 (CUDA) wheels when both are available.
Bypasses Poetry resolver — poetry-core's PEP 517 metadata generation merges [tool.poetry.dependencies] source mappings with [project.optional-dependencies], producing incorrect version constraints (e.g. torch==2.7.1+cu126 instead of the actual version). Dependencies are installed explicitly via pip with a constraints file protecting CUDA packages.
torchaudio compatibility shim — Jetson AI Lab torchaudio builds strip AudioMetaData, info(), and list_audio_backends(). A soundfile-based .pth monkey-patch restores them for pyannote.audio 3.x compatibility.
torch.load compatibility shim — PyTorch >=2.6 defaults weights_only=True, but pyannote VAD checkpoints contain omegaconf.ListConfig globals. The shim defaults weights_only=False when None is passed (as lightning_fabric does).
huggingface_hub use_auth_token→token shim — huggingface_hub 1.5.0 removed the deprecated use_auth_token parameter. pyannote.audio and whisperx still pass use_auth_token=. The shim translates it to token= for hf_hub_download, model_info, and hf_hub_url across all submodules.
Guard step — Force-reinstalls torch, torchaudio, and onnxruntime-gpu from the Jetson index after dependency resolution, ensuring CPU-only PyPI wheels haven't overwritten them.

Verified On

Jetson Orin, JetPack 6.2.2, L4T R36.5.0, CUDA 12.6
torch.cuda.is_available() = True, device = Orin (8, 7)
CTranslate2 4.4.0 CUDA compute types: float16, bfloat16, int8, float32
All three ASR engines tested and passing:
- faster_whisper — 200 OK, GPU transcription ✓
- openai_whisper — 200 OK, GPU transcription ✓
- whisperx — 200 OK, word-level timestamps ✓
HF_TOKEN support for gated model access (diarization) ✓

Build & Run

# Build
docker compose -f docker-compose.jetson.yml build

# Run
docker compose -f docker-compose.jetson.yml up

Docker Hub

Pre-built image available:

docker pull toolboc/whisper-asr-webservice-jetson:jp6.1-cu12.6-py3.10

Add Dockerfile.jetson and docker-compose.jetson.yml for building and running the whisper-asr-webservice on NVIDIA Jetson devices (JetPack 6.x, L4T R36.x, aarch64, CUDA 12.6). Key features: - Multi-stage build: CTranslate2 compiled from source with CUDA for Orin - PyTorch, torchaudio, and onnxruntime-gpu from Jetson AI Lab pip index - nvidia-cudss-cu12 for libcudss.so.0 required by Jetson torch wheels - torchaudio compatibility shim for pyannote.audio 3.x (Jetson builds strip AudioMetaData/info()/list_audio_backends()) - Pip constraints file to protect pre-installed CUDA packages from being overwritten by CPU-only PyPI wheels during dependency resolution - Guard step to force-reinstall CUDA packages from Jetson index - Bypasses Poetry resolver (poetry-core PEP 517 metadata bug produces incorrect torch version constraints on aarch64) Tested on Jetson Orin with JetPack 6.2.2: - torch.cuda.is_available() = True (Orin, compute 8.7) - CTranslate2 CUDA compute types: float16, bfloat16, int8, float32 - faster-whisper model loads on CUDA with float16 - All three ASR engines import successfully - Webservice starts and serves on port 9000

nvidia-cudss-cu12 pulls in nvidia-cublas-cu12 (v12.9.1.4) as a transitive dependency. When its lib path was included in LD_LIBRARY_PATH alongside JetPack's system cuBLAS (v12.6.1.4), both versions loaded into the same process, causing CUBLAS_STATUS_ALLOC_FAILED at runtime. Fix: - Remove nvidia/cublas/lib from LD_LIBRARY_PATH (system cuBLAS is correct) - Uninstall nvidia-cublas-cu12 pip package after nvidia-cudss-cu12 install (we only need libcudss.so.0 from that package)

- Update torchaudio compat shim to also monkey-patch torch.load, defaulting weights_only=False when None is passed (PyTorch >=2.6 changed the default to True, breaking pyannote VAD checkpoints that contain omegaconf globals) - Update image tag to whisper-asr-webservice-jetson:jp6.1-cu12.6-py3.10 - Set default ASR_ENGINE to whisperx in compose file - All three engines tested and verified: faster_whisper, openai_whisper, whisperx

huggingface_hub >= 1.0 removed the deprecated use_auth_token parameter from hf_hub_download() and related functions, but pyannote.audio 3.x and whisperx still pass it. The compatibility shim now translates use_auth_token -> token at startup, before pyannote imports the function, so HF_TOKEN works correctly for diarization model access.

toolboc added 5 commits February 27, 2026 17:56

chore: add container_name to compose file

19ef291

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add NVIDIA Jetson GPU support#364

feat: add NVIDIA Jetson GPU support#364
toolboc wants to merge 5 commits intoahmetoner:mainfrom
toolboc:feat/jetson-gpu-support

toolboc commented Feb 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

toolboc commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Target Platform

What's Included

Key Technical Decisions

Verified On

Build & Run

Docker Hub

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

toolboc commented Feb 27, 2026 •

edited

Loading