Agent-first CLI for audio/video transcription via Whisper.
Downloads, cleans, and transcribes media from URLs or local files with machine-readable output designed for AI agents.
bun add -g @crafter/trx
trx inittrx init installs dependencies (whisper-cli, yt-dlp, ffmpeg via Homebrew), downloads a Whisper model, and optionally installs the agent skill for your AI coding tool.
If you already have trx set up and just want the agent skill:
npx skills add crafter-station/trx -g# Transcribe a local file
trx recording.mp4
# Transcribe from URL (YouTube, Twitter, Instagram, etc.)
trx "https://youtube.com/watch?v=..."
# Agent-friendly JSON output
trx transcribe video.mp4 --output json
# Only get the text (saves tokens)
trx transcribe video.mp4 --fields text --output json
# Dry-run (validate without executing)
trx transcribe video.mp4 --dry-run --output json
# Specify language
trx transcribe video.mp4 --language es
# Schema introspection for agents
trx schema transcribe| Command | Description |
|---|---|
trx <input> |
Shorthand for trx transcribe |
trx init |
Install deps + download Whisper model |
trx transcribe <input> |
Full transcription pipeline |
trx doctor |
Check dependency status |
trx schema <resource> |
JSON schema introspection |
Built following agent-first CLI principles:
--output jsonauto-detects: table for TTY, JSON when piped--dry-runvalidates before executing--fieldslimits response size to protect agent context windowstrx schemaruntime introspection (no docs needed)- Input validation rejects control characters, path traversals, URL-encoded strings
- Ships with SKILL.md for Claude Code agent post-processing
The bundled skill (skills/trx/SKILL.md) enables AI agents to:
- Transcribe media via CLI
- Post-process output (fix punctuation, accents, technical terms, repeated phrases)
- Reference
whisper-fixes.mdfor common Whisper mistake patterns
Input (URL or file)
|
v
[yt-dlp] Download media (if URL)
|
v
[ffmpeg] Clean audio (silence removal, noise reduction, normalization)
|
v
[whisper-cli] Transcribe (local Whisper model)
|
v
Output: .wav + .srt + .txt + JSON
Stored at ~/.trx/config.json after trx init:
{
"modelPath": "~/.trx/models/ggml-small.bin",
"modelSize": "small",
"language": "auto",
"threads": 8
}Models: tiny (75MB) | base (142MB) | small (466MB) | medium (1.5GB) | large (3GB)
MIT