A side tool by BLOB Productions. Come join our Discord here!.
Hands-free camera switching for VALORANT observers. Say a digit ("one"
through "nine", or "zero") and the tool presses the matching number key
in VALORANT to switch to that player's camera.
Built for esports broadcasts where observers need to react faster than a keyboard allows, but works for anyone who wants voice-triggered number-key hotkeys in any Windows application.
- Offline. Uses Vosk locally. No cloud, no account, no network calls.
- Fast. ~120 ms from end-of-speech to keystroke (Change in config.json)
- Background-friendly by default. Sends keystrokes to VALORANT even
when it is not the focused window (via the
target_windowconfig, which defaults to"VALORANT").
⚠️ Disclaimer: not affiliated with Riot Games or VALORANT. This is an unofficial community tool. Use at your own risk. VALORANT's Vanguard anti-cheat has never, to BLOB's knowledge, flagged keystroke-level automation of this kind, but Riot's ToS and anti-cheat behavior can change without notice. BLOB takes no responsibility for account action. For broadcast/observer use in a controlled production environment (its intended use case), this has been stable.
- Recognize the spoken digits
zerothroughnineand send the corresponding keystroke (0through9). - Toggle mode (tap a hotkey to start/stop listening) or hold mode (listen only while a key is held).
- Per-digit debounce so a single "five" doesn't fire twice.
- Two keystroke backends:
- SendInput: fires into the foreground window. Works with DirectX and fullscreen games.
- PostMessage: fires into a specific window by title, so the game does not need focus.
- Closed-grammar recognizer (Vosk is restricted to the ten digit words), which dramatically improves accuracy in noisy rooms.
- Self-contained config file, sensible defaults, graceful handling of invalid values.
- No cloud services, no telemetry, no analytics. Everything stays on your machine.
- No arbitrary speech-to-text. The grammar is locked to the ten digits. This is by design: accuracy and latency are the priorities.
- No macOS / Linux support. The keystroke injection uses Win32 APIs.
To keep detection both accurate and fast, the pipeline is tuned to react to subtle speech cues, which also makes it somewhat sensitive to background noise and to software that rewrites your mic signal.
If the program feels slower or less accurate than you'd like, try tuning your mic settings first. That usually fixes it before any config changes are needed.
A quick checklist, in rough order of impact, in Windows Settings → Sound → <your input device> → Properties (and in any vendor app that owns your mic: NVIDIA Broadcast, Krisp, Logitech G HUB, SteelSeries Sonar, Razer Synapse, Realtek Audio Console, etc.):
- Turn off noise suppression / noise cancellation
- Turn off echo cancellation (AEC)
- Turn off automatic gain control (AGC)
- Turn off voice clarity / voice focus / AI voice enhancement
- In Discord / OBS / game voice chat, disable their noise suppression too (it can run on the input device even when you're not talking)
- Prefer a plain wired headset in a reasonably quiet room
The tool captures at 16 kHz mono from the Windows default input
device, so anything that processes the signal upstream of that is
processing what the recognizer ultimately hears. Bluetooth headsets,
webcam mics, and laptop arrays all work, but may need a slightly more
forgiving vad_aggressiveness / trailing_silence_ms
(see Configuration).
- Download the latest release zip from the Releases page and unzip it.
- Right-click
VoiceObserver.exe→ Run as administrator (required if VALORANT is elevated, which it usually is). Windows SmartScreen may warn about an unsigned exe on first launch. Click More info → Run anyway. - Press F6 to start listening (default hotkey).
- Say a digit:
"one","two", ..."nine","zero". - Press F6 again to pause.
- Press Ctrl+C in the console window to exit.
Any standard headset or desktop microphone works (see disclaimer above).
If VALORANT runs as Administrator (common due to Vanguard anti-cheat), you must run this tool as Administrator too. Otherwise Windows silently blocks the keystrokes. User Interface Privilege Isolation (UIPI) forbids a non-elevated process from sending input to an elevated one.
Edit config.json next to the executable (or in the project root when
running from source). If the file is missing, one with defaults is
created automatically on first run.
{
"mode": "toggle",
"toggle_key": "F6",
"hold_key": "caps_lock",
"debounce_ms": 300,
"vad_aggressiveness": 3,
"trailing_silence_ms": 120,
"target_window": "VALORANT"
}| Field | Description |
|---|---|
mode |
"toggle" (press to flip on/off) or "hold" (hold to listen) |
toggle_key |
Key for toggle mode, e.g. "F6", "F5", "scroll_lock" |
hold_key |
Key for hold mode, e.g. "caps_lock", "space", "right_shift" |
debounce_ms |
Cooldown (per digit) between repeat recognitions; prevents double-fires |
vad_aggressiveness |
0 to 3. Higher = stricter silence detection, faster response. Lower for noisy rooms. |
trailing_silence_ms |
How long to wait after speech ends before finalizing (30-2000 ms). Lower = faster but may clip short words. |
target_window |
Window title for PostMessage mode. Defaults to "VALORANT" so keystrokes land without the game needing focus. Set to "" to fall back to SendInput (foreground-only) instead. |
Fastest (quiet desk):
{ "trailing_silence_ms": 90, "vad_aggressiveness": 3 }Noisy environment (arena, crowd, open mic):
{ "trailing_silence_ms": 150, "vad_aggressiveness": 2 }Foreground-only (opt out of PostMessage):
{ "target_window": "" }
target_windowdoes a class-name prefix match, so the default string"VALORANT"correctly matches VALORANT's actual classVALORANTUnrealWindoweven though its title has trailing whitespace.
Runtime (installed via requirements.txt):
| Package | Why |
|---|---|
vosk |
Offline speech recognition engine |
webrtcvad-wheels |
Voice Activity Detector (Windows-friendly wheel fork) |
pyaudio |
Microphone audio capture |
keyboard |
Global hotkey binding |
Build-only (not shipped with the exe):
| Package | Why |
|---|---|
pyinstaller |
Packages Python + model into a single folder |
pytest |
Runs the test suite |
External assets:
- Vosk model:
vosk-model-small-en-us-0.15(~50 MB). Download link. Licensed Apache 2.0.
Requires Python 3.10+ on Windows.
python -m venv venv
venv\Scripts\pip install -r requirements.txt
venv\Scripts\pip install pytest pyinstallerDownload the Vosk model (automated):
venv\Scripts\python -c "import urllib.request,zipfile,os; urllib.request.urlretrieve('https://alphacephei.com/vosk/models/vosk-model-small-en-us-0.15.zip','model.zip'); zipfile.ZipFile('model.zip','r').extractall('.'); os.remove('model.zip')"Or manually: grab
vosk-model-small-en-us-0.15.zip,
extract it, and place the vosk-model-small-en-us-0.15 folder in the
project root.
Run from source:
venv\Scripts\python src\main.pyRun tests:
venv\Scripts\pytest tests/ -vBuild standalone exe:
build.batOutput lands in dist\VoiceObserver\. Zip that folder to
distribute.
Developer note:
build.batdeletes a broken PyInstaller contrib hook forwebrtcvad. The upstream hook callscopy_metadata('webrtcvad'), but we install thewebrtcvad-wheelspackage which registers under a different metadata name. If the build breaks after upgrading PyInstaller, check whether the hook has been recreated.
Mic -> 30 ms frames -> webrtcvad -> SpeechDetector state machine
\-> Vosk (streaming decode) -> FinalResult() -> keystroke
webrtcvadclassifies every 30 ms frame as speech or silence.SpeechDetectoris a 3-state machine (idle→speaking→trailing→idle) that tracks speech boundaries and emits the moment a short, complete utterance ends.- Vosk receives every frame in parallel via
AcceptWaveform()and decodes incrementally in real time, constrained to a closed grammar of the ten digit words. - The moment
SpeechDetectorsignals end-of-speech, we callFinalResult()to force Vosk to emit its current best hypothesis immediately, instead of waiting 500-800 ms for Vosk's own endpointer. - The recognized digit is debounced and passed to
key_sender, which uses eitherSendInput(foreground) orPostMessage(targeted window) to press the corresponding number key.
No GPU required. Runs comfortably on any modern laptop.
src/
main.py # Entry point, startup checks, console UI
voice_listener_vosk.py # VAD + Vosk hybrid pipeline
speech_detector.py # 3-state VAD state machine
config.py # Config loading and validation
key_sender.py # Win32 SendInput / PostMessage keystroke injection
hotkey_manager.py # Toggle / hold hotkey binding
tests/ # Pytest suite for each src/ module
third_party/ # Third-party license texts (Apache 2.0)
build.bat # PyInstaller build script
pyproject.toml # Project metadata + pytest config
requirements.txt # Runtime dependencies
LICENSE # MIT license
NOTICE # Third-party attributions
config.json # Runtime configuration (gitignored, generated on first run)
| Problem | Solution |
|---|---|
| "No microphone detected" at startup | Plug in your headset before launching. Check Windows Settings → Sound → Input. |
| Exe blocked by SmartScreen | Click More info → Run anyway. The exe is unsigned. |
| Keystrokes not reaching VALORANT | Run the tool as Administrator. VALORANT must either be focused (SendInput mode) or target_window must be set (PostMessage mode). |
| Wrong digit / misrecognition | Speak clearly, one word at a time. Check your mic settings. Try lowering vad_aggressiveness to 2. |
| Recognizer feels slow | Lower trailing_silence_ms toward 90. Confirm mic effects are off. |
| First word after unpausing gets clipped | Raise trailing_silence_ms slightly, or raise pre_pad_ms (source only). |
| Tool stops responding after unplugging mic | Toggle the hotkey off then back on to reinitialize the audio stream. |
| Antivirus quarantines the exe | Add VoiceObserver.exe to exclusions. Win32 SendInput is a common false-positive trigger. |
target_window says "NOT FOUND" |
The window hasn't been created yet; start VALORANT, then press the hotkey. The tool retries on each keystroke. |
Contributions are welcome. Before opening a PR:
- Run the test suite:
venv\Scripts\pytest tests/ -v - Keep comments focused on why, not what.
- Don't add cloud/network dependencies. This is an offline tool by design.
- Windows-only is fine; cross-platform PRs are welcome but not required.
For bug reports, please include:
- Windows version
- Microphone model + whether you disabled all effects
- Contents of your
config.json - What you said, what happened, and what you expected
Built by BLOB Productions. For more of what we make, visit blob.productions.
MIT. See LICENSE.
The bundled Vosk model (vosk-model-small-en-us-0.15) is licensed under
Apache 2.0.
VALORANT is a trademark of Riot Games, Inc. This project is not affiliated with, endorsed by, or sponsored by Riot Games.