Skip to content

rogerchappel/voicepath

voicepath

Low-latency voice routing for agent apps. voicepath picks the best eligible TTS provider, falls back gracefully, emits useful telemetry, and preserves voice continuity so an agent does not randomly switch voices mid-sentence.

Install

npm install @voicepath/core

This repository exposes @voicepath/core: a local-first SDK and CLI for provider routing, latency budgets, fallback, telemetry, and voice continuity.

Quickstart

import {
  createDeviceSpeechProvider,
  createElevenLabsProvider,
  createOpenAiVoiceProvider,
  createVoicePath
} from '@voicepath/core';

const voice = createVoicePath({
  policy: {
    maxFirstAudioMs: 450,
    prefer: ['elevenlabs', 'openai', 'device'],
    fallback: 'device',
    continuity: 'utterance',
    neverSwitchMidSentence: true
  },
  providers: {
    elevenlabs: createElevenLabsProvider({ apiKey: process.env.ELEVENLABS_API_KEY }),
    openai: createOpenAiVoiceProvider({ apiKey: process.env.OPENAI_API_KEY }),
    device: createDeviceSpeechProvider()
  }
});

voice.events.subscribe((event) => console.log(event.type, event.payload));

await voice.speak({
  text: 'I found the PR and the tests passed.',
  voice: 'calm-operator',
  context: 'agent-status'
});

Demo

Works without cloud credentials:

npm run demo -- "This should start quickly and never switch voices mid-sentence."
VOICEPATH_DEMO_CLOUD=offline npm run demo -- "Show fallback."
VOICEPATH_DEMO_CLOUD=healthy npm run demo -- "Show preferred cloud."

See docs/DEMO.md.

Personality

voicepath is the calm stage manager for agent voice: quick to start, honest when it falls back, and stubborn about not changing the actor mid-line.

Why voicepath is different

  • Deterministic policy engine for latency, quality, quota, health, and fallback.
  • Utterance planner locks provider/voice identity across planned segments.
  • Playback queue supports chunk ordering, cancellation, ducking, and BargeKit-style interruption.
  • Observable events report provider selection, fallback reasons, first-audio latency, interruptions, failures, and completion.
  • Cloud providers are explicit opt-in; no hidden network calls.

Recipes and privacy

Verify

npm run check
npm test
npm run build
npm run smoke
bash scripts/validate.sh

scripts/validate.sh runs repository checks and skips optional agent-qc when unavailable.

Contributing

See CONTRIBUTING.md.

Security

See SECURITY.md. Do not put provider credentials into telemetry payloads.

License

MIT

About

Local-first low-latency voice routing SDK for agent apps

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors