feat: Add OpenAI API compatible STT provider by angelplusultra · Pull Request #5268 · Mintplex-Labs/anything-llm

angelplusultra · 2026-03-25T21:58:05Z

Pull Request Type

✨ feat (New feature)
🐛 fix (Bug fix)
♻️ refactor (Code refactoring without changing behavior)
💄 style (UI style changes)
🔨 chore (Build, CI, maintenance)
📝 docs (Documentation updates)

Relevant Issues

resolves #3812

Description

Adds a "Generic OpenAI Compatible" STT provider option, allowing users to use any OpenAI-compatible speech-to-text service (OpenAI Whisper, Groq, Deepgram, self-hosted faster-whisper, etc.) for voice-to-text transcription in the chat prompt input.

What changed:

New STT provider selection: Users can now choose between "System native" (browser Web Speech API) and "OpenAI Compatible" in Settings > Audio Preference > Speech-to-text
Server-backed transcription: When using the OpenAI Compatible provider, audio is recorded via the browser's MediaRecorder API, sent to a new POST /system/stt endpoint, which proxies the audio to the configured OpenAI-compatible transcription API (/audio/transcriptions)
Configurable settings: Base URL, API Key, and Model are configurable via the settings UI
Silence detection: Uses Web Audio API's AnalyserNode to detect silence and auto-stop recording after 3.2s (matching native provider behavior)
Auto-submit support: Works with the existing "Auto Submit Speech Input" setting
CTRL+M shortcut: Works with both providers
Decoupled ChatContainer from react-speech-recognition: ChatContainer now uses a custom STOP_STT_EVENT to signal STT stop, making it provider-agnostic
Loading spinner on mic button: When using the server-backed provider, clicking the mic shows a spinner while the browser acquires the microphone via getUserMedia. Unlike the native Web Speech API (which manages the mic internally and returns synchronously), the server provider must explicitly request the raw audio stream from the OS — this hardware initialization takes 1-3 seconds depending on the device and browser. The spinner provides visual feedback during this unavoidable delay so users know the app is responding.

Visuals (if applicable)

Additional Information

The native browser STT provider is completely unchanged — this is purely additive
Tested with OpenAI (whisper-1) and Groq (whisper-large-v3) successfully
Audio is recorded as audio/webm (browser default) and sent through multer memory storage — no files written to disk

Developer Validations

I ran yarn lint from the root of the repo & committed changes
Relevant documentation has been updated (if applicable)
I have tested my code functionality
Docker build succeeds locally

shatfield4

LGTM

codeCraft-Ritik

Nice work on this PR. The code is well-structured and easy to follow

angelplusultra added 3 commits March 25, 2026 14:27

implement generic openai stt provider

684f144

simplify server.jsx

a7bf476

lint and formatting

f1289ee

angelplusultra marked this pull request as draft March 25, 2026 21:58

angelplusultra changed the title ~~feat: Add Generic OpenAI compatible STT provider~~ feat: Add OpenAI API compatible STT provider Mar 25, 2026

angelplusultra marked this pull request as ready for review March 25, 2026 22:11

angelplusultra added 3 commits March 25, 2026 15:18

fix incorrect extension names

7f1f2bd

cleanup audioContext in endSTTSession

47e147d

add loading state for microphone getUserMedia

afd3b44

angelplusultra requested a review from shatfield4 March 25, 2026 22:47

angelplusultra assigned shatfield4 Mar 25, 2026

angelplusultra added the PR:needs review Needs review by core team label Mar 25, 2026

shatfield4 approved these changes Mar 27, 2026

View reviewed changes

shatfield4 assigned angelplusultra and unassigned shatfield4 Mar 27, 2026

angelplusultra requested a review from timothycarambat March 27, 2026 18:32

angelplusultra assigned timothycarambat and unassigned angelplusultra Mar 27, 2026

codeCraft-Ritik reviewed Mar 29, 2026

View reviewed changes

timothycarambat mentioned this pull request Mar 30, 2026

[FEAT]: Features and QoL recommendations by my personal opinion #5221

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add OpenAI API compatible STT provider#5268

feat: Add OpenAI API compatible STT provider#5268
angelplusultra wants to merge 6 commits intomasterfrom
stt-provider-expansion-openai-api-compatible

angelplusultra commented Mar 25, 2026 •

edited

Loading

Uh oh!

shatfield4 left a comment

Uh oh!

codeCraft-Ritik left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

angelplusultra commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Type

Relevant Issues

Description

Visuals (if applicable)

Additional Information

Developer Validations

Uh oh!

shatfield4 left a comment

Choose a reason for hiding this comment

Uh oh!

codeCraft-Ritik left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

angelplusultra commented Mar 25, 2026 •

edited

Loading