@@ -7,6 +7,258 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
77
88<!-- towncrier release notes start -->
99
10+ ## [0.0.101] - 2026-01-30
11+
12+ ### Added
13+
14+ - Additions for `AICFilter` and `AICVADAnalyzer`:
15+ - Added model downloading support to `AICFilter` with `model_id` and
16+ `model_download_dir` parameters.
17+ - Added `model_path` parameter to `AICFilter` for loading local `.aicmodel`
18+ files.
19+ - Added unit tests for `AICFilter` and `AICVADAnalyzer`.
20+ (PR [#3408](https://github.com/pipecat-ai/pipecat/pull/3408))
21+
22+ - Added handling for `server_content.interrupted` signal in the Gemini Live
23+ service for faster interruption response in the case where there isn't
24+ already turn tracking in the pipeline, e.g. local VAD + context aggregators.
25+ When there is already turn tracking in the pipeline, the additional
26+ interruption does no harm.
27+ (PR [#3429](https://github.com/pipecat-ai/pipecat/pull/3429))
28+
29+ - Added new `GenesysFrameSerializer` for the Genesys AudioHook WebSocket
30+ protocol, enabling bidirectional audio streaming between Pipecat pipelines
31+ and Genesys Cloud contact center.
32+ (PR [#3500](https://github.com/pipecat-ai/pipecat/pull/3500))
33+
34+ - Added `reached_upstream_types` and `reached_downstream_types` read-only
35+ properties to `PipelineTask` for inspecting current frame filters.
36+ (PR [#3510](https://github.com/pipecat-ai/pipecat/pull/3510))
37+
38+ - Added `add_reached_upstream_filter()` and `add_reached_downstream_filter()`
39+ methods to `PipelineTask` for appending frame types.
40+ (PR [#3510](https://github.com/pipecat-ai/pipecat/pull/3510))
41+
42+ - Added `UserTurnCompletionLLMServiceMixin` for LLM services to detect and
43+ filter incomplete user turns. When enabled via `filter_incomplete_user_turns`
44+ in `LLMUserAggregatorParams`, the LLM outputs a turn completion marker at the
45+ start of each response: ✓ (complete), ○ (incomplete short), or ◐ (incomplete
46+ long). Incomplete turns are suppressed, and configurable timeouts
47+ automatically re-prompt the user.
48+ (PR [#3518](https://github.com/pipecat-ai/pipecat/pull/3518))
49+
50+ - Added `FrameProcessor.broadcast_frame_instance(frame)` method to broadcast a
51+ frame instance by extracting its fields and creating new instances for each
52+ direction.
53+ (PR [#3519](https://github.com/pipecat-ai/pipecat/pull/3519))
54+
55+ - `PipelineTask` now automatically adds `RTVIProcessor` and registers
56+ `RTVIObserver` when `enable_rtvi=True` (default), simplifying pipeline setup.
57+ (PR [#3519](https://github.com/pipecat-ai/pipecat/pull/3519))
58+
59+ - Added `RTVIProcessor.create_rtvi_observer()` factory method for creating RTVI
60+ observers.
61+ (PR [#3519](https://github.com/pipecat-ai/pipecat/pull/3519))
62+
63+ - Added `video_out_codec` parameter to `TransportParams` allowing configuration
64+ of the preferred video codec (e.g., `"VP8"`, `"H264"`, `"H265"`) for video
65+ output in `DailyTransport`.
66+ (PR [#3520](https://github.com/pipecat-ai/pipecat/pull/3520))
67+
68+ - Added `location` parameter to Google TTS services (`GoogleHttpTTSService`,
69+ `GoogleTTSService`, `GeminiTTSService`) for regional endpoint support.
70+ (PR [#3523](https://github.com/pipecat-ai/pipecat/pull/3523))
71+
72+ - Added new `PIPECAT_SMART_TURN_LOG_DATA` environment variable, which causes
73+ Smart Turn input data to be saved to disk
74+ (PR [#3525](https://github.com/pipecat-ai/pipecat/pull/3525))
75+
76+ - Added `result_callback` parameter to `UserImageRequestFrame` to support
77+ deferred function call results.
78+ (PR [#3571](https://github.com/pipecat-ai/pipecat/pull/3571))
79+
80+ - Added `function_call_timeout_secs` parameter to `LLMService` to configure
81+ timeout for deferred function calls (defaults to 10.0 seconds).
82+ (PR [#3571](https://github.com/pipecat-ai/pipecat/pull/3571))
83+
84+ - Added `vad_analyzer` parameter to `LLMUserAggregatorParams`. VAD analysis is
85+ now handled inside the `LLMUserAggregator` rather than in the transport,
86+ keeping voice activity detection closer to where it is consumed. The
87+ `vad_analyzer` on `BaseInputTransport` is now deprecated.
88+
89+ ```python
90+ context_aggregator = LLMContextAggregatorPair(
91+ context,
92+ user_params=LLMUserAggregatorParams(
93+ vad_analyzer=SileroVADAnalyzer(),
94+ ),
95+ )
96+ ```
97+ (PR [#3583](https://github.com/pipecat-ai/pipecat/pull/3583))
98+
99+ - Added `VADProcessor` for detecting speech in audio streams within a pipeline.
100+ Pushes `VADUserStartedSpeakingFrame`, `VADUserStoppedSpeakingFrame`, and
101+ `UserSpeakingFrame` downstream based on VAD state changes.
102+ (PR [#3583](https://github.com/pipecat-ai/pipecat/pull/3583))
103+
104+ - Added `VADController` for managing voice activity detection state and
105+ emitting speech events independently of transport or pipeline processors.
106+ (PR [#3583](https://github.com/pipecat-ai/pipecat/pull/3583))
107+
108+ - Added local `PiperTTSService` for offline text-to-speech using Piper voice
109+ models. The existing HTTP-based service has been renamed to
110+ `PiperHttpTTSService`.
111+ (PR [#3585](https://github.com/pipecat-ai/pipecat/pull/3585))
112+
113+ - `main()` in `pipecat.runner.run` now accepts an optional
114+ `argparse.ArgumentParser`, allowing bots to define custom CLI arguments
115+ accessible via `runner_args.cli_args`.
116+ (PR [#3590](https://github.com/pipecat-ai/pipecat/pull/3590))
117+
118+ - Added `KokoroTTSService` for local text-to-speech synthesis using the
119+ Kokoro-82M model.
120+ (PR [#3595](https://github.com/pipecat-ai/pipecat/pull/3595))
121+
122+ ### Changed
123+
124+ - Updated `AICFilter` and `AICVADAnalyzer` to use aic-sdk ~= 2.0.1.
125+ (PR [#3408](https://github.com/pipecat-ai/pipecat/pull/3408))
126+
127+ - Improved the STT TTFB (Time To First Byte) measurement, reporting the delay
128+ between when the user stops speaking and when the final transcription is
129+ received. Note: Unlike traditional TTFB which measures from a discrete
130+ request, STT services receive continuous audio input—so we measure from
131+ speech end to final transcript, which captures the latency that matters for
132+ voice AI applications. In support of this change, added `finalized` field to
133+ `TranscriptionFrame` to indicate when a transcript is the final result for an
134+ utterance.
135+ (PR [#3495](https://github.com/pipecat-ai/pipecat/pull/3495))
136+
137+ - `SarvamSTTService` now defaults `vad_signals` and `high_vad_sensitivity` to
138+ `None` (omitted from connection parameters), improving latency by ~300ms
139+ compared to the previous defaults.
140+ (PR [#3495](https://github.com/pipecat-ai/pipecat/pull/3495))
141+
142+ - Changed frame filter storage from tuples to sets in `PipelineTask`.
143+ (PR [#3510](https://github.com/pipecat-ai/pipecat/pull/3510))
144+
145+ - Changed default Inworld TTS model from `inworld-tts-1` to
146+ `inworld-tts-1.5-max`.
147+ (PR [#3531](https://github.com/pipecat-ai/pipecat/pull/3531))
148+
149+ - `FrameSerializer` now subclasses from `BaseObject` to enable event support.
150+ (PR [#3560](https://github.com/pipecat-ai/pipecat/pull/3560))
151+
152+ - Added support for TTFS in `SpeechmaticsSTTService` and set the default mode
153+ to `EXTERNAL` to support Pipecat-controlled VAD.
154+ - Changed dependency to `speechmatics-voice[smart]>=0.2.8`
155+ (PR [#3562](https://github.com/pipecat-ai/pipecat/pull/3562))
156+
157+ - ⚠️ Changed function call handling to use timeout-based completion instead of
158+ immediate callback execution.
159+ - Function calls that defer their results (e.g., `UserImageRequestFrame`)
160+ now use a timeout mechanism
161+ - The `result_callback` is invoked automatically when the deferred
162+ operation completes or after timeout
163+ - This change affects examples using `UserImageRequestFrame` - the
164+ `result_callback` should now be passed to the frame instead of being called
165+ immediately
166+ (PR [#3571](https://github.com/pipecat-ai/pipecat/pull/3571))
167+
168+ - Pipecat runner now uses `DAILY_ROOM_URL` instead of `DAILY_SAMPLE_ROOM_URL`.
169+ (PR [#3582](https://github.com/pipecat-ai/pipecat/pull/3582))
170+
171+ - Updates to `GradiumSTTService`:
172+ - Now flushes pending transcriptions when VAD detects the user stopped
173+ speaking, improving response latency.
174+ - `GradiumSTTService` now supports `InputParams` for configuring `language`
175+ and `delay_in_frames` settings.
176+ (PR [#3587](https://github.com/pipecat-ai/pipecat/pull/3587))
177+
178+ ### Deprecated
179+
180+ - ⚠️ Deprecated `vad_analyzer` parameter on `BaseInputTransport`. Pass
181+ `vad_analyzer` to `LLMUserAggregatorParams` instead or use `VADProcessor` in
182+ the pipeline.
183+ (PR [#3583](https://github.com/pipecat-ai/pipecat/pull/3583))
184+
185+ ### Removed
186+
187+ - Removed deprecated `AICFilter` parameters: `enhancement_level`, `voice_gain`,
188+ `noise_gate_enable`.
189+ (PR [#3408](https://github.com/pipecat-ai/pipecat/pull/3408))
190+
191+ ### Fixed
192+
193+ - Fixed an issue where if you were using `OpenRouterLLMService` with a Gemini
194+ model, it wouldn't handle multiple `"system"` messages as expected (and as we
195+ do in `GoogleLLMService`), which is to convert subsequent ones into `"user"`
196+ messages. Instead, the latest `"system"` message would overwrite the previous
197+ ones.
198+ (PR [#3406](https://github.com/pipecat-ai/pipecat/pull/3406))
199+
200+ - Transports now properly broadcast `InputTransportMessageFrame` frames both
201+ upstream and downstream instead of only pushing downstream.
202+ (PR [#3519](https://github.com/pipecat-ai/pipecat/pull/3519))
203+
204+ - Fixed `FrameProcessor.broadcast_frame()` to deep copy kwargs, preventing
205+ shared mutable references between the downstream and upstream frame
206+ instances.
207+ (PR [#3519](https://github.com/pipecat-ai/pipecat/pull/3519))
208+
209+ - Fixed OpenAI LLM services to emit `ErrorFrame` on completion timeout,
210+ enabling proper error handling and LLMSwitcher failover.
211+ (PR [#3529](https://github.com/pipecat-ai/pipecat/pull/3529))
212+
213+ - Fixed a logging issue where non-ASCII characters (e.g., Japanese, Chinese,
214+ etc.) were being unnecessarily escaped to Unicode sequences when function
215+ call occurred.
216+ (PR [#3536](https://github.com/pipecat-ai/pipecat/pull/3536))
217+
218+ - Fixed how audio tracks are synchronized inside the `AudioBufferProcessor` to
219+ fix timing issues where silence and audio were misaligned between user and
220+ bot buffers.
221+ (PR [#3541](https://github.com/pipecat-ai/pipecat/pull/3541))
222+
223+ - Fixed race condition in `OpenAIRealtimeBetaLLMService` that could cause an
224+ error when truncating the conversation.
225+ (PR [#3567](https://github.com/pipecat-ai/pipecat/pull/3567))
226+
227+ - Fixed an infinite loop in `WebsocketService` that blocked the event loop when
228+ a remote server closed the connection gracefully.
229+ (PR [#3574](https://github.com/pipecat-ai/pipecat/pull/3574))
230+
231+ - Fixed `LLMUserAggregator` and `LLMAssistantAggregator` not emitting pending
232+ transcripts via `on_user_turn_stopped` and `on_assistant_turn_stopped` events
233+ when the conversation ends (`EndFrame`) or is cancelled (`CancelFrame`).
234+ (PR [#3575](https://github.com/pipecat-ai/pipecat/pull/3575))
235+
236+ - Added missing `LiveKitRunnerArguments` and `LiveKitTransport` support in
237+ runner utilities to enable LiveKit transport configuration.
238+ (PR [#3580](https://github.com/pipecat-ai/pipecat/pull/3580))
239+
240+ - Fixed race condition in `OpenAIRealtimeLLMService` that could cause an error
241+ when truncating the conversation.
242+ (PR [#3581](https://github.com/pipecat-ai/pipecat/pull/3581))
243+
244+ - Fixed `PiperHttpTTSService` (olf `PiperTTSService`) to resample audio output
245+ based on the model's sample rate parsed from the WAV header.
246+ (PR [#3585](https://github.com/pipecat-ai/pipecat/pull/3585))
247+
248+ - Fixed `UserTurnController` to reset user turn timeout when interim
249+ transcriptions are received.
250+ (PR [#3594](https://github.com/pipecat-ai/pipecat/pull/3594))
251+
252+ - Fixed an issue in the `IVRNavigator` where the `TextFrame`s pushed had
253+ incorrect spacing. Now, the internal `IVRProcessor` pushes
254+ `AggregatedTextFrame`s when in conversation mode. This allows for controlling
255+ spacing of the outputted, aggregated text.
256+ (PR [#3604](https://github.com/pipecat-ai/pipecat/pull/3604))
257+
258+ - Fixed `GeminiLiveLLMService` transcription timeout handler not being
259+ scheduled by yielding to the event loop after task creation.
260+ (PR [#3605](https://github.com/pipecat-ai/pipecat/pull/3605))
261+
10262## [0.0.100] - 2026-01-20
11263
12264### Added
0 commit comments