Skip to content

Commit 7853e5c

Browse files
authored
Merge pull request #3606 from pipecat-ai/changelog-0.0.101
Release 0.0.101 - Changelog Update
2 parents ef51c2a + 614b8e1 commit 7853e5c

48 files changed

Lines changed: 252 additions & 66 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

CHANGELOG.md

Lines changed: 252 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,258 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
77

88
<!-- towncrier release notes start -->
99

10+
## [0.0.101] - 2026-01-30
11+
12+
### Added
13+
14+
- Additions for `AICFilter` and `AICVADAnalyzer`:
15+
- Added model downloading support to `AICFilter` with `model_id` and
16+
`model_download_dir` parameters.
17+
- Added `model_path` parameter to `AICFilter` for loading local `.aicmodel`
18+
files.
19+
- Added unit tests for `AICFilter` and `AICVADAnalyzer`.
20+
(PR [#3408](https://github.com/pipecat-ai/pipecat/pull/3408))
21+
22+
- Added handling for `server_content.interrupted` signal in the Gemini Live
23+
service for faster interruption response in the case where there isn't
24+
already turn tracking in the pipeline, e.g. local VAD + context aggregators.
25+
When there is already turn tracking in the pipeline, the additional
26+
interruption does no harm.
27+
(PR [#3429](https://github.com/pipecat-ai/pipecat/pull/3429))
28+
29+
- Added new `GenesysFrameSerializer` for the Genesys AudioHook WebSocket
30+
protocol, enabling bidirectional audio streaming between Pipecat pipelines
31+
and Genesys Cloud contact center.
32+
(PR [#3500](https://github.com/pipecat-ai/pipecat/pull/3500))
33+
34+
- Added `reached_upstream_types` and `reached_downstream_types` read-only
35+
properties to `PipelineTask` for inspecting current frame filters.
36+
(PR [#3510](https://github.com/pipecat-ai/pipecat/pull/3510))
37+
38+
- Added `add_reached_upstream_filter()` and `add_reached_downstream_filter()`
39+
methods to `PipelineTask` for appending frame types.
40+
(PR [#3510](https://github.com/pipecat-ai/pipecat/pull/3510))
41+
42+
- Added `UserTurnCompletionLLMServiceMixin` for LLM services to detect and
43+
filter incomplete user turns. When enabled via `filter_incomplete_user_turns`
44+
in `LLMUserAggregatorParams`, the LLM outputs a turn completion marker at the
45+
start of each response: ✓ (complete), ○ (incomplete short), or ◐ (incomplete
46+
long). Incomplete turns are suppressed, and configurable timeouts
47+
automatically re-prompt the user.
48+
(PR [#3518](https://github.com/pipecat-ai/pipecat/pull/3518))
49+
50+
- Added `FrameProcessor.broadcast_frame_instance(frame)` method to broadcast a
51+
frame instance by extracting its fields and creating new instances for each
52+
direction.
53+
(PR [#3519](https://github.com/pipecat-ai/pipecat/pull/3519))
54+
55+
- `PipelineTask` now automatically adds `RTVIProcessor` and registers
56+
`RTVIObserver` when `enable_rtvi=True` (default), simplifying pipeline setup.
57+
(PR [#3519](https://github.com/pipecat-ai/pipecat/pull/3519))
58+
59+
- Added `RTVIProcessor.create_rtvi_observer()` factory method for creating RTVI
60+
observers.
61+
(PR [#3519](https://github.com/pipecat-ai/pipecat/pull/3519))
62+
63+
- Added `video_out_codec` parameter to `TransportParams` allowing configuration
64+
of the preferred video codec (e.g., `"VP8"`, `"H264"`, `"H265"`) for video
65+
output in `DailyTransport`.
66+
(PR [#3520](https://github.com/pipecat-ai/pipecat/pull/3520))
67+
68+
- Added `location` parameter to Google TTS services (`GoogleHttpTTSService`,
69+
`GoogleTTSService`, `GeminiTTSService`) for regional endpoint support.
70+
(PR [#3523](https://github.com/pipecat-ai/pipecat/pull/3523))
71+
72+
- Added new `PIPECAT_SMART_TURN_LOG_DATA` environment variable, which causes
73+
Smart Turn input data to be saved to disk
74+
(PR [#3525](https://github.com/pipecat-ai/pipecat/pull/3525))
75+
76+
- Added `result_callback` parameter to `UserImageRequestFrame` to support
77+
deferred function call results.
78+
(PR [#3571](https://github.com/pipecat-ai/pipecat/pull/3571))
79+
80+
- Added `function_call_timeout_secs` parameter to `LLMService` to configure
81+
timeout for deferred function calls (defaults to 10.0 seconds).
82+
(PR [#3571](https://github.com/pipecat-ai/pipecat/pull/3571))
83+
84+
- Added `vad_analyzer` parameter to `LLMUserAggregatorParams`. VAD analysis is
85+
now handled inside the `LLMUserAggregator` rather than in the transport,
86+
keeping voice activity detection closer to where it is consumed. The
87+
`vad_analyzer` on `BaseInputTransport` is now deprecated.
88+
89+
```python
90+
context_aggregator = LLMContextAggregatorPair(
91+
context,
92+
user_params=LLMUserAggregatorParams(
93+
vad_analyzer=SileroVADAnalyzer(),
94+
),
95+
)
96+
```
97+
(PR [#3583](https://github.com/pipecat-ai/pipecat/pull/3583))
98+
99+
- Added `VADProcessor` for detecting speech in audio streams within a pipeline.
100+
Pushes `VADUserStartedSpeakingFrame`, `VADUserStoppedSpeakingFrame`, and
101+
`UserSpeakingFrame` downstream based on VAD state changes.
102+
(PR [#3583](https://github.com/pipecat-ai/pipecat/pull/3583))
103+
104+
- Added `VADController` for managing voice activity detection state and
105+
emitting speech events independently of transport or pipeline processors.
106+
(PR [#3583](https://github.com/pipecat-ai/pipecat/pull/3583))
107+
108+
- Added local `PiperTTSService` for offline text-to-speech using Piper voice
109+
models. The existing HTTP-based service has been renamed to
110+
`PiperHttpTTSService`.
111+
(PR [#3585](https://github.com/pipecat-ai/pipecat/pull/3585))
112+
113+
- `main()` in `pipecat.runner.run` now accepts an optional
114+
`argparse.ArgumentParser`, allowing bots to define custom CLI arguments
115+
accessible via `runner_args.cli_args`.
116+
(PR [#3590](https://github.com/pipecat-ai/pipecat/pull/3590))
117+
118+
- Added `KokoroTTSService` for local text-to-speech synthesis using the
119+
Kokoro-82M model.
120+
(PR [#3595](https://github.com/pipecat-ai/pipecat/pull/3595))
121+
122+
### Changed
123+
124+
- Updated `AICFilter` and `AICVADAnalyzer` to use aic-sdk ~= 2.0.1.
125+
(PR [#3408](https://github.com/pipecat-ai/pipecat/pull/3408))
126+
127+
- Improved the STT TTFB (Time To First Byte) measurement, reporting the delay
128+
between when the user stops speaking and when the final transcription is
129+
received. Note: Unlike traditional TTFB which measures from a discrete
130+
request, STT services receive continuous audio input—so we measure from
131+
speech end to final transcript, which captures the latency that matters for
132+
voice AI applications. In support of this change, added `finalized` field to
133+
`TranscriptionFrame` to indicate when a transcript is the final result for an
134+
utterance.
135+
(PR [#3495](https://github.com/pipecat-ai/pipecat/pull/3495))
136+
137+
- `SarvamSTTService` now defaults `vad_signals` and `high_vad_sensitivity` to
138+
`None` (omitted from connection parameters), improving latency by ~300ms
139+
compared to the previous defaults.
140+
(PR [#3495](https://github.com/pipecat-ai/pipecat/pull/3495))
141+
142+
- Changed frame filter storage from tuples to sets in `PipelineTask`.
143+
(PR [#3510](https://github.com/pipecat-ai/pipecat/pull/3510))
144+
145+
- Changed default Inworld TTS model from `inworld-tts-1` to
146+
`inworld-tts-1.5-max`.
147+
(PR [#3531](https://github.com/pipecat-ai/pipecat/pull/3531))
148+
149+
- `FrameSerializer` now subclasses from `BaseObject` to enable event support.
150+
(PR [#3560](https://github.com/pipecat-ai/pipecat/pull/3560))
151+
152+
- Added support for TTFS in `SpeechmaticsSTTService` and set the default mode
153+
to `EXTERNAL` to support Pipecat-controlled VAD.
154+
- Changed dependency to `speechmatics-voice[smart]>=0.2.8`
155+
(PR [#3562](https://github.com/pipecat-ai/pipecat/pull/3562))
156+
157+
- ⚠️ Changed function call handling to use timeout-based completion instead of
158+
immediate callback execution.
159+
- Function calls that defer their results (e.g., `UserImageRequestFrame`)
160+
now use a timeout mechanism
161+
- The `result_callback` is invoked automatically when the deferred
162+
operation completes or after timeout
163+
- This change affects examples using `UserImageRequestFrame` - the
164+
`result_callback` should now be passed to the frame instead of being called
165+
immediately
166+
(PR [#3571](https://github.com/pipecat-ai/pipecat/pull/3571))
167+
168+
- Pipecat runner now uses `DAILY_ROOM_URL` instead of `DAILY_SAMPLE_ROOM_URL`.
169+
(PR [#3582](https://github.com/pipecat-ai/pipecat/pull/3582))
170+
171+
- Updates to `GradiumSTTService`:
172+
- Now flushes pending transcriptions when VAD detects the user stopped
173+
speaking, improving response latency.
174+
- `GradiumSTTService` now supports `InputParams` for configuring `language`
175+
and `delay_in_frames` settings.
176+
(PR [#3587](https://github.com/pipecat-ai/pipecat/pull/3587))
177+
178+
### Deprecated
179+
180+
- ⚠️ Deprecated `vad_analyzer` parameter on `BaseInputTransport`. Pass
181+
`vad_analyzer` to `LLMUserAggregatorParams` instead or use `VADProcessor` in
182+
the pipeline.
183+
(PR [#3583](https://github.com/pipecat-ai/pipecat/pull/3583))
184+
185+
### Removed
186+
187+
- Removed deprecated `AICFilter` parameters: `enhancement_level`, `voice_gain`,
188+
`noise_gate_enable`.
189+
(PR [#3408](https://github.com/pipecat-ai/pipecat/pull/3408))
190+
191+
### Fixed
192+
193+
- Fixed an issue where if you were using `OpenRouterLLMService` with a Gemini
194+
model, it wouldn't handle multiple `"system"` messages as expected (and as we
195+
do in `GoogleLLMService`), which is to convert subsequent ones into `"user"`
196+
messages. Instead, the latest `"system"` message would overwrite the previous
197+
ones.
198+
(PR [#3406](https://github.com/pipecat-ai/pipecat/pull/3406))
199+
200+
- Transports now properly broadcast `InputTransportMessageFrame` frames both
201+
upstream and downstream instead of only pushing downstream.
202+
(PR [#3519](https://github.com/pipecat-ai/pipecat/pull/3519))
203+
204+
- Fixed `FrameProcessor.broadcast_frame()` to deep copy kwargs, preventing
205+
shared mutable references between the downstream and upstream frame
206+
instances.
207+
(PR [#3519](https://github.com/pipecat-ai/pipecat/pull/3519))
208+
209+
- Fixed OpenAI LLM services to emit `ErrorFrame` on completion timeout,
210+
enabling proper error handling and LLMSwitcher failover.
211+
(PR [#3529](https://github.com/pipecat-ai/pipecat/pull/3529))
212+
213+
- Fixed a logging issue where non-ASCII characters (e.g., Japanese, Chinese,
214+
etc.) were being unnecessarily escaped to Unicode sequences when function
215+
call occurred.
216+
(PR [#3536](https://github.com/pipecat-ai/pipecat/pull/3536))
217+
218+
- Fixed how audio tracks are synchronized inside the `AudioBufferProcessor` to
219+
fix timing issues where silence and audio were misaligned between user and
220+
bot buffers.
221+
(PR [#3541](https://github.com/pipecat-ai/pipecat/pull/3541))
222+
223+
- Fixed race condition in `OpenAIRealtimeBetaLLMService` that could cause an
224+
error when truncating the conversation.
225+
(PR [#3567](https://github.com/pipecat-ai/pipecat/pull/3567))
226+
227+
- Fixed an infinite loop in `WebsocketService` that blocked the event loop when
228+
a remote server closed the connection gracefully.
229+
(PR [#3574](https://github.com/pipecat-ai/pipecat/pull/3574))
230+
231+
- Fixed `LLMUserAggregator` and `LLMAssistantAggregator` not emitting pending
232+
transcripts via `on_user_turn_stopped` and `on_assistant_turn_stopped` events
233+
when the conversation ends (`EndFrame`) or is cancelled (`CancelFrame`).
234+
(PR [#3575](https://github.com/pipecat-ai/pipecat/pull/3575))
235+
236+
- Added missing `LiveKitRunnerArguments` and `LiveKitTransport` support in
237+
runner utilities to enable LiveKit transport configuration.
238+
(PR [#3580](https://github.com/pipecat-ai/pipecat/pull/3580))
239+
240+
- Fixed race condition in `OpenAIRealtimeLLMService` that could cause an error
241+
when truncating the conversation.
242+
(PR [#3581](https://github.com/pipecat-ai/pipecat/pull/3581))
243+
244+
- Fixed `PiperHttpTTSService` (olf `PiperTTSService`) to resample audio output
245+
based on the model's sample rate parsed from the WAV header.
246+
(PR [#3585](https://github.com/pipecat-ai/pipecat/pull/3585))
247+
248+
- Fixed `UserTurnController` to reset user turn timeout when interim
249+
transcriptions are received.
250+
(PR [#3594](https://github.com/pipecat-ai/pipecat/pull/3594))
251+
252+
- Fixed an issue in the `IVRNavigator` where the `TextFrame`s pushed had
253+
incorrect spacing. Now, the internal `IVRProcessor` pushes
254+
`AggregatedTextFrame`s when in conversation mode. This allows for controlling
255+
spacing of the outputted, aggregated text.
256+
(PR [#3604](https://github.com/pipecat-ai/pipecat/pull/3604))
257+
258+
- Fixed `GeminiLiveLLMService` transcription timeout handler not being
259+
scheduled by yielding to the event loop after task creation.
260+
(PR [#3605](https://github.com/pipecat-ai/pipecat/pull/3605))
261+
10262
## [0.0.100] - 2026-01-20
11263

12264
### Added

changelog/3406.fixed.md

Lines changed: 0 additions & 1 deletion
This file was deleted.

changelog/3408.added.md

Lines changed: 0 additions & 4 deletions
This file was deleted.

changelog/3408.changed.md

Lines changed: 0 additions & 1 deletion
This file was deleted.

changelog/3408.removed.md

Lines changed: 0 additions & 1 deletion
This file was deleted.

changelog/3429.added.md

Lines changed: 0 additions & 1 deletion
This file was deleted.

changelog/3495.changed.2.md

Lines changed: 0 additions & 1 deletion
This file was deleted.

changelog/3495.changed.md

Lines changed: 0 additions & 1 deletion
This file was deleted.

changelog/3500.added.md

Lines changed: 0 additions & 1 deletion
This file was deleted.

changelog/3510.added.2.md

Lines changed: 0 additions & 1 deletion
This file was deleted.

0 commit comments

Comments
 (0)