Skip to content

feat: add progress handler with cancellation to PyannoteDiarizationPipeline.diarize()#194

Merged
ivan-digital merged 4 commits intosoniqo:mainfrom
SargerasWang:feature/diarization-progress-handler
Apr 7, 2026
Merged

feat: add progress handler with cancellation to PyannoteDiarizationPipeline.diarize()#194
ivan-digital merged 4 commits intosoniqo:mainfrom
SargerasWang:feature/diarization-progress-handler

Conversation

@SargerasWang
Copy link
Copy Markdown
Contributor

@SargerasWang SargerasWang commented Apr 7, 2026

Summary

  • Add an overloaded diarize() method that accepts an optional progressHandler: ((Float, String) -> Bool)? callback
  • Progress is calculated from actual completed work units (completedUnits / totalUnits), no estimated weights
  • Total units = windowCount × 2 (segmentation pass + embedding extraction pass)
  • The handler returns Bool: true to continue, false to cancel immediately
  • When cancelled, returns an empty DiarizationResult at the next window boundary (~50–200ms latency)
  • The original diarize(audio:sampleRate:config:) API is unchanged and delegates to the new overload with nil handler

Motivation

For long audio files (e.g. 40+ minutes), diarize() can take many minutes. Without progress reporting, callers have no way to show meaningful progress to users. Additionally, users need the ability to cancel a long-running diarization — because diarize() is synchronous, Swift Task cancellation alone cannot interrupt it.

Changes

  • Sources/SpeechVAD/DiarizationPipeline.swift — new diarize(audio:sampleRate:config:progressHandler:) overload with Bool return type; cancellation checks in VAD pre-filter, segmentation loop, and embedding loop
  • Tests/SpeechVADTests/DiarizationPipelineTests.swift — unit tests (API signature, config defaults) + E2E tests (monotonic progress, cancellation returns empty result, nil handler)
  • docs/inference/speaker-diarization.md — progress reporting & cancellation usage examples

Test plan

  • Unit tests pass (swift test --filter DiarizationPipelineTests --skip E2E)
  • E2E tests pass with model downloads
  • Existing diarize(audio:sampleRate:config:) API works identically (backward compatible)
  • Cancellation stops diarization within one window's inference time

Add an overloaded diarize() method that accepts an optional
progressHandler callback reporting (progress: Float, stage: String).

Progress is calculated from actual completed work units:
- Total units = windowCount * 2 (segmentation + embedding extraction)
- Each window completion increments the counter
- No estimated weights or magic numbers

The original diarize(audio:sampleRate:config:) method is unchanged
and delegates to the new overload with nil handler.
- Unit test: verify progressHandler overload compiles, DiarizationConfig
  defaults, DiarizationResult construction
- E2E test: verify progress values are monotonically non-decreasing and
  within [0, 1] range, original API still works without handler
- Docs: add progress reporting example to speaker-diarization.md
Change progressHandler return type from Void to Bool. Returning false
stops diarization at the next window boundary and returns an empty
DiarizationResult. This enables callers to cancel long-running
diarization without waiting for the full pipeline to complete.
@SargerasWang SargerasWang changed the title feat: add progress handler to PyannoteDiarizationPipeline.diarize() feat: add progress handler with cancellation to PyannoteDiarizationPipeline.diarize() Apr 7, 2026
…arize()

Same pattern as PyannoteDiarizationPipeline: progressHandler returns
Bool (true=continue, false=cancel). Progress is reported per streaming
chunk. The original no-handler overload delegates with nil.
Copy link
Copy Markdown
Collaborator

@ivan-digital ivan-digital left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean implementation — backward compatible API, good progress granularity per window, cancellation at window boundaries is the right tradeoff. Tests cover the key cases well. Thanks for adding this, progress reporting for long diarization jobs has been a gap.

@ivan-digital ivan-digital merged commit 35fd73f into soniqo:main Apr 7, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants