Skip to content

feat: add reusable qwen3 reference conditioning#156

Draft
gaelic-ghost wants to merge 3 commits intoBlaizzy:mainfrom
gaelic-ghost:add/qwenReferenceConditioning
Draft

feat: add reusable qwen3 reference conditioning#156
gaelic-ghost wants to merge 3 commits intoBlaizzy:mainfrom
gaelic-ghost:add/qwenReferenceConditioning

Conversation

@gaelic-ghost
Copy link
Copy Markdown
Contributor

Summary

  • add a reusable Qwen3TTSReferenceConditioning surface for consumer-managed clone/reference preparation
  • split reusable reference-side preparation from request-specific generation input assembly
  • add generation overloads that accept conditioning:
  • keep the existing refAudio / refText path routed through the same underlying implementation
  • add tests covering the reusable conditioning flow

Why

Presently, when the same refAudio and refText are reused across multiple generations, the model repeats the same conditioning work each time. This change adds a surface for consumers to do that work once, and receive a type containing the conditioning to store and reuse as they wish, without introducing instance-local cache state.

Notes

This is the branch I referenced in #149, restored as a clean branch on top of current main.

Why:
- external callers can use the public Qwen3TTSReferenceConditioning type in generate APIs
- without a public initializer, that type could not actually be constructed outside the module

Verification:
- swift build
- swift test
- xcodebuild build-for-testing -scheme MLXAudio-Package -destination 'platform=macOS' MACOSX_DEPLOYMENT_TARGET=14.0 CODE_SIGNING_ALLOWED=NO
- xcodebuild test-without-building -scheme MLXAudio-Package -destination 'platform=macOS' -skip-testing:'MLXAudioTests/SmokeTests' -parallel-testing-enabled NO CODE_SIGNING_ALLOWED=NO
}
}

private func resolveVoiceDesignGenerationSettings(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I get the idea of this struct -- it's seems the same as GenerateParameters with the language from the conditioning, but we already have the default generate params to resolve the sampling defaults, so it seems like we're getting a net code / complexity increase by introducing this additional type at the call sites.

@lucasnewman
Copy link
Copy Markdown
Collaborator

@gaelic-ghost Can you resolve the conflicts with the main branch and see the comment?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants