Add isSpeaking function to NVDA controller client#20179
Conversation
|
Closing for now.
|
|
@seanbudd I'm confused. Can you explain how this is a non-trivial change? |
|
By this, I mean it doesn't break or affect anything, since it's just the controller client and creates a new interface. |
|
This is a new feature. For new features, we generally want some justification for why they should exist (e.g. user stories). |
|
@seanbudd Sure, but it's a new feature that my TTS library, Prism, can take advantage of, and would also eliminate screen reader calibration being needed in games and other environments which need to know when the screen reader has finished speaking. |
|
Thanks - can you open a feature request or developer facing changes issue to document the problem, and serve as a place for discussion |
|
I think we should avoid bumping the major version for nvdaController client, only bumping the minor version |
|
So do you mean not creating a new interface? I'm a bit hesitant to merge the two since I don't know how that would break unmaintained copies of the controller client (and there are quite a few floating around). |
|
I would suggest |
|
@seanbudd How do you do a version increment like that, allowing for version 1.0 at the same time? (I'm not super familiar with MIDL files and how versioning works.) |
seanbudd
left a comment
There was a problem hiding this comment.
Don't worry about the versioning, it doesn't matter much. Generally looks good to me
There was a problem hiding this comment.
Pull request overview
This PR adds an isSpeaking query to NVDA’s speech module and exposes it through the NVDA Controller RPC/client API so external processes can determine whether NVDA is currently producing speech audio.
Changes:
- Added
speech.isSpeaking()to report whether NVDA is currently speaking (per speech mode/paused state and speech manager status). - Exported
isSpeakingfromsource/speech/__init__.py. - Introduced a new controller RPC interface (
NvdaController3) and wired it through server/client glue (IDL/ACF, RPC server registration, exports, binding handle setup).
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| source/speech/speech.py | Adds isSpeaking() implementation based on speech state and speech manager “still speaking” tracking. |
| source/speech/init.py | Re-exports isSpeaking from the speech package. |
| source/NVDAHelper/init.py | Adds nvdaController_isSpeaking callback implementation and registers it with nvdaHelperLocal. |
| nvdaHelper/local/rpcSrv.cpp | Registers the new controller v3 RPC interface with the local RPC server. |
| nvdaHelper/local/nvdaHelperLocal.def | Exports the _nvdaController_isSpeaking function pointer for nvdaHelperLocal. |
| nvdaHelper/local/nvdaController.cpp | Adds the local wrapper and function pointer for nvdaController_isSpeaking. |
| nvdaHelper/interfaces/nvdaController/nvdaController.idl | Defines NvdaController3::isSpeaking. |
| nvdaHelper/interfaces/nvdaController/nvdaController.acf | Adds binding handle configuration for the new interface. |
| nvdaHelper/client/nvdaControllerClient.def | Exports nvdaController_isSpeaking from the controller client library. |
| nvdaHelper/client/client.cpp | Creates/frees an RPC binding handle for the new controller v3 interface. |
LeonarddeR
left a comment
There was a problem hiding this comment.
I'm not sure that you actually need this function for your goal, see #20188 (comment).
That said, I'm not against adding this at all, I believe that JAWS has such a function as well.
Note that this function will never tell consumers whether the speech that is currently being spoken is originating from your app. That can be done with SSML marks. If you need your app to detect speaking not caused by your app, then adding this function is the right approach I think.
|
@LeonarddeR I know SSML can do this, but SSML speech is not caught by add-ons such as speech history (which is a problem with speech history I'm guessing and not with NVDA core). |
Co-authored-by: Sean Budd <seanbudd123@gmail.com>
Co-authored-by: Sean Budd <seanbudd123@gmail.com>
…indp/nvda into nvda-controller-enhancements
In that case, I guess it is better to fixup speech history for that matter. SPeaking SSML uses speech.speak so it should call all extension points fine. |
…indp/nvda into nvda-controller-enhancements
Link to issue number:
N/A
Summary of the issue:
Right now, NVDA provides no way of detecting when it is speaking. This PR attempts to close that gap.
Description of user facing changes:
N/A
Description of developer facing changes:
A new method,
isSpeaking, is added to the speech module. This is a composite check, in that it returns false if either the speech mode is not talk or if speech is paused.Description of development approach:
In the NVDA codebase, "speaking" is a highly ambiguous term. It could mean that the synth driver is actively producing audio or speech is queued but may not be pushed to the synth driver. In this PR, we define "speaking" as the former: NVDA is "speaking" when the synth is actually speaking/producing audio, not when NVDA has speech queued. We do this because this naturally maps to what a user would consider as "speaking" intuitively.
Testing strategy:
This is a difficult property to actively test. Right now, I know that the implementation works over the RPC endpoint at least.
Known issues with pull request:
Code Review Checklist: