Skip to content

Add isSpeaking function to NVDA controller client#20179

Draft
ethindp wants to merge 9 commits into
nvaccess:masterfrom
ethindp:nvda-controller-enhancements
Draft

Add isSpeaking function to NVDA controller client#20179
ethindp wants to merge 9 commits into
nvaccess:masterfrom
ethindp:nvda-controller-enhancements

Conversation

@ethindp
Copy link
Copy Markdown
Contributor

@ethindp ethindp commented May 19, 2026

Link to issue number:

N/A

Summary of the issue:

Right now, NVDA provides no way of detecting when it is speaking. This PR attempts to close that gap.

Description of user facing changes:

N/A

Description of developer facing changes:

A new method, isSpeaking, is added to the speech module. This is a composite check, in that it returns false if either the speech mode is not talk or if speech is paused.

Description of development approach:

In the NVDA codebase, "speaking" is a highly ambiguous term. It could mean that the synth driver is actively producing audio or speech is queued but may not be pushed to the synth driver. In this PR, we define "speaking" as the former: NVDA is "speaking" when the synth is actually speaking/producing audio, not when NVDA has speech queued. We do this because this naturally maps to what a user would consider as "speaking" intuitively.

Testing strategy:

This is a difficult property to actively test. Right now, I know that the implementation works over the RPC endpoint at least.

Known issues with pull request:

Code Review Checklist:

  • Documentation:
    • Change log entry
    • User Documentation
    • Developer / Technical Documentation
    • Context sensitive help for GUI changes
  • Testing:
    • Unit tests
    • System (end to end) tests
    • Manual testing
  • UX of all users considered:
    • Speech
    • Braille
    • Low Vision
    • Different web browsers
    • Localization in other languages / culture than English
  • API is compatible with existing add-ons.
  • Security precautions taken.

@ethindp ethindp marked this pull request as ready for review May 19, 2026 23:05
@ethindp ethindp requested a review from a team as a code owner May 19, 2026 23:05
@ethindp ethindp requested a review from SaschaCowley May 19, 2026 23:05
@seanbudd seanbudd added the blocked/needs-product-decision A product decision needs to be made. Decisions about NVDA UX or supported use-cases. label May 20, 2026
@seanbudd seanbudd requested review from seanbudd and removed request for SaschaCowley May 20, 2026 00:07
@seanbudd
Copy link
Copy Markdown
Member

seanbudd commented May 20, 2026

Closing for now.
Please open an issue first, per our contributing guidelines:

For anything other than minor bug fixes, ensure an issue has been filed and triaged. Please understand that we very likely will not accept non-trivial changes that are not discussed first.

@seanbudd seanbudd closed this May 20, 2026
@ethindp
Copy link
Copy Markdown
Contributor Author

ethindp commented May 20, 2026

@seanbudd I'm confused. Can you explain how this is a non-trivial change?

@ethindp
Copy link
Copy Markdown
Contributor Author

ethindp commented May 20, 2026

By this, I mean it doesn't break or affect anything, since it's just the controller client and creates a new interface.

@seanbudd
Copy link
Copy Markdown
Member

This is a new feature. For new features, we generally want some justification for why they should exist (e.g. user stories).

@ethindp
Copy link
Copy Markdown
Contributor Author

ethindp commented May 20, 2026

@seanbudd Sure, but it's a new feature that my TTS library, Prism, can take advantage of, and would also eliminate screen reader calibration being needed in games and other environments which need to know when the screen reader has finished speaking.

@seanbudd
Copy link
Copy Markdown
Member

Thanks - can you open a feature request or developer facing changes issue to document the problem, and serve as a place for discussion

@seanbudd
Copy link
Copy Markdown
Member

I think we should avoid bumping the major version for nvdaController client, only bumping the minor version

@seanbudd seanbudd reopened this May 25, 2026
@seanbudd seanbudd added conceptApproved Similar 'triaged' for issues, PR accepted in theory, implementation needs review. and removed blocked/needs-product-decision A product decision needs to be made. Decisions about NVDA UX or supported use-cases. labels May 25, 2026
@ethindp
Copy link
Copy Markdown
Contributor Author

ethindp commented May 25, 2026

So do you mean not creating a new interface? I'm a bit hesitant to merge the two since I don't know how that would break unmaintained copies of the controller client (and there are quite a few floating around).

@seanbudd
Copy link
Copy Markdown
Member

I would suggest nvdaController_NvdaController2_v1_1_s_ifspec instead of nvdaController_NvdaController3_v1_0_s_ifspec

@ethindp
Copy link
Copy Markdown
Contributor Author

ethindp commented May 26, 2026

@seanbudd How do you do a version increment like that, allowing for version 1.0 at the same time? (I'm not super familiar with MIDL files and how versioning works.)

Copy link
Copy Markdown
Member

@seanbudd seanbudd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't worry about the versioning, it doesn't matter much. Generally looks good to me

Comment thread nvdaHelper/client/client.cpp
Comment thread nvdaHelper/interfaces/nvdaController/nvdaController.idl Outdated
Comment thread source/NVDAHelper/__init__.py Outdated
Comment thread source/speech/speech.py Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an isSpeaking query to NVDA’s speech module and exposes it through the NVDA Controller RPC/client API so external processes can determine whether NVDA is currently producing speech audio.

Changes:

  • Added speech.isSpeaking() to report whether NVDA is currently speaking (per speech mode/paused state and speech manager status).
  • Exported isSpeaking from source/speech/__init__.py.
  • Introduced a new controller RPC interface (NvdaController3) and wired it through server/client glue (IDL/ACF, RPC server registration, exports, binding handle setup).

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
source/speech/speech.py Adds isSpeaking() implementation based on speech state and speech manager “still speaking” tracking.
source/speech/init.py Re-exports isSpeaking from the speech package.
source/NVDAHelper/init.py Adds nvdaController_isSpeaking callback implementation and registers it with nvdaHelperLocal.
nvdaHelper/local/rpcSrv.cpp Registers the new controller v3 RPC interface with the local RPC server.
nvdaHelper/local/nvdaHelperLocal.def Exports the _nvdaController_isSpeaking function pointer for nvdaHelperLocal.
nvdaHelper/local/nvdaController.cpp Adds the local wrapper and function pointer for nvdaController_isSpeaking.
nvdaHelper/interfaces/nvdaController/nvdaController.idl Defines NvdaController3::isSpeaking.
nvdaHelper/interfaces/nvdaController/nvdaController.acf Adds binding handle configuration for the new interface.
nvdaHelper/client/nvdaControllerClient.def Exports nvdaController_isSpeaking from the controller client library.
nvdaHelper/client/client.cpp Creates/frees an RPC binding handle for the new controller v3 interface.

Comment thread source/speech/speech.py Outdated
Comment thread source/speech/speech.py
Comment thread source/speech/speech.py
Comment thread source/NVDAHelper/__init__.py
Comment thread nvdaHelper/local/rpcSrv.cpp Outdated
Comment thread nvdaHelper/client/client.cpp Outdated
Comment thread nvdaHelper/local/nvdaController.cpp
@seanbudd seanbudd marked this pull request as draft May 26, 2026 04:41
Copy link
Copy Markdown
Collaborator

@LeonarddeR LeonarddeR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that you actually need this function for your goal, see #20188 (comment).
That said, I'm not against adding this at all, I believe that JAWS has such a function as well.
Note that this function will never tell consumers whether the speech that is currently being spoken is originating from your app. That can be done with SSML marks. If you need your app to detect speaking not caused by your app, then adding this function is the right approach I think.

@ethindp
Copy link
Copy Markdown
Contributor Author

ethindp commented May 26, 2026

@LeonarddeR I know SSML can do this, but SSML speech is not caught by add-ons such as speech history (which is a problem with speech history I'm guessing and not with NVDA core).

@LeonarddeR
Copy link
Copy Markdown
Collaborator

@LeonarddeR I know SSML can do this, but SSML speech is not caught by add-ons such as speech history (which is a problem with speech history I'm guessing and not with NVDA core).

In that case, I guess it is better to fixup speech history for that matter. SPeaking SSML uses speech.speak so it should call all extension points fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

conceptApproved Similar 'triaged' for issues, PR accepted in theory, implementation needs review.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants