fix: resolve 'TODO: last event is not final' error in streaming mode …#638
fix: resolve 'TODO: last event is not final' error in streaming mode …#638mahendrarathore1742 wants to merge 4 commits intogoogle:mainfrom
Conversation
…oogle#600) When using streaming mode (e.g., with a2a remote agents), Flow.Run() would return an error 'TODO: last event is not final' if the last event from runOneStep had Partial=true. This is a legitimate state that occurs when: - The model reaches max token limit during streaming - A sub-agent (e.g., a2a remote agent) emits partial streaming events The turn is complete in these cases, so we simply return instead of erroring. Added regression test TestFlowRunPartialLastEvent with 3 sub-cases covering single partial, multiple partial, and mixed partial/final response scenarios. Fixes google#600
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the robustness of the streaming functionality by addressing an issue where Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request correctly addresses an error that occurred in streaming mode when the last event was a partial response. The fix involves simply returning instead of erroring out, which is the correct behavior for a completed turn. The addition of a comprehensive regression test with multiple sub-cases ensures this scenario is well-covered. My review includes a minor suggestion to improve the clarity and maintainability of the new test code.
| // partialOnlyModel is a mock LLM that always returns partial responses in streaming mode. | ||
| // This simulates the scenario where the model's last streaming chunk has Partial=true, | ||
| // for example when reaching a max token limit. | ||
| type partialOnlyModel struct { | ||
| responses []*model.LLMResponse | ||
| } | ||
|
|
||
| func (m *partialOnlyModel) Name() string { return "partial-mock" } | ||
| func (m *partialOnlyModel) GenerateContent(_ context.Context, _ *model.LLMRequest, _ bool) iter.Seq2[*model.LLMResponse, error] { | ||
| return func(yield func(*model.LLMResponse, error) bool) { | ||
| for _, r := range m.responses { | ||
| if !yield(r, nil) { | ||
| return | ||
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
The name partialOnlyModel and its associated comment are slightly misleading. The comment states it "always returns partial responses", but it's used in a test case that includes a non-partial response. To improve clarity and future maintainability, I've suggested renaming it to mockLLM and updating the comment. You will also need to update its usage on line 666.
| // partialOnlyModel is a mock LLM that always returns partial responses in streaming mode. | |
| // This simulates the scenario where the model's last streaming chunk has Partial=true, | |
| // for example when reaching a max token limit. | |
| type partialOnlyModel struct { | |
| responses []*model.LLMResponse | |
| } | |
| func (m *partialOnlyModel) Name() string { return "partial-mock" } | |
| func (m *partialOnlyModel) GenerateContent(_ context.Context, _ *model.LLMRequest, _ bool) iter.Seq2[*model.LLMResponse, error] { | |
| return func(yield func(*model.LLMResponse, error) bool) { | |
| for _, r := range m.responses { | |
| if !yield(r, nil) { | |
| return | |
| } | |
| } | |
| } | |
| } | |
| // mockLLM is a mock LLM that returns a canned sequence of responses. | |
| // This simulates the scenario where the model's last streaming chunk has Partial=true, | |
| // for example when reaching a max token limit. | |
| type mockLLM struct { | |
| responses []*model.LLMResponse | |
| } | |
| func (m *mockLLM) Name() string { return "mock-llm" } | |
| func (m *mockLLM) GenerateContent(_ context.Context, _ *model.LLMRequest, _ bool) iter.Seq2[*model.LLMResponse, error] { | |
| return func(yield func(*model.LLMResponse, error) bool) { | |
| for _, r := range m.responses { | |
| if !yield(r, nil) { | |
| return | |
| } | |
| } | |
| } | |
| } |
Address code review feedback: the mock type name was misleading since it's also used in a test case with non-partial responses. Renamed to mockStreamingLLM to avoid both the naming confusion and a conflict with the existing mockLLM in the same package.
|
Thanks for the review! Good catch the name was indeed misleading. I renamed it to mockStreamingLLM instead of mockLLM because mockLLM already exists in outputschema_processor_test.go in the same package and would cause a compilation error. Updated and pushed. |
…(#600)
When using streaming mode (e.g., with a2a remote agents), Flow.Run() would return an error 'TODO: last event is not final' if the last event from runOneStep had Partial=true. This is a legitimate state that occurs when:
The turn is complete in these cases, so we simply return instead of erroring.
Added regression test TestFlowRunPartialLastEvent with 3 sub-cases covering single partial, multiple partial, and mixed partial/final response scenarios.
Fixes #600