Skip to content

feat(go): added common middleware (e.g. tool approval, retry, fallback)#4719

Merged
apascal07 merged 69 commits intomainfrom
ap/go-middleware-impls
Apr 27, 2026
Merged

feat(go): added common middleware (e.g. tool approval, retry, fallback)#4719
apascal07 merged 69 commits intomainfrom
ap/go-middleware-impls

Conversation

@apascal07
Copy link
Copy Markdown
Collaborator

@apascal07 apascal07 commented Feb 17, 2026

Adds a middleware plugin that bundles five production-ready implementations of the ai.Middleware interface introduced in #4464: Retry with exponential backoff, Fallback across models, ToolApproval for human-in-the-loop gating, Filesystem for scoped file access, and Skills for loadable SKILL.md personas.

Examples

Composing Retry and Fallback

Register the middleware plugin during Init to expose the built-ins to the Dev UI, then attach them with ai.WithUse. Middleware composes outer-to-inner, so ai.WithUse(&Retry{...}, &Fallback{...}) expands to Retry { Fallback { model } }: each model in the fallback list is retried with exponential backoff before the next fallback is tried.

import "github.com/firebase/genkit/go/plugins/middleware"

g := genkit.Init(ctx, genkit.WithPlugins(
    &googlegenai.GoogleAI{},
    &middleware.Middleware{},
))

response, _ := genkit.Generate(ctx, g,
    ai.WithModelName("googleai/gemini-flash-latest"),
    ai.WithPrompt("Explain quantum computing."),
    ai.WithUse(
        &middleware.Retry{MaxRetries: 3},
        &middleware.Fallback{Models: []ai.ModelRef{
            googlegenai.ModelRef("googleai/gemini-3.1-flash", nil),
        }},
    ),
)

Tool Approval

ToolApproval interrupts any tool call outside its AllowedTools list. The caller approves (or rejects) and resumes with ai.WithToolRestarts, reusing the existing interrupt/restart machinery:

response, _ := genkit.Generate(ctx, g,
    ai.WithModelName("googleai/gemini-flash-latest"),
    ai.WithPrompt("Transfer $500 to Alice."),
    ai.WithTools(transferTool, lookupTool),
    ai.WithUse(&middleware.ToolApproval{
        AllowedTools: []string{lookupTool.Name()}, // transfer triggers an interrupt
    }),
)

Filesystem

Filesystem registers list_files, read_file, and optionally write_file and search_and_replace, all confined to RootDir. Path safety is enforced by os.Root (Go 1.25+), which rejects any path that resolves outside the root, including via .., absolute paths, or symlinks:

response, _ := genkit.Generate(ctx, g,
    ai.WithModelName("googleai/gemini-flash-latest"),
    ai.WithPrompt("Summarise docs/ and save the summary to summary.md"),
    ai.WithUse(&middleware.Filesystem{
        RootDir:          "./workspace",
        AllowWriteAccess: true,
    }),
)

Skills

Skills exposes a local library of loadable instructions. Each skill is a directory with a SKILL.md file; the middleware advertises available skills (name plus optional description) in the system prompt and registers a use_skill tool that loads the chosen skill's full body into the conversation on demand:

response, _ := genkit.Generate(ctx, g,
    ai.WithModelName("googleai/gemini-flash-latest"),
    ai.WithPrompt("Explain how rainbows form, in the voice of a pirate."),
    ai.WithUse(&middleware.Skills{SkillPaths: []string{"skills"}}),
)

API Reference

Built-in middleware (plugins/middleware)

// Middleware is the plugin that registers the built-ins with the Dev UI.
type Middleware struct{}

// Retry retries failed model calls with exponential backoff and jitter.
// Non-GenkitError errors are always retried; GenkitErrors only if their status is in Statuses.
type Retry struct {
    MaxRetries     int               `json:"maxRetries,omitempty"`      // default 3
    Statuses       []core.StatusName `json:"statuses,omitempty"`        // default: UNAVAILABLE, DEADLINE_EXCEEDED, RESOURCE_EXHAUSTED, ABORTED, INTERNAL
    InitialDelayMs int               `json:"initialDelayMs,omitempty"`  // default 1000
    MaxDelayMs     int               `json:"maxDelayMs,omitempty"`      // default 60000
    BackoffFactor  float64           `json:"backoffFactor,omitempty"`   // default 2
    NoJitter       bool              `json:"noJitter,omitempty"`
}

// Fallback tries alternative models when the primary fails with a retryable status.
// Each model's config comes from its ModelRef; the original request config is not inherited.
type Fallback struct {
    Models   []ai.ModelRef     `json:"models,omitempty"`
    Statuses []core.StatusName `json:"statuses,omitempty"` // default also includes NOT_FOUND, UNIMPLEMENTED
}

// ToolApproval interrupts any tool call not present in AllowedTools.
// An empty list interrupts all tools.
type ToolApproval struct {
    AllowedTools []string `json:"allowedTools,omitempty"`
}

// Filesystem grants the model scoped file access under RootDir.
// Registers list_files and read_file; adds write_file and search_and_replace
// when AllowWriteAccess is true.
type Filesystem struct {
    RootDir          string `json:"rootDirectory,omitempty"`
    AllowWriteAccess bool   `json:"allowWriteAccess,omitempty"`
    ToolNamePrefix   string `json:"toolNamePrefix,omitempty"` // prefix tool names when attaching multiple Filesystem middlewares
}

// Skills exposes a local library of SKILL.md files as loadable system instructions.
// Injects a system prompt listing available skills and registers a use_skill tool.
type Skills struct {
    SkillPaths []string `json:"skillPaths,omitempty"` // default []string{"skills"}
}

Each type satisfies the simplified ai.Middleware contract from #4464 by implementing Name() string and New(ctx) (*ai.Hooks, error). New returns a per-call hook bundle (Tools, WrapGenerate, WrapModel, WrapTool); per-call state — like Filesystem's message queue and resolved os.Root, or Skills' scanned skill set — lives in closures captured by New so concurrent calls stay isolated.


Notes

  • Builds on the simplified Middleware interface from feat(go/ai): added DefineMiddleware (Middleware V2) #4464 (config struct + New(ctx) (*ai.Hooks, error); tools surfaced via Hooks.Tools).
  • Requires Go 1.25+ (Filesystem depends on os.Root).
  • Samples added under go/samples/basic-middleware/: retry-fallback, filesystem, and skills.
  • Also included in this branch: tighter error wrapping in plugins/googlegenai, a streaming ordering fix that preserves generate > model > tool turn order when resuming restarted tools, a refactor of the Fallback/Retry split into separate middlewares, and a streaming-format-handler fix so chunks emitted from WrapGenerate accumulate alongside model-emitted chunks.

@apascal07 apascal07 changed the base branch from main to ap/go-middleware February 17, 2026 20:43
@github-actions github-actions Bot added docs Improvements or additions to documentation js go tooling labels Feb 17, 2026
@apascal07 apascal07 marked this pull request as ready for review April 16, 2026 16:52
Rework the Go middleware primitives introduced in PR #4464 to collapse
configuration and behavior into a single "config struct is the middleware"
model and remove the descriptor/factory/prototype scaffolding.

- Drop the Middleware interface (Name/New/WrapGenerate/WrapModel/WrapTool/Tools)
  and the BaseMiddleware embedding helper. Introduce Hooks as a plain struct
  of optional hook func fields (WrapGenerate, WrapModel, WrapTool, Tools);
  nil hooks pass through.
- Repurpose Middleware as an interface with just Name() + New(ctx), which a
  user-facing config struct implements directly. Passing a config value to
  WithUse runs its New on the local fast path with no registry lookup, so
  pure-Go code works without plugin registration.
- NewMiddleware[M](description, prototype) captures the typed prototype in a
  closure stored on MiddlewareDesc.buildFromJSON, preserving unexported
  plugin-level state across JSON-dispatched calls via value-copy.
- MiddlewareDesc returns to being the shared schemas.config-generated type
  with the private factory added via the existing `field` directive.
- Rename MiddlewarePlugin.ListMiddleware to Middlewares to align with the
  upcoming V2 naming conventions.
- Replace Inline with MiddlewareFunc, a canonical Go adapter type that
  satisfies Middleware for ad-hoc closure-based middleware.
- Add genkit.DefineMiddleware and genkit.LookupMiddleware wrappers with
  complete godoc matching the DefineTool/LookupTool style.

Fixes carried over from the initial review:
- Preserve MultipartToolResponse.Content through the resume path in
  handleResumedToolRequest (previously dropped).
- Change WrapTool return type to *MultipartToolResponse so metadata and
  content flow through without an out-of-band capture hack.
- Reject duplicate middleware-contributed tool names explicitly in
  GenerateWithRequest instead of panicking at registry registration.
- Build the WrapGenerate, WrapModel, and WrapTool hook chains once per
  GenerateWithRequest rather than rebuilding them on every tool-loop turn.
- Export NewToolInterruptError so WrapTool hooks can interrupt tools
  without constructing a ToolContext.

Tests rewritten against the new shape and expanded to cover: plugin-state
value-copy, call-level state isolation, MiddlewareFunc adapter, nil hooks,
stream chunk accumulation, tool contribution, duplicate-tool rejection,
factory error propagation, WrapTool interrupts, per-iteration WrapGenerate,
and metadata round-trip through WrapTool. All green under -race.
Adopt the new Hooks-based middleware architecture and update the
built-in middleware implementations (Retry, Fallback, ToolApproval,
Filesystem, Skills) to match.

- Replace ai.BaseMiddleware embedding + WrapX methods with
  New(ctx) (*ai.Hooks, error) returning a per-call hooks bundle
- Switch WrapTool to return *ai.MultipartToolResponse (preserves
  Content/Metadata through the resume path)
- Move Filesystem's per-call queue + os.Root open into the closure
  returned from New(); tools are now exposed via Hooks.Tools
- Skills now scans paths inside New() and returns Tools via Hooks.Tools
- Rename plugin's ListMiddleware -> Middlewares to match the new
  ai.MiddlewarePlugin interface

Also re-applies the streaming format handler fix from this branch on
top of the new generate.go pipeline so middleware-emitted chunks share
the model's accumulator. Tests pass under -race.
Base automatically changed from ap/go-middleware to main April 20, 2026 17:27
@apascal07 apascal07 requested a review from a team as a code owner April 20, 2026 17:27
@apascal07 apascal07 removed the request for review from huangjeff5 April 20, 2026 17:34
# Conflicts:
#	go/ai/generate.go
#	go/ai/middleware_test.go
Comment thread go/plugins/middleware/tool_approval.go Outdated
Comment thread go/plugins/middleware/retry.go Outdated
@apascal07 apascal07 requested a review from pavelgj April 24, 2026 03:21
@apascal07 apascal07 merged commit 69e903b into main Apr 27, 2026
10 checks passed
@apascal07 apascal07 deleted the ap/go-middleware-impls branch April 27, 2026 16:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

config docs Improvements or additions to documentation go js python Python tooling

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants