perf(inference): freeze-time BatchNorm folding (Conv/Dense → BN) — Phase 6 by ooples · Pull Request #1473 · ooples/AiDotNet

ooples · 2026-05-30T21:58:57Z

Phase 6 — freeze-time super-folding (compiled-inference plan)

Folds inference-time BatchNorm into a preceding identity-activation linear op and removes the BN layer — the canonical ResNet/VGG/EfficientNet inference optimization, extended to Dense→BN.

What

At inference BatchNorm is a fixed per-channel affine y = γ·(x−μ)/√(σ²+ε) + β. Directly after a linear z = W·x + b (identity activation), BN(z) = W'·x + b' with s = γ/√(σ²+ε), W' = W·s, b' = (b−μ)·s + β. Lossless; eliminates a per-element pass + an intermediate tensor per block.

NeuralNetworkBase.FoldBatchNormForInference() — folds Conv2D→BN (per output channel) and Dense→BN (per output feature) in place, then removes the BN layer. Guards: only folds across identity/no activation, and on matching channel counts.
InferenceOptimizer.OptimizeForInference gains an ApplyLayerFusion step (before attention rewrites / quantization), gated by InferenceOptimizationConfig.EnableLayerFusion (default true — lossless). Clone-before-mutate now also triggers for foldable Conv/Dense→BN.

Verification

ConvBatchNormFoldTests (4 cases, all green):

Conv→BN and Dense→BN fold reproduce the pre-fold output within 1e-4 (non-trivial γ/β + running μ/σ² warmed by training-mode forwards), BN removed.
Fusion-off retains BN; Conv(ReLU)→BN is correctly left unfolded.
All 30 existing InferenceOptimizer tests pass. net10.0 + net471 build clean.

Note: triggers only on models built for it (linear op with no activation immediately followed by BatchNorm) — targets BN-heavy backbones (ResNet/EfficientNet/CSPDarknet); the parity-benchmark MLP/CNN have no BN.

🤖 Generated with Claude Code

…nferenceOptimizer (Phase 6) At inference a BatchNorm layer is a fixed per-channel affine y = γ·(x − μ)/√(σ² + ε) + β. When it directly follows a linear op with identity activation — the canonical Conv→BN block in ResNet/VGG/EfficientNet, and Dense→BN in BN-MLPs — it folds into that op's weights and bias with no change in output: s = γ/√(σ² + ε), W' = W·s, b' = (b − μ)·s + β. The BatchNorm layer is then removed, eliminating a full per-element pass and an intermediate tensor per block. New: NeuralNetworkBase.FoldBatchNormForInference() walks the layer list and folds every BatchNorm whose predecessor is an identity-activation ConvolutionalLayer ([outC,inC,kH,kW], per-output-channel) or DenseLayer ([inputSize,outputSize], per-output-feature), mutating the live kernel/weight and bias tensors in place via GetFilters/GetWeights/GetBiases, then removing the BN via RemoveLayerFromCollection. It refuses to fold across a real nonlinearity (only IdentityActivation / no activation qualifies) and on any channel-count mismatch. Wired into InferenceOptimizer.OptimizeForInference as a new ApplyLayerFusion step (runs before attention rewrites / quantization so folded weights flow into them), gated by InferenceOptimizationConfig.EnableLayerFusion (default true — folding is lossless). The clone-before-mutate guard now also triggers when foldable Conv/Dense→BN is present. Verified (ConvBatchNormFoldTests): Conv→BN and Dense→BN folds reproduce the pre-fold inference output within 1e-4 (non-trivial γ/β plus running μ/σ² warmed by training-mode forwards), the BN layer is removed, fusion-off retains BN, and a Conv(ReLU)→BN block is correctly left unfolded. All 30 existing InferenceOptimizer tests still pass; net10.0 + net471 build clean. Note: only triggers on models built for it (a Conv/Dense with no activation immediately followed by BatchNorm); the parity-benchmark MLP/CNN have no BN, so this targets BN-heavy backbones (ResNet/EfficientNet/CSPDarknet). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

vercel · 2026-05-30T21:58:58Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
aidotnet_website	Ready	Preview, Comment	May 30, 2026 9:59pm
aidotnet-playground-api	Ready	Preview, Comment	May 30, 2026 9:59pm

coderabbitai · 2026-05-30T21:59:03Z

Warning

Review limit reached

@ooples, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 3 minutes and 47 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 20120ee4-7142-4d2f-9453-1440529f4b21

📥 Commits

Reviewing files that changed from the base of the PR and between 0db695d and ac4c56f.

📒 Files selected for processing (4)

src/Configuration/InferenceOptimizationConfig.cs
src/Inference/InferenceOptimizer.cs
src/NeuralNetworks/NeuralNetworkBase.cs
tests/AiDotNet.Tests/UnitTests/Inference/ConvBatchNormFoldTests.cs

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch perf/1462-conv-bn-fold

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

vercel Bot deployed to Preview – aidotnet_website May 30, 2026 21:58 View deployment

vercel Bot deployed to Preview – aidotnet-playground-api May 30, 2026 21:59 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf(inference): freeze-time BatchNorm folding (Conv/Dense → BN) — Phase 6#1473

perf(inference): freeze-time BatchNorm folding (Conv/Dense → BN) — Phase 6#1473
ooples wants to merge 1 commit into
masterfrom
perf/1462-conv-bn-fold

ooples commented May 30, 2026

Uh oh!

vercel Bot commented May 30, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 30, 2026

Review limit reached

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

ooples commented May 30, 2026

Phase 6 — freeze-time super-folding (compiled-inference plan)

What

Verification

Uh oh!

vercel Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented May 30, 2026

Review limit reached

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vercel Bot commented May 30, 2026 •

edited

Loading