[release/8.0] Port dump collection perf improvements#128022
Merged
hoyosjs merged 3 commits intoMay 13, 2026
Merged
Conversation
Contributor
|
Tagging subscribers to this area: @steveisok, @tommcdon, @dotnet/dotnet-diag |
Contributor
There was a problem hiding this comment.
Pull request overview
This PR backports three CoreCLR DAC-focused changes intended to reduce minidump/heap-dump collection time for large workloads (many threads, deep stacks) by improving DAC caching behavior and optionally enabling a faster heap enumeration mode.
Changes:
- Add an opt-in knob (
DOTNET_EnableFastHeapDumps) that promotes heap dump enumeration fromCLRDATA_ENUM_MEM_HEAPtoCLRDATA_ENUM_MEM_HEAP2. - Introduce a DAC-side cache of debugger breakpoint patches to avoid re-scanning the patch table during x64 unwinding.
- Replace the non-scaling DAC instance hash implementation with an
SHash-based implementation (intended).
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/coreclr/vm/vars.hpp | Declares g_EnableFastHeapDumps global. |
| src/coreclr/vm/vars.cpp | Defines/initializes g_EnableFastHeapDumps. |
| src/coreclr/vm/ceemain.cpp | Reads EXTERNAL_EnableFastHeapDumps at startup into g_EnableFastHeapDumps. |
| src/coreclr/inc/dacvars.h | Exposes g_EnableFastHeapDumps to the DAC via DEFINE_DACVAR. |
| src/coreclr/inc/clrconfigvalues.h | Adds EXTERNAL_EnableFastHeapDumps config entry. |
| src/coreclr/debug/daccess/enummem.cpp | Promotes heap dump enumeration to HEAP2 when enabled. |
| src/coreclr/debug/daccess/dacimpl.h | Updates instance-hash implementation (intended SHash) and adds DacPatchCache to ClrDataAccess. |
| src/coreclr/debug/daccess/dacfn.cpp | Implements patch cache population and uses cached patches during host-memory patch replacement. |
| src/coreclr/debug/daccess/daccess.cpp | Updates DacInstanceManager map operations for the SHash-based implementation and flushes the patch cache on ClrDataAccess::Flush(). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
d243f0d to
7782610
Compare
Open
3 tasks
Member
Author
|
/ba-g the failures are known on a release branch or unrelated timeouts |
noahfalk
approved these changes
May 13, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #122459
main PRs:
Description
Backports three DAC performance improvements for minidump collection:
Use SHash as DAC instance hash (Use SHash as DAC instance hash #125631): Replaces the hand-rolled hash table in
DacInstanceManagerwith anSHash-based implementation. The previous fixed-bucket hash degraded quickly for Find and insertion operations under high load. Measured ~9.5x speedup for minidump collection against a repro app with 2.5k-frame deep stacks over 50 threads.Cache debugger patches (Cache debugger patches to speed up x64 stackwalk epilogue/prologue scanning #125459): Caches the list of debugger breakpoint patches so that x64 stack unwinding doesn't re-scan the 1,000-bucket patch hash table on every frame. The cache is populated once on first access and invalidated on
Flush(). Measured reduction from 55s to ~7s for minidump collection (10,000 iterations across 10 threads).Enable CLRDATA_ENUM_MEM_HEAP2 via environment variable: When the target process has
DOTNET_EnableFastHeapDumpsset, the DAC promotesCLRDATA_ENUM_MEM_HEAPtoCLRDATA_ENUM_MEM_HEAP2, which dumps loader heap pages in bulk instead of walking individual runtime structures.Customer Impact
Customers collecting minidumps of large .NET applications (many threads, deep stacks) experience extremely slow dump collection times - on the order of minutes for what should take seconds. This directly impacts incident response time in production environments. Without these fixes, dump collection through Watson/dotnet-dump/createdump remains unacceptably slow for large workloads.
Regression
Yes, with respect to framework. Customers doing migrations have noticed them - framework used non-portable variants of the MSVC library.
Testing
DOTNET_EnableFastHeapDumps), so no change in default behavior. This is the riskier change since it makes heap dump match our expectations but might yield unknown!unknown if the modules aren't indexed properly.Risk
Low.
daccess.cpp,dacfn.cpp,dacimpl.h) that execute only during diagnostic operations (dump collection, debugging). They do not affect runtime execution.DOTNET_EnableFastHeapDumpsenv var is opt-in and does not change default behavior.