Skip to content

Fix table cache entry leak in VersionBuilder::UnrefFile on failed LogAndApply (#14720)#14720

Open
mszeszko-meta wants to merge 1 commit into
facebook:mainfrom
mszeszko-meta:export-D99759696
Open

Fix table cache entry leak in VersionBuilder::UnrefFile on failed LogAndApply (#14720)#14720
mszeszko-meta wants to merge 1 commit into
facebook:mainfrom
mszeszko-meta:export-D99759696

Conversation

@mszeszko-meta
Copy link
Copy Markdown
Contributor

@mszeszko-meta mszeszko-meta commented May 7, 2026

Summary:

Context

VersionSet::LogAndApply loads table handlers before the MANIFEST update is durable. If that MANIFEST update later fails, newly added files can be discarded by VersionBuilder without ever being installed in a Version. VersionBuilder::UnrefFile released the FileMetaData pinned_reader handle, but releasing that handle only dropped the reference; it did not erase the cache key, so an orphaned table-cache entry could survive for a file that is neither live nor quarantined. When metadata read fault injection prevents the obsolete-file scan from cleaning up the orphan, DB close can trip TEST_VerifyNoObsoleteFilesCached with File N is not live nor quarantined.

Fix

After releasing a pinned_reader for a FileMetaData whose refcount reaches zero, explicitly evict that file number from the table cache. The new VersionBuilderDBTest.FailedLogAndApplyEvictsTableCacheEntry regression test exercises the realistic production path: it creates a real SST, calls LogAndApply through the DB's own VersionSet, injects a MANIFEST sync failure via the AfterSyncManifest sync point, and asserts that the table cache no longer contains the file after ProcessManifestWrites destroys the VersionBuilder. With the eviction removed, the test fails deterministically with leaked_cache_entry=true.

Differential Revision: D99759696

@meta-cla meta-cla Bot added the CLA Signed label May 7, 2026
@meta-codesync
Copy link
Copy Markdown

meta-codesync Bot commented May 7, 2026

@mszeszko-meta has exported this pull request. If you are a Meta employee, you can view the originating Diff in D99759696.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 7, 2026

✅ clang-tidy: No findings on changed lines

Completed in 72.4s.

@meta-codesync meta-codesync Bot changed the title fix table cache entry leak in VersionBuilder::UnrefFile on failed LogAndApply Fix table cache entry leak in VersionBuilder::UnrefFile on failed LogAndApply May 7, 2026
@meta-codesync meta-codesync Bot changed the title Fix table cache entry leak in VersionBuilder::UnrefFile on failed LogAndApply Fix table cache entry leak in VersionBuilder::UnrefFile on failed LogAndApply (#14720) May 7, 2026
mszeszko-meta added a commit to mszeszko-meta/rocksdb that referenced this pull request May 7, 2026
…AndApply (facebook#14720)

Summary:

### Context

`VersionSet::LogAndApply` loads table handlers before the `MANIFEST` update is durable. If that `MANIFEST` update later fails, newly added files can be discarded by `VersionBuilder` without ever being installed in a `Version`. `VersionBuilder::UnrefFile` released the `FileMetaData` `pinned_reader` handle, but releasing that handle only dropped the reference; it did not erase the cache key, so an orphaned table-cache entry could survive for a file that is neither live nor quarantined. When metadata read fault injection prevents the obsolete-file scan from cleaning up the orphan, DB close can trip `TEST_VerifyNoObsoleteFilesCached` with File N is not live nor quarantined.

### Fix
After releasing a pinned_reader for a `FileMetaData` whose refcount reaches zero, explicitly evict that file number from the table cache. The new `VersionBuilderDBTest.UnrefFileEvictsPinnedTableReader` regression test exercises this directly by loading a real SST through `VersionBuilder::LoadTableHandlers`, destroying the builder, and asserting that the table cache no longer contains the file. With the eviction removed, the test fails deterministically with `leaked_cache_entry`=`true`.

Differential Revision: D99759696
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 7, 2026

Claude Code Review - OBSOLETE

Superseded by a newer AI review. Expand to see the original review.

✅ Claude Code Review

Auto-triggered after CI passed — reviewing commit d089ff2


Summary

Clean, well-targeted bug fix that prevents table cache entry leaks when a VersionBuilder is destroyed without installing files into a Version (e.g., failed LogAndApply). The fix is safe because refs=0 guarantees the file was never installed, making cache eviction always correct. The regression test is thorough and deterministic.

High-severity findings (0):

No high-severity findings.

Full review (click to expand)

Findings

🔴 HIGH

None.

🟡 MEDIUM

None.

🟢 LOW / NIT

L1. Narrow race window between Release() and Evict() — version_builder.cc:439-447
  • Issue: Between pinned_reader.Release() (line 439) and TableCache::Evict() (line 447), another thread could theoretically acquire a handle to the same cache entry via FindTable().
  • Root cause: The two operations are not atomic.
  • Impact: Extremely low. The race window is narrow, the code is on a cold path (VersionBuilder destruction), and Cache::Erase() safely handles entries with active references — it removes the entry from the hash table but preserves it for existing handle holders until they release.
  • Suggested fix: No action needed. The existing cache implementation handles this safely.
L2. Unconditional eviction even without pinned_reader — version_builder.cc:444-447
  • Issue: TableCache::Evict() is called even when pinned_reader is null, meaning the file may not have a cache entry at all. This results in a no-op cache lookup.
  • Root cause: Simplicity-driven design choice — checking for pinned_reader before evicting would miss entries loaded by other paths (e.g., FindTable() without pinning).
  • Impact: Negligible. Evict() is an O(1) hash lookup that's a no-op when the key doesn't exist. Only occurs on cold path.
  • Suggested fix: No action needed. The unconditional approach is actually more robust.
L3. Test uses file number offset heuristic — version_builder_test.cc:241
  • Issue: file_number = dbfull()->TEST_Current_Next_FileNo() + 1000 uses a hardcoded offset to avoid collisions with real file numbers. While practically safe, it's a brittle pattern.
  • Root cause: Need to create an SST file with a known number outside the DB's allocation range.
  • Impact: Very low. The +1000 offset provides ample margin for any reasonable test scenario.
  • Suggested fix: Consider using TEST_Current_Next_FileNo() directly and then advancing the counter, though this is minor.
L4. Missing test coverage for secondary eviction paths — version_builder_test.cc
  • Issue: The test covers the primary scenario (VersionBuilder destroyed after LoadTableHandlers) but does not test:
    • The ApplyFileDeletion path (file added then deleted in same builder)
    • The CreateOrReplaceSavePoint path (copy constructor + destruction)
    • The case where table_cache_ is null
  • Root cause: Test focuses on the specific bug scenario from the PR description.
  • Impact: Low. The code paths share the same UnrefFile() cleanup, so the primary test provides good coverage of the fix itself.
  • Suggested fix: Consider adding tests for the deletion and savepoint paths in a follow-up.

Cross-Component Analysis

Context Executes? Safe? Reasoning
Normal SaveTo() path No (refs > 0) N/A Eviction not reached
Failed LogAndApply Yes (refs = 0) Yes File never installed in Version
CreateOrReplaceSavePoint Eventual (last copy) Yes Correct ref counting prevents premature eviction
table_cache_ = nullptr Guarded Yes Null check prevents crash
Concurrent LogAndApply No collision Yes File numbers are unique (atomic counter)
WritePreparedTxnDB Standard path Yes No special VersionBuilder usage
Secondary/ReadOnly DB Standard path Yes Same VersionBuilder patterns

Assumption stress-test results:

  • "refs=0 means never installed" — HOLDS: SaveTo() → AddFile() increments refs; if SaveTo fails, files were never installed. Version destruction in LogAndApply failure happens while VersionBuilder still holds refs.
  • "Eviction won't remove needed entries" — HOLDS: File numbers are unique (atomic allocation), and refs=0 files are by definition not referenced by any live Version.
  • "Cache::Erase is safe with active handles" — HOLDS: LRU cache Erase removes from hash table but preserves entries for existing handle holders.

Positive Observations

  • Minimal, surgical fix: Only 9 lines of production code, directly addressing the root cause.
  • Correct operation ordering: Release pinned handle before evicting from cache prevents use-after-free.
  • Follows established patterns: TableCache::Evict() is used in similar cleanup scenarios elsewhere (db_impl_files.cc).
  • Thorough regression test: Creates real SST, exercises actual cache pinning, and verifies eviction deterministically. Proper cleanup on failure path (manual eviction + file deletion before assertion).
  • Good defensive null check: if (table_cache_ != nullptr) is more defensive than the existing assert-only pattern.

ℹ️ About this response

Generated by Claude Code.
Review methodology: claude_md/code_review.md

Limitations:

  • Claude may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /claude-review [context] — Request a code review
  • /claude-query <question> — Ask about the PR or codebase

…AndApply (facebook#14720)

Summary:

### Context

`VersionSet::LogAndApply` loads table handlers before the `MANIFEST` update is durable. If that `MANIFEST` update later fails, newly added files can be discarded by `VersionBuilder` without ever being installed in a `Version`. `VersionBuilder::UnrefFile` released the `FileMetaData` `pinned_reader` handle, but releasing that handle only dropped the reference; it did not erase the cache key, so an orphaned table-cache entry could survive for a file that is neither live nor quarantined. When metadata read fault injection prevents the obsolete-file scan from cleaning up the orphan, DB close can trip `TEST_VerifyNoObsoleteFilesCached` with File N is not live nor quarantined.

### Fix
After releasing a pinned_reader for a `FileMetaData` whose refcount reaches zero, explicitly evict that file number from the table cache. The new `VersionBuilderDBTest.FailedLogAndApplyEvictsTableCacheEntry` regression test exercises the realistic production path: it creates a real SST, calls `LogAndApply` through the DB's own `VersionSet`, injects a MANIFEST sync failure via the `AfterSyncManifest` sync point, and asserts that the table cache no longer contains the file after `ProcessManifestWrites` destroys the `VersionBuilder`. With the eviction removed, the test fails deterministically with `leaked_cache_entry=true`.

Differential Revision: D99759696
@github-actions
Copy link
Copy Markdown

🟡 Codex Code Review

Auto-triggered after CI passed — reviewing commit 3314ad1


Codex review failed before producing findings.

WARNING: proceeding, even though we could not update PATH: Refusing to create helper binaries under temporary dir "/tmp" (codex_home: AbsolutePathBuf("/tmp/codex-home"))
error: the argument '--base <BRANCH>' cannot be used with '[PROMPT]'

Usage: codex exec review --commit <SHA> --base <BRANCH> --title <TITLE> --model <MODEL> --config <key=value> --dangerously-bypass-approvals-and-sandbox --output-last-message <FILE> [PROMPT]

For more information, try '--help'.

ℹ️ About this response

Generated by Codex CLI.
Review methodology: claude_md/code_review.md

Limitations:

  • Codex may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /codex-review [context] — Request a code review
  • /codex-query <question> — Ask about the PR or codebase

@github-actions
Copy link
Copy Markdown

✅ Claude Code Review

Auto-triggered after CI passed — reviewing commit 3314ad1


Summary

Solid, low-risk fix for a real table cache entry leak on failed LogAndApply. The unconditional eviction is safe and correct. One medium-severity suggestion to consider scoping the eviction more narrowly.

High-severity findings (0):
No high-severity findings.

Full review (click to expand)

Findings

🔴 HIGH

None.

🟡 MEDIUM

M1. Unconditional eviction is broader than necessary — version_builder.cc:442
  • Issue: The TableCache::Evict() call runs for every file whose refcount drops to zero, regardless of whether it had a pinned reader (i.e., was actually loaded into the table cache). This means it also fires in the ApplyFileDeletion path when a file is added then deleted in the same VersionBuilder session, even if LoadTableHandlers was never called.
  • Root cause: The eviction is placed outside the if (f->fd.pinned_reader.Get() != nullptr) block.
  • Suggested fix: Consider moving the eviction inside the pinned_reader guard:
    if (f->fd.pinned_reader.Get() != nullptr) {
      assert(table_cache_ != nullptr);
      f->fd.pinned_reader.Release(table_cache_->get_cache().get());
      TableCache::Evict(table_cache_->get_cache().get(), f->fd.GetNumber());
    }
    This is sufficient because LoadTableHandlersHelper always pins the reader when FindTable succeeds (version_util.cc:71), so a file in the cache from this path will always have a pinned reader. If FindTable fails, no cache entry is created. The unconditional approach works correctly too (Cache::Erase is a no-op for missing keys), but the scoped version is more precise and avoids unnecessary cache hash-table lookups.

🟢 LOW / NIT

L1. Test file number selection could be more robust — version_builder_test.cc:241
  • Issue: dbfull()->TEST_Current_Next_FileNo() + 1000 could theoretically overlap with auto-allocated file numbers if the DB performs internal operations (though unlikely with disable_auto_compactions = true).
  • Suggested fix: Use a very large offset or allocate via NewFileNumber() under the mutex if possible.
L2. Test only covers the destructor path — version_builder_test.cc:229
  • Issue: The test exercises the primary scenario (failed LogAndApply → VersionBuilder destructor evicts), but does not test the ApplyFileDeletion path where a file is added then deleted in the same builder.
  • Suggested fix: Consider adding a test for the add-then-delete-in-same-builder scenario if the unconditional eviction is retained.

Cross-Component Analysis

Concern Verdict Reasoning
Copy constructor double-evict Safe Rep(const Rep&) calls RefFile (bumping refs to 2). When the first copy is destroyed, refs goes 2→1 (no eviction). When the second copy is destroyed, refs goes 1→0 (eviction fires correctly — no Rep holds the file).
Base version cache entry conflict Safe File numbers are strictly monotonic (NewFileNumber uses fetch_add(1) in version_set.h). A newly added file will always have a unique number. ApplyFileAddition rejects duplicate file numbers. The VersionBuilder's FileMetaData is separate from the base version's.
ApplyFileDeletion premature eviction Safe (no-op) For files added then deleted in the same builder, if LoadTableHandlers was never called for them (no pinned reader), Cache::Erase is a no-op. If it was called, the file was only referenced by this builder, so eviction is correct.
Recovery path (VersionEditHandler) Safe During recovery, UnrefFile in the destructor won't drop refs to 0 for files installed in the Version (refs > 1). For files that are added then deleted during replay, eviction of orphaned cache entries is correct.
ReadOnly / SecondaryInstance N/A These contexts don't trigger the write path that creates VersionBuilders for LogAndApply.
WritePreparedTxn / WriteUnpreparedTxn Safe Same VersionBuilder lifecycle; no additional interaction.
Performance: unnecessary Erase calls Negligible Cache::Erase for a non-existent key is O(1) hash lookup + shard lock. This runs only when refs reaches 0, which is rare in normal operation (files are installed in Versions).

Positive Observations

  • The fix correctly addresses a real resource leak that could cause TEST_VerifyNoObsoleteFilesCached assertions during DB close under fault injection.
  • The table_cache_ != nullptr guard correctly handles the many existing unit tests that pass nullptr for the table cache.
  • The test uses sync points rather than sleep, avoiding flakiness.
  • The test follows established patterns from similar tests (e.g., LeakedTableCacheEntryOnFlushInstallFailure in db_flush_test.cc).
  • Cache::Erase is idempotent and safe for concurrent access, so even if PurgeObsoleteFiles later tries to erase the same entry, there's no issue.

ℹ️ About this response

Generated by Claude Code.
Review methodology: claude_md/code_review.md

Limitations:

  • Claude may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /claude-review [context] — Request a code review
  • /claude-query <question> — Ask about the PR or codebase

Copy link
Copy Markdown
Contributor

@hx235 hx235 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left comment internally

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants