Skip to content

Fix dead lock when purging transaction logs#512

Merged
cb1kenobi merged 4 commits intomainfrom
deadlock-fix
Apr 16, 2026
Merged

Fix dead lock when purging transaction logs#512
cb1kenobi merged 4 commits intomainfrom
deadlock-fix

Conversation

@cb1kenobi
Copy link
Copy Markdown
Contributor

@cb1kenobi cb1kenobi commented Apr 16, 2026

Fixes two related issues. One of them was a deadlock that caused the transaction log worker test to sometimes timeout.

Issue 1: DBDescriptor::close() deadlock

When closing a descriptor, it would hold transactionLogMutex while calling store->close() which acquires writeMutex. This deadlock occurred when the main thread was closing the descriptor while another thread holds writeMutex and RocksDB flush callback is blocked waiting for transactionLogMutex to free.

To fix it, update RocksDB flush listeners to hold the transactionLogMutex only while we collect the stores to close, release the lock, then signal the database is flushed.

Also, when closing the DB descriptor, hold the transactionLogMutex only while we collect the stores to close, then release the lock, then close the stores.

Issue 2: resolveTransactionLogStore() returns stores that are closing/closed

purgeTransactionLogs() calls tryClose(), isClosing is set to true, but the store is still in the stores map. resolveTransactionLogStore() sees the store still in the map and doesn't care if it's closing/closed, so it happily returns it.

The fix is to simply check if the store is closing, and if so, return a new store.

Disclaimer, Claude burned a lot of tokens to find and fix the deadlock.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 16, 2026

📊 Benchmark Results

get-sync.bench.ts

getSync() > random keys - small key size (100 records)

Implementation Rank Operations/sec Mean (ms) Min (ms) Max (ms) RME (%) Samples
🥇 lmdb 1 23.64K ops/sec 42.29 41.00 652.925 0.117 118,221
🥈 rocksdb 2 11.78K ops/sec 84.89 80.01 22,296.675 0.881 58,899

getSync() > sequential keys - small key size (100 records)

Implementation Rank Operations/sec Mean (ms) Min (ms) Max (ms) RME (%) Samples
🥇 lmdb 1 27.50K ops/sec 36.37 35.14 2,487.645 0.143 137,491
🥈 rocksdb 2 12.94K ops/sec 77.30 76.10 503.364 0.053 64,682

ranges.bench.ts

getRange() > small range (100 records, 50 range)

Implementation Rank Operations/sec Mean (ms) Min (ms) Max (ms) RME (%) Samples
🥇 lmdb 1 25.79K ops/sec 38.78 37.23 491.772 0.137 128,942
🥈 rocksdb 2 3.39K ops/sec 295.096 261.208 2,503.718 0.510 16,944

realistic-load.bench.ts

Realistic write load with workers > write variable records with transaction log

Implementation Rank Operations/sec Mean (ms) Min (ms) Max (ms) RME (%) Samples
🥇 rocksdb 1 188.61 ops/sec 5,301.928 57.51 160,911.571 33.72 384
🥈 lmdb 2 27.62 ops/sec 36,199.162 42.42 1,180,584.03 137.007 64.00

transaction-log.bench.ts

Transaction log > read 100 iterators while write log with 100 byte records

Implementation Rank Operations/sec Mean (ms) Min (ms) Max (ms) RME (%) Samples
🥇 rocksdb 1 34.52K ops/sec 28.97 13.20 14,283.744 0.603 172,605
🥈 lmdb 2 438.47 ops/sec 2,280.655 166.636 12,318.405 1.27 2,194

Transaction log > read one entry from random position from log with 1000 100 byte records

Implementation Rank Operations/sec Mean (ms) Min (ms) Max (ms) RME (%) Samples
🥇 rocksdb 1 754.98K ops/sec 1.32 1.15 4,365.038 0.182 3,774,916
🥈 lmdb 2 388.90K ops/sec 2.57 1.19 8,684.723 1.87 1,944,487

worker-put-sync.bench.ts

putSync() > random keys - small key size (100 records, 10 workers)

Implementation Rank Operations/sec Mean (ms) Min (ms) Max (ms) RME (%) Samples
🥇 rocksdb 1 845.18 ops/sec 1,183.181 993.248 3,000.43 0.406 1,691
🥈 lmdb 2 1.17 ops/sec 857,585.693 788,617.894 903,373.116 2.58 10.00

worker-transaction-log.bench.ts

Transaction log with workers > write log with 100 byte records

Implementation Rank Operations/sec Mean (ms) Min (ms) Max (ms) RME (%) Samples
🥇 rocksdb 1 18.71K ops/sec 53.46 30.00 569.958 0.531 37,412
🥈 lmdb 2 809.10 ops/sec 1,235.943 278.344 14,756.811 5.32 1,620

Results from commit b5bc06d

@cb1kenobi cb1kenobi changed the title Fix dead lock when purging transaction logs WIP: Fix dead lock when purging transaction logs Apr 16, 2026
@cb1kenobi cb1kenobi changed the title WIP: Fix dead lock when purging transaction logs Fix dead lock when purging transaction logs Apr 16, 2026
@cb1kenobi cb1kenobi marked this pull request as ready for review April 16, 2026 22:04
@cb1kenobi cb1kenobi requested a review from kriszyp April 16, 2026 22:04
Copy link
Copy Markdown
Member

@kriszyp kriszyp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

@cb1kenobi cb1kenobi merged commit ef15c3f into main Apr 16, 2026
20 checks passed
@cb1kenobi cb1kenobi deleted the deadlock-fix branch April 16, 2026 22:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants