Python client 19.2.1: implicit shm_key makes Client.close() fail to clean up SHM/shared-connection state
Summary
When SHM is enabled and use_shared_connection=True, omitting shm_key causes the Python wrapper to register shared state under a key built before the implicit shm_key is assigned. Client.close() later looks up that state using a key that includes the actual shm_key, so the lookup misses and the SHM/shared entry is left behind.
This leaves stale SHM behind even when the application calls close(), and later processes can unexpectedly attach to that stale segment.
Environment
- Aerospike Python client 19.2.1
- Reproduced on Debian Bookworm with Python 3.11.2
- Reproduced on Debian Trixie with Python 3.13.5
- SHM enabled with
shm={}
use_shared_connection=True
- No explicit
shm_key
Minimal Reproduction
from aerospike import Client
conf = {
"hosts": [("seed1.example", 3000), ("seed2.example", 3000)],
"shm": {},
"use_shared_connection": True,
}
client = Client(conf)
client.close()
Expected Behavior
client.close() should detach and clean up the SHM/shared-connection state created for that client, or at least use the same lookup key at connect time and close time.
Actual Behavior
With implicit shm_key, the close-time lookup key does not match the connect-time registration key, so the shared entry is not found and cleanup does not happen.
In tests, a fresh process reused the same implicit key and the stale SHM segment remained visible and reusable after close().
Impact
- SHM state persists across processes even when the application calls
close().
- Later processes can attach to obsolete cluster state.
- This creates the precondition for follow-on bugs in the SHM follower path.
Technical Analysis
The issue appears to be in the Python wrapper's shared-connection alias handling:
- The shared-connection alias or search string is created before the implicit
shm_key is assigned.
close() later rebuilds the lookup string including the real shm_key.
- When
shm_key was implicit, those two strings differ.
- The lookup misses, so the entry is not cleaned up.
From source inspection in 19.2.1:
- The implicit SHM key is assigned during connect.
- The shared alias is created too early.
- Close-time lookup uses the actual SHM key, so it does not match the original registration.
Relevant Source Locations
Verified against the extracted 19.2.1 source tree.
src/main/client/connect.c:53-54 builds alias_to_search before SHM key generation.
src/main/client/connect.c:93-119 assigns the implicit shm_key only after the alias has already been computed.
src/main/client/connect.c:127-129 registers the shared/global entry under the pre-key alias.
src/main/client/close.c:64-66 rebuilds alias_to_search during close().
src/main/client/close.c:138-141 appends shm_key inside return_search_string() when SHM is enabled.
src/main/client/type.c:873-876 sets user_shm_key = true only when the caller provided an explicit shm_key.
src/main/aerospike.c:47-49 defines the global implicit-key state: counter = 0xA8000000 and user_shm_key = false.
Workaround
Always set an explicit shm_key.
Likely Fix Scope
- Primary fix surface is the Python wrapper, not the C client.
- The most likely code changes are in
src/main/client/connect.c, where alias_to_search is built before the implicit SHM key is assigned.
- A minimal fix would assign the final
shm_key before computing the alias, or recompute the alias after shm_key is finalized and before registering the shared/global entry.
src/main/client/close.c should be reviewed at the same time to confirm the lookup and cleanup path uses the same alias contract.
- Risk looks low to medium because the behavior is scoped to shared-connection bookkeeping, but it affects lifecycle semantics across processes.
- The most important regression tests would cover
use_shared_connection=True with both implicit and explicit shm_key, followed by close() and verification that the shared entry and SHM cleanup behavior are consistent.
Notes
This is separate from the follower-startup latency bug. This issue is about stale SHM/shared state surviving close(). The startup latency bug is a downstream effect that becomes visible once stale SHM exists, I will create another issue for that. #1059
Python client 19.2.1: implicit
shm_keymakesClient.close()fail to clean up SHM/shared-connection stateSummary
When SHM is enabled and
use_shared_connection=True, omittingshm_keycauses the Python wrapper to register shared state under a key built before the implicitshm_keyis assigned.Client.close()later looks up that state using a key that includes the actualshm_key, so the lookup misses and the SHM/shared entry is left behind.This leaves stale SHM behind even when the application calls
close(), and later processes can unexpectedly attach to that stale segment.Environment
shm={}use_shared_connection=Trueshm_keyMinimal Reproduction
Expected Behavior
client.close()should detach and clean up the SHM/shared-connection state created for that client, or at least use the same lookup key at connect time and close time.Actual Behavior
With implicit
shm_key, the close-time lookup key does not match the connect-time registration key, so the shared entry is not found and cleanup does not happen.In tests, a fresh process reused the same implicit key and the stale SHM segment remained visible and reusable after
close().Impact
close().Technical Analysis
The issue appears to be in the Python wrapper's shared-connection alias handling:
shm_keyis assigned.close()later rebuilds the lookup string including the realshm_key.shm_keywas implicit, those two strings differ.From source inspection in 19.2.1:
Relevant Source Locations
Verified against the extracted 19.2.1 source tree.
src/main/client/connect.c:53-54buildsalias_to_searchbefore SHM key generation.src/main/client/connect.c:93-119assigns the implicitshm_keyonly after the alias has already been computed.src/main/client/connect.c:127-129registers the shared/global entry under the pre-key alias.src/main/client/close.c:64-66rebuildsalias_to_searchduringclose().src/main/client/close.c:138-141appendsshm_keyinsidereturn_search_string()when SHM is enabled.src/main/client/type.c:873-876setsuser_shm_key = trueonly when the caller provided an explicitshm_key.src/main/aerospike.c:47-49defines the global implicit-key state:counter = 0xA8000000anduser_shm_key = false.Workaround
Always set an explicit
shm_key.Likely Fix Scope
src/main/client/connect.c, wherealias_to_searchis built before the implicit SHM key is assigned.shm_keybefore computing the alias, or recompute the alias aftershm_keyis finalized and before registering the shared/global entry.src/main/client/close.cshould be reviewed at the same time to confirm the lookup and cleanup path uses the same alias contract.use_shared_connection=Truewith both implicit and explicitshm_key, followed byclose()and verification that the shared entry and SHM cleanup behavior are consistent.Notes
This is separate from the follower-startup latency bug. This issue is about stale SHM/shared state surviving
close(). The startup latency bug is a downstream effect that becomes visible once stale SHM exists, I will create another issue for that. #1059