Skip to content

Fix async RedisCluster lifecycle after aclose#4069

Open
seplease wants to merge 3 commits into
redis:masterfrom
seplease:fix-3846-async-cluster-aclose
Open

Fix async RedisCluster lifecycle after aclose#4069
seplease wants to merge 3 commits into
redis:masterfrom
seplease:fix-3846-async-cluster-aclose

Conversation

@seplease
Copy link
Copy Markdown

@seplease seplease commented May 16, 2026

Summary

Fix async RedisCluster reopening itself after aclose().

Previously, calling await client.aclose() did not fully terminate the cluster client lifecycle. A later command could trigger lazy reinitialization and recreate cluster connections.

This change separates "not initialized yet" from "explicitly closed" and prevents closed async cluster clients from being reused.

Fixes #3846.

Background

RedisCluster uses lazy initialization internally.

The problem was that _initialize represented two different states at once:

  • the client has not been initialized yet
  • the client has already been explicitly closed

aclose() disconnected node connections, but also reset _initialize=True.

As a result, a later command path treated the client as uninitialized and called initialize() again, reopening cluster connections after close.

That behavior made aclose() act more like a soft reset than a terminal lifecycle operation.

There was also a race window between initialize() and aclose() where connections created during initialization could survive the close path.

Changes

Introduced an explicit _closed state for the async cluster client lifecycle.

Main changes:

  • added _closed lifecycle tracking
  • added _ensure_open() checks before initialization and command execution
  • changed public aclose() to mark the client closed and prevent reuse
  • made aclose() wait for in-flight initialization before disconnecting nodes
  • separated internal node reset from terminal client close

Internal cleanup paths are now split into:

  • aclose()
    • marks the client closed
    • prevents future reuse
  • _close_nodes()
    • resets node state without closing the client lifecycle
    • still allows reinitialization
  • _disconnect_nodes()
    • performs the actual disconnect cleanup

Cluster recovery paths (MOVED, ClusterDownError, SlotNotCoveredError, and pipeline retry) now use _close_nodes() so topology refresh behavior remains unchanged.

Tests

Added regression coverage for:

  • commands executed after aclose()
  • calling aclose() before initialization
  • concurrent initialize() / aclose() execution
  • pipeline execution after close
  • initialization failure cleanup
  • node cleanup paths that still allow reinitialization
  • transaction pipeline execution after close
  • context manager re-entry after exit

Also added an integration-style regression test against a real Docker Redis Cluster.

Ran:

.venv/bin/python -m pytest tests/test_asyncio/test_cluster.py -q \
  --redis-url=redis://localhost:16379/0 \
  --redis-ssl-url=rediss://localhost:27379/0 \
  -m 'not onlynoncluster and not redismod and not fixed_client'

Result:

238 passed, 195 skipped, 33 deselected

Also validated with:

.venv/bin/ruff format --check redis/asyncio/cluster.py tests/test_asyncio/test_cluster.py tests/test_asyncio/conftest.py
.venv/bin/ruff check redis/asyncio/cluster.py tests/test_asyncio/test_cluster.py tests/test_asyncio/conftest.py
.venv/bin/python -m py_compile redis/asyncio/cluster.py tests/test_asyncio/test_cluster.py tests/test_asyncio/conftest.py

Notes

This PR only changes the async cluster client lifecycle behavior.

The sync RedisCluster implementation is unchanged.


Note

Medium Risk
Changes async RedisCluster lifecycle semantics by making aclose() terminal and adding new close/retry paths; this could break callers/tests that previously relied on implicit reinitialization after close or on specific retry behavior under cluster errors.

Overview
Fixes async RedisCluster reopening itself after aclose() by introducing an explicit _closed state and enforcing it via _ensure_open() checks in initialize(), execute_command(), and pipeline/transaction execution paths.

Splits connection teardown into _disconnect_nodes() (actual disconnect), _close_nodes() (soft reset to allow topology refresh retries), and aclose() (terminal close that waits on the client lock), and updates cluster recovery/retry paths (e.g. MOVED, ClusterDownError, pipeline retries) to use _close_nodes() instead of fully closing the client.

Updates async test fixtures to handle flushing a closed cluster client by creating a temporary cleanup client, and adds regression tests covering closed-client command rejection, concurrent initialize()/aclose() behavior, init-failure cleanup, and pipeline execution after close.

Reviewed by Cursor Bugbot for commit b922022. Bugbot is set up for automated code reviews on this repo. Configure here.

@jit-ci
Copy link
Copy Markdown

jit-ci Bot commented May 16, 2026

Hi, I’m Jit, a friendly security platform designed to help developers build secure applications from day zero with an MVS (Minimal viable security) mindset.

In case there are security findings, they will be communicated to you as a comment inside the PR.

Hope you’ll enjoy using Jit.

Questions? Comments? Want to learn more? Get in touch with us.

Comment thread redis/asyncio/cluster.py
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Reviewed by Cursor Bugbot for commit b922022. Configure here.

Comment thread redis/asyncio/cluster.py
@petyaslavova
Copy link
Copy Markdown
Collaborator

Hi @seplease, thank you for your contribution! I will have a look at it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

async cluster does not close

2 participants