submariner: make start hook idempotent with ensure model by raghavendra-talur · Pull Request #2502 · RamenDR/ramen

raghavendra-talur · 2026-04-08T17:45:06Z

Refactor the submariner start hook to check whether the broker and cluster joins are already healthy before re-running them. This avoids the "existing joined cluster with the same ID" error when re-running start on a partially deployed environment.

Split deploy_broker/join_cluster into is_/do_/ensure_* functions
Add are_deployments_available() to check deployment health
Add clean_broker_registration() to remove stale broker-side state (clusters.submariner.io and endpoints.submariner.io) before re-joining
Add subctl.uninstall() wrapper in drenv/subctl.py
Fix typo: "deployuments" -> "deployments"

Assisted-by: Claude Code/claude-opus-4-6

Refactor the submariner start hook to check whether the broker and cluster joins are already healthy before re-running them. This avoids the "existing joined cluster with the same ID" error when re-running start on a partially deployed environment. - Split deploy_broker/join_cluster into is_*/do_*/ensure_* functions - Add are_deployments_available() to check deployment health - Add clean_broker_registration() to remove stale broker-side state (clusters.submariner.io and endpoints.submariner.io) before re-joining - Add subctl.uninstall() wrapper in drenv/subctl.py - Fix typo: "deployuments" -> "deployments" Assisted-by: Claude Code/claude-opus-4-6 Signed-off-by: Raghavendra Talur <raghavendra.talur@gmail.com>

raaizik · 2026-04-09T12:03:41Z

-def deploy_broker(broker):
-    print(f"Waiting until broker '{broker}' is ready")
-    drenv_cluster.wait_until_ready(broker)
+def is_broker_deployed(broker, broker_info):


I'm thinking this should be a bit more robust and could include a broker health validation and the validity of the broker info file. A corrupted or stale broker info file could cause the function to skip deployment when it shouldn't.

raaizik · 2026-04-09T12:06:31Z

        )


+BROKER_NAMESPACE = "submariner-k8s-broker"


Are we sure this is the best practice placement for this const?

raaizik · 2026-04-09T12:08:37Z

+        pass  # Not found is fine.
+
+
+def are_deployments_available(cluster, names, namespace):


Looks like it doesn't distinguish between deployment doesn't exist and deployment exists but isn't available

raaizik · 2026-04-09T12:11:08Z

+    for name in names:
+        try:
+            out = kubectl.get(
+                f"deploy/{name}",
+                f"--namespace={namespace}",
+                "--output=jsonpath={.status.conditions[?(@.type=='Available')].status}",
+                context=cluster,
+            )
+            if out.strip() != "True":
+                return False
+        except Exception:
+            return False


Maybe this can be optimized to minimize kubectl.get calls by fetching all deployments in advance and then checking against each one?

nirs · 2026-04-09T12:26:59Z

This avoids the "existing joined cluster with the same ID" error when re-running start on a partially deployed environment.

I never seen such issue - how do you reproduce it?

The start script should already be idempotent, deploying submariner twice works.

nirs · 2026-05-04T14:38:56Z

@raghavendra-talur I just tried and submariner is idempotent:

% drenv start envs/submariner.yaml 
2026-05-04 16:57:59,926 INFO    [submariner] Starting environment
2026-05-04 16:57:59,974 INFO    [hub] Starting minikube cluster
2026-05-04 16:57:59,978 INFO    [dr1] Starting minikube cluster
2026-05-04 16:57:59,984 INFO    [dr2] Starting minikube cluster
2026-05-04 16:58:15,424 INFO    [dr2] Cluster started in 15.44 seconds
2026-05-04 16:58:15,764 INFO    [dr2] Configuring containerd
2026-05-04 16:58:18,643 INFO    [hub] Cluster started in 18.67 seconds
2026-05-04 16:58:18,978 INFO    [hub] Configuring containerd
2026-05-04 16:58:20,070 INFO    [hub/0] Running addons/submariner/start
2026-05-04 16:58:21,701 INFO    [dr1] Cluster started in 21.72 seconds
2026-05-04 16:58:22,044 INFO    [dr1] Configuring containerd
2026-05-04 16:59:19,887 INFO    [hub/0] addons/submariner/start completed in 59.82 seconds
2026-05-04 16:59:19,887 INFO    [hub/0] Running addons/submariner/test
2026-05-04 16:59:39,860 INFO    [hub/0] addons/submariner/test completed in 19.97 seconds
2026-05-04 16:59:39,861 INFO    [submariner] Environment started in 99.93 seconds

% drenv start envs/submariner.yaml
2026-05-04 17:02:09,311 INFO    [submariner] Starting environment
2026-05-04 17:02:09,629 INFO    [dr1] Starting minikube cluster
2026-05-04 17:02:09,634 INFO    [dr2] Starting minikube cluster
2026-05-04 17:02:09,649 INFO    [hub] Starting minikube cluster
2026-05-04 17:02:32,565 INFO    [dr1] Cluster started in 22.94 seconds
2026-05-04 17:02:32,678 INFO    [dr1] Waiting for fresh status
2026-05-04 17:02:39,585 INFO    [hub] Cluster started in 29.94 seconds
2026-05-04 17:02:39,664 INFO    [hub] Waiting for fresh status
2026-05-04 17:02:40,713 INFO    [dr2] Cluster started in 31.08 seconds
2026-05-04 17:02:40,788 INFO    [dr2] Waiting for fresh status
2026-05-04 17:03:02,671 INFO    [dr1] Looking up failed deployments
2026-05-04 17:03:09,664 INFO    [hub] Looking up failed deployments
2026-05-04 17:03:10,002 INFO    [hub/0] Running addons/submariner/start
2026-05-04 17:03:10,780 INFO    [dr2] Looking up failed deployments
2026-05-04 17:03:57,613 INFO    [hub/0] addons/submariner/start completed in 47.61 seconds
2026-05-04 17:03:57,613 INFO    [hub/0] Running addons/submariner/test
2026-05-04 17:04:16,690 INFO    [hub/0] addons/submariner/test completed in 19.08 seconds
2026-05-04 17:04:16,690 INFO    [submariner] Environment started in 127.39 seconds

Can you explain how to reproduce the issue you are trying to fix?

raghavendra-talur requested review from nirs and parikshithb as code owners April 8, 2026 17:45

raghavendra-talur removed the request for review from parikshithb April 8, 2026 17:45

raaizik reviewed Apr 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

submariner: make start hook idempotent with ensure model#2502

submariner: make start hook idempotent with ensure model#2502
raghavendra-talur wants to merge 1 commit intoRamenDR:mainfrom
raghavendra-talur:rtalur-fix-submariner-install

raghavendra-talur commented Apr 8, 2026

Uh oh!

raaizik Apr 9, 2026

Uh oh!

raaizik Apr 9, 2026

Uh oh!

raaizik Apr 9, 2026

Uh oh!

raaizik Apr 9, 2026

Uh oh!

nirs commented Apr 9, 2026

Uh oh!

nirs commented May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		pass # Not found is fine.


		def are_deployments_available(cluster, names, namespace):

		)


		BROKER_NAMESPACE = "submariner-k8s-broker"

Conversation

raghavendra-talur commented Apr 8, 2026

Uh oh!

raaizik Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

raaizik Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

raaizik Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

raaizik Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

nirs commented Apr 9, 2026

Uh oh!

nirs commented May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants