Skip to content

Fix known issues with dryRun (test failover) implementation#2526

Open
am-agrawa wants to merge 3 commits intoRamenDR:mainfrom
am-agrawa:dryRun-feedback-fixes
Open

Fix known issues with dryRun (test failover) implementation#2526
am-agrawa wants to merge 3 commits intoRamenDR:mainfrom
am-agrawa:dryRun-feedback-fixes

Conversation

@am-agrawa
Copy link
Copy Markdown
Member

No description provided.

…havior

Signed-off-by: Aman Agrawal <aman_31dec@yahoo.in>
}
// Use the DRPC Protected condition to check if it is true and then allow failover
if !d.isProtected() {
if d.instance.Spec.DryRun && !d.isProtected() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that test failovers are blocked if not Protected and actual failovers proceed even if not Protected. I believe real failovers should be more strict about protection, not less.

if !d.instance.Spec.DryRun && !d.isProtected() {
    return !done, nil
}

This would block real failovers if not Protected which is safe, and allow test failovers to proceed without protection which is flexible.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we intent to change how real failover works, but failover can be performed in any or all scenarios.
Test failover follows the same path.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Real failover cannot be strict - this is the emergency case when the cluster is broken and we cannot access it. The only thing we can do is try to fail over to the other cluster knowing that we are losing some data.

Test failover must be strict, if the system is not is good state this is not the time to test if failover works - it should work like relocate, blocked if the system is not in good state.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BenamarMk Thoughts?

Copy link
Copy Markdown
Member

@BenamarMk BenamarMk May 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nirs is correct.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So are we going to make this change as part of this PR?

am-agrawa and others added 2 commits April 30, 2026 01:06
Add annotation constants to track test failover mode:
- DRPCTestFailoverDryRunAnnotation in controllers package
- DRPCTestFailoverDryRunAnnotationValueTrue in API types

Update autoResync() to handle test failover explicitly using the
annotation while maintaining same behavior as regular failover.

Signed-off-by: Aman Agrawal <aman_31dec@yahoo.in>
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: Aman Agrawal <aman_31dec@yahoo.in>
Introduce handleTestFailoverTransition() to manage test failover
lifecycle - adds annotation on entry, triggers cleanup on exit,
and removes annotation after cleanup completes.

Signed-off-by: Aman Agrawal <aman_31dec@yahoo.in>
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: Aman Agrawal <aman_31dec@yahoo.in>
@am-agrawa am-agrawa force-pushed the dryRun-feedback-fixes branch from 1c4b9fe to e94c6e4 Compare April 29, 2026 19:41
@am-agrawa am-agrawa changed the title Add Protected condition check for dryRun without changing existing behaviour Fix known issues with dryRun (test failover) implementation Apr 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants