Fix known issues with dryRun (test failover) implementation#2526
Fix known issues with dryRun (test failover) implementation#2526am-agrawa wants to merge 3 commits intoRamenDR:mainfrom
Conversation
…havior Signed-off-by: Aman Agrawal <aman_31dec@yahoo.in>
| } | ||
| // Use the DRPC Protected condition to check if it is true and then allow failover | ||
| if !d.isProtected() { | ||
| if d.instance.Spec.DryRun && !d.isProtected() { |
There was a problem hiding this comment.
This means that test failovers are blocked if not Protected and actual failovers proceed even if not Protected. I believe real failovers should be more strict about protection, not less.
if !d.instance.Spec.DryRun && !d.isProtected() {
return !done, nil
}
This would block real failovers if not Protected which is safe, and allow test failovers to proceed without protection which is flexible.
There was a problem hiding this comment.
I don't think we intent to change how real failover works, but failover can be performed in any or all scenarios.
Test failover follows the same path.
There was a problem hiding this comment.
Real failover cannot be strict - this is the emergency case when the cluster is broken and we cannot access it. The only thing we can do is try to fail over to the other cluster knowing that we are losing some data.
Test failover must be strict, if the system is not is good state this is not the time to test if failover works - it should work like relocate, blocked if the system is not in good state.
There was a problem hiding this comment.
So are we going to make this change as part of this PR?
Add annotation constants to track test failover mode: - DRPCTestFailoverDryRunAnnotation in controllers package - DRPCTestFailoverDryRunAnnotationValueTrue in API types Update autoResync() to handle test failover explicitly using the annotation while maintaining same behavior as regular failover. Signed-off-by: Aman Agrawal <aman_31dec@yahoo.in> Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> Signed-off-by: Aman Agrawal <aman_31dec@yahoo.in>
Introduce handleTestFailoverTransition() to manage test failover lifecycle - adds annotation on entry, triggers cleanup on exit, and removes annotation after cleanup completes. Signed-off-by: Aman Agrawal <aman_31dec@yahoo.in> Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> Signed-off-by: Aman Agrawal <aman_31dec@yahoo.in>
1c4b9fe to
e94c6e4
Compare
No description provided.