migration: Add data integrity check#6859
Conversation
WalkthroughMigrationBase now records an MD5 checksum of a VM-side file and verifies it after migration. During setup (when the VM is started by the test), it creates a random file at the configured path on the source VM, captures its Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Add a data integrity check during migration to ensure that the data remains consistent between source and destination. Signed-off-by: lcheng <lcheng@redhat.com>
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
provider/migration/base_steps.py (1)
92-103:⚠️ Potential issue | 🟠 MajorKeep checksum setup and verification in sync.
Line 103 only captures the baseline when
setup_default()had to start the guest itself, but Line 330 always verifies it. An already-running guest, or an offline case withstart_vm=no, can reach the verify path without ever creating the file/baseline and fail for the wrong reason.Possible fix
- if start_vm == "yes" and not self.vm.is_alive(): + if start_vm == "yes" and not self.vm.is_alive(): self.vm.start() if use_console: self.vm.cleanup_serial_console() self.vm.create_serial_console() self.vm.wait_for_serial_login().close() else: self.vm.wait_for_login().close() - self.prepare_data_integrity_check() + if start_vm == "yes" and self.vm.is_alive(): + self.prepare_data_integrity_check()- self.verify_data_integrity_check() + if self.file_md5sum_in_vm is not None: + self.verify_data_integrity_check()Also applies to: 326-330
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@provider/migration/base_steps.py` around lines 92 - 103, The checksum baseline is only created when this code path starts the VM (the block that calls self.prepare_data_integrity_check()), but the verification path (around the verify call near the later block using setup_default()/verify) can run for already-running guests or when start_vm=="no", causing false failures; ensure the baseline is created whenever verification may later run by invoking the same setup routine (e.g., call self.prepare_data_integrity_check() or the equivalent setup_default() baseline-creation logic) for cases where the VM is already alive or when start_vm=="no" and the VM exists, and mirror this change for the similar code around the verify path (the block that uses setup_default()/verify).
🧹 Nitpick comments (1)
provider/migration/base_steps.py (1)
495-500: Make the checksum payload size configurable.Hardcoding
count=600here adds 600 MiB of guest writes to every covered case and assumes/varcan absorb it. A smaller default plus a size parameter would keep the integrity check useful without turning it into a filesystem-capacity test.Possible refactor
file_md5sum_path = self.params.get("file_md5sum_path", "/var/check_file_md5sum") + file_md5sum_size_mb = int(self.params.get("file_md5sum_size_mb", "64")) self.vm.cleanup_serial_console() self.vm.create_serial_console() vm_session = self.vm.wait_for_serial_login(timeout=120) - vm_session.cmd(f"dd if=/dev/urandom of={file_md5sum_path} bs=1M count=600") + vm_session.cmd( + f"dd if=/dev/urandom of={file_md5sum_path} bs=1M count={file_md5sum_size_mb}" + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@provider/migration/base_steps.py` around lines 495 - 500, The dd payload size is hardcoded (count=600) causing large guest writes; add a configurable parameter (e.g., "file_md5sum_size_mb") in self.params with a sensible default (e.g., 10), validate/normalize it to an integer and optional max cap, then use it to build the dd command instead of the fixed 600 (modify the code around file_md5sum_path, self.params and the vm_session.cmd call so the command becomes f"dd if=/dev/urandom of={file_md5sum_path} bs=1M count={size_mb}" with size_mb coming from the new param).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@provider/migration/base_steps.py`:
- Around line 524-526: Remove the unnecessary f-string prefix from the log call
self.test.log.info(f"Data integrity check passed.") so it becomes a plain
string, and add one additional blank line before the top-level function
definition setup_network_data_transport(params) to satisfy PEP8 spacing (E302).
- Around line 497-499: prepare_data_integrity_check and
verify_data_integrity_check unconditionally force serial-console login, ignoring
the use_console option passed to setup_default; update setup_default to save the
passed use_console flag on the instance (e.g., self.use_console) and change
prepare_data_integrity_check and verify_data_integrity_check to consult
self.use_console before calling vm.cleanup_serial_console(),
vm.create_serial_console(), and vm.wait_for_serial_login(); if self.use_console
is False, use the normal guest login path (e.g., vm.wait_for_login /
vm.wait_for_ssh or the existing initial login code) or make the checks optional
based on that flag.
---
Outside diff comments:
In `@provider/migration/base_steps.py`:
- Around line 92-103: The checksum baseline is only created when this code path
starts the VM (the block that calls self.prepare_data_integrity_check()), but
the verification path (around the verify call near the later block using
setup_default()/verify) can run for already-running guests or when
start_vm=="no", causing false failures; ensure the baseline is created whenever
verification may later run by invoking the same setup routine (e.g., call
self.prepare_data_integrity_check() or the equivalent setup_default()
baseline-creation logic) for cases where the VM is already alive or when
start_vm=="no" and the VM exists, and mirror this change for the similar code
around the verify path (the block that uses setup_default()/verify).
---
Nitpick comments:
In `@provider/migration/base_steps.py`:
- Around line 495-500: The dd payload size is hardcoded (count=600) causing
large guest writes; add a configurable parameter (e.g., "file_md5sum_size_mb")
in self.params with a sensible default (e.g., 10), validate/normalize it to an
integer and optional max cap, then use it to build the dd command instead of the
fixed 600 (modify the code around file_md5sum_path, self.params and the
vm_session.cmd call so the command becomes f"dd if=/dev/urandom
of={file_md5sum_path} bs=1M count={size_mb}" with size_mb coming from the new
param).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 300cb407-39fb-4b8a-a678-c6014a3b2e15
📒 Files selected for processing (1)
provider/migration/base_steps.py
| self.vm.cleanup_serial_console() | ||
| self.vm.create_serial_console() | ||
| vm_session = self.vm.wait_for_serial_login(timeout=120) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -name "base_steps.py" -type f | grep migrationRepository: autotest/tp-libvirt
Length of output: 97
🏁 Script executed:
wc -l provider/migration/base_steps.pyRepository: autotest/tp-libvirt
Length of output: 99
🏁 Script executed:
sed -n '490,525p' provider/migration/base_steps.pyRepository: autotest/tp-libvirt
Length of output: 1598
🏁 Script executed:
# Get more context - find method definitions
sed -n '470,550p' provider/migration/base_steps.pyRepository: autotest/tp-libvirt
Length of output: 3254
🏁 Script executed:
# Find the class definition
grep -n "class.*MigrationBase" provider/migration/base_steps.pyRepository: autotest/tp-libvirt
Length of output: 94
🏁 Script executed:
# Search for setup_default method
grep -n "def setup_default" provider/migration/base_steps.pyRepository: autotest/tp-libvirt
Length of output: 113
🏁 Script executed:
# Search for use_console references
grep -n "use_console" provider/migration/base_steps.pyRepository: autotest/tp-libvirt
Length of output: 255
🏁 Script executed:
# Get context around setup_default and how it handles use_console
sed -n '1,100p' provider/migration/base_steps.py | head -50Repository: autotest/tp-libvirt
Length of output: 1395
🏁 Script executed:
# Get the full setup_default method
sed -n '67,150p' provider/migration/base_steps.pyRepository: autotest/tp-libvirt
Length of output: 4121
🏁 Script executed:
# Check if prepare_data_integrity_check and verify_data_integrity_check are called from setup_default
grep -n "prepare_data_integrity_check\|verify_data_integrity_check" provider/migration/base_steps.pyRepository: autotest/tp-libvirt
Length of output: 260
🏁 Script executed:
# Check class attributes or initialization to see if use_console is stored
sed -n '28,100p' provider/migration/base_steps.pyRepository: autotest/tp-libvirt
Length of output: 2791
🏁 Script executed:
grep -n "prepare_data_integrity_check\|verify_data_integrity_check" provider/migration/base_steps.pyRepository: autotest/tp-libvirt
Length of output: 260
🏁 Script executed:
# Get context around the verify_data_integrity_check call at line 330
sed -n '320,340p' provider/migration/base_steps.pyRepository: autotest/tp-libvirt
Length of output: 930
Both data integrity check helpers unconditionally force serial-console login, ignoring the use_console parameter passed to setup_default().
The initial guest login respects use_console, but prepare_data_integrity_check() (lines 497–499) and verify_data_integrity_check() (lines 514–516) override that choice and require serial console. This breaks the contract for guests that support normal login but lack serial console. Either reuse the selected login method by storing use_console as a class attribute, accept it as a method parameter, or make these checks optional.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@provider/migration/base_steps.py` around lines 497 - 499,
prepare_data_integrity_check and verify_data_integrity_check unconditionally
force serial-console login, ignoring the use_console option passed to
setup_default; update setup_default to save the passed use_console flag on the
instance (e.g., self.use_console) and change prepare_data_integrity_check and
verify_data_integrity_check to consult self.use_console before calling
vm.cleanup_serial_console(), vm.create_serial_console(), and
vm.wait_for_serial_login(); if self.use_console is False, use the normal guest
login path (e.g., vm.wait_for_login / vm.wait_for_ssh or the existing initial
login code) or make the checks optional based on that flag.
| self.test.log.info(f"Data integrity check passed.") | ||
|
|
||
| def setup_network_data_transport(params): |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's find and read the relevant file
fd base_steps.py | head -5Repository: autotest/tp-libvirt
Length of output: 95
🏁 Script executed:
# Read the file around lines 524-526
wc -l provider/migration/base_steps.pyRepository: autotest/tp-libvirt
Length of output: 99
🏁 Script executed:
# Get context around the problematic lines
sed -n '520,530p' provider/migration/base_steps.pyRepository: autotest/tp-libvirt
Length of output: 403
Remove the unnecessary f-string prefix and add one more blank line before the top-level function.
Line 524 uses an f-string with no placeholders (Ruff F541), and the function definition on line 526 requires an additional blank line to meet PEP 8 spacing requirements (E302).
Fix
- self.test.log.info(f"Data integrity check passed.")
+ self.test.log.info("Data integrity check passed.")
+
def setup_network_data_transport(params):📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| self.test.log.info(f"Data integrity check passed.") | |
| def setup_network_data_transport(params): | |
| self.test.log.info("Data integrity check passed.") | |
| def setup_network_data_transport(params): |
🧰 Tools
🪛 GitHub Actions: CI
[error] 526-526: inspekt checkall style failure: E302 expected 2 blank lines, found 1
🪛 Ruff (0.15.9)
[error] 524-524: f-string without any placeholders
Remove extraneous f prefix
(F541)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@provider/migration/base_steps.py` around lines 524 - 526, Remove the
unnecessary f-string prefix from the log call self.test.log.info(f"Data
integrity check passed.") so it becomes a plain string, and add one additional
blank line before the top-level function definition
setup_network_data_transport(params) to satisfy PEP8 spacing (E302).
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
provider/migration/base_steps.py (1)
95-103:⚠️ Potential issue | 🔴 CriticalDon't run the integrity check when no baseline was prepared.
Line 103 only populates
self.file_md5sum_in_vmwhensetup_default()had to start the guest itself. For an already-running guest—or anystart_vm = "no"flow—Line 330 still callsverify_data_integrity_check(), so this path compares againstNoneor tries to log into a guest that was never prepared.Minimal safe guard
- self.verify_data_integrity_check() + if self.file_md5sum_in_vm is not None: + self.verify_data_integrity_check()Also applies to: 329-330
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@provider/migration/base_steps.py` around lines 95 - 103, The integrity check runs even when no baseline was prepared; modify the flow so verify_data_integrity_check() is only called if a baseline exists (i.e., self.file_md5sum_in_vm is populated). In practice, set a clear flag or check self.file_md5sum_in_vm before calling verify_data_integrity_check() in the code paths that handle start_vm == "no" and the post-setup flow (the block that calls verify_data_integrity_check() around lines 329-330), and ensure prepare_data_integrity_check() still sets self.file_md5sum_in_vm when invoked from setup_default()/start sequence so the check will run only when a baseline was created.
♻️ Duplicate comments (2)
provider/migration/base_steps.py (2)
524-524:⚠️ Potential issue | 🟡 MinorDrop the literal f-string on Line 524.
Ruff F541 is still triggered here, so the lint step will keep failing until this becomes a plain string.
Minimal fix
- self.test.log.info(f"Data integrity check passed.") + self.test.log.info("Data integrity check passed.")The grep below should return Line 524 until the literal f-string is removed.
#!/bin/bash rg -n 'self\.test\.log\.info\(f"Data integrity check passed\."\)' provider/migration/base_steps.py🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@provider/migration/base_steps.py` at line 524, The log call uses a literal f-string which triggers Ruff F541; change the call in the method containing self.test.log.info to use a plain string instead of an f-string (replace self.test.log.info(f"Data integrity check passed.") with self.test.log.info("Data integrity check passed.") so the string is no longer an f-string).
97-103:⚠️ Potential issue | 🔴 CriticalHonor
use_consolein these new helpers.Lines 97-102 explicitly support normal guest login, but Lines 497-499 and 513-516 always rebuild the serial console and call
wait_for_serial_login(). That breaksuse_console=Falsescenarios on guests without a serial console.Also applies to: 497-499, 513-516
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@provider/migration/base_steps.py` around lines 97 - 103, The new helper code must respect the use_console flag: replace the unconditional calls to self.vm.cleanup_serial_console(), self.vm.create_serial_console(), and self.vm.wait_for_serial_login() in the new helpers with logic that checks use_console and only rebuilds/uses the serial console and calls self.vm.wait_for_serial_login() when use_console is True; when False, call self.vm.wait_for_login() instead. Update both helper locations that currently always recreate the serial console (the blocks invoking vm.cleanup_serial_console, vm.create_serial_console, vm.wait_for_serial_login) so they mirror the existing branch in prepare_data_integrity_check() and honor use_console consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@provider/migration/base_steps.py`:
- Around line 513-520: The current swap of self.vm.connect_uri to dest_uri then
calling self.vm.wait_for_serial_login() and vm_session.cmd_output(...) is not
exception-safe; if wait_for_serial_login or cmd_output raises, the original
self.vm.connect_uri is never restored and vm_session may remain open. Fix by
wrapping the serial-login and md5sum work in a try/finally: after setting
backup_uri, set self.vm.connect_uri = dest_uri, then in a finally block always
restore self.vm.connect_uri = backup_uri and ensure vm_session.close() (only if
vm_session was created) and any necessary serial-console cleanup are executed;
use the existing methods wait_for_serial_login, vm_session.cmd_output,
vm_session.close, vm.cleanup_serial_console and vm.create_serial_console to
implement the safe cleanup.
---
Outside diff comments:
In `@provider/migration/base_steps.py`:
- Around line 95-103: The integrity check runs even when no baseline was
prepared; modify the flow so verify_data_integrity_check() is only called if a
baseline exists (i.e., self.file_md5sum_in_vm is populated). In practice, set a
clear flag or check self.file_md5sum_in_vm before calling
verify_data_integrity_check() in the code paths that handle start_vm == "no" and
the post-setup flow (the block that calls verify_data_integrity_check() around
lines 329-330), and ensure prepare_data_integrity_check() still sets
self.file_md5sum_in_vm when invoked from setup_default()/start sequence so the
check will run only when a baseline was created.
---
Duplicate comments:
In `@provider/migration/base_steps.py`:
- Line 524: The log call uses a literal f-string which triggers Ruff F541;
change the call in the method containing self.test.log.info to use a plain
string instead of an f-string (replace self.test.log.info(f"Data integrity check
passed.") with self.test.log.info("Data integrity check passed.") so the string
is no longer an f-string).
- Around line 97-103: The new helper code must respect the use_console flag:
replace the unconditional calls to self.vm.cleanup_serial_console(),
self.vm.create_serial_console(), and self.vm.wait_for_serial_login() in the new
helpers with logic that checks use_console and only rebuilds/uses the serial
console and calls self.vm.wait_for_serial_login() when use_console is True; when
False, call self.vm.wait_for_login() instead. Update both helper locations that
currently always recreate the serial console (the blocks invoking
vm.cleanup_serial_console, vm.create_serial_console, vm.wait_for_serial_login)
so they mirror the existing branch in prepare_data_integrity_check() and honor
use_console consistently.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: cc6c8c8f-13ba-4e3e-9430-c106a7345c2e
📒 Files selected for processing (1)
provider/migration/base_steps.py
| backup_uri, self.vm.connect_uri = self.vm.connect_uri, dest_uri | ||
| self.vm.cleanup_serial_console() | ||
| self.vm.create_serial_console() | ||
| vm_session = self.vm.wait_for_serial_login(timeout=120) | ||
| o = vm_session.cmd_output(f"md5sum {file_md5sum_path}").strip() | ||
| self.test.log.debug(f"File md5sum: {o}") | ||
| vm_session.close() | ||
| self.vm.connect_uri = backup_uri |
There was a problem hiding this comment.
Make the destination URI swap exception-safe.
If wait_for_serial_login() or md5sum fails on Lines 513-520, self.vm.connect_uri is never restored and the session stays open. That can send later cleanup to the wrong host.
Safer cleanup
backup_uri, self.vm.connect_uri = self.vm.connect_uri, dest_uri
- self.vm.cleanup_serial_console()
- self.vm.create_serial_console()
- vm_session = self.vm.wait_for_serial_login(timeout=120)
- o = vm_session.cmd_output(f"md5sum {file_md5sum_path}").strip()
- self.test.log.debug(f"File md5sum: {o}")
- vm_session.close()
- self.vm.connect_uri = backup_uri
+ vm_session = None
+ try:
+ self.vm.cleanup_serial_console()
+ self.vm.create_serial_console()
+ vm_session = self.vm.wait_for_serial_login(timeout=120)
+ o = vm_session.cmd_output(f"md5sum {file_md5sum_path}").strip()
+ self.test.log.debug(f"File md5sum: {o}")
+ finally:
+ if vm_session:
+ vm_session.close()
+ self.vm.connect_uri = backup_uri🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@provider/migration/base_steps.py` around lines 513 - 520, The current swap of
self.vm.connect_uri to dest_uri then calling self.vm.wait_for_serial_login() and
vm_session.cmd_output(...) is not exception-safe; if wait_for_serial_login or
cmd_output raises, the original self.vm.connect_uri is never restored and
vm_session may remain open. Fix by wrapping the serial-login and md5sum work in
a try/finally: after setting backup_uri, set self.vm.connect_uri = dest_uri,
then in a finally block always restore self.vm.connect_uri = backup_uri and
ensure vm_session.close() (only if vm_session was created) and any necessary
serial-console cleanup are executed; use the existing methods
wait_for_serial_login, vm_session.cmd_output, vm_session.close,
vm.cleanup_serial_console and vm.create_serial_console to implement the safe
cleanup.
|
Closing this PR due to current team constraints. This is part of a broader effort to triage all in-flight work across our upstream repos. If this work is still needed, please feel free to reopen and it will be picked up. Apologies for any inconvenience. |
Add a data integrity check during migration to ensure that the data remains consistent between source and destination.
Summary by CodeRabbit