feat(vm): symlink indirection for dm-snapshot checkpoint/restore#209
feat(vm): symlink indirection for dm-snapshot checkpoint/restore#209lucas77778 wants to merge 3 commits intofeat/dm-snapshot-jailerfrom
Conversation
Direct mode: do_boot() now creates a stable symlink
{vm_dir}/rootfs.link → /dev/mapper/arcbox-snap-{id} and passes the
symlink path to Firecracker. The vmstate records this stable path,
so on restore a new dm-snapshot can be created and the symlink
retargeted without FC knowing the underlying device changed.
Restore (both modes): restore_sandbox() now sets up dm-snapshot CoW
for the restored sandbox instead of doing a full rootfs copy.
- Jailer: stage_kernel_for_jailer() + cow_manager.setup() + mknod
- Direct: cow_manager.setup() + symlink at original vm_dir path
cow_handle is stored on the restored SandboxInstance so teardown
runs correctly on remove.
Remove unused stage_files_for_jailer() — all call sites now use the
split stage_kernel_for_jailer() + stage_rootfs_*_for_jailer().
- cow_size: use sectors.checked_mul(512) to guard against overflow instead of unchecked multiplication. - CowManager::teardown(): change return type from Result<()> to () since the method is best-effort with internal logging and never returns Err. Update all call sites to remove dead error handling.
- stage_rootfs_device_for_jailer: remove stale rootfs.ext4 before mknod to avoid EEXIST from a previous crash. - do_boot: update comment — CowHandle is active in both direct and jailer modes, not direct only. - init.rs: mount /var with nodev (safe default), then mount only /var/lib/arcbox/jailer without nodev for jailer block device nodes.
|
@greptile review |
Greptile SummaryThis PR introduces a symlink indirection layer ( Issues found:
Confidence Score: 3/5Happy path works per the test plan, but direct-mode restore has a resource leak on symlink failure and a systematic directory leak on every restore-and-remove cycle. Two P1 logic bugs in the direct-mode restore path: a CowHandle resource leak when symlink() fails (leaks dm device + loop device + COW file), and an original_vm_dir that is recreated per restore but never cleaned up. The resource leak is an uncommon path, but the directory leak is triggered by the primary test scenario and accumulates silently. Both are straightforward to fix before merging. virt/arcbox-vm/src/sandbox.rs — specifically the direct-mode restore block (~lines 1120–1145) and the corresponding cleanup in remove_sandbox_impl.
|
| Filename | Overview |
|---|---|
| virt/arcbox-vm/src/sandbox.rs | Wires dm-snapshot CoW into restore for both jailer and direct modes; direct-mode path has a resource leak on symlink failure and a systematic directory leak on sandbox removal. |
| virt/arcbox-vm/src/snapshot_cow.rs | Changes teardown to return () (best-effort, logs internally) and adds a checked_mul overflow guard on sector count — both clean improvements. |
| guest/arcbox-agent/src/init.rs | Narrows the dev-capable tmpfs mount from all of /var to just /var/lib/arcbox/jailer, reducing the attack surface while preserving jailer mknod functionality. |
Sequence Diagram
sequenceDiagram
participant Host as Host (sandbox.rs)
participant DM as CowManager
participant FC as Firecracker
participant VM as Guest VM
Note over Host,VM: Boot (direct mode)
Host->>DM: setup(id, rootfs)
DM-->>Host: CowHandle { dm_device: /dev/mapper/arcbox-snap-{id} }
Host->>Host: symlink(vm_dir/rootfs.link to dm_device)
Host->>FC: drive path = vm_dir/rootfs.link
FC->>VM: boot with CoW rootfs
Note over Host,VM: Checkpoint
FC-->>Host: vmstate (records vm_dir/rootfs.link)
Note over Host,VM: Remove original sandbox
Host->>FC: SIGKILL
Host->>DM: teardown(handle)
Host->>Host: remove_dir_all(sandboxes/{id})
Note over Host,VM: Restore (direct mode)
Host->>DM: setup(new_id, rootfs)
DM-->>Host: CowHandle { dm_device: /dev/mapper/arcbox-snap-{new_id} }
Host->>Host: create_dir_all(sandboxes/{original_id})
Host->>Host: symlink(sandboxes/{original_id}/rootfs.link to new dm_device)
Host->>FC: load vmstate (opens rootfs.link to new dm device)
FC->>VM: resume with new CoW rootfs
Note over Host,VM: Remove restored sandbox
Host->>FC: SIGKILL
Host->>DM: teardown(handle)
Host->>Host: remove_dir_all(sandboxes/{new_id})
Note over Host: sandboxes/{original_id}/ NOT removed (dir leak)
Reviews (1): Last reviewed commit: "fix(vm): address PR 208 review feedback" | Re-trigger Greptile
| let cow = if let Some(r) = snap_meta.rootfs_path.as_deref() { | ||
| match self.cow_manager.setup(&new_id, r).await { | ||
| Ok(handle) => { | ||
| let original_vm_dir = PathBuf::from(&fc_cfg.data_dir) | ||
| .join("sandboxes") | ||
| .join(&snap_meta.vm_id); | ||
| let link_path = original_vm_dir.join("rootfs.link"); | ||
| let _ = std::fs::remove_file(&link_path); | ||
| std::os::unix::fs::symlink(&handle.dm_device, &link_path) | ||
| .map_err(VmmError::Io)?; | ||
| Some(handle) | ||
| } |
There was a problem hiding this comment.
CowHandle leaked on symlink failure
If std::os::unix::fs::symlink() fails (e.g. permission denied, stale entry from an unexpected crash), the ? propagates the error and handle is silently dropped — without calling cow_manager.teardown(). This leaks the dm device (/dev/mapper/arcbox-snap-{id}), its backing loop device, and the sparse COW file.
The fix is to call teardown before returning the error:
Ok(handle) => {
let original_vm_dir = PathBuf::from(&fc_cfg.data_dir)
.join("sandboxes")
.join(&snap_meta.vm_id);
let link_path = original_vm_dir.join("rootfs.link");
let _ = std::fs::remove_file(&link_path);
if let Err(e) = std::os::unix::fs::symlink(&handle.dm_device, &link_path) {
self.cow_manager.teardown(&handle).await;
return Err(VmmError::Io(e));
}
Some(handle)
}| } else { | ||
| (vmstate_str, mem_file) | ||
| // Direct mode: setup dm-snapshot for the restored sandbox. | ||
| // The vmstate references {original_vm_dir}/rootfs.link — create | ||
| // a new dm-snapshot and point that symlink to the new device. | ||
| let cow = if let Some(r) = snap_meta.rootfs_path.as_deref() { | ||
| match self.cow_manager.setup(&new_id, r).await { | ||
| Ok(handle) => { | ||
| let original_vm_dir = PathBuf::from(&fc_cfg.data_dir) | ||
| .join("sandboxes") | ||
| .join(&snap_meta.vm_id); | ||
| let link_path = original_vm_dir.join("rootfs.link"); | ||
| let _ = std::fs::remove_file(&link_path); | ||
| std::os::unix::fs::symlink(&handle.dm_device, &link_path) | ||
| .map_err(VmmError::Io)?; | ||
| Some(handle) | ||
| } | ||
| Err(e) => { | ||
| debug!( | ||
| sandbox_id = %new_id, | ||
| error = %e, | ||
| "dm-snapshot unavailable for restore" | ||
| ); | ||
| None | ||
| } | ||
| } | ||
| } else { | ||
| None | ||
| }; | ||
|
|
||
| (vmstate_str, mem_file, cow) |
There was a problem hiding this comment.
original_vm_dir is recreated during restore but never removed
At line 1022-1026 (earlier in the function), create_dir_all(&original_vm_dir) ensures .../sandboxes/{snap_meta.vm_id}/ exists so the symlink can be written into it. However, remove_sandbox_impl only removes .../sandboxes/{new_id}:
// remove_sandbox_impl (line ~1753):
let vm_dir = PathBuf::from(&config.firecracker.data_dir)
.join("sandboxes")
.join(id); // id == new_id, NOT snap_meta.vm_id
tokio::fs::remove_dir_all(&vm_dir).await ...On every "create → checkpoint → remove → restore → remove" cycle:
original_vm_diris recreated during restore- it is never deleted when the restored sandbox is torn down
- it accumulates a dangling
rootfs.linksymlink after dm teardown
Consider also unlinking original_vm_dir (or at minimum rootfs.link) from inside remove_sandbox_impl using the instance's snap_meta.vm_id, or storing the link path on SandboxInstance for explicit cleanup.
| let (rootfs, cow) = match cow_manager.setup(id, &spec.rootfs).await { | ||
| Ok(handle) => { | ||
| let path = handle.dm_device.clone(); | ||
| (path, Some(handle)) | ||
| let link_path = vm_dir.join("rootfs.link"); | ||
| std::os::unix::fs::symlink(&handle.dm_device, &link_path) | ||
| .map_err(VmmError::Io)?; | ||
| (link_path.to_str().unwrap().to_owned(), Some(handle)) |
There was a problem hiding this comment.
Symlink not removed on boot failure
When builder.start() fails (a few lines below at line ~1668), the cleanup block calls cow_manager.teardown(handle) which removes the dm device, leaving vm_dir/rootfs.link as a dangling symlink. The vm_dir is not cleaned up on boot failure either, so the stale symlink persists until the next attempt.
This mirrors the situation in restore; for consistency, consider deleting link_path in the boot-failure cleanup block alongside the teardown call.
Summary
{vm_dir}/rootfs.link) instead of the ephemeral/dev/mapper/arcbox-snap-{id}device path. On restore, a new dm-snapshot is created and the symlink is retargeted — FC reopens the same path transparently.restore_sandbox()for both jailer and direct modes, eliminating full rootfs copies on restore.cow_handleon restoredSandboxInstancesoremove_sandbox_impl()properly tears down the dm-snapshot.stage_files_for_jailer()— all call sites now use the splitstage_kernel_for_jailer()+stage_rootfs_*_for_jailer().How it works
Boot (direct mode):
Restore (direct mode):
Restore (jailer mode):
Test plan
readystatedm-snapshot createdlog for both original and restored sandboxdm-snapshot teardown completeon remove of bothcargo clippyandcargo fmtpass with zero warnings