Skip to content

feat(vm): symlink indirection for dm-snapshot checkpoint/restore#209

Open
lucas77778 wants to merge 3 commits intofeat/dm-snapshot-jailerfrom
feat/dm-snapshot-restore
Open

feat(vm): symlink indirection for dm-snapshot checkpoint/restore#209
lucas77778 wants to merge 3 commits intofeat/dm-snapshot-jailerfrom
feat/dm-snapshot-restore

Conversation

@lucas77778
Copy link
Copy Markdown
Member

Summary

  • Add symlink indirection layer between Firecracker and dm-snapshot devices, inspired by E2B's architecture. FC's vmstate records a stable symlink path ({vm_dir}/rootfs.link) instead of the ephemeral /dev/mapper/arcbox-snap-{id} device path. On restore, a new dm-snapshot is created and the symlink is retargeted — FC reopens the same path transparently.
  • Wire dm-snapshot CoW into restore_sandbox() for both jailer and direct modes, eliminating full rootfs copies on restore.
  • Store cow_handle on restored SandboxInstance so remove_sandbox_impl() properly tears down the dm-snapshot.
  • Remove unused stage_files_for_jailer() — all call sites now use the split stage_kernel_for_jailer() + stage_rootfs_*_for_jailer().

How it works

Boot (direct mode):

cow_manager.setup() → /dev/mapper/arcbox-snap-{id}
symlink: {vm_dir}/rootfs.link → /dev/mapper/arcbox-snap-{id}
FC receives: {vm_dir}/rootfs.link (stable path, saved in vmstate)

Restore (direct mode):

cow_manager.setup() → /dev/mapper/arcbox-snap-{new_id}
symlink: {original_vm_dir}/rootfs.link → /dev/mapper/arcbox-snap-{new_id}
FC loads vmstate → opens {original_vm_dir}/rootfs.link → new device

Restore (jailer mode):

cow_manager.setup() → /dev/mapper/arcbox-snap-{new_id}
mknod: {chroot}/rootfs.ext4 → (major, minor) of new dm device
FC loads vmstate → opens /rootfs.ext4 (chroot-relative) → new device

Test plan

  • Create sandbox → checkpoint → remove → restore → verify ready state
  • Verified dm-snapshot created log for both original and restored sandbox
  • Verified dm-snapshot teardown complete on remove of both
  • cargo clippy and cargo fmt pass with zero warnings
  • CI

Direct mode: do_boot() now creates a stable symlink
{vm_dir}/rootfs.link → /dev/mapper/arcbox-snap-{id} and passes the
symlink path to Firecracker. The vmstate records this stable path,
so on restore a new dm-snapshot can be created and the symlink
retargeted without FC knowing the underlying device changed.

Restore (both modes): restore_sandbox() now sets up dm-snapshot CoW
for the restored sandbox instead of doing a full rootfs copy.
- Jailer: stage_kernel_for_jailer() + cow_manager.setup() + mknod
- Direct: cow_manager.setup() + symlink at original vm_dir path

cow_handle is stored on the restored SandboxInstance so teardown
runs correctly on remove.

Remove unused stage_files_for_jailer() — all call sites now use the
split stage_kernel_for_jailer() + stage_rootfs_*_for_jailer().
@lucas77778 lucas77778 requested a review from AprilNEA April 8, 2026 06:21
- cow_size: use sectors.checked_mul(512) to guard against overflow
  instead of unchecked multiplication.
- CowManager::teardown(): change return type from Result<()> to ()
  since the method is best-effort with internal logging and never
  returns Err. Update all call sites to remove dead error handling.
- stage_rootfs_device_for_jailer: remove stale rootfs.ext4 before
  mknod to avoid EEXIST from a previous crash.
- do_boot: update comment — CowHandle is active in both direct and
  jailer modes, not direct only.
- init.rs: mount /var with nodev (safe default), then mount only
  /var/lib/arcbox/jailer without nodev for jailer block device nodes.
@lucas77778 lucas77778 requested a review from PeronGH April 8, 2026 13:04
@AprilNEA
Copy link
Copy Markdown
Member

AprilNEA commented Apr 9, 2026

@greptile review

@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Apr 9, 2026

Greptile Summary

This PR introduces a symlink indirection layer (rootfs.link) between Firecracker and dm-snapshot devices so that the vmstate can record a stable path that is retargeted on restore rather than embedding the ephemeral /dev/mapper/arcbox-snap-{id} device path. It also wires dm-snapshot CoW into restore_sandbox() for both jailer and direct modes, stores the cow_handle on restored SandboxInstance for proper teardown, removes the now-dead stage_files_for_jailer() helper, and makes teardown best-effort (returns (), logs errors internally). Two supporting fixes land alongside: an overflow guard on the sectors * 512 CoW file size calculation, and a narrower tmpfs+dev mount in the guest init (/var/lib/arcbox/jailer only instead of all of /var).

Issues found:

  • P1 — Resource leak on symlink failure (direct-mode restore): If std::os::unix::fs::symlink() fails inside the cow_manager.setup() success arm, the ? operator propagates the error and drops the CowHandle without calling cow_manager.teardown(), leaking the dm device, loop device, and sparse COW file.

  • P1 — original_vm_dir leaked after restored sandbox removal: During direct-mode restore, create_dir_all(&original_vm_dir) recreates .../sandboxes/{snap_meta.vm_id}/ so the symlink can be placed there. When the restored sandbox is subsequently removed, remove_sandbox_impl only deletes .../sandboxes/{new_id} — the original directory (and its dangling rootfs.link symlink) is never cleaned up. Each restore-and-remove cycle accumulates one orphaned directory.

  • P2 — Dangling rootfs.link on direct-mode boot failure: If builder.start() fails after the rootfs.link symlink is created in do_boot, the cleanup block tears down the dm device but leaves the symlink behind."

Confidence Score: 3/5

Happy path works per the test plan, but direct-mode restore has a resource leak on symlink failure and a systematic directory leak on every restore-and-remove cycle.

Two P1 logic bugs in the direct-mode restore path: a CowHandle resource leak when symlink() fails (leaks dm device + loop device + COW file), and an original_vm_dir that is recreated per restore but never cleaned up. The resource leak is an uncommon path, but the directory leak is triggered by the primary test scenario and accumulates silently. Both are straightforward to fix before merging.

virt/arcbox-vm/src/sandbox.rs — specifically the direct-mode restore block (~lines 1120–1145) and the corresponding cleanup in remove_sandbox_impl.

Vulnerabilities

No security concerns identified. The narrowing of the dev-capable tmpfs mount in init.rs from all of /var to only /var/lib/arcbox/jailer is a positive security improvement — it limits the filesystem surface where device nodes can be created inside the guest VM.

Important Files Changed

Filename Overview
virt/arcbox-vm/src/sandbox.rs Wires dm-snapshot CoW into restore for both jailer and direct modes; direct-mode path has a resource leak on symlink failure and a systematic directory leak on sandbox removal.
virt/arcbox-vm/src/snapshot_cow.rs Changes teardown to return () (best-effort, logs internally) and adds a checked_mul overflow guard on sector count — both clean improvements.
guest/arcbox-agent/src/init.rs Narrows the dev-capable tmpfs mount from all of /var to just /var/lib/arcbox/jailer, reducing the attack surface while preserving jailer mknod functionality.

Sequence Diagram

sequenceDiagram
    participant Host as Host (sandbox.rs)
    participant DM as CowManager
    participant FC as Firecracker
    participant VM as Guest VM

    Note over Host,VM: Boot (direct mode)
    Host->>DM: setup(id, rootfs)
    DM-->>Host: CowHandle { dm_device: /dev/mapper/arcbox-snap-{id} }
    Host->>Host: symlink(vm_dir/rootfs.link to dm_device)
    Host->>FC: drive path = vm_dir/rootfs.link
    FC->>VM: boot with CoW rootfs

    Note over Host,VM: Checkpoint
    FC-->>Host: vmstate (records vm_dir/rootfs.link)

    Note over Host,VM: Remove original sandbox
    Host->>FC: SIGKILL
    Host->>DM: teardown(handle)
    Host->>Host: remove_dir_all(sandboxes/{id})

    Note over Host,VM: Restore (direct mode)
    Host->>DM: setup(new_id, rootfs)
    DM-->>Host: CowHandle { dm_device: /dev/mapper/arcbox-snap-{new_id} }
    Host->>Host: create_dir_all(sandboxes/{original_id})
    Host->>Host: symlink(sandboxes/{original_id}/rootfs.link to new dm_device)
    Host->>FC: load vmstate (opens rootfs.link to new dm device)
    FC->>VM: resume with new CoW rootfs

    Note over Host,VM: Remove restored sandbox
    Host->>FC: SIGKILL
    Host->>DM: teardown(handle)
    Host->>Host: remove_dir_all(sandboxes/{new_id})
    Note over Host: sandboxes/{original_id}/ NOT removed (dir leak)
Loading

Reviews (1): Last reviewed commit: "fix(vm): address PR 208 review feedback" | Re-trigger Greptile

Comment on lines +1120 to +1131
let cow = if let Some(r) = snap_meta.rootfs_path.as_deref() {
match self.cow_manager.setup(&new_id, r).await {
Ok(handle) => {
let original_vm_dir = PathBuf::from(&fc_cfg.data_dir)
.join("sandboxes")
.join(&snap_meta.vm_id);
let link_path = original_vm_dir.join("rootfs.link");
let _ = std::fs::remove_file(&link_path);
std::os::unix::fs::symlink(&handle.dm_device, &link_path)
.map_err(VmmError::Io)?;
Some(handle)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 CowHandle leaked on symlink failure

If std::os::unix::fs::symlink() fails (e.g. permission denied, stale entry from an unexpected crash), the ? propagates the error and handle is silently dropped — without calling cow_manager.teardown(). This leaks the dm device (/dev/mapper/arcbox-snap-{id}), its backing loop device, and the sparse COW file.

The fix is to call teardown before returning the error:

Ok(handle) => {
    let original_vm_dir = PathBuf::from(&fc_cfg.data_dir)
        .join("sandboxes")
        .join(&snap_meta.vm_id);
    let link_path = original_vm_dir.join("rootfs.link");
    let _ = std::fs::remove_file(&link_path);
    if let Err(e) = std::os::unix::fs::symlink(&handle.dm_device, &link_path) {
        self.cow_manager.teardown(&handle).await;
        return Err(VmmError::Io(e));
    }
    Some(handle)
}

Comment on lines 1116 to +1145
} else {
(vmstate_str, mem_file)
// Direct mode: setup dm-snapshot for the restored sandbox.
// The vmstate references {original_vm_dir}/rootfs.link — create
// a new dm-snapshot and point that symlink to the new device.
let cow = if let Some(r) = snap_meta.rootfs_path.as_deref() {
match self.cow_manager.setup(&new_id, r).await {
Ok(handle) => {
let original_vm_dir = PathBuf::from(&fc_cfg.data_dir)
.join("sandboxes")
.join(&snap_meta.vm_id);
let link_path = original_vm_dir.join("rootfs.link");
let _ = std::fs::remove_file(&link_path);
std::os::unix::fs::symlink(&handle.dm_device, &link_path)
.map_err(VmmError::Io)?;
Some(handle)
}
Err(e) => {
debug!(
sandbox_id = %new_id,
error = %e,
"dm-snapshot unavailable for restore"
);
None
}
}
} else {
None
};

(vmstate_str, mem_file, cow)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 original_vm_dir is recreated during restore but never removed

At line 1022-1026 (earlier in the function), create_dir_all(&original_vm_dir) ensures .../sandboxes/{snap_meta.vm_id}/ exists so the symlink can be written into it. However, remove_sandbox_impl only removes .../sandboxes/{new_id}:

// remove_sandbox_impl (line ~1753):
let vm_dir = PathBuf::from(&config.firecracker.data_dir)
    .join("sandboxes")
    .join(id);  // id == new_id, NOT snap_meta.vm_id
tokio::fs::remove_dir_all(&vm_dir).await ...

On every "create → checkpoint → remove → restore → remove" cycle:

  • original_vm_dir is recreated during restore
  • it is never deleted when the restored sandbox is torn down
  • it accumulates a dangling rootfs.link symlink after dm teardown

Consider also unlinking original_vm_dir (or at minimum rootfs.link) from inside remove_sandbox_impl using the instance's snap_meta.vm_id, or storing the link path on SandboxInstance for explicit cleanup.

Comment on lines 1570 to +1575
let (rootfs, cow) = match cow_manager.setup(id, &spec.rootfs).await {
Ok(handle) => {
let path = handle.dm_device.clone();
(path, Some(handle))
let link_path = vm_dir.join("rootfs.link");
std::os::unix::fs::symlink(&handle.dm_device, &link_path)
.map_err(VmmError::Io)?;
(link_path.to_str().unwrap().to_owned(), Some(handle))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Symlink not removed on boot failure

When builder.start() fails (a few lines below at line ~1668), the cleanup block calls cow_manager.teardown(handle) which removes the dm device, leaving vm_dir/rootfs.link as a dangling symlink. The vm_dir is not cleaned up on boot failure either, so the stale symlink persists until the next attempt.

This mirrors the situation in restore; for consistency, consider deleting link_path in the boot-failure cleanup block alongside the teardown call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants