[Offload] Add asserts when removing and modifying files#666
[Offload] Add asserts when removing and modifying files#666
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughDiskCache filenames now use a rank-aware Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/compressed_tensors/offload/cache/disk.py`:
- Around line 190-192: Update _is_created_file_path to accept the DiskCache
instance (e.g., def _is_created_file_path(file_path: str, disk_cache: DiskCache)
-> bool) and use the instance's prefix (disk_cache._new_file_prefix) rather than
the class constant so subclasses can override it; resolve both file_path and the
cache directory to real paths and verify the file is physically inside the disk
cache directory (use os.path.realpath + os.path.commonpath to compare against
disk_cache's directory property, e.g., disk_cache.cache_dir or
disk_cache.base_dir) and only then check that
os.path.basename(file_path).startswith(disk_cache._new_file_prefix).
- Around line 105-106: Replace the assert-based safety checks around
writing/reading cache files by enhancing _is_created_file_path to validate that
the resolved absolute path is inside offload_dir (not just basename prefix) and
then raise explicit exceptions (e.g., ValueError or PermissionError) instead of
using assert in the places that call _is_created_file_path and save_file; also
make the conditional usage of _is_created_file_path consistent by raising when
it fails (rather than silently skipping) so all checks for file creation use the
strengthened helper and explicit exceptions (update all call sites that
currently use assert or conditional checks with _is_created_file_path such as
where save_file is invoked and the other checks in the same class).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 42a0c560-a30c-482a-983b-2fd645a93de3
📒 Files selected for processing (1)
src/compressed_tensors/offload/cache/disk.py
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/test_offload/convert/test_from_accelerate.py`:
- Line 154: In DiskCache._is_ct_file_path, replace the incorrect reference to
cls._created_file_prefix with the correctly defined attribute
cls._ct_file_prefix so the method matches _get_ct_file_path and avoids the
AttributeError at runtime; update the check inside _is_ct_file_path to use
cls._ct_file_prefix and ensure any related comparisons use the same prefix
symbol.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 108746c9-cee5-47d1-a3e8-88da7dbbafd6
📒 Files selected for processing (2)
src/compressed_tensors/offload/cache/disk.pytests/test_offload/convert/test_from_accelerate.py
🚧 Files skipped from review as they are similar to previous changes (1)
- src/compressed_tensors/offload/cache/disk.py
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/compressed_tensors/offload/cache/disk.py`:
- Around line 169-177: The helper constants and method decorators were changed
but not updated everywhere: update _is_ct_file_path to reference the correct
constant name (use cls._ct_file_prefix instead of cls._created_file_prefix) and
add the missing `@classmethod` decorator to _get_ct_file_path so it can be called
as cls._get_ct_file_path(...) (also ensure its signature accepts cls,
offload_dir, offloaded). Verify create_checkpoint_symlink (and callers at lines
noted) now call the corrected classmethod and no AttributeError/TypeError
remains.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1bb913a2-1937-461b-9d73-9336ca0765c8
📒 Files selected for processing (1)
src/compressed_tensors/offload/cache/disk.py
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
src/compressed_tensors/offload/cache/disk.py (2)
162-162:⚠️ Potential issue | 🔴 CriticalAdd
@classmethodback to_get_ct_file_path().Line 162 calls
cls._get_ct_file_path(...), but without the decorator Python does not bindclsthere, socreate_checkpoint_symlink()raisesTypeErrorbefore the symlink is created.offload()only appears fine because the instance call bindsselfinto the first parameter.🔧 Proposed fix
`@classmethod` def _is_ct_file_path(cls, file_path: str) -> bool: """Only write and delete files that DiskCache has created""" return os.path.basename(file_path).startswith(cls._ct_file_prefix) + `@classmethod` def _get_ct_file_path(cls, offload_dir: str, offloaded: torch.Tensor) -> str: """Create file path with a prefix marking it as modifiable""" file_name = f"{cls._ct_file_prefix}{_get_rank()}{id(offloaded)}.safetensors" return os.path.join(offload_dir, file_name)Also applies to: 175-178
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/compressed_tensors/offload/cache/disk.py` at line 162, The call sites like create_checkpoint_symlink use cls._get_ct_file_path(...) but _get_ct_file_path was changed to a plain function, causing an unbound first-argument TypeError; restore the method binding by re-adding the `@classmethod` decorator to _get_ct_file_path so it accepts cls as its first parameter (this will also fix the other calls in the same area such as the calls around lines 175-178), ensure the method signature expects cls and not self, and run tests for create_checkpoint_symlink and offload to verify behavior.
170-173:⚠️ Potential issue | 🟠 MajorPrefix-only ownership checks can still target checkpoint files.
_is_ct_file_path()only checksstartswith(cls._ct_file_prefix). That means a real checkpoint likect_disk_cache_model.safetensorswould still pass the guards on Lines 120, 141, and 145, so__delitem__()/update_offload()can still unlink or overwrite a non-cache-owned file. This helper needs to prove ownership, not just match a basename prefix.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/compressed_tensors/offload/cache/disk.py` around lines 170 - 173, _is_ct_file_path currently only checks startswith(cls._ct_file_prefix), which allows unrelated checkpoint files like "ct_disk_cache_model.safetensors" to be treated as cache-owned; update _is_ct_file_path in class DiskCache to prove ownership by validating the full expected filename pattern (not just prefix) and/or location: check that file_path is inside the cache directory and that os.path.basename matches the exact naming scheme the cache generates (e.g., prefix + separator + nonce/hex and expected suffix) or match a strict regex for generated names; this will ensure __delitem__ and update_offload only operate on true cache-created files and not arbitrary checkpoint files.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/compressed_tensors/offload/cache/disk.py`:
- Around line 175-178: The filename generation in _get_ct_file_path currently
concatenates _get_rank() and id(offloaded) without a separator causing
collisions; change the formatting used to build file_name (referencing
_ct_file_prefix, _get_rank(), and id(offloaded)) to include a clear separator or
labeled segments (e.g., "r{rank}_id{obj_id}" or similar) between rank and object
id so filenames are unambiguous across ranks in offload_dir and cannot collide.
---
Duplicate comments:
In `@src/compressed_tensors/offload/cache/disk.py`:
- Line 162: The call sites like create_checkpoint_symlink use
cls._get_ct_file_path(...) but _get_ct_file_path was changed to a plain
function, causing an unbound first-argument TypeError; restore the method
binding by re-adding the `@classmethod` decorator to _get_ct_file_path so it
accepts cls as its first parameter (this will also fix the other calls in the
same area such as the calls around lines 175-178), ensure the method signature
expects cls and not self, and run tests for create_checkpoint_symlink and
offload to verify behavior.
- Around line 170-173: _is_ct_file_path currently only checks
startswith(cls._ct_file_prefix), which allows unrelated checkpoint files like
"ct_disk_cache_model.safetensors" to be treated as cache-owned; update
_is_ct_file_path in class DiskCache to prove ownership by validating the full
expected filename pattern (not just prefix) and/or location: check that
file_path is inside the cache directory and that os.path.basename matches the
exact naming scheme the cache generates (e.g., prefix + separator + nonce/hex
and expected suffix) or match a strict regex for generated names; this will
ensure __delitem__ and update_offload only operate on true cache-created files
and not arbitrary checkpoint files.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b4028389-e9d3-4d2b-a7e7-809bc827ea44
📒 Files selected for processing (1)
src/compressed_tensors/offload/cache/disk.py
|
This pull request has merge conflicts that must be resolved before it can be |
3a2207b to
e23f7e0
Compare
|
This pull request has merge conflicts that must be resolved before it can be |
|
@CodeRabbit resolve merge conflict |
|
✅ Merge conflicts resolved successfully! Resolved 1 conflict file(s). Commit: 3 file operation(s)
View agent analysis |
|
The quality checks have failed. Please run |
|
This pull request has merge conflicts that must be resolved before it can be |
|
@CodeRabbit resolve merge conflict |
|
The quality checks have failed. Please run |
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
2d14d78 to
77ad959
Compare
Signed-off-by: Kyle Sayers <[email protected]>
Purpose
Summary by CodeRabbit
Refactor
Tests