Feats: Quantize/save/evaluate the Wan-AI/WAN2.2 models in w4a16 format#1678
Feats: Quantize/save/evaluate the Wan-AI/WAN2.2 models in w4a16 format#1678lvliang-intel wants to merge 27 commits intomainfrom
Conversation
Signed-off-by: lvliang-intel <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: lvliang-intel <[email protected]>
There was a problem hiding this comment.
Pull request overview
Adds support to quantize/save/evaluate Wan-AI/Wan2.2 diffusion models in W4A16 by improving diffusion pipeline loading, calibration, and multi-device handling within AutoRound’s diffusion compressor.
Changes:
- Add a fallback diffusion pipeline loader when
AutoPipelineForText2Imagecannot resolve a linked pipeline. - Extend
DiffusionCompressorto better handle Wan-specific block I/O, multi-device dispatch before caching, and calibration inputs (including requiredimage). - Make
tie_weights()calls conditional to support models that don’t implement it; document Wan2.2 models in the diffusion README.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| auto_round/utils/model.py | Fallback from AutoPipeline to DiffusionPipeline for unsupported/unknown pipeline links. |
| auto_round/utils/device.py | Guard tie_weights() in block-wise dispatch to avoid attribute errors. |
| auto_round/compressors/diffusion/compressor.py | Add Wan block output config, calibration image handling, multi-device dispatch before caching, and config saving tweak. |
| auto_round/compressors/diffusion/README.md | Document Wan2.2 models and calibration dataset. |
| auto_round/compressors/base.py | Skip update_module() for diffusion; guard tie_weights(); adjust multi-device auto-offload logic for diffusion. |
| auto_round/auto_scheme/utils.py | Guard tie_weights() in device dispatch utility. |
| pipe_kwargs["image"] = self._get_calibration_image(len(prompts)) | ||
| try: | ||
| self.pipe( | ||
| prompt=prompts, | ||
| guidance_scale=self.guidance_scale, | ||
| num_inference_steps=self.num_inference_steps, | ||
| generator=( | ||
| None | ||
| if self.generator_seed is None | ||
| else torch.Generator(device=self.pipe.device).manual_seed(self.generator_seed) | ||
| ), | ||
| ) | ||
| self.pipe(**pipe_kwargs) | ||
| except NotImplementedError: | ||
| pass | ||
| except Exception as error: |
| val.save_pretrained(sub_module_path) | ||
| self.pipe.config.save_pretrained(output_dir) | ||
| if hasattr(self.pipe, "save_config"): | ||
| self.pipe.save_config(output_dir) |
| if "AutoPipeline can't find a pipeline linked" not in str(exc): | ||
| raise | ||
| pipe = pipelines.pipeline_utils.DiffusionPipeline.from_pretrained( | ||
| pretrained_model_name_or_path, torch_dtype=torch_dtype | ||
| ) |
Co-authored-by: Copilot Autofix powered by AI <[email protected]>
Co-authored-by: Copilot Autofix powered by AI <[email protected]>
Co-authored-by: Copilot Autofix powered by AI <[email protected]>
Co-authored-by: Copilot Autofix powered by AI <[email protected]>
Signed-off-by: lvliang-intel <[email protected]>
|
/azp run Unit-Test-CUDA-AutoRound |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
@copilot resolve the merge conflicts in this pull request |
- Merge f81e20d (Enable NextStepDiffusion and support multi-device tuning for diffusion) - Resolve import conflicts: use dispatch_model_by_all_available_devices - Resolve base.py: use getattr(self, 'is_diffusion', False) and callable() check - Resolve README: combine FLUX, WAN2.2, Z-Image, NextStep model tables - Resolve compressor.py: use _run_pipeline/_align_device_and_dtype approach from main, keep WAN2.2-specific output_configs, _uses_single_hidden_state_input, _requires_calibration_image, _get_calibration_image methods - Fix inspect._empty -> inspect.Parameter.empty - Fix 'raise error' -> 'raise' to preserve traceback Co-authored-by: lvliang-intel <[email protected]>
Agent-Logs-Url: https://github.com/intel/auto-round/sessions/993bffda-50a4-414a-bbef-3754099c7c21 Co-authored-by: lvliang-intel <[email protected]>
Resolved merge conflicts with Key resolutions:
|
Signed-off-by: lvliang-intel <[email protected]>
into lvl/support_wan2.2 Signed-off-by: lvliang-intel <[email protected]>
|
Azure Pipelines: Successfully started running 6 pipeline(s). 1 pipeline(s) require an authorized user to comment /azp run to run. |
for more information, see https://pre-commit.ci
|
Azure Pipelines: Successfully started running 6 pipeline(s). 1 pipeline(s) require an authorized user to comment /azp run to run. |
|
/azp run Unit-Test-CUDA-AutoRound |
|
Azure Pipelines: Successfully started running 1 pipeline(s). |
|
/azp run Unit-Test-CUDA-AutoRound |
|
Azure Pipelines: Successfully started running 1 pipeline(s). |
|
/azp run Unit-Test-CUDA-AutoRound |
|
Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command. |
|
Azure Pipelines: Successfully started running 6 pipeline(s). 1 pipeline(s) require an authorized user to comment /azp run to run. |
Signed-off-by: lvliang-intel <[email protected]>
…upport_wan2.2 Signed-off-by: lvliang-intel <[email protected]>
|
/azp run Unit-Test-CUDA-AutoRound |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: lvliang-intel <[email protected]>
for more information, see https://pre-commit.ci
|
/azp run Unit-Test-CUDA-AutoRound |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: lvliang-intel <[email protected]>
|
/azp run Unit-Test-CUDA-AutoRound |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run Unit-Test-CUDA-AutoRound |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Description
Quantize/save/evaluate the Wan-AI/WAN2.2 in w4a16 format.
Models:
https://huggingface.co/Wan-AI/Wan2.2-I2V-A14B-Diffusers
https://huggingface.co/Wan-AI/Wan2.2-TI2V-5B-Diffusers
https://huggingface.co/Wan-AI/Wan2.2-T2V-A14B-Diffusers
Target dtypes: w4a16
Quantized Models:
https://huggingface.co/Intel/Wan2.2-T2V-A14B-Diffusers-int4-AutoRound
https://huggingface.co/Intel/Wan2.2-I2V-A14B-Diffusers-int4-AutoRound
https://huggingface.co/Intel/Wan2.2-TI2V-5B-Diffusers-int4-AutoRound
Type of Change
Related Issues
#1672
Checklist Before Submitting