add warnings for lm_head activation scale fallback by n1ck-guo · Pull Request #1728 · intel/auto-round

n1ck-guo · 2026-04-23T06:56:38Z

Description

Please briefly describe your main changes, the motivation.

Type of Change

Related Issues

Fixes or relates to #

Checklist Before Submitting

My code has been tested locally.
Documentation has been updated as needed.
New or updated tests are included where applicable.

Signed-off-by: n1ck-guo <[email protected]>

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds user-facing warnings to clarify behavior when lm_head static activation quantization lacks calibration statistics/inputs, especially when activation scale falls back to a default.

Changes:

Emit a warning during quantization when lm_head calibration inputs are missing under static activation quantization.
Emit a warning during unwrap when lm_head static activation stats are missing and scale fallback occurs/likely occurs.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
auto_round/wrapper.py	Adds a one-time warning during unwrap for `lm_head` when static activation stats are missing and a scale fallback is implied.
auto_round/compressors/base.py	Adds a one-time warning when `lm_head` activation calibration inputs are missing for static activation quantization.

Copilot · 2026-04-23T07:03:09Z

+        ):
+            logger.warning_once(
+                "Static activation quantization for lm_head is not fully supported yet. "
+                "lm_head activation statistics are missing, so activation scale falls back to unit scale."


The warning message states that activation scale “falls back to unit scale”, but this conditional only infers missing stats (act_max is None) and the presence of act_scale, without verifying that a unit-scale fallback actually happened. Consider softening the wording (e.g., “may fall back”) or tightening the condition to detect the actual fallback value/state so the warning is always accurate.

Suggested change

"lm_head activation statistics are missing, so activation scale falls back to unit scale."

"lm_head activation statistics are missing, so activation scale may fall back to unit scale."

Copilot · 2026-04-23T07:03:09Z

+                    if "lm_head" in layer_name:
+                        logger.warning_once(
+                            "Static activation quantization for lm_head is not fully supported yet. "
+                            "If lm_head calibration inputs are missing, activation scale may fall back to unit scale "
+                            "or quantization may be skipped."
+                        )


There are now two separate lm_head static-activation warnings (here and in unwrapper_layer), with similar but not identical text. Because the message strings differ, warning_once will still emit both, which can be noisy and confusing. Consider centralizing this into a shared helper/message constant (or aligning the exact string) and/or choosing a single place to warn (calibration-time vs unwrap-time) so users see one consistent warning.

Suggested change

if "lm_head" in layer_name:

logger.warning_once(

"Static activation quantization for lm_head is not fully supported yet. "

"If lm_head calibration inputs are missing, activation scale may fall back to unit scale "

"or quantization may be skipped."

)

n1ck-guo · 2026-04-24T10:07:59Z

/azp run Unit-Test-CUDA-AutoRound

azure-pipelines · 2026-04-24T10:08:08Z

Azure Pipelines successfully started running 1 pipeline(s).

add warnings for lm_head activation scale fallback

e3f35d2

Signed-off-by: n1ck-guo <[email protected]>

n1ck-guo requested review from Copilot and xin3he April 23, 2026 06:56

Copilot AI reviewed Apr 23, 2026

View reviewed changes

n1ck-guo mentioned this pull request Apr 23, 2026

quant_lm_head=True actually didn't quant the lm head linear activation. #1044

Open

Copilot started reviewing on behalf of n1ck-guo April 23, 2026 09:06 View session

WeiweiZhang1 approved these changes Apr 24, 2026

View reviewed changes

xin3he approved these changes Apr 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add warnings for lm_head activation scale fallback#1728

add warnings for lm_head activation scale fallback#1728
n1ck-guo wants to merge 1 commit intomainfrom
hengguo/add_log

n1ck-guo commented Apr 23, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 23, 2026

Uh oh!

Copilot AI Apr 23, 2026

Uh oh!

n1ck-guo commented Apr 24, 2026

Uh oh!

azure-pipelines Bot commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	"lm_head activation statistics are missing, so activation scale falls back to unit scale."
	"lm_head activation statistics are missing, so activation scale may fall back to unit scale."

Conversation

n1ck-guo commented Apr 23, 2026

Description

Type of Change

Related Issues

Checklist Before Submitting

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

n1ck-guo commented Apr 24, 2026

Uh oh!

azure-pipelines Bot commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants