[python-package] fix misleading feature name warning on sklearn 1.6+ predict() by FranciscoRMendes · Pull Request #7232 · lightgbm-org/LightGBM

FranciscoRMendes · 2026-04-19T10:29:31Z

Summary

Starting with scikit-learn 1.6, LGBMClassifier/LGBMRegressor/etc. emit a spurious warning during predict() when fitted on data without feature names (e.g. a numpy array):

UserWarning: X does not have valid feature names, but LGBMClassifier was fitted with feature names

This happens because LightGBM auto-generates feature names (Column_0, Column_1, ...) for such inputs and previously exposed them via feature_names_in_, causing sklearn to believe the model was fitted with named features.

Fix: feature_names_in_ now raises AttributeError when training data had no feature names, matching sklearn's own convention. Auto-generated names remain accessible via the LightGBM-specific feature_name_ property.

Changes

python-package/lightgbm/sklearn.py: track _fitted_with_feature_names in fit(); gate feature_names_in_ on that flag
tests/python_package_test/test_sklearn.py: update test_getting_feature_names_in_np_input to assert feature_names_in_ is absent for numpy input; add regression test test_no_spurious_feature_name_warning_on_np_predict

Test plan

test_getting_feature_names_in_np_input — asserts feature_names_in_ is not set after numpy fit, feature_name_ still works
test_no_spurious_feature_name_warning_on_np_predict — asserts no warnings raised during predict() on numpy input
test_getting_feature_names_in_pd_input — unchanged; DataFrame input still exposes feature_names_in_
Full test_sklearn.py suite: 483 passed, 0 failures

…predict() When fitting on data without feature names (e.g. numpy arrays), LightGBM auto-generates names like Column_0, Column_1, etc. Previously these were exposed via feature_names_in_, causing sklearn 1.6+ to emit a spurious UserWarning during predict() ("X does not have valid feature names, but ... was fitted with feature names"). feature_names_in_ now raises AttributeError when the training data had no feature names, matching sklearn's own convention. Auto-generated names remain accessible via the LightGBM-specific feature_name_ property. Fixes lightgbm-org#6798

…pyarrow.compute stubs

FranciscoRMendes requested review from StrikerRUS, borchero, guolinke, jameslamb, jmoralez and shiyu1994 as code owners April 19, 2026 10:29

jameslamb added awaiting review fix labels Apr 20, 2026

FranciscoRMendes force-pushed the fix/issue-6798 branch from beaa0b2 to c4e1c39 Compare April 20, 2026 22:40

FranciscoRMendes added 2 commits April 20, 2026 18:42

Merge branch 'master' into fix/issue-6798

f0c47c4

fix lint: move import warnings to module level, add type: ignore for …

e6e69bd

…pyarrow.compute stubs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[python-package] fix misleading feature name warning on sklearn 1.6+ predict()#7232

[python-package] fix misleading feature name warning on sklearn 1.6+ predict()#7232
FranciscoRMendes wants to merge 3 commits intolightgbm-org:masterfrom
FranciscoRMendes:fix/issue-6798

FranciscoRMendes commented Apr 19, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

FranciscoRMendes commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

FranciscoRMendes commented Apr 19, 2026 •

edited

Loading