Skip to content

[python-package] standardize variable naming for evaluation results#7163

Merged
jameslamb merged 5 commits intomasterfrom
python/eval-result-naming
Apr 1, 2026
Merged

[python-package] standardize variable naming for evaluation results#7163
jameslamb merged 5 commits intomasterfrom
python/eval-result-naming

Conversation

@jameslamb
Copy link
Copy Markdown
Member

Contributes to #6748

In the Python package, eval_name is used in two different ways. Sometimes it means "name of a Dataset" and sometimes "name of an evaluation metric".

This proposes standardizing that naming everywhere, along with some other related variables and documentation.

  • eval_name (dataset) -> dataset_name
  • eval_name (metric) -> metric_name
  • eval_result -> metric_value
  • is_higher_better -> maximize

That naming only affects internal variables and docs, so this change shouldn't affect user code or the ability to load old model files with newer lightgbm.

Notes for Reviewers

Splitting this off from #7161, to make the diff there a bit smaller.

is_higher_better : bool
Is eval result higher better, e.g. AUC is ``is_higher_better``.
metric_name : str
Unique identifier for the metric (e.g. "custom_adjusted_mse").
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "name of the evaluation function" is a little confusing. It was meant to mean mathematical function (like "RMSE"), but I think it could also be misunderstood to refer to Python function objects that implement a metric calculation.

I think this new wording makes it clearer that this value has nothing to do with the name of any Python object.

metric_name : str
Unique identifier for the metric (e.g. "custom_adjusted_mse").
metric_value : float
Value of the evaluation metric.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"eval result" is sometimes used in this code base to refer to the entire tuple, sometimes to the list of tuples, and here for just the score.

Hoping this wording clarifies that this element of the tuple should be a single floating-point value with the value of the metric.

Unique identifier for the metric (e.g. "custom_adjusted_mse").
metric_value : float
Value of the evaluation metric.
maximize : bool
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted a clearer a name than is_higher_better, thought it'd make sense to copy naming from xgboost here:

https://github.com/dmlc/xgboost/blob/4efdd7623222f6e3475218a390c41d507964d83c/python-package/xgboost/training.py#L84

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't like maximize, my next proposal is higher_is_better. I think boolean names should be a statement, not a question.

@jameslamb jameslamb changed the title WIP: [python-package] standardize variable naming for evaluation results [python-package] standardize variable naming for evaluation results Feb 23, 2026
@jameslamb jameslamb marked this pull request as ready for review February 23, 2026 02:57
@jameslamb
Copy link
Copy Markdown
Member Author

@borchero or @jmoralez could one of you take a look at this when you have time?

I'd love to try to finish #6748 in this release, I think it'll be really helpful in addressing other feature requests around early stopping and model evaluation. This will make the PR to do that (#7161, not quite done yet) easier to review.

Copy link
Copy Markdown
Collaborator

@borchero borchero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with all your choices @jameslamb, I like it much better than before :)

@jameslamb
Copy link
Copy Markdown
Member Author

Thanks so much @borchero !

I think (hope?) you'll REALLY like #7161 once it's ready. From my view, the complexity of the positional subsetting and tuple unpacking was making it harder to implement other forms of early stopping control (like fine-grained control of which datasets and metrics to use).

@jameslamb jameslamb merged commit a7d00a9 into master Apr 1, 2026
68 checks passed
@jameslamb jameslamb deleted the python/eval-result-naming branch April 1, 2026 16:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants