-
Notifications
You must be signed in to change notification settings - Fork 4k
[python-package] standardize variable naming for evaluation results #7163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
fd01f0b
6ff2762
c8244ac
db9dfd9
f3c939b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -4345,7 +4345,7 @@ def eval( | |
| feval : callable, list of callable, or None, optional (default=None) | ||
| Customized evaluation function. | ||
| Each evaluation function should accept two parameters: preds, eval_data, | ||
| and return (eval_name, eval_result, is_higher_better) or list of such tuples. | ||
| and return (metric_name, metric_value, maximize) or list of such tuples. | ||
|
|
||
| preds : numpy 1-D array or numpy 2-D array (for multi-class task) | ||
| The predicted values. | ||
|
|
@@ -4354,17 +4354,17 @@ def eval( | |
| e.g. they are raw margin instead of probability of positive class for binary task in this case. | ||
| eval_data : Dataset | ||
| A ``Dataset`` to evaluate. | ||
| eval_name : str | ||
| The name of evaluation function (without whitespace). | ||
| eval_result : float | ||
| The eval result. | ||
| is_higher_better : bool | ||
| Is eval result higher better, e.g. AUC is ``is_higher_better``. | ||
| metric_name : str | ||
| Unique identifier for the metric (e.g. "custom_adjusted_mse"). | ||
| metric_value : float | ||
| Value of the evaluation metric. | ||
|
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "eval result" is sometimes used in this code base to refer to the entire tuple, sometimes to the list of tuples, and here for just the score. Hoping this wording clarifies that this element of the tuple should be a single floating-point value with the value of the metric. |
||
| maximize : bool | ||
|
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wanted a clearer a name than
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we don't like |
||
| Are higher values better? e.g. ``True`` for AUC and ``False`` for binary error. | ||
|
|
||
| Returns | ||
| ------- | ||
| result : list | ||
| List with (dataset_name, eval_name, eval_result, is_higher_better) tuples. | ||
| List with (dataset_name, metric_name, metric_value, maximize) tuples. | ||
| """ | ||
| if not isinstance(data, Dataset): | ||
| raise TypeError("Can only eval for Dataset instance") | ||
|
|
@@ -4394,7 +4394,7 @@ def eval_train( | |
| feval : callable, list of callable, or None, optional (default=None) | ||
| Customized evaluation function. | ||
| Each evaluation function should accept two parameters: preds, eval_data, | ||
| and return (eval_name, eval_result, is_higher_better) or list of such tuples. | ||
| and return (metric_name, metric_value, maximize) or list of such tuples. | ||
|
|
||
| preds : numpy 1-D array or numpy 2-D array (for multi-class task) | ||
| The predicted values. | ||
|
|
@@ -4403,17 +4403,17 @@ def eval_train( | |
| e.g. they are raw margin instead of probability of positive class for binary task in this case. | ||
| eval_data : Dataset | ||
| The training dataset. | ||
| eval_name : str | ||
| The name of evaluation function (without whitespace). | ||
| eval_result : float | ||
| The eval result. | ||
| is_higher_better : bool | ||
| Is eval result higher better, e.g. AUC is ``is_higher_better``. | ||
| metric_name : str | ||
| Unique identifier for the metric (e.g. "custom_adjusted_mse"). | ||
| metric_value : float | ||
| Value of the evaluation metric. | ||
| maximize : bool | ||
| Are higher values better? e.g. ``True`` for AUC and ``False`` for binary error. | ||
|
|
||
| Returns | ||
| ------- | ||
| result : list | ||
| List with (train_dataset_name, eval_name, eval_result, is_higher_better) tuples. | ||
| List with (train_dataset_name, metric_name, metric_value, maximize) tuples. | ||
| """ | ||
| return self.__inner_eval(data_name=self._train_data_name, data_idx=0, feval=feval) | ||
|
|
||
|
|
@@ -4428,7 +4428,7 @@ def eval_valid( | |
| feval : callable, list of callable, or None, optional (default=None) | ||
| Customized evaluation function. | ||
| Each evaluation function should accept two parameters: preds, eval_data, | ||
| and return (eval_name, eval_result, is_higher_better) or list of such tuples. | ||
| and return (metric_name, metric_value, maximize) or list of such tuples. | ||
|
|
||
| preds : numpy 1-D array or numpy 2-D array (for multi-class task) | ||
| The predicted values. | ||
|
|
@@ -4437,17 +4437,17 @@ def eval_valid( | |
| e.g. they are raw margin instead of probability of positive class for binary task in this case. | ||
| eval_data : Dataset | ||
| The validation dataset. | ||
| eval_name : str | ||
| The name of evaluation function (without whitespace). | ||
| eval_result : float | ||
| The eval result. | ||
| is_higher_better : bool | ||
| Is eval result higher better, e.g. AUC is ``is_higher_better``. | ||
| metric_name : str | ||
| Unique identifier for the metric (e.g. "custom_adjusted_mse"). | ||
| metric_value : float | ||
| Value of the evaluation metric. | ||
| maximize : bool | ||
| Are higher values better? e.g. ``True`` for AUC and ``False`` for binary error. | ||
|
|
||
| Returns | ||
| ------- | ||
| result : list | ||
| List with (validation_dataset_name, eval_name, eval_result, is_higher_better) tuples. | ||
| List with (validation_dataset_name, metric_name, metric_value, maximize) tuples. | ||
| """ | ||
| return [ | ||
| item | ||
|
|
@@ -5215,11 +5215,11 @@ def __inner_eval( | |
| continue | ||
| feval_ret = eval_function(self.__inner_predict(data_idx=data_idx), cur_data) | ||
| if isinstance(feval_ret, list): | ||
| for eval_name, val, is_higher_better in feval_ret: | ||
| ret.append((data_name, eval_name, val, is_higher_better)) | ||
| for metric_name, metric_value, maximize in feval_ret: | ||
| ret.append((data_name, metric_name, metric_value, maximize)) | ||
| else: | ||
| eval_name, val, is_higher_better = feval_ret | ||
| ret.append((data_name, eval_name, val, is_higher_better)) | ||
| metric_name, metric_value, maximize = feval_ret | ||
| ret.append((data_name, metric_name, metric_value, maximize)) | ||
| return ret | ||
|
|
||
| def __inner_predict(self, *, data_idx: int) -> np.ndarray: | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "name of the evaluation function" is a little confusing. It was meant to mean mathematical function (like "RMSE"), but I think it could also be misunderstood to refer to Python function objects that implement a metric calculation.
I think this new wording makes it clearer that this value has nothing to do with the name of any Python object.