[c++] add path_smooth_hessian parameter for hessian-based path smoothing#7242
Open
amqadmiakur8 wants to merge 1 commit intolightgbm-org:masterfrom
Open
[c++] add path_smooth_hessian parameter for hessian-based path smoothing#7242amqadmiakur8 wants to merge 1 commit intolightgbm-org:masterfrom
amqadmiakur8 wants to merge 1 commit intolightgbm-org:masterfrom
Conversation
cf17946 to
35c4808
Compare
jameslamb
reviewed
Apr 23, 2026
Member
jameslamb
left a comment
There was a problem hiding this comment.
Thanks for your interest in LightGBM. I personally don't understand this submission, but I hope that maybe @shiyu1994 or @guolinke will be able to comment when they have time.
If they think this is a useful addition to LightGBM, I'd be happy to help with the tactical parts of getting it merge-ready.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a new
path_smooth_hessianparameter — a hessian-based variant ofpath_smooth(implemented in #2950) that is more appropriate when samples have different weights.Motivation
The existing
path_smoothuses sample counts as the smoothing weight. This works well for unweighted data, but when samples have different weights (e.g. via theweightcolumn), sample count does not reflect the actual importance of the data in each leaf. A leaf with 10 high-weight samples and a leaf with 10 low-weight samples get the same smoothing, even though they carry very different amounts of information.path_smooth_hessianuses the sum of Hessians instead, which naturally incorporates sample weights (sinceh_i = w_i * h_i_unweighted). This makes the smoothing weight-aware: leaves with more weighted evidence are trusted more, leaves with less are pulled harder toward the parent.Theory
The current path_smooth looks like an implementation of Bülmann's credibility, which assumes all rows have the same exposure/weight.
An extension of it is the Bühlmann-Straub credibility, which uses weights rather than raw counts when observations have different weights. However, we implement it using the Hessian instead to stay consistent with the existing
min_data_in_leafvsmin_sum_hessian_in_leaf, and because it would require more changes to the code (weights aren't available when smoothing is applied).Smoothing formula
Same structure as
path_smooth, withh(sum of hessians) replacingn(sample count):w_L = w*_L * (h / α) / (h / α + 1) + w_parent / (h / α + 1)where
alpha = path_smooth_hessian,h= sum of hessians in the leaf,w*_L= unsmoothed leaf output,w_parent= parent's smoothed output.Note: the current implementation of
path_smoothactually uses a hessian-based approximation ofn_samples(viaRoundInt(bin_hessian * num_data / leaf_sum_hessian)), not the true sample count. Thismeans
path_smoothis already between its stated definition andpath_smooth_hessian.Results
I tested this change on some of my datasets, with Poisson, Gamma, and Logistic Losses, and it seems to either perform better or as good as path_smooth, depending on the dataset.
Here is an example of the performance on a Poisson dataset with heterogeneous weights. The smoothing ranges are centered around the optimal one (found empirically), and results are cross-validated to reduce the noise.
I also compared it with
min_data_in_leafandmin_sum_hessian_in_leaf(which are calledmcsandmcwon the graph). And it seems to improve path_smooth the same way asmin_sum_hessian_in_leafimprovesmin_data_in_leaf.Also, out of 8 business cases I tested, soft smoothing methods (
path_smoothandpath_smooth_hessian) outperformed hard smoothing methods (min_data_in_leafandmin_sum_hessian_in_leaf) on 6 of them, while hard methods had a slight edge on 2 of them (the two smallest datasets I had, 4k and 30k rows). It is one of the reasons that made me want to use soft smoothing methods rather than hard smoothing ones.Summary of changes
path_smooth_hessian, double, default 0) rather than a boolean flag onpath_smooth, following themin_data_in_leaf/min_sum_hessian_in_leafpattern.min_sum_hessian_in_leaf.path_smooth: if both are set,path_smoothis ignored with a warning.min_data_in_leaf >= 2guard only applies to count-basedpath_smooth. The hessian-based path usessum_hessiansdirectly from the histogram, so the rounding issue that motivated the guard does notapply.
Config(effective_path_smooth(),use_hessian_smoothing()) to avoid repeated branching logic across call sites.test_path_smoothing_hessian