Fix TP reparameterized prior#8295
Conversation
|
Thank you for submitting this @rx18-eng, this is a real problem. One thing if you don't mind double checking, it looks like from [the paper](https://proceedings.mlr.press/v33/shah14.pdf that the logp for the TP is a bit different than for pm.MvStudentT by a scale factor, # Pass (nu-2)/nu * K_y as the scale matrix so that
# the actual covariance is K_y, matching the TP derivation
y_obs = pm.MvStudentT(
"y_obs",
nu=nu,
mu=pt.zeros(n),
cov=(nu - 2) / nu * K_y,
observed=y_train
)Would you mind double checking? I'm not sure I see that scale factor in your tests, but I could have missed it. Here's the implementation of the logp for the TP that I get, def tp_logp(value, m, K_y, nu):
"""
logp of a Student-t Process marginal likelihood.
'value' is y (observed), shape (n,).
K_y = K(X,X) + sigma^2 * I already passed in as a parameter.
"""
n = value.shape[0]
r = value - m
L = pt.linalg.cholesky(K_y + 1e-6 * pt.eye(n))
alpha = pt.linalg.solve_triangular(L, r, lower=True)
beta = pt.dot(alpha, alpha)
log_det_Ky = 2.0 * pt.sum(pt.log(pt.diag(L)))
return (
pt.gammaln((nu + n) / 2.0)
- pt.gammaln(nu / 2.0)
- (n / 2.0) * pt.log(pt.pi * (nu - 2.0))
- 0.5 * log_det_Ky
- ((nu + n) / 2.0) * pt.log(1.0 + beta / (nu - 2.0))
)This is only a question about the tests though, you're fix the the pm.gp.TP code looks right. |
|
Thanks @bwengals , I double checked this. The (nu - 2) / nu factor is for the paper version where K is treated asthe covariance. PyMC’s MvStudentT uses scale, and the current TP non-reparameterized path already passes scale_func(X) as scale. So I kept the existing PyMC behavior and only made reparameterize=True match reparameterize=False. Changing to the paper covariance version would be a separate behavior/API change for both paths. I also pushed a small test rename to make this clearer. |
|
@bwengals shall i modify anything else on this pr ? |
|
After thinking about it some more, I think people using this in their models will expect the behavior from the paper, so with the scaling factor. Updating this could subtly break peoples models if they are using this. So there should be a warning along with the current behavior, and then an option added to use the corrected scaling factor to match the paper. The warning should say that there was a bug, and that the next release will match the result from the paper. Then in a later release we just use the correct version. Thanks again for your patience and for tackling this. |
Signed-off-by: rx18-eng <remopanda78@gmail.com>
|
@bwengals is there any update on this ? |
bwengals
left a comment
There was a problem hiding this comment.
Looks good. Thanks for making the changes
| f"got {parameterization!r}" | ||
| ) | ||
| self.nu = nu | ||
| self.parameterization = parameterization |
There was a problem hiding this comment.
since it's a two value possibility could it be a boolean flag?
| else: | ||
| size = np.shape(X)[0] | ||
| v = pm.Normal(name + "_rotated_", mu=0.0, sigma=1.0, size=size, **kwargs) | ||
| mixing_scale = pm.Gamma(name + "_scale", alpha=self.nu / 2, beta=self.nu / 2) |
There was a problem hiding this comment.
If GPs support batch dims (no idea if they do), this should have batch dim shape/dims, not always be a scalar
closes #8290
Summary
I fixed
pm.gp.TP.prior(..., reparameterize=True)so it uses a shared Gamma scale with a standard Normal rotated variable.This makes the reparameterized TP prior match the direct
MvStudentTprior.I also added regression tests for the shared scale behavior and checked the marginal density against
MvStudentT.Tests
ruff check pymc/gp/gp.py tests/gp/test_gp.pypytest tests/gp/test_gp.py::TestTP -qpytest tests/gp/test_gp.py -qcc: @ferrine @ricardoV94