The problem with the Gaussian distribution is that the normalization constant is too complicated.
I admit it really isn’t particularly complicated, but in its many forms– multivariate, conditional, CDF, etc. these things continue to cause annoyance. In particular, I am frequently finding that I introduce bugs when I write code using Gaussians.
Now, can this be simplified? It can. Notice that
So, choosing , and defining
, we can instead write a Gaussian in the form
.
By changing , this represents any normal Gaussian.
Now, that’s slightly nicer than a regular gaussian, but can it extend to higher dimensions? (I admit I have to look up the normalization constant for a multivariate Gaussian every time I use one.) Unfortunately, it doesn’t seem so. The trouble is that (here is now a vector)
where is the number of dimensions (Matrix Cookbook). This means that if we are again going to define the constant
to try to make the normalization constant disappear,
would have to depend on the dimensions of the problem. That seems odd.
December 22, 2008 at 12:58 am
Hi,
Not totally sure, but I think if you let Sigma_2 = (Sigma / (2 * pi)) or something like that, it might work. When you compute the determinant of Sigma_2 on the right hand side, it’ll be something like (1/(2*pi))^(d/2) * det(Sigma), which will cancel out. Then, inside the exp, you’ll find inv(Sigma_2) = 2 pi * inv(Sigma)?
Does that work?
http://en.wikipedia.org/wiki/Matrix_inverse#Properties_of_invertible_matrices
http://en.wikipedia.org/wiki/Determinant#Properties
Personally, I rely more on the computer doing the math than myself, so I would probably write out the whole thing
. lol…
regards,
Joao.
December 26, 2008 at 3:50 pm
Hey Joao, I think that doesn’t quite work. Changing Sigma like that will successfully annihilate the the normalization constant (as you describe), but it will also end up changing the covariance, meaning we aren’t representing the same Gaussian any more.
December 26, 2008 at 6:33 pm
So, I think there’s a step missing in what I wrote…
I guess we would let a = e^{2*pi}, and axp(x) = a^x, just as you did in the 1D case (except there’s an extra 2 that could be fixed later maybe?). So we have the integral
int axp(-1/2 xt inv(Sigma) x)
that will give us the normalizing term, just as you did previously.
Replacing the axp(x) with exp(2*pi*x), the integral is
int exp(-1/2 xt 2*pi*inv(Sigma) x)
and 2*pi*inv(Sigma) is inv(Sigma_2), so the integral is
int exp(-1/2 xt 2*pi*inv(Sigma_2) x)
and we know that turns out to be det(Sigma)^(1/2) as I wrote previously? I forgot the ^(1/2) in the previous comment I think.
Anyway, I would definetely go over it a couple of times
.
best regards,
Joao.