Fast Gaussian blurs and picking random unit vectors

Suppose you have an n-dimensional image that you want to Gaussian blur. This can be achieved by convolving each pixel with a Gaussian kernel. If the kernel has side-length k, then each pixel requires k^n multiplications and additions to convolve (it's just a weighted sum). Can we do it faster?

Now consider another problem: how do you pick a unit vector uniformly randomly from the surface of the n-dimensional unit sphere? That is, how do you pick a random direction in n-d space? Give that a few moments' thought.

It turns out that both answers can be solved by noticing that an n-d Gaussian is the product of n 1-d Gaussians.

For the first problem, consider what happens if instead of doing an n-d convolution, we do n 1-d convolutions sequentially (once along each dimension). Let's say I'm the pixel at (0, 0, ..., 0). Let's compute how much the pixel p at (x1, x2, ..., xn) contributes to me this way.

After the first convolution, it will have contributed to pixel (0, x2, ..., xn), in the amount of f(x1) (where f is the Gaussian). On the second convolution, that pixel will contribute to (0, 0, x3, ..., xn) in the amount of f(x2). The amount of contribution from p is now f(x1) * f(x2). After doing all n dimensions, the contribution will be f(x1) * f(x2) * .... * f(xn). Since $f(x) = e^{-x^2}$:

$$\begin{eqnarray}
f(x_1) \cdot f(x_2) \cdot \ldots \cdot f(x_n) \\
= e^{-x_1^2} \cdot e^{-x_2^2} \cdot \ldots \cdot e^{-x_n^2} \\
= e^{-(x_1^2 + x_2^2 + \ldots + x_n^2)}
\end{eqnarray}$$

What about if we had done the whole n-d convolution at once? Then the contribution from p would have been given by the n-dimensional Gaussian, defined by $f(x_1, x_2, ..., x_n) = e^{-(x_1^2 + x_2^2 + \ldots + x_n^2)}$. This matches what we got using the first technique, because an n-d Gaussian is equivalent to the product of n 1-d Gaussians. The same thing should work for any function $f(x)$ such that $f(x) \cdot f(y) = f(\sqrt{x^2 + y^2})$.

Now, how do we pick a vector uniformly randomly from the surface of the n-d unit sphere? We pick each coordinate independently from the Gaussian and normalize. Why does this work? Consider the probability of picking the point $(x_1, x_2, ..., x_n)$. The probability of picking coordinate $k$ is $e^{-x_k^2}$. Because we picked independently, the overall probability is the product:

$$\begin{eqnarray}
e^{-x_1^2} \cdot e^{-x_2^2} \cdot \ldots \cdot e^{-x_n^2} \\
= e^{-(x_1^2 + x_2^2 + \ldots + x_n^2)}
\end{eqnarray}$$

That should look familiar. Notice that this is a function of the length of the vector $(x_1, x_2, ..., x_n)$. Because it depends only on the length, it is direction-independent, and so is uniform on the sphere.

Maximum Likelihood Estimation for dummies

What is Maximum Likelihood Estimation (MLE)? It's simple, but there are some gotchas. First, let's recall what likelihood  is. ...