ML for dummies: Fast Gaussian blurs and picking random unit vectors

Suppose you have an n-dimensional image that you want to Gaussian blur. This can be achieved by convolving each pixel with a Gaussian kernel. If the kernel has side-length k, then each pixel requires k^n multiplications and additions to convolve (it's just a weighted sum). Can we do it faster?

Now consider another problem: how do you pick a unit vector uniformly randomly from the surface of the n-dimensional unit sphere? That is, how do you pick a random direction in n-d space? Give that a few moments' thought.

It turns out that both answers can be solved by noticing that an n-d Gaussian is the product of n 1-d Gaussians.

For the first problem, consider what happens if instead of doing an n-d convolution, we do n 1-d convolutions sequentially (once along each dimension). Let's say I'm the pixel at (0, 0, ..., 0). Let's compute how much the pixel p at (x1, x2, ..., xn) contributes to me this way.

After the first convolution, it will have contributed to pixel (0, x2, ..., xn), in the amount of f(x1) (where f is the Gaussian). On the second convolution, that pixel will contribute to (0, 0, x3, ..., xn) in the amount of f(x2). The amount of contribution from p is now f(x1) * f(x2). After doing all n dimensions, the contribution will be f(x1) * f(x2) * .... * f(xn). Since $f(x) = e^{-x^2}$:

$$\begin{eqnarray}
f(x_1) \cdot f(x_2) \cdot \ldots \cdot f(x_n) \\
= e^{-x_1^2} \cdot e^{-x_2^2} \cdot \ldots \cdot e^{-x_n^2} \\
= e^{-(x_1^2 + x_2^2 + \ldots + x_n^2)}
\end{eqnarray}$$

What about if we had done the whole n-d convolution at once? Then the contribution from p would have been given by the n-dimensional Gaussian, defined by $f(x_1, x_2, ..., x_n) = e^{-(x_1^2 + x_2^2 + \ldots + x_n^2)}$. This matches what we got using the first technique, because an n-d Gaussian is equivalent to the product of n 1-d Gaussians. The same thing should work for any function $f(x)$ such that $f(x) \cdot f(y) = f(\sqrt{x^2 + y^2})$.

Now, how do we pick a vector uniformly randomly from the surface of the n-d unit sphere? We pick each coordinate independently from the Gaussian and normalize. Why does this work? Consider the probability of picking the point $(x_1, x_2, ..., x_n)$. The probability of picking coordinate $k$ is $e^{-x_k^2}$. Because we picked independently, the overall probability is the product:

$$\begin{eqnarray}
e^{-x_1^2} \cdot e^{-x_2^2} \cdot \ldots \cdot e^{-x_n^2} \\
= e^{-(x_1^2 + x_2^2 + \ldots + x_n^2)}
\end{eqnarray}$$

That should look familiar. Notice that this is a function of the length of the vector $(x_1, x_2, ..., x_n)$. Because it depends only on the length, it is direction-independent, and so is uniform on the sphere.

ML for dummies

Fast Gaussian blurs and picking random unit vectors

No comments:

Post a Comment

Maximum Likelihood Estimation for dummies

Report Abuse