Numpy-Notes-Part1-How to Generate Multivariates Gaussian Distribution in A Decent Way?

Mar 30 2020 Notes

Numpy is the most powerful scientific computation tools in Python. I decide to start a new set of blogs to elaborate the amazing tricks in its source codes. This blog introduced how Numpy generates multivariates Guassian distribution. You can read the source code via Link.

1. Definitions and Concepts

1.1 Positve-semidefinite

Definition 1: Hermitian Matrix

We call a square matrix $A$ Hermitian if it is self-adjoint. i.e.

$A = \bar{A^T}$

Definition 2: Positive-Semidefinite

A positive semidefinite matrix is a Hermitian matrix if all of whose eigenvalues are nonnegative.

1.2 SVD decomposition

SVD decomposition generalizes the eigendecomposition of a square matrix to any $m\times n$ matrix.

Matrix $M$ can be written as $U\Sigma V^{*}$, where $U$ is a $m \times m$ unitary matrix, $V$ is $n \times n$ unitary matrix, and $\Sigma$ is a rectangular diagonal matrix with non-negative real numbers. Note that if $M$ is real, then matrix $U$ and $V$ must be orthonormal matrices.

2. How to generate r.v. following multivariate Gaussian distribution?

The most common idea is to use the Cholesky decomposition. However, the Cholesky–Banachiewicz and Cholesky–Crout algorithms require a positive definite matrix (Click here for details), which implies that the matrix must have full rank.

In most situiation, we have to handle positive semidefinite matrix. For example, generate a covariance matrix with kernel trick. In this situation, Numpy provides us with a brilliant idea via SVD decomposition.

Note that if we can find a Matrix A, such that

$A^TA = M$

Then we can conclude that $M$ is positive-semidefinite.

Moreover, recall that for real matrix M:

$M = U\Sigma V^T = U \Sigma^{\frac{1}{2}} \Sigma^{\frac{1}{2}} V^T = U \Sigma^{\frac{1}{2}} (V\Sigma^{\frac{1}{2}})^T$

If $ U \Sigma^{\frac{1}{2}} = V\Sigma^{\frac{1}{2}}$, then we find an available A for matrix M.

$ U \Sigma^{\frac{1}{2}} = V\Sigma^{\frac{1}{2}} \leftrightarrow{} U\Sigma U^T = V\Sigma V^T=M$ $(*)$

If $(*)$ holds, then we can make sure $M$ is a positive-semidefinite matrix.

Therefore, we can easily generate random vector following multivariates Gaussian distribution via:

$rv = AR + \mu$

where $R$ is random vector following standard Gaussian distribution with $\Sigma = I$. (Just use $n$ independent r.v. following standard Gaussian distribution)

#Numpy #Programming #Linear Algebra