Eigenfunction and Mercer's Theorem

Feb 19 2021 Notes

Keywords: linear operator, compact linear operator, Mercer’s theorem, eigenfunction, foundation of kernel trick

1. Eigenfunctions

An eigenfunctions of a linear operator $D$ defined on some function space is any non-zero function $f$ in that space that when acted upon by D, is equivalent to multiplied by some scaling factor called an eigenvalue, i.e.

$Df = \lambda f$

Note that the boundary conditions may greatly limit the value of $\lambda$.

The all possible values of $D$ is sometimes called its spectrum.

Moreover, given a inner-product function space, whose inner product is defined as:

$< f, g > = \int_{\Omega} f^{*}(t)g(t)dt$ ($*$ means complex conjugate)

Suppose the function space has an orthonormal basis given by the set of functions $\{u_1(t), u_2(t), u_3(t), \dots\}$

$< u_i, u_j > = \delta_{i, j}$ (Kronecker delta), such that for any functions can be written as a a linear combination of the basis functions:

$f(t) = \sum_{j = 1}^n b_j u_j(t)$ (The infinite dimension version should be written as integral, e.g. Fourier series)

Define a matrix representation of the linear operator D with elements

$A_{i, j} = < u_i, Du_j > = \int_{\Omega} u^{*}_i(t) D u_{j}(t) dt$

We can write the function $Df(t)$ either on $\{u_i\}$ or $\{Du_i\}$, i.e.

$Df(t) = \sum_{j = 1}^n c_j u_j(t) = \sum_{j = 1}^n b_j Du_j(t)$

Taking the inner product of each side of the above equation with an arbitrary basis function:

$< Df, Df >_i = \sum_{j = 1}^n c_{j} \int_{\Omega} u^{*}_i(t)u^{*}_j(t)dt = \sum_{j = 1}^n c_{j} \delta_{i, j} = c_i = \sum_{j = 1}^n b_{j} \int_{\Omega} u^{*}_i(t)Du^{*}_j(t)dt = \sum_{j = 1}^n b_j A_{i, j}$

2. Mercer’s Theorem

Suppose K is a continuous symmetric non-negative definite kernel. Then there is an orthonormal basis $\{e_i\}$ of $L^2[a, b]$ consisting of eigenfunctions of $T_k$ such that the corresponding sequence of eigenvalues $\{\lambda_j\}$ is nonnegative. The eigenfunctions corresponding to non-zero eigenvalues are continuous on $[a, b]$ and K has the representation $K(t, s) = \sum_{j = 1}^{\infty} \lambda_j e_j(s) e_j(t)$

where the convergence is absolute and uniform.

Note that $[T_{K}\phi](x) = \int_{a}^b K(x, s)\phi(s) ds$ $T_{K}$ is defined as a linear operator.

$[T_{K}(\phi_1 + \phi_2)](x) = \int_{a}^b K(x, s)(\phi_1(s) + \phi_2(s))ds = \int_{a}^b K(x, s) \phi_1(s) ds + \int_{a}^b K(x, s) \phi_2(s) ds$

In another word, the kernel function should be able to written as:

$K_{i, j} = \phi(x_i)^T \phi(x_j)$

$\phi$ is also called basis function, the state space of the mapping function can be a $\mathbb{R}^n$ space.

2.1 Mercer Kernels

Let $\{x_1, x_2, \dots, x_n\}$ be a finite set of n samples from $\mathcal{X}$. The Gram matrix of X is defined as $\mathbf{K}(X; \kappa) \in \mathbf{R}^{n \times n}$ or $\mathbf{K}$ for short, such that $(\mathbf{K})_{ij} = \kappa(x_i, x_j)$. If $\forall X \subset \mathcal{X}$, the matrix $\mathbf{K}$ is positive definite, $\kappa$ is called a Mercer Kernel, or a positive definite kernel. (Note that Mercer kernel is symmetric by definition)

According to Mercer’s theorem, the kernel should be able to be written as $\kappa(x, x^\prime) = \phi(x)^T \phi(x^\prime)$. We can also easily get the basis function through the construction of the kernel function.

#Functional analysis

Eigenfunction and Mercer's Theorem

1. Eigenfunctions

2. Mercer’s Theorem

2.1 Mercer Kernels

Comments

Your browser is out-of-date!