Metric Embedding-Notes-Part 1-An overview and Basic Theorems

Metric Embedding plays an important role in unsupervised machine learning and algorithm analysis. Embedding methods are also considered as one of the most important methods in the design of approximation algorithms.

1. Motivations

We sometimes have to deal with data without a vector space representation. Metric embeddings help us measure the relationship of samples in a more convenient way. Moreover, one the most important applications of metric embedding is dimensionality reduction, which greatly increase the computational efficiency.

2. Metric spaces and embedding

2.1 The concept of Metric space

Metric space is a pair $(X, \rho)$, where $X$ is a set of points (can be uncountable) and a function $\rho$: $X \times X \rightarrow \mathbb{R}^{\ge 0}$ such that:

(a) $\rho(x, y) = 0 \leftrightarrow x=y$

(b) $\rho(x,y) = \rho(y,x)$ (symmetry)

(c) $\rho(x,y) + \rho(y,z) \ge \rho(x,z)$ (triangle inequality)

Note the difference between metric space and normed space. Every normed space is a metric space, but a metric space may not be a normed space. Moreover, (b) and (c) implicate that for any two points $x, y \in X$, we can get $\rho(x,y) \ge 0$. This can be quickly proved due to the fact that:

$\rho(x,y) + \rho(y,x) \ge \rho(x,x) = 0$

$\rho(x,y) + \rho(x,y)\ge 0$

$\rho(x,y)\ge 0$ $\therefore Q.E.D$

If a mapping $\rho$ is call distance if it satisfies (a) and (b). If a distance function satisfies (c), then this function is called metric. Moreover, if there exists different points, where $\rho(x, y) = 0$, then we call this function semi-metric or pseudometric.

It is natural for us to think whether we can find a function mapping from metric space to Euclidean space (a normed space, which the definition of norm is exactly inner product) and the keep the distance define in the original space. Unfortunately, it is not always possible to achieve that transformation.

2.2 Embedding

We want to find a simple metrics with some nice property. For example, we may reduce the dimension though metric embedding techniques, or a metric generated by simpler structures such as a tree, or a circle. The task is to find a function transferring the elements from the original metric space to the preferred one.

Definition:

A finite metric space $(X, \rho)$ is realized in $\mathcal{L}^k$ if there is a function $f$: $X \rightarrow \mathbb{R}^K$ so that $\rho(x,y)=||f(x) - f(y)||_p$. A metric space is also called an $\mathcal{L}_p-metric$ if it can be realized in $\mathcal{L}_p^k$ for some k.

We sometimes cannot find a nice embedding. Therefore, we need to measure the similarity of metric spaces in order to find a good enough one. Let $f$ be an embedding from the finite metric space $(X, \rho)$ into another finite metric $(Y, \mu)$. We define:

$expansion(f)$ = $max \frac{ \mu (f(x), f(y)) }{ \rho (x,y)}$

$contraction(f)$ = $max \frac{\rho (x,y)}{\mu (f(x), f(y))}$

The distortion of an embedding is defined as the product of $expansion(f)$ and $contraction(f)$. An embedding $f$ with $distortion(f) = 1$ is called $isometric$.

An alternative brief definition is:

Definition:

Given two metric space $(X, \rho)$, $(Y, \sigma)$. A mapping f: $X \rightarrow Y$ is called a D-embedding of $X$ into $Y$ (where $D > 1$) if there exists some $r > 0$ such that $\forall x, x’ \in X$,

$r \cdot \rho (x, x’) \le \sigma (f(x), f(x’)) \le D \cdot r \cdot \rho(x, x’)$

$D$ is the aforementioned distortion of an embedding.

We can easily prove that (at least in finite metric space) every metric space embeds isometrically into $\mathcal{L}_{\infty}$. (Frechet)

2. Basic theorems

2.1 Incompressibility of general metric spaces

If $\mathbf{Z}$ is a normed space that D-embeds all n-points metric space, then,

$dim(\mathbf{Z}) = \Omega(n)$ for $D<3$

$dim(\mathbf{Z}) = \Omega(n^{\frac{1}{2}})$ for $D<5$

$dim(\mathbf{Z}) = \Omega(n^{\frac{1}{3}})$ for $D<7$

, where $\Omega(f(n))$ means at least $k\cdot f(n)$.

This theorem gives us some hints that an embedding with smaller distortion needs more dimension or representation

2.2 Bourgain embedding

Theorem (Bourgain):

Every metric space on $n$ points embeds into $\mathcal{L}_p^{O(\log{n})}$ with distortion $O(\log{n})$ and dimension $O(log^2 n)$.

Probabilistic programming-Notes-Variational Inference-Part1 Numpy-Notes-Part1-How to Generate Multivariates Gaussian Distribution in A Decent Way?

Comments

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×