Before we dig deeper in the Kalman filter, I would like to share one of the applications of Kalman filter with you. N-factors Gaussian model is a direct application of what we have discussed in Kalman filter Part1. This blog includes a brief introduction about this method, and some tricks in this algorithm‘s implement.
1. The Financial basis of N-factor Gaussian model
If you have heard Barra model for stocks or other factors analysis models, you may understand this n-factors Gaussian model easily. Factor analysis is a cross-sectional methodology, which totally depends on the information at current time slot. To be more specific, it can detect the sources of correlations if we assume the correlations between different observable states are caused by communalities. Of course, we can use factors analysis framework to predict with some extra assumptions (for example, the factors’ returns keep constant between time slots).
At time $t$, we can know the return of $n$ assets. We assume that the correlations between assets are caused by $m$ factors (risk factors). Matrix $X_{(m \times n)}$ denotes the exposures of assets on risk factors, while vector $f$ denotes the ‘return’ of risk factors. If we write this model in a linear regression scheme, then
$r_t = X_t^Tf + \epsilon$
Intuitively, we decompose the observed returns according to the risk factors, and the coefficients of factors describe the average returns of each factors. In this framework, we treat the returns of factors as constants. What if we want to treat the coefficients like a distribution or something which contains the uncertainties? Let’s review the framework of risk neutral theorem in financial engineering.
We assume the market (efficient and complete) has $n$ risk factors which determine the return of observable assets. Let $\Theta(t)$ be a n-dimensional adapted process according to $W(t)$, which is a m-dimensional Brownian motions. In this framework, the returns of the risk factors are described as a m-dimensional Brownian motion.
$d\Theta(t) = \mu(\Theta)dt + \sigma(\Theta)dW(t)$
$sigma(\Theta)$ works the same as the risk factors exposures in this framework. From this perspective, multi-factors model in stocks is a just special case.
2. The settings of N-factors Gaussian model
We assume the spot prices of assets ($\mathbf{S}_t$ is a $N \times 1$ vector)in a market can be described with $n$ risk factors as:
$log\mathbf{S_t} = \mathbf{L}_t’ \mathbf{x_t} + \mu t $
$d\mathbf{x}_t = -\mathbf{K}\mathbf{x}_tdt + \mathbf{\Sigma} d\mathbf{W}_t$
where $\mathbf{K} = \begin{Bmatrix} 0 & 0 & … & 0 \\ 0 & k_2 & … & 0 \\ .. & .. & .. & .. \\ 0 & 0 &…&k_n \end{Bmatrix}$, and $\mathbf{\Sigma} = \begin{Bmatrix} \sigma_1 & 0 & … & 0 \\ 0 & \sigma_2 & … & 0 \\ .. & .. & .. & .. \\ 0 & 0 &…&\sigma_n \end{Bmatrix}$
$\mathbf{L}_t$ is a $n \times N$ matrix. This settings of multi-factors analysis include the independence of communalities. However, of course, we can handle the correlations between the risk factors. For example, we assume the risk factors are linear combinations of m independent factors (m < n), then the observed returns of the markets is a affined transformation of m-factor Gaussian model we showed above (See Affined N-factors Gaussian Model).
3. Parameters estimation
Parameters estimation is a direct application of Kalman filter (See here to know more about Kalman filter). As we have discussed in the Part I of KF Notes, we can maximize the likelihood function based on innovation ($z_t - \hat{z_t}$) to estimate the transition matrix of unobservable states.
$\mathbf{x}_t = A \mathbf{x}(t-1)+ \mathbf{c}_t + \mathbf{\epsilon}_t$, where $\mathbf{\epsilon}_t \backsim N(0, Q_t)$
$\mathbf{z}_t = H \mathbf{x}_t + \mathbf{d}_t + \mathbf{v}_t$, where $\mathbf{v}_t \backsim N(0, R_t)$
$\hat{x_t^-} = A\hat{x_{t-1}} + \mathbf{c}_t$
$\hat{P_t^-} = A \hat{P_{t-1}}A^T + Q_t$
$\hat{z_t^-} = H \hat{\mathbf{x}_t^-} + \mathbf{d}_t$
Let $\hat{F_t^-}$ be the priori covariance matrix of observable states.
$\hat{F_t^-} = H\hat{P_t^-}H^T + R_t$
We need to parameterize the covariance matrix $R_t$. Some papers assume that the error of measurement equation is a homoscedastic diagonal matrix. However, we can try some more complex assumption, which is another typical bias and variance trade off. We need to maximize the log-likelihood function of $z_t$:
$\mathcal{L}(\Theta) = -\frac{1}{2} \sum_t log|F_t^-| - \frac{1}{2} \sum_t [z_t - \hat{z_t^-(\Theta)}]^T (F_t^-)^{-1}[z_t - \hat{z_t^-(\Theta)}]$
If $H_t$ also needs to be parameterized, we can use EM algorithm to estimate the parameters.
Remark: We used the priori estimation of the covariance matrix of $\hat{z_t^-}$ in the likelihood function. Why not the posterior covariance matrix? The answer is quite trivial if you notice that the posterior estimation includes a weighted average of priori estimation and innovation (measure residual), which means we have to know the innovation before the posteriori estimation.
In the likelihood function methodology, we use Fisher information matrix to estimate the variance of parameters.
Comments