Data assimilation method

1. Introduction

Data assimilation is a method that combines observations with the output of a model to improve a prediction. The main idea is to combine information from our model and from observations, in order to have a more reliable analysis. Sometimes what you are trying to improve in our model may not be in the same space as what our observe this is something that we need to consider while doing our assimilation. It is also important to control our output, it is up to us to decide which input is responsible for the error in our output. This means that there will be uncertainties coming from the model or from the observations, these uncertainties coming from our input will also translate into uncertainties in our output. At the end of the data assimilation we will obtain an output which will be an estimate of the unknown quantity process the state variables. The best estimate is searched for as a linear combination of the background estimate and the observations:

\[x^a=Lx^b+Ky^0\]

2. Kalman Filter

The Kalman filter method consists in looking for $x^a$ an analysis, this analysis will be a linear combination of what we already know, our model and our observations. Now let’s try to see the formula of the kalman filter method in a multi-dimensional case.

\[\left\{\begin{aligned} &x^a=(I-KH)x^b+Ky^0=x^b+K(y^0-H(x^b)) \\ &K=BH^T(HBH^T+R)^{-1} \\ \end{aligned}\right.\]

With \(K\) the gain or weight matrix, \((y^0-H(x^b))\) the innovation and $H$ the linear model of the observations. This formulation is called the Best Linear Unbiased Estimator (BLUE) or least squares analysis. The principle of the Kalman filter is based on this formulation. Here is a small figure which illustrates the Kalman filter.

Kalman Filter illustration

The general idea consists in estimating the state at time $k$ from an estimate at time \(k-1\) and measurements at time \(k\). We do the estimation in two steps:

  • Prediction of the state from the evolution model.

  • Correction of the prediction from the measurements.

3. Ensemble Kalman Filter

We have seen the Kalman Filter methods to do data assimilation, this method is valid only for linear systems, but the Lorenz system is non-linear, that’s why we will introduce the Ensemble Kalman Filter method which works well for non-linear systems. The ENKF method consists in using the Kalman filter method in high dimension and compare P by a set of states \(x_1,x_2,..,x_{m}\). So we can approximate the moments of the error by the moments of the sample. The we have:

\[x_i^a=x_i^f+K[y-h(x_i^f)]\]

with \(h(x_i^f)\) the observation operator. We can also define the Kalman gains:

\[K=P^f H^T(HP^f H^T+R)^{-1}\]

To begin with we can estimate the forecast error covariance matrix as:

\[P^f=\frac{1}{m-1}\sum_{i=1}^{m}(x_i^f-\bar{x}^f)(x_i^f-\bar{x}^f)^T~~with~~\bar{x}^f=\frac{1}{m}\sum_{i=1}^{m}x_i^f\]

\noindent We can factorized the forecast error covariance matrix by:

\[P^f=X_f X_f^T\]

where \(X_f\) is an $n \times m$ matrix whose columns are the normalized anomalies or normalized perturbations,

\[[X_f]_i=\frac{x_i^f-\bar{x}^f}{\sqrt{m-1}}\]

In addition, we have:

\[\bar{x}^a=\frac{1}{m}\sum_{i=1}^mx_i^a~~,~~~~[X_a]_i=\frac{x_i^a-\bar{x}^a}{\sqrt{m-1}}\]