ECE5550: Applied Kalman Filtering
	KALMAN FILTER GENERALIZATIONS
	
		5.1: Maintaining symmetry of covariance matrices
	
	
		■  The Kalman filter as described so far is theoretically correct, but has known vulnerabilities and limitations in practical implementations.
	
	
		■  In this unit of notes, we consider the following issues:
	
	
		1. Improving numeric robustness;
	
	
		2. Sequential measurement processing and square-root filtering;
	
	
		3. Dealing with auto- and cross-correlated sensor or process noise;
	
	
		4. Extending the filter to prediction and smoothing;
	
	
		5. Reduced-order filtering;
	
	
		6. Using residue analysis to detect sensor faults.
	
	
		Improving numeric robustness 
	
	
		■  Within the filter, the covariance matrices Σ,k  and Σ,k  must remain
	
	
		1. Symmetric, and
	
	
		2. Positive definite (all eigenvalues strictly positive).
	
	
		■  It is possible for both conditions to be violated due to round-off errors in a computer implementation.
	
	
		■  We wish to find ways to limit or eliminate these problems.
	
	
		Dealing with loss of symmetry
	
	
		■  The cause of covariance matrices becoming asymmetric or    non-positive definite must be due to either the time-update or measurement-update equations of the filter.
	
	
		■  Consider first the time-update equation:
	
	
		 
	
	
		• Because we are adding two positive-definite quantities together, the result must be positive definite.
	
	
		• A “suitable implementation” of the products of the matrices will avoid loss of symmetry in the final result.
	
	
		■  Consider next the measurement-update equation:
	
	
		Σ,k  = Σ,k  − Lk Ck Σ,k.
	
	
		■   Theoretically, the result is positive definite, but due to the subtraction operation it is possible for round-off errors in an implementation to
	
	
		result in a non-positive-definite solution.
	
	
		■  The problem may be mitigated in part by computing instead
	
	
		Σ,k  = Σ,k  − Lk Σz(˜) ,kL k(T) .
	
	
		• This may be proven correct via
	
	
		 
	
	
		 
	
	
		 
	
	
		 
	
	
		■  With a “suitable implementation” of the products in the Lk Σz(˜) ,kL k(T) term,
	
	
		symmetry can be guaranteed. However, the subtraction may still give a non-positive definite result if there is round-off error.
	
	
		■  A better solution is the Joseph form covariance update.    
	
	
		• This may be proven correct via
	
	
		 
	
	
		 
	
	
		 
	
	
		 
	
	
		 
	
	
		■  Because the subtraction occurs in the “squared” term, this form. “guarantees” a positive definite result.
	
	
		■  If we end up with a negative definite matrix (numerics), we can    replace it by the nearest symmetric positive semidefinite matrix.
	
	
		■  Omitting the details, the procedure is:
	
	
		• Calculate singular-value decomposition: Σ = USVT .
	
	
		• Compute H = VSVT .
	
	
		• Replace Σ with (Σ + Σ T  + H + HT )/4.
	
	
		5.2: Sequential processing of measurements
	
	
		■  There are still improvements that may be made. We can:
	
	
		• Reduce the computational requirements of the Joseph form,
	
	
		• Increase the precision of the numeric accuracy.
	
	
		■  One of the computationally intensive operations in the Kalman filter is
	
	
		the matrix inverse operation in Lk  = 
	
	
		■  Using matrix inversion via Gaussian elimination (the most
	
	
		straightforward approach), is an O(m3) operation, where m is the dimension of the measurement vector.
	
	
		■  If there is a single sensor, this matrix inverse becomes a scalar division, which is an O(1) operation.
	
	
		■  Therefore, if we can break them measurements intom single-sensor measurements and update the Kalman filter that way, there is
	
	
		opportunity for significant computational savings.
	
	
		Sequentially processing independent measurements
	
	
		■  We start by assuming that the sensor measurements are independent. That is, that
	
	
		 
	
	
		■  We will use colon “:” notation to refer to the measurement number. For example, z k :1  is the measurement from sensor 1 at time k.
	
	
		■  Then, the measurement is
	
	
		 
	
	
		where Ck(T):1  is the first row of Ck  (for example), and vk :1  is the sensor
	
	
		noise of the first sensor at time k, for example.
	
	
		■  We will consider this a sequence of scalar measurements z k :1 ... z k:m , and update the state estimate and covariance estimates in m steps.
	
	
		■  We initialize the measurement update process with^(x)k:0(十) =^(x)k— and Σ,k :0  = —,k.
	
	
		■  Consider the measurement update for the ith measurement, z k:i
	
	
		 
	
	
		= E[xk  | Zk — 1 , z k :1 ... z k:i — 1] 十 Lk:i (z k:i  — E[z k   | Zk — 1 , z k :1 ... z k:i — 1])
	
	
		 
	
	
		■  Generalizing from before
	
	
		Lk:i  = E k(十):i — 1z k(~ T):i ] Σz k(~—):i(1) .
	
	
		■  Next, we recognize that the variance of the innovation corresponding to measurement z k:i  is
	
	
		 
	
	
		■  The corresponding gain is Lk:i  =  and the updated state is
	
	
		 
	
	
		with covariance
	
	
		 
	
	
		■  The covariance update can be implemented as
	
	
		 
	
	
		■  An alternative update is the Joseph form,
	
	
		Σ,k:i  = [I — Lk:iCk(T):i] Σ,k:i  [I — Lk:iCk(T):i]T  十 Lk:iσ L k(T):i .
	
	
		■  The final measurement update gives ^(x)k(十) =^(x)k(十):m  and Σ,k  = Σ,k:m.
	
	
		Sequentially processing correlated measurements
	
	
		■  The above process must be modified to accommodate the situation where sensor noise is correlated among the measurements.
	
	
		■  Assume that we can factor the matrix  = SvS , where Sv  is a
	
	
		lower-triangular matrix (for symmetric positive-definite  .
	
	
		• The factor Sv  is a kind of a matrix square root, and will be important in a number of places in this course.
	
	
		• It is known as the “Cholesky” factor of the original matrix.
	
	
		
	
	
		• Be careful: MATLAB’s default answer (without specifying “lower”) is an upper-triangular matrix, which is not what we’re after.
	
	
		■  The Cholesky factor has strictly positive elements on its diagonal (positive eigenvalues), so is guaranteed to be invertible.
	
	
		■  Consider a modification to the output equation of a system having correlated measurements
	
	
		z k  = Cxk  十 vk
	
	
		 
	
	
		 
	
	
		• Note that we will use the “bar” decoration (-) frequently in this chapter of notes.
	
	
		• It rarely (if ever) indicates the mean of that quantity.
	
	
		• Rather, it refers to a definition having similar meaning to the original symbol.
	
	
		• For example, z-k  is a (computed) output value, similar in interpretation to the measured output value zk.
	
	
		■  Consider now the covariance of the modified noise input k  = Sv—1vk
	
	
		 
	
	
		 
	
	
		 
	
	
		■  Therefore, we have identified a transformation that de-correlates (and normalizes) measurement noise.
	
	
		■  Using this revised output equation, we use the prior method.
	
	
		■  We start the measurement update process with^(x)k:0(十) =^(x)k— and
	
	
		 
	
	
		■  The Kalman gain is k:i  =  and the updated state is
	
	
		 
	
	
		 
	
	
		with covariance
	
	
		Σ,k:i  = Σ,k:i — 1  — L-k:iC- k(T):i  Σ,k:i — 1
	
	
		(which may also be computed with a Joseph form. update, for example).
	
	
		■  The final measurement update gives ^(x)k(十) =^(x)k(十):m  and Σ:k  = Σ,k:m.
	
	
		LDL updates for correlated measurements
	
	
		■  An alternative to the Cholesky decomposition for factoring the covariance matrix is the LDL decomposition
	
	
		 = Lv DvLv(T) ,
	
	
		where Lv  is lower-triangular and Dv  is diagonal (with positive entries).
	
	
		
	
	
		■  The Cholesky decomposition is related to the LDL decomposition via
	
	
		 
	
	
		■  Both texts show how to use the LDL decomposition to perform. a sequential measurement update.
	
	
		■  A computational advantage of LDL over Cholesky is that no
	
	
		square-root operations need betaken. (We can avoid finding Dv(1)/2.)
	
	
		■  A pedagogical advantage of introducing the Colesky decomposition is that we use it later on.
	
	
		5.3: Square-root filtering
	
	
		■  The modifications to the basic Kalman filter that we have described so far are able to
	
	
		• Ensure symmetric, positive-definite covariance matrices;
	
	
		• Speed up the operation of a multiple-measurement Kalman filter.
	
	
		■  The filter is still sensitive to finite word length: no longer in the sense of causing divergence, but in the sense of not converging to as good a solution as possible.
	
	
		■  Consider the set of numbers: 1,000,000; 100; 1. There are six orders of magnitude in the spread between the largest and smallest.
	
	
		■  Now consider a second set of numbers: 1,000; 10; 1. There are only three orders of magnitude in spread.
	
	
		■  But, the second set is the square root of the first set: We can reduce dynamic range (number of bits required to implement a given
	
	
		precision of solution) by using square roots of numbers.
	
	
		■  For example, we can get away with single-precision math instead of double-precision math.
	
	
		■  The place this really shows up is in the eigenvalue spread of
	
	
		covariance matrices. If we can use square-root matrices instead, that would be better.
	
	
		■  Consider the Cholesky factorization from before. Define
	
	
		T  and  .
	
	
		■  We would like to be able to compute the covariance time updates and measurement updates in terms of S,k  instead of Σ,k. Let’s take the steps in order.
	
	
		SR-KF step 1a: State estimate time update.
	
	
		■  We compute
	
	
		ˆ(x) = Ak−1ˆ(x)1 + Bk−1uk−1 .
	
	
		■  No change in this step from standard KF.
	
	
		SR-KF step 1b: Error covariance time update.
	
	
		■  We start with standard step
	
	
		 
	
	
		■  We would like to write this in terms of Cholesky factors
	
	
		 
	
	
		■  One option is to compute the right side, then take the Cholesky decomposition to compute the factors on the left side. This is     computationally too intensive.
	
	
		■  Instead, start by noticing that we can write the equation as
	
	
		 
	
	
		= MM T .
	
	
		■  This might at first appear to be exactly what we desire, but the
	
	
		problem is that Sk  is and n × n matrix, whereas M is an n × 2nmatrix.
	
	
		■  But, it is at least a step in the right direction. Enter the QR decomposition.
	
	
	QR decomposition : The QR decomposition algorithm computes two factors Q ∈ Rn×n  and R ∈ Rn×m  for a matrix Z ∈ Rn×m  such that
	Z = QR , Q is orthogonal, R is upper-triangular, and m ≥ n.
	■  The property of the QR factorization that is important here is that R is related to the Cholesky factor we wish to find.
	■  Specifically, if R(˜) ∈ Rn×n  is the upper-triangular portion of R , then R(˜)T  is
	the Cholesky factor of Σ = MT M.
	■  That is, if R(˜) = qr(MT )T , where qr(·) performs the QR decomposition and returns the upper-triangular portion of R only, then R(˜) is the
	lower-triangular Cholesky factor of MM T .
	■  Continuing with our derivation, notice that if we form. M as above, then computeR(˜), we have our desired result.
	 
	■  The computational complexity of the QR decomposition is O(mn2),
	whereas the complexity of the Cholesky factor is O(n3 /6) plus O(mn2) to first compute MM T .
	■  In MATLAB:
	
	SR-KF step 1c: Estimate system output.
	■  As before, we estimate the system output as
	z(ˆ)k  = Ck ˆ(x) Dk uk.
	SR-KF step 2a: Estimator (Kalman) gain matrix.
	
	
	
		■  In this step, we must compute Lk  = 
	
	
		■  Recall that k  = Ck 
	
	
		■  We may find Sz(˜) ,k  using the QR decomposition, as before. And, we
	
	
		already know S,k.
	
	
		■  So, we can now write Lk 
	
	
		■  If zk  is not a scalar, this equation may often be computed most efficiently via back-substitution in two steps.
	
	
		• First, k  is found, and
	
	
		• Then Lk Sz(˜) ,k  = M is solved.
	
	
		• Back-substitution has complexity O(n2 /2).
	
	
		• Since Sz(˜) ,k  is already triangular, no matrix inversion need be done.
	
	
		■  Note that multiplying out T  in the computation of Σ ,k  may drop some precision in Lk.
	
	
		■  However, this is not the critical issue.
	
	
		■  The critical issue is keeping Sk  accurate for whatever Lk  is used, which is something that we do manage to accomplish.
	
	
		■  In MATLAB:
	
	
		
	
	
		SR-KF step 2b: State estimate measurement update.
	
	
		■  This is done just as in the standard Kalman filter,
	
	
		ˆ(x) = ˆ(x)k (z k  − z(ˆ)k ).
	
	
		SR-KF step 2c: Error covariance measurement update.
	
	
		■  Finally, we update the error covariance matrix.
	
	
		 
	
	
		which can be written as,
	
	
		 
	
	
		■  Note that the “−” sign prohibits us using the QR decomposition to solve this problem as we did before.
	
	
		■  Instead, we rely on the “Cholesky downdating” procedure.
	
	
		■  In MATLAB,
	
	
		
	
	
		■  If you need to implement this kind of filter in a language other than
	
	
		MATLAB, a really excellent discussion of finding Cholesky factors, QR factorizations, and both Cholesky updating and downdating may be
	
	
		found in: G.W. Stewart, Matrix Algorithms, Volume I: Basic Decompositions, Siam, 1998. Pseudo-code is included.