Kalman & Extended Kalman Filters
Linear KF, linearization via Jacobians, the EKF prediction/update cycle.
Kalman filter
The Kalman filter is a Bayes filter specifically designed for linear-Gaussian estimation problems. Under those assumptions it provides the optimal solution:
- Linear motion and observation models.
- Zero-mean Gaussian noise in both motion and sensor measurements.
Everything in the Kalman filter remains Gaussian, which is essential to its closed-form updates.
Gaussian distributions
A Gaussian (normal) distribution is:
Two crucial properties of Gaussians are marginalization and conditioning.
Marginalization. Given the joint:
the marginal of is:
Conditioning. The conditional of given is Gaussian:
with:
Note. Inverting can be computationally expensive. When we know very little about , the second term tends toward zero.
Linear models
The Kalman filter assumes linear models:
Linear motion model:
Linear observation model:
These represent the mean models, with uncertainty injected through Gaussian noise terms having covariances and .
Components:
- — state transition matrix (), describing state evolution without controls or noise.
- — control input matrix (), describing how controls influence the state.
- — observation matrix (), mapping the state space to observations.
- — Gaussian noise for process and measurement.
The motion and observation distributions are:
Kalman filter algorithm
The Kalman filter recursively computes the belief in two steps: prediction and correction.
- μ̄_t = A_t μ_{t-1} + B_t u_t# Prediction: mean
- Σ̄_t = A_t Σ_{t-1} A_tᵀ + R_t# Prediction: covariance
- K_t = Σ̄_t C_tᵀ (C_t Σ̄_t C_tᵀ + Q_t)⁻¹# Kalman gain
- μ_t = μ̄_t + K_t (z_t − C_t μ̄_t)# Update mean
- Σ_t = (I − K_t C_t) Σ̄_t# Update covariance
- return μ_t, Σ_t# Updated belief
Interpreting the Kalman gain ()
The Kalman gain trades off how confident we are in observations versus prediction:
- No measurement uncertainty (): — the update directly maps observations into state space.
- Infinite measurement uncertainty (): — no correction is performed.
Further reading. Mathematical proofs and detailed derivations of the Kalman filter are in Probabilistic Robotics, §3.2.4.
Python example — 1D Kalman filter
A simple 1-dimensional walk through prediction, measurement, and correction.
1. Prediction. Using :
import numpy as np
# Prior belief
mu_0, sigma_0 = 0.0, 1.0
# Motion model parameters
A, B, u, R = 1.0, 1.0, 5.0, 2.0
# Prediction
mu_pred = A * mu_0 + B * u
sigma_pred = A**2 * sigma_0 + R2. Measurement. Simulate a noisy reading with :
C, Q, z = 1.0, 4.0, 8.03. Correction. The 1-D Kalman gain is :
K = sigma_pred * C / (C**2 * sigma_pred + Q)
mu_corr = mu_pred + K * (z - C * mu_pred)
sigma_corr = (1 - K * C) * sigma_predPlotting the three densities (prediction, measurement, correction) shows the core idea of Kalman filtering: recursive fusion of prediction and observation under Gaussian assumptions. In higher-dimensional SLAM problems, the same principles apply — only the matrices grow.
Drag the sliders to feel the trade-offs. Increasing motion noise widens the prediction so the correction trusts the measurement more. Increasing measurement noise does the opposite — the posterior stays close to the prediction. The Kalman gain formalizes that intuition as a closed-form ratio.
But how often do we see linear systems in the real world, especially in robotics where rotations are involved?
Extended Kalman filter (EKF)
While the Kalman filter is optimal and efficient for linear-Gaussian systems, most realistic robotic problems involve nonlinear dynamics. Rotations in the plane are inherently nonlinear because of trig functions. The previously linear equations:
no longer describe such systems adequately. Instead we use general nonlinear functions:
where might represent the robot's motion with angles, and a range-bearing sensor.
These nonlinearities break the Gaussian assumption. We resort to local approximations — and that's the Extended Kalman Filter.
EKF linearization (first-order Taylor)
The EKF approximates the nonlinear models around the current estimate.
Prediction linearization:
Jacobian:
Correction linearization:
Jacobian:
Linearized motion model
Linearized sensor model
EKF algorithm
- μ̄_t = g(u_t, μ_{t-1})# Predicted mean
- Σ̄_t = G_t Σ_{t-1} G_tᵀ + R_t# Predicted covariance
- K_t = Σ̄_t H_tᵀ (H_t Σ̄_t H_tᵀ + Q_t)⁻¹# Kalman gain
- μ_t = μ̄_t + K_t (z_t − h(μ̄_t))# Correct the mean
- Σ_t = (I − K_t H_t) Σ̄_t# Correct the covariance
- return μ_t, Σ_t# Updated belief
The quality of estimation now heavily depends on the accuracy of the local linear approximations.
Note. In the EKF, and are replaced by the Jacobians and , respectively.
Python example — linear vs nonlinear mapping
1. Linear map. Push a Gaussian prior through . The result is exactly Gaussian, with closed-form mean and variance.
import numpy as np
from scipy.stats import norm
mu, sigma = 0.0, 0.5
a, b = -0.5, 1.0
mu_lin = a * mu + b
sigma_lin = abs(a) * sigma2. Nonlinear map. Push the same prior through . The Gaussian shape is destroyed by local curvature and high-frequency variation, so we estimate the result via Monte Carlo:
def f_nl(x):
return np.sin(x) + 0.1 * np.sin(5 * x)
N = 100_000
ys = f_nl(np.random.normal(mu, sigma, N))3. Local linearization. The EKF performs a first-order Taylor expansion of around a point :
Pushing the prior through this linear approximation yields a Gaussian centred at with variance scaled by the local slope:
x0 = 0.4
G0 = np.cos(x0) + 0.1 * 5 * np.cos(5 * x0) # f'(x0)
y0 = f_nl(x0)
sigma_ekf = abs(G0) * sigmaToggle between the three modes. Linear keeps the output Gaussian (left and right look identical up to a flip and shift). Nonlinear runs Monte Carlo through — the output histogram becomes skewed and even multi-modal. EKF picks a point and approximates the function by its tangent line; sweep along the curve and watch the approximation be good in some places and badly wrong in others — exactly where the EKF is well- or ill-behaved.
Why local linearization matters
- A linear transformation of a Gaussian remains Gaussian — the core strength of the Kalman filter.
- Real-world (especially robotic) systems involve nonlinear models; the output distribution becomes non-Gaussian — potentially skewed or multimodal.
- The EKF is a compromise: it locally linearizes the nonlinear model around the current mean, maintaining a Gaussian belief while capturing local curvature.
Note. The EKF does not produce exact results for nonlinear systems. But when the system is "locally close to linear" it gives a good approximation while remaining efficient.
This lays the foundation for applying the EKF to nonlinear motion and sensor models in SLAM — which we tackle in the next lesson.
Reading material
Kalman filter and EKF
- Thrun, Burgard, Fox. Probabilistic Robotics, Chapter 3.