데이터분석/Code & Tools & Script Snippet

kalman script

늘근이 2018. 6. 9. 13:41

잡음까지 포함된 입력 데이터를 재귀적으로 처리하는 필터

Kalman filtering, also known as linear quadratic estimation (LQE), is an algorithm that uses a series of measurements observed over time, containing statistical noise and other inaccuracies, and produces estimates of unknown variables that tend to be more accurate than those based on a single measurement alone, by estimating a joint probability distribution over the variables for each timeframe.


현재의 관측 정보로부터 다음 데이터 지점을 예측하는 예측치 Predictor

마지막 두 관측을 처리해서 노이즈를 없애는 필터

과거 관측 데이터로부터 경향을 계산해내는 평활화기 smoother


In this video, we’ll discuss why you’d use Kalman filters. If you’re not familiar with the topic, you may be asking yourself, “What is a Kalman filter? Is it a new brand of coffee filter that brews the smoothest tasting coffee?” No, it’s not. A Kalman filter is an optimal estimation algorithm. Today we’ll discuss two examples that demonstrate common uses of Kalman filters. In the first example, we’ll see how a Kalman filter can be used to estimate a system state when it cannot be measured directly.

To illustrate this, let’s go to Mars before anyone else does. If your spacecraft’s engine can burn fuel at a high enough temperature, it can create thrust that will let you fly to Mars. By the way, according to NASA, liquid hydrogen is a light and powerful rocket propellant that burns with extreme intensity at 5500 degrees Fahrenheit. But be careful, because a too high temperature can put the mechanical components of the engine at risk. And this can lead to the failure of some of the mechanical parts. If that happens, you might be stuck in your small spacecraft where you’ve got to eat from tubes.

To prevent such a situation, you should closely monitor the internal temperature of the combustion chamber. This is not an easy task, since a sensor placed inside the chamber would melt. Instead, it needs to be placed on a cooler surface close to the chamber. The problem you’re facing here is that you want to measure internal temperature of the chamber but you can’t. Instead, you have to measure external temperature. In this situation, you can use a Kalman filter to find the best estimate of the internal temperature from an indirect measurement. This way, you’re extracting information about what you can’t measure from what you can.

Now that you know the solution to your problem, you can continue your journey to Mars. But are you scared of traveling in space? Let me tell you this: On Mars, you’re going to weigh 62% less than what you weigh on Earth. Still not convinced? Ok, then let’s go back and take a look at another scenario that takes place on Earth. In this example, we’re going to see how a Kalman filter can be used to estimate a state of the system by combining measurements from different sources that may be subject to noise.

You have guests visiting from overseas and you need to pick them up from the airport. You’re using your car’s navigation system. Let’s look at the sensors you have onboard, which help you find your position and navigate you to the airport. The inertial measurement unit (IMU) uses accelerometers and gyroscopes to measure the car’s acceleration and angular velocity. The odometer measures the relative distance traveled by the car. The GPS receiver receives signals from satellites to locate the car on earth’s surface. If you live in Boston as I do, you’ve got to travel through the Big Dig, a very very long tunnel. And in the tunnel, it gets harder to estimate your position through GPS, since the receiver’s line of sight to satellites is blocked and GPS signal is weak. In this case, you may want to trust the IMU readings, which give you the acceleration. However, acceleration itself doesn’t tell you much about the car’s position. For that, you need to take the double integral of the acceleration. Unfortunately, this operation is prone to drift due to small errors accumulating over time. To get better position estimates, you can use IMU measurements along with odometer readings. But note that odometer measurements may be affected by the tire pressure and road conditions. 

To summarize, your sensors measuring the relative position of your car give you fast updates, but they are prone to drift. The GPS receiver provides your absolute location, but it gets updated less frequently and it may be noisy. In this scenario, a Kalman filter can be used to fuse these three measurements to find the optimal estimate of the exact position of the car.

Let’s look at some facts about Kalman filters. The Kalman filter is named after Rudolf Kalman, who is the primary developer of its theory. It is an optimal estimation algorithm that predicts the parameter of interest such as location, speed, and direction in the presence of noisy measurements. Common applications of Kalman filters include guidance, navigation, and control systems; computer vision systems; and signal processing. One of the first applications of Kalman filters was in the 1960s. Do you have any guesses as to what it helped with? Engineers used it in the Apollo project, where the Kalman filter was used to estimate trajectories of the manned spacecraft to the Moon and back.

Let’s summarize what we’ve seen in this video. Kalman filters are used to optimally estimate the variables of interest when they can’t be measured directly, but an indirect measurement is available. They’re also used to find the best estimate of states by combining measurements from various sensors in the presence of noise.

In the next videos, we’ll cover what Kalman filters are and how they work. We’ll start by discussing state observers since this will help us understand Kalman filtering, which is a method to design optimal state observers. Kalman filtering is a method to design optimal state observers. Therefore, in the next video, we’ll learn state observers and then we’ll continue our discussion with optimal state estimators.



In this video, we’ll discuss state observers. This concept will help explain what Kalman filters are and how they work. Let’s start with an example. This is little Timmy, and you want to know about his mood and how he’s feeling right now. However, there is no direct way of measuring his mood. So, what you do is give him a cookie and start observing his facial expressions. This observation helps you estimate his real mood. State observation helps you estimate something that you can’t see or measure directly.

Throughout the rest of the discussion, if you see something with a hat on it, this means that it is estimated through a state observer. So, if the state x is shown with a hat, then it is an estimated state.

Next, we’ll look at a more solid example. You’re traveling in space to discover new planets. In order to get to the planets safely, you need to monitor the internal temperature of your jet engine. If it gets too hot, it could damage your spacecraft. However, there isn’t any feasible way of measuring the internal temperature, since a sensor placed inside the engine would melt. What you can do is to place the sensor on a colder surface and measure the temperature there. Let’s call this temperature Texternal and the one you can’t measure, but you need to estimate, Tinternal.

This is your rocket. You’re wondering how high the internal temperature of the engine is, since this will tell you how you should regulate the fuel flow to your rocket. However, you don’t have access to the state Tinternal. Instead, you can measure Texternal. The signals that are available to you are the fuel flow and your measurement. How do you estimate the internal temperature? Actually, you have access to more information. Since your math is pretty good, you can drive the equations that will give you the mathematical model of your real system. You already know how much fuel you’re adding to your rocket, so if you now input this fuel flow to your mathematical model, this will give you an estimate of your output. And also, notice that since you have all the governing equations, you can even calculate the internal state of this system.

Ok, does this solve your problem now? Unfortunately, it doesn’t. No doubt that you’re good at math, but in reality, the mathematical model you found is only an approximation of your real system. It is subject to uncertainties. If you had a perfect model without any uncertainties and if your real system and your model had the same initial conditions, then your measurement and estimated output values would match each other, and therefore the estimated internal temperature would match the true internal temperature as well. But in real life, this is an unlikely scenario and therefore the estimated external temperature won’t match the measured temperature. That’s why you need to use a state estimator to estimate your internal states. Let’s see how a state estimator works:

Here, our goal is to match the estimated external temperature with the measured external temperature. We know that if these two are equal, then the model will converge to the real system. So, the estimated internal temperature will converge to its true value. What we’re trying to do is to minimize the difference between the estimated and measured external temperature. Does this sound familiar to you? Actually, we’re talking about a feedback control system, where we try to control the error between the measured and estimated external temperature at zero using a controller K. 

If we now update our diagram on the left hand side based on what we’ve discussed here, it will look like this. This part represents the state observer. By closing the loop with a controller K around the observer, we try to eliminate the error between the estimated and measured external temperature such that the estimated internal temperature is driven to its true value.

In summary, you can’t directly measure the internal engine temperature. But you know how much fuel you’re supplying to your rocket, so you can run this through your mathematical model and estimate the output, which you can then use along with your real measurement to estimate the internal state of the system.

The question is how to choose the controller gain K such that the error between the measured and estimated external temperature is minimized optimally. We’ll provide more insights into this in the next videos where we’ll discuss how Kalman filters work.

Next, let’s look at how we can explain the state observer mathematically:

We will generalize the problem and show the input as u, the output as y, and any states we want to estimate as x. Our goal is to drive x̂ to x, so we can define the difference between these values as an error. Next, let’s write down the equations for the system and the observer. If we subtract these equations from each other, this will give us the error dynamics. By rearranging terms, we see that the error dynamics can be shown by this equation. The solution to this equation is an exponential function. What this means is if this term is less than zero, we’re good, because we know that our error will vanish over time and x̂ will converge to x.

At this point, you may be asking if we really need the KC term in this equation. Because even without the feedback loop that adds the KC term to the equation, we would have a decaying exponential function for the error. The significance of having a feedback loop around the observer is that we can control the decay rate of the error function by selecting the controller gain K accordingly. However, here the decay rate solely depends on the matrix A. And if there are some uncertainties in the mathematical model, this means you don’t know A exactly. Therefore, you can’t control how quickly the error will vanish. Having the feedback controller gives you more control over this equation and guarantees a faster elimination of the error. And the faster the error vanishes, the faster the estimated states x̂ converge to the true states x.

An optimal way of choosing the gain K is performed through the use of Kalman filters. In the next video, we’ll get insights into how Kalman filters work.



In this video, we’ll discuss the working principles of the Kalman filter algorithm. Let’s start with an example. While you’re desperately staring at your bills, an ad in a magazine catches your eye. You can earn $1 million by joining a competition where you design a self-driving car, which uses a GPS sensor to measure its position. Your car is supposed to drive 1 km on 100 different terrains. Each time, it must stop as close as possible to the finish line. At the end of the competition, the average final position is computed for each team, and the owner of the car with the smallest error variance and an average final position closest to 1 km gets the big prize. Here’s an example. Let these points represent the final position and the red ones the average final position for different teams. Based on these results, team 1 would lose due to the biased average final position although it has small variance. Team 2 would lose as well. Its average final position is on the finish line but it has high variance. The winner would be team 3 since it has the smallest variance and its average final position is on the finish line. If you want to want to be a millionaire, you don’t want to rely purely on GPS readings, since they can be noisy. In order to meet the required criteria to win the competition, you can estimate the car’s position using a Kalman filter.


Let’s look at this system to understand how the Kalman filter works. The input to the car is the throttle. The output that we’re interested in is the car’s position. For such a system, we would have multiple states. But here to give you intuition, we’ll assume an overly simplistic system where the input to the car is the velocity. This system will have a single state: the car’s position. And we’re measuring this state, so matrix C is equal to 1.


It’s important to know y as accurately as possible, since we want the car to finish as close as possible to the finish line. But the GPS readings will be noisy. We’ll show this measurement noise with v, which is a random variable. Similarly, there’s process noise, which is also random and can represent the effects of wind or changes in the car’s velocity. Although these random variables don’t follow a pattern, using probability theory we can tell something about their average properties. V, for example, is assumed to be drawn from a Gaussian distribution with zero mean and covariance R. This means if we measured the position of the car - let’s say a hundred times at the same location - the noise in these readings would take on values with most of them located near the zero mean and fewer located further away from it. And this results in the Gaussian distribution, which is described by the covariance R. Since we have a single output system, the covariance R is scalar and is equal to the variance of the measurement noise. Similarly, the process noise is also random and assumes a Gaussian distribution with covariance Q.


Now we know that the measurement is noisy, and therefore what we measure doesn’t quite reflect the true position of the car. If we know the car model, we can run the input through it to estimate the position. But this estimate also won’t be perfect. Because now we’re estimating x, which is uncertain due to the process noise. This is where the Kalman filter comes into play; it combines these two pieces of information to come up with the best estimate of the car’s position in the presence of process and measurement noise.


We’ll discuss the working principle of the Kalman filter visually with the help of probability density functions. At the initial time step k-1, the actual car position can be anywhere around the estimate x^hat_(k-1), and this uncertainty is described by this probability density function. What this plot also tells us is that the car is going to be most likely around the mean of this distribution. At the next time step, the uncertainty in the estimate has increased, which is shown with the larger variance. This is because between time step k-1 and k, the car might have run over a pothole, or maybe the wheels may have slipped a little bit. Therefore, it may have traveled a different distance than what we’ve predicted by the model. As we discussed before, another source of information on car’s position comes from the measurement. Here, the variance represents the uncertainty in the noisy measurement. Again, the true position can be anywhere around the mean.

 


Now that we have the prediction and measurement, the question is what is the best estimate of the car’s position? It turns out that the optimal way to estimate the car’s position is by combining these two pieces of information: the prediction and the measurement. And this is done by multiplying these two probability functions together. The resulting product is also a Gaussian function. This estimate has a smaller variance than either of the previous estimates. And the mean of this probability density function gives us the optimal estimate of the car’s position.


This is the basic idea behind Kalman filters. But to win the competition, you need to be able to implement the algorithm. We’re going to discuss this in the next video.




Part4, Optimal State Estimator Algorithm

In this video, we’ll discuss the set of equations that you need to implement the Kalman filter algorithm. Let’s revisit the example that we introduced in the previous video. You join a competition to win the big prize. You’re asked to design a self-driving car that needs to drive 1 km on 100 different terrains. In each trial, the car must stop as close as possible to the finish line. At the end of the competition, the average final position is computed for each team, and the owner of the car with the smallest error variance and an average final position closest to 1 km gets the big prize.

In that example, we also showed the car dynamics and the car model for our single-state system, and we discussed process and measurement noises along with their covariances. Finally, we said that you could win the competition by using a Kalman filter, which computes an optimal unbiased estimate of the car’s position with minimum variance. This optimal estimate is found by multiplying the prediction and measurement probability functions together, scaling the result, and computing the mean of the resulting probability density function.

Computationally, the multiplication of these two probability density functions relates to the discrete Kalman filter equation shown here. Does this ring a bell? Doesn’t it look similar to the state observer equation that we’ve discussed in previous videos? Actually, a Kalman filter is a type of state observer, but it is designed for stochastic systems. Here’s how the Kalman filter equation relates to what we’ve discussed with the probability density functions. The first part predicts the current state by using state estimate from the previous time step and the current input. Note that these two state estimates are different from each other. We’ll show the predicted state estimate with this notation. This is also called the a priori estimate since it is calculated before the current measurement is taken. We can now rewrite the equation like this. The second part of the equation uses the measurement and incorporates it into the prediction to update the a prioriestimate. And we’ll call the result the a posteriori estimate.

You want to win the big prize, right? Then these are the equations you need to run on your car’s ECU. Looks a little scary? What if we turn everything upside down? Doesn’t change much, does it? Ok, we’ll go over the algorithm equations step by step. The Kalman filter is a two-step process. Let’s first start with the prediction part.

Here, the system model is used to calculate the a priori state estimate and the error covariance P. For our single-state system, P is the variance of the a priori estimate. The error covariance P can be thought of as a measure of uncertainty in the estimated state. This variance comes from the process noise and propagation of the uncertain x̂(k-1). At the very start of the algorithm, the k-1 values for x̂ and P come from their initial estimates.

The second step of the algorithm uses the a priori estimates calculated in the prediction step and updates them to find the a posteriori estimates of the state and error covariance. The Kalman gain is calculated such that it minimizes the a posteriori error covariance. Let this bar represent the calculation of x̂k.  By weighing the correction term, the Kalman gain determines how heavily the measurement and the a priori estimate contributes to the calculation of x̂k. If the measurement noise is small, the measurement is trusted more and contributes to the calculation of x̂k more than the a priori state estimate does. In the opposite case, where the error in the a priori estimate is small, the a priori estimate is trusted more and the computation of x̂k mostly comes from this estimate. We can also show this mathematically by looking at two extreme cases.

Assume that in the first case the measurement covariance is zero. To calculate the Kalman gain, we take its limit as R goes to zero. We plug in 0 for R and see that these two terms cancel each other out. As R goes to zero, the Kalman gain approaches to the inverse of C, which is equal to 1 in our system. Plugging K and C-1into the a posteriori state estimate shows that x̂k is equal to yk, so the calculation comes from the measurement only as expected. Now if we update our plot, we can show the measurement with impulse function which is shown with this orange vertical line. Note that the variance in the measurement is zero since R goes to zero. We found that the a posteriori estimate is equal to the measurement, so we can show it by the same impulse function. On the other hand, if the a priori error covariance is close to zero, then the Kalman gain is found as zero. Therefore, the contribution of this term to x̂k is ignored, and the computation of x̂k comes from the a priori state estimate. On the plot, we’ll show the a priori state estimate with an impulse function which has zero variance. And since the a posteriori estimate is equal to the a priori estimate, we’ll show it with the same impulse function.

Once we’ve calculated the update equations, in the next time step the a posteriori estimates are used to predict the new a priori estimates and the algorithm repeats itself. Notice that to estimate the current state, the algorithm doesn’t need all the past information. It only needs the estimated state and error covariance matrix from the previous time step and the current measurement. This is what makes the Kalman filter recursive.

Now you know the set of equations needed to implement the Kalman filter algorithm. What are you going to do with the big prize when you win the competition? If you can’t decide, here’s a suggestion. Note that the Kalman filter is also referred to as a sensor fusion algorithm. So, you can buy an additional sensor, such as an IMU, and experiment to see whether using multiple sensors would improve your self-driving car’s estimated position. If you have two measurements, the dimensions of y, C, and K matrices would change as shown here. But basically, you would still follow the same logic to compute the optimal state estimate. On the plot, we’ll have one more probability density function for the measurement from IMU, and this time we’ll be multiplying three probability density functions together to find the optimal estimate of the car’s position.

So far, we had a linear system. But what if you have a nonlinear system and want to use a Kalman filter? In the next video, we’ll discuss nonlinear state estimators.



In general, we want our lives to be linear, as shown on this graph. This might be in terms of success, income, or happiness. But in reality, life is not linear. It is full of up and downs, and sometimes it gets even more complicated. If you’re an engineer, you will often need to deal with nonlinear systems. To help you, we’re going to discuss nonlinear state estimators.

Previously, we used a simplified linear car model to discuss state estimation through Kalman filters. However, if this system is modeled such that it takes into account nonlinearities due to road friction, then the state transition function becomes nonlinear. Here, the noise enters the system linearly but there may be systems where the noise is not additive. In a general system, either the state transition function, or the measurement function or both may be nonlinear. For all these cases, we need to use a nonlinear state estimator instead of a Kalman filter, as Kalman filters are only defined for linear systems. Here’s an example that shows the problem with using a Kalman filter for state estimation of a nonlinear system. The Kalman filter assumes a Gaussian distribution. If the state transition function is linear, then after undergoing the linear transformation, the distribution maintains its Gaussian property. Although it’s not shown here, the same is true for the measurement function g(x). However, if f(x) is nonlinear, then the resulting state distribution may not be Gaussian. And therefore, the Kalman filter algorithm may not converge. In this case, you can implement an extended Kalman filter (EKF), which linearizes the nonlinear function around the mean of the current state estimate. At each time step, the linearization is performed locally and the resulting Jacobian matrices are then used in the prediction and update states of the Kalman filter algorithm.

When the system is nonlinear and can be well approximated by linearization, then extended Kalman filter is a good option for state estimation. However, it has the following drawbacks: 1. It maybe be difficult to calculate the Jacobians analytically due to complicated derivatives; 2. There might be a high computational cost to calculating them numerically; 3. You cannot apply an extended Kalman filter to systems with a discontinuous model, since the system is not differentiable and the Jacobians wouldn’t exist; and 4. Linearization doesn’t provide a good approximation for highly nonlinear systems. In the last case, linearization becomes invalid since the nonlinear function cannot be approximated well enough by a linear function and doesn’t describe system dynamics.

To address the issues with extended Kalman filters, you can instead use another estimation technique called the unscented Kalman filter (UKF). Did you know that the creator of the filter came up with this name after noticing the deodorant on his co-worker’s desk? Now back to the filter: instead of approximating a nonlinear function as an extended Kalman filters does, unscented Kalman filters approximate the probability distribution. What we mean by that is the following: This is the probability distribution. An unscented Kalman filter selects a minimal set of sample points such that their mean and covariance is the same as this distribution. These are referred as sigma points, and are symmetrically distributed around the mean. Each sigma point is then propagated through the nonlinear system model. The mean and covariance of the nonlinearly transformed points are calculated and an empirical Gaussian distribution is computed, which is then used to calculate the new state estimate. Note that in the linear the Kalman filter algorithm, the error covariance P is calculated using the state transition function in the prediction step, and then it is updated using the measurement. However, in the unscented Kalman filter, we don’t calculate it in the same way, because we get it empirically instead.

Another nonlinear state estimator based on a very similar principle is the particle filter (PF). It also uses sample points referred as particles. The significant difference from an unscented Kalman filter is that a particle filter approximates any arbitrary distribution, so it’s not limited to a Gaussian assumption. And to represent an arbitrary distribution that is not known explicitly, the number of particles that a particle filter needs is much larger than you’d need for an unscented Kalman filter.

For comparison, here are the properties of the filters that we’ve discussed so far. A Kalman filter only works on linear systems. For state estimation of nonlinear systems, you can use an EKF, UKF, or PF. Note that for an EKF to precisely estimate states, it needs good linearization of the nonlinear system model. Otherwise, the filter provides poor estimation. A particle filter is the only one that works for any arbitrary distribution. And we see that the computational cost grows as we move down the column. Particle filter is computationally the most expensive filter, since it requires a large number of particles to approximate an arbitrary distribution.

In this video, we discussed the basic concepts behind different nonlinear state estimators. Now, if you need to deal with any nonlinearities such as the road friction in the car example, you know how to estimate states of interest of your nonlinear system. For more information on EKFs, UKFs, and PFs, explore the resources in the descriptions of this video.