Introduction to numerical methods. Sources of error and error quantification.
Announcement: Next week's tutorial will be essentially MATLAB 101, so if you already know how to to program you probably don't need to attend.
1Introduction to numerical methods¶
Don't use numerical methods unless you need to! An analytical solution is always better, if possible.
For example, let's say you're given
$$\frac{dy}{dt} = -\alpha y$$
If we solve this analytically, we would have a solution for every value of $\alpha$. If we attempt to solve it numerically, however, we would need to specify the value of $\alpha$ as well as the initial value, and we would need to discretise $t$, resulting in only an approximate solution.
However, if we can't get the solution analytically, then numerical methods are the next best thing. Before we attempt to find an approximate solution, however, we should learn as much as we can about it. Are there asymptotes, and if so, where? Does the function model a physical phenomenon that we know something about (which could give insight as to the upper and lower bounds, for instance)? Do we know any statistics (e.g., average value)?
One thing to keep in mind is the fact that when you discretise $t$, you are, by definition, taking a continuous variable and sampling it at discrete intervals. Sometimes your choice of interval will be erroneous - if you have a sine wave, and you choose to plot the values at $t = n\pi$ for $n \in \mathbb N$, then the resulting plot will be a straight line along the $x$-axis. On the other hand, if you plot the values at $t = 0.01n$ for $n \in \mathbb N$, then you'll get a pretty accurate picture, but the high quality of the resolution is a bit superfluous since you would have gotten just as good of a picture at $t = 0.1n$ (and the computational resources involved would be fewer). So there's necessarily a trade-off between precision and resources. If our resolution is too low, we lose sight of the structure of the function; too high, and we waste computing power (and thus money).
2Sources of error¶
- Errors in input data (due to finite precision of measurements)
- Rounding errors (due to finite precision of floating-point numbers when stored in a computer - for example, to store $\pi$ in a computer, we'd only be able to store a finite number of digits)1
- Truncation errors (due to finiteness again - for example, if we use a Taylor series to approximate a function, we can only look at a finite number of terms even though we would need the infinite series to get a true approximation)
3Error quantification¶
Suppose that $a$ is the exact value of something, and $\overline{a}$ is the numerical value that we have magically obtained (our estimate of $a$). The absolute error is given by $|\overline{a} - a|$, and the relative error is, of course, $|(\overline{a}-a)/a|$.
For example, if $a = \sqrt{2}$, and $\overline{a} = 1.414$, then the absolute error is (roughly) $2.1 \times 10^{-4}$. Then we can write that $a = 1.414 \pm 2.2 \times 10^{-4}$, or, equivalently
$$1.41379 \leq a \leq 1.41421$$
For this course, both of the above are acceptable ways of stating the answer.
3.1Error propagation¶
Given a certain error (uncertainty2) on the input $x$, what is the expected error on the output?
Let $\Delta f$ be the error in the resulting output (after putting $x$ through a function), and let $\overline{x}$ be our approximation of $x$, warts and all. Then we have
$$\Delta f = \left|f(\overline{x} - f(x)\right| \leq \max\left (\left |f(\overline x + \epsilon) - f(x)\right|, \left |f(\overline x - \epsilon) - f(\overline x) \right |\right)$$
where $\epsilon$ is the estimated error (and since we don't know the sign we have to look at both $+\epsilon$ and $-\epsilon$). So $x = \overline x \pm \epsilon$.
3.2Mean value theorem¶
Now we come to the mean value theorem. I don't think it's actually relevant to what we're talking about now, but it will be relevant later on. If $f$ is differentiable, then there is a point, $\xi$, between $\overline x$ and $x$ such that $\Delta f = \left |f(\overline x) - f(x) \right | = f'(\xi) (\overline x- x)$.
Proof: Assume that $f(a) \neq f(b)$. Then, there is a point whose derivative is equal to the slope of the straight line between $a$ and $b$. We define $g$ as follows:
$$g(x) = f(x) - \left [ \frac{f(b) - f(a)}{b-a} (x-a) + f(a) \right ]$$
Then, this function $g$ has the property that $g(a) = g(b)$! Now use Rolle's theorem: this function is differentiable on the interval3 $a$ to $b$, and since $g(a) = g(b)$, there is at least one point between $a$ and $b$ when the slope is 0. If $g'(x) = 0$, there is a point $x = c$ such that
$$0 = f'(c) - \frac{f(b) - f(a)}{b-a}$$
Thus we conclude that $f'(c)(b-a) = f(b) -f(a)$ which is exactly what we wanted to prove. Then if we let $b = \overline x$ and $a = x$ then we get some result, I'm not really sure, what are we doing this for anyway?
3.3Elementary operations and error propagation¶
Forget about the previous section for a moment. Let's talk about how various elementary operations affect uncertainty.
Addition/subtraction: add the errors (the derivation uses the triangle inequality, but it's pretty obvious so I'm just going to omit the derivation)
Multiplication: if $y = x_1x_2$, then $\overline y= \overline{x_1}\overline{x_2}$, and so
$$\begin{align} \Delta y & = \overline{x_1}\overline{x_2} - x_1x_2 \\ & = (x_1 \pm \Delta x_1)(x_2\pm \Delta x_2) - x_1x_2 \\ & = \pm x_1\Delta x_2 \pm x_2\Delta x_1 + \Delta x_1x_2\end{align}$$
If the relative error on the inputs is small, with $\displaystyle \frac{|\Delta x_i |}{|\Delta x_i|} \ll 1$, then we can neglect the last term, and divide by $y$ to get the relative error in the output:
$$\frac{\Delta y}{y} = \pm \frac{\Delta x_2}{x_2} \pm \frac{\Delta x_1}{x_1}$$
Then the absolute relative error is given by
$$\left | \frac{\Delta y}{y} \right | \leq \left | \frac{\Delta x_1}{x_1} \right | + \left | \frac{\Delta x_2}{x_2} \right |$$
The derivation for division is similar.