The experimental data comes in the form of a set of measurements , say of reactor temperature T which has changed as a result of adjustments or random changes in say feed rate F. These come in pairs (T1, F1), (T2, F2), ... (Ti, Fi), ... (Tn, Fn),
They are to be fitted to an equation which will give T in terms of F and one or more parameters e.g. a and b:
T = f(F, a, b)
The problem is to find a and b to give an optimal fit to the above equation for the measured data. One way of defining this optimum is in terms of the minimum square deviation between experimental and theoretical temperature points. The squared deviation for a single point will be:
[Ti - f(Fi, a, b)]²
We require to minimise the sum of these squared deviations over all n points.
The reason why this must be an optimisation problem rather than an equation solving one is that if we were to treat it as an equation solving problem there would be too many equations for the number of unknowns. There are two unknowns, a and b. However, for each pair of measurements we have effectively one equation, which we could write as:
T1 = f(F1, a, b)
for the first pair of measurements
T2 = f(F2, a, b)
for the second pair of measurements
... and so on.
Unless we had only two measurements we have more equations than unknowns and what is called an overdetermined system of equations. Data fitting is thus a particular case of `solving' such a set of equations, which involves finding the solution which most nearly satisfies all the equations.
Consider the fitting data to the
equation below
to estimate the parameter a.
We have a set of data paired points (yi, xi). These give an experimental value yi and with the fitted equation, enable a corresponding calculated value to be determined from a xi.
The difference between experimental and calculated values
or residual for each point is thus:
The aim of the the fitting procedure is to minimise this overall by
minimising the sum of the squares of these residuals over all data points.
This is an optimisation problem to find a so as to minimise the objective
function P(a):
Expanding:
Differentiating w.r.t. a:
P' must be zero at the minimum, so we get:
Consider the following data:
| x | y(measured) | x2 | xy |
| 0 | 0.5 | 0 | 0 |
| 1 | 1.5 | 1 | 1.5 |
| 2 | 4.5 | 4 | 9.0 |
| 3 | 5.5 | 9 | 16.5 |
| 4 | 8.5 | 16 | 34.0 |
| 5 | 9.5 | 25 | 47.5 |
| 55 | 108.5 | ||
Hence:
In fact the data were generated from y = 2 x with errors of
introduced at alternate readings.
It is also possible to fit linear parameters directly using a general procedure, such as is available in the Excel solver. A link to a spreadsheet which does this is here.
In some cases it is possible to transform
the problem so that it is linear in its parameters. For example:
y = exp( a x )
is not linear in the parameter a. However, if
we take logs of both sides then we have:
ln y = a x
This is linear in a which can be determined by fitting ln(y) as a function of x.
Statisticians will warn you that this procedure is not precisely equivalent to fitting the nonlinear parameters directly, which is what must be done if no suitable transformation is available.
Next - Section 5.3: Notes on Mathematics
Return to Section 5 Index