The experimental data comes in the form of a set of
* measurements *, say of reactor temperature
`T` which has changed as a result of
adjustments or random changes in say feed rate *F*.
These come in pairs
(*T*_{1}, *F*_{1}),
(*T*_{2}, *F*_{2}),
... (*T*_{i}, *F*_{i}),
...
(*T*_{n}, *F*_{n}),

They are to be fitted to an equation which will give
*T* in terms of *F* and one or more
* parameters * e.g. *a* and * b*:

* T = f*(*F, a, b*)

The problem is to find *a* and * b* to give an optimal
fit to the above equation for the measured data. One way of defining this
optimum is in terms of the * minimum square deviation * between
experimental and theoretical temperature points. The
squared deviation for
a single point will be:

[*T*_{i} - * f*(*F*_{i}, * a, b*)]²

We require to minimise the sum of these squared deviations over all *n*
points.

The reason why this must be an optimisation problem rather than an
equation solving one is that if we were to treat it as an equation
solving problem there would be too many equations for the number
of unknowns.
There are ** two ** unknowns, *a* and * b*.
However, for each pair of measurements we have effectively one equation,
which we could write as:

*T*_{1} = * f*(*F*_{1}, * a, b*)
for the first pair of measurements

*T*_{2} = * f*(*F*_{2}, * a, b*)
for the second pair of measurements

... and so on.

Unless we had only two measurements we have more equations than unknowns
and what is called an * overdetermined* system of equations.
Data fitting is thus a particular case of `solving' such a set
of equations, which involves finding the solution which most nearly
satisfies all the equations.

Consider the fitting data to the
equation below
to estimate the parameter *a*.

We have a set of data paired points
(*y*_{i}, *x*_{i}).
These give an experimental value *y*_{i} and with the
fitted equation, enable a corresponding *calculated value*
to be determined from *a x*_{i}.

The difference between experimental and calculated values
or *residual* for each point is thus:

The aim of the the fitting procedure is to minimise this overall by
minimising the sum of the squares of these residuals over all data points.
This is an optimisation problem to find *a* so as to minimise the objective
function *P*(*a*):

Expanding:

Differentiating w.r.t. *a*:

*P*' must be zero at the minimum, so we get:

Consider the following data:

x |
y(measured) |
x^{2} |
xy |

0 | 0.5 | 0 | 0 |

1 | 1.5 | 1 | 1.5 |

2 | 4.5 | 4 | 9.0 |

3 | 5.5 | 9 | 16.5 |

4 | 8.5 | 16 | 34.0 |

5 | 9.5 | 25 | 47.5 |

55 | 108.5 |

Hence:

In fact the data were generated from *y* = 2 *x* with errors of
introduced at alternate readings.

It is also possible to fit linear parameters directly using a general procedure, such as is available in the Excel solver. A link to a spreadsheet which does this is here.

In some cases it is possible to transform
the problem so that it is linear in its parameters. For example:

* y * = exp( * a x* )

is not linear in the parameter `a`. However, if
we take logs of both sides then we have:

ln *y* = *a x *

This is linear in *a* which can be determined by fitting
ln(*y*) as a function of *x*.

Statisticians will warn you that this procedure is not precisely equivalent to fitting the nonlinear parameters directly, which is what must be done if no suitable transformation is available.

Next - Section 5.3: Notes on Mathematics

Return to Section 5 Index