# Section 5.2.1: Data Fitting as Optimisation

Consider the problem of fitting an equation to a set of experimental data points.

The experimental data comes in the form of a set of measurements , say of reactor temperature T which has changed as a result of adjustments or random changes in say feed rate F. These come in pairs (T1, F1), (T2, F2), ... (Ti, Fi), ... (Tn, Fn),

They are to be fitted to an equation which will give T in terms of F and one or more parameters e.g. a and b:

T = f(F, a, b)

The problem is to find a and b to give an optimal fit to the above equation for the measured data. One way of defining this optimum is in terms of the minimum square deviation between experimental and theoretical temperature points. The squared deviation for a single point will be:

[Ti - f(Fi, a, b)]²

We require to minimise the sum of these squared deviations over all n points.

The reason why this must be an optimisation problem rather than an equation solving one is that if we were to treat it as an equation solving problem there would be too many equations for the number of unknowns. There are two unknowns, a and b. However, for each pair of measurements we have effectively one equation, which we could write as:

T1 = f(F1, a, b) for the first pair of measurements
T2 = f(F2, a, b) for the second pair of measurements
... and so on.

Unless we had only two measurements we have more equations than unknowns and what is called an overdetermined system of equations. Data fitting is thus a particular case of `solving' such a set of equations, which involves finding the solution which most nearly satisfies all the equations.

## Linear Parameter Estimation

In the particular case where the model equation, e.g. T = f(F, a, b) is linear in the unknown parameters, here a and b, the problem can be reduced to solving a set of linear equations.

##### Example

Consider the fitting data to the equation below to estimate the parameter a.

y = a x

We have a set of data paired points (yi, xi). These give an experimental value yi and with the fitted equation, enable a corresponding calculated value to be determined from a xi.

The difference between experimental and calculated values or residual for each point is thus:

yi - a xi

The aim of the the fitting procedure is to minimise this overall by minimising the sum of the squares of these residuals over all data points. This is an optimisation problem to find a so as to minimise the objective function P(a):

Expanding:

Differentiating w.r.t. a:

P' must be zero at the minimum, so we get:

Consider the following data:

 x y(measured) x2 xy 0 0.5 0 0 1 1.5 1 1.5 2 4.5 4 9.0 3 5.5 9 16.5 4 8.5 16 34.0 5 9.5 25 47.5 55 108.5

Hence:

In fact the data were generated from y = 2 x with errors of introduced at alternate readings.

It is also possible to fit linear parameters directly using a general procedure, such as is available in the Excel solver. A link to a spreadsheet which does this is here.

## Nonlinear Parameter Estimation

There are two approaches.

In some cases it is possible to transform the problem so that it is linear in its parameters. For example:
y = exp( a x )

is not linear in the parameter a. However, if we take logs of both sides then we have:
ln y = a x

This is linear in a which can be determined by fitting ln(y) as a function of x.

Statisticians will warn you that this procedure is not precisely equivalent to fitting the nonlinear parameters directly, which is what must be done if no suitable transformation is available.