# Linear Calibration curve

Leo de Galan, Independent consultant, NetherlandsAbstract Quantitative analysis is based on least squares calibration curves. Calculation of a least squares curve assumes that the standard deviation is constant over the calibration range and that the uncertainty of the response is greater than the uncertainty in the reference concentrations. The uncertainty in calculated concentrations stems both from uncertainty in the measured response for the unknown and from the uncertainty inherent in the construction of the calibration curve. If a method’s calibration curve is linear and stable, recalibration can be simplified greatly. Note that correlation coefficients do not give sufficient information to characterize a calibration curve.

KeywordsLeast squares, Calibration curve, Uncertainty, Slope, Intercept, Standard deviation, Confidence interval, Correlation coefficient (Rc), Linearity, Random error, Linear dynamic range

LevelAdvanced

**The significance of a linear calibration curve**

As almost all instrumental methods of analysis, chromatography is not an *absolute*, but a *relative *technique. The response for a sample component must be compared with the response for a suitable reference, i.e. a substance that closely resembles the sample and with a known amount of the analyte. Usually, the comparison is based on a *calibration curve* extending over a concentration range corresponding to the variation of the analyte content in the samples.

## Least square calibration curve

**The least square linear calibration curve**

Most chromatographic detectors (as indeed in most other branches of analytical chemistry) show a linear response over a certain range. This is a desirable property because it requires fewer data points for the construction and consecutive use of the calibration curve. It is also more tolerant to a slight extrapolation beyond the calibrated range. In the common calculation of the *least square *calibration curve (linear or any other polynomial), the curve is positioned such that the sum of squares of the *vertical **distances* (parallel to the response axis) to the curve is minimized.

Least square straight line: y=G*c + d

It should be realized that this calculation rests on two assumptions:

- The uncertainty of the
*independent variable*(in chemical analysis usually the concentrations of a series of references) is very much smaller than the uncertainty in the*dependent**variable*(usually the measured responses). In medical and biologcal applications (where e.g. the response of a test group is plotted against the response of a control group) this assumption is often not true; then the least square calculation must be based on the distances of the data points as measured*perpendicular*to the curve; this is also known as*orthogonal regression*. - The standard deviation in the measured response is constant over the range of the calibration curve. For calibrations extending over more than one or two decades, this assumption is generally not true. Such a situation can be accommodated by entering into the calculation response data
*weighted*by the inverse of the standard deviation or, to a good approximation, by the inverse of the concentration.

It might be a good idea to check the calibration software of your instrument on this point.

## Calibration curve and analysis error

**The contribution of the calibration curve to the total uncertainty**

Once the calibration curve has been constructed, the concentration of an unknown sample can easily be derived from its response: *c*_{sample}= (*y*_{sample}* *- *d*)/*G*

* Random error of sample measurement*The uncertainty in the derived sample concentration* *stems from two sources:

- The random error associated with the measurement of
*y*_{sample}and characterized by the standard deviation,*s*,leads_{y}

to a corresponding error of*s*in the sample concentration._{c}

_{}

2. The uncertainty in the position of the least square calibration curve yields another contribution to the uncertainty, since the slope, *G*, and the intercept, *d*, are also the result of experimental data, that are subject to a measurement uncertainty with the same standard deviation, *s _{y}*. In the figure below the green hyperboles illustrate the 95% confidence interval of the straight line (in red). Random error of calibration curve

When the curve is recalibrated, it should fall within the green borderlines in 95% of the cases. Apparently, the calibration curve is more narrowly defined (i.e. more precise) in the central part than in either extremes.

Together the two causes of uncertainty lead to the following combined expression for the uncertainty in *c*_{sample} ** **

K = number of measurements of *y*_{sample }N = number of data points for the calibration curve

y_{i} = response for reference i Total random errorThe first term under the square root is the familiar 'averaging out' with more sample measurements (*K*>1). The two other terms result from the uncertainty in the calibration cure and their influence clearly diminishes when more reference samples are used. This is illustrated in the next fgure for the situation that *K*=1 and *N*=10.

For *K* = 1 and *N *>> 10 this expression simplifies to:

S_{c} = S_{y}/G or S_{c}/c = S_{y}/y

Accurate calibration curveWhen the calibration curve is accurately determined from many reference data, then the relative uncertainty in the sample response leads to a proportional uncertainty in the sample concentration. This is als known as the *coefficient of variation.*

## Verifying linearity

**Recommended practice in the use of calibration curves**

- Always use references that resemble the daily samples as closely as possible to minimize systematic deviations.
- In the initial set-up, measure 10 references evenly spread over the concentration range
*in duplicate,*then calculate the least square line. In the four figures below this is always the straight line in red. - Plot the data to verify linearity, considering that:

• when all duplicates fall on either side of the line, the response is indeed nicely linear as in the upper left figure below.

• when the duplicates at the extreme ends fall below or above the line, there is a strong indication of nonlinearity as in the other three figures. - Convex calibration curves (upper right figure) are quite common. Obviously, the curve may still be linear over a more restricted range. The upper end of linearity is reached when the reference response falls 5% below the straight line.
- Once linearity is confirmed,
*recalibration*is simplified to the measurement of a blank and a single reference at the extreme end of the calibration. If, moreover, the calibration curve passes through the origin (i.e. the intercept is zero), a single high reference suffices. - The frequency of recalibration depends on the stability of the analytical procedure which can be read from a control chart.

Detecting nonlinearity

## The correlation coefficient

**Can (non)linearity be judged from the correlation coefficient?**

The correlation coefficient, R_{c}, of a perfectly straight line is equal to one. In the upper left figure above this value is closely approximated:

R_{c }= 0.995. The numbers entered in the other three figures demonstrate that curvature does lead to values less than one, albeit still only slightly. Unfortunately, the indication is far from unequivocal. A wider variation of data points around a truly linear response will also reduce the correlation coefficient to values below one, as illustrated below. Therefore, the correlation coefficient is an inappropriate judge of linearity.

Straight lines with different correlation coefficients

Judging from the correlation coefficient alone one may be temped to conclude that the calibration curve in the right hand figure is nonlinear. Actually it is perfectly linear, but the underlying measurements are rather imprecise, and this is reflected in a lower value of the correlation coeficient.