When interpolating we start from reasonably exact tabulated values and
require that the interpolating function pass exactly through the
values. In curve fitting we generally start with a table of
experimental data in which the values are imperfectly known, so
come with a confidence interval
, and we
require that the fitting function come only reasonably close to the
data. The table of experimental data consists of three columns
, with the last value specifying the confidence
range (commonly one standard deviation).
In cubic spline fitting we relax the requirement that the spline pass
exactly through the points and demand only that the spline and its
first and second derivatives be continuous:
To quantify the curvature we integrate the square of the second
derivative, giving
These constraints are contradictory. Notice that to make the
curvature zero, its smallest possible value, the spline would have to
have zero curvature -- i.e., a straight line. But that would
probably give us a high value of . On the other hand, we can
make
equal to zero by having the spline interpolate the
points exactly, but that would give a high curvature.
Putting these two constraints together, we require that the cubic
spline minimize
Minimizing subject to the continuity requirements leads again to a tridiagonal system that is easily solved to give the coefficients of the cubics.
Making too small allows a high value of
. Making it
too large forces an unreasonably small value of
. Usually we
have to experiment with the choice. The best value results in
in the range
.
We should consider ``smoothing'' to be a poor substitute for fitting data to a decent model function that has a fundamental basis. When we fit experimental data to a model function we are actually testing our understanding of nature. When we smooth in order to fit to an arbitrary cubic spline, we are merely parameterizing an observation, but not learning something more fundamental.