Least Squares Curve Fitting¶
This notebook demonstrates polynomial curve fitting via least squares,
highlighting the symbolic-numeric bridge: we build design matrices
numerically with ndarray operations, solve the least-squares problem with
np_lstsq (wrapping LAPACK's SVD), and then reconstruct the fitted polynomial
as a symbolic Maxima expression for plotting and analysis.
Key functions: np_linspace, np_lstsq, np_hstack, np_reshape, np_matmul, np_pow.
load("numerics")$
load("ax-plots")$
Generating Noisy Data¶
We start from a known quadratic function $f(x) = 0.5 x^2 - 2x + 1$ and add Gaussian noise. This gives us a ground truth to compare against.
/* True function: f(x) = 0.5*x^2 - 2*x + 1 */
n : 20$
x : np_linspace(0, 5, n)$
/* Evaluate f at each x and add noise */
y : np_add(
np_add(
np_scale(0.5, np_pow(x, 2)),
np_scale(-2.0, x)
),
np_add(
np_ones([n]),
np_scale(0.3, np_randn([n]))
)
)$
print("x (first 5):", np_to_list(np_slice(x, [0, 5])))$
print("y (first 5):", np_to_list(np_slice(y, [0, 5])))$
x (first 5):
[0.0,0.2631578947368421,0.5263157894736842,0.7894736842105263,
1.0526315789473684]
y (first 5):
[1.144367961890701,0.3465674007798698,-0.1013859777412156,
-0.2302246196332074,-0.6000505351780027]
/* Plot the noisy data */
ax_draw2d(
color=red, marker_size=6,
name="Noisy data",
points(np_to_list(x), np_to_list(y)),
color=black, dash="dash", name="True: 0.5x^2 - 2x + 1",
explicit(0.5*t^2 - 2*t + 1, t, 0, 5),
title="Noisy Observations from a Quadratic",
xlabel="x", ylabel="y",
grid=true
)$
The Vandermonde Matrix [S+N]¶
To fit a degree-$d$ polynomial $p(x) = a_0 + a_1 x + a_2 x^2 + \cdots + a_d x^d$ to data $(x_i, y_i)$, we form the Vandermonde matrix (or design matrix) $V$ with entries $V_{ij} = x_i^{\,j}$:
$$ V = \begin{pmatrix} 1 & x_1 & x_1^2 & \cdots & x_1^d \\ 1 & x_2 & x_2^2 & \cdots & x_2^d \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & x_m & x_m^2 & \cdots & x_m^d \end{pmatrix} $$
The least-squares fit then solves $V \mathbf{a} \approx \mathbf{y}$ for the coefficient vector $\mathbf{a} = (a_0, a_1, \ldots, a_d)^T$.
We can inspect the symbolic structure first, then build it numerically.
/* Symbolic Vandermonde structure for degree 2 */
print("Symbolic Vandermonde (degree 2) for points x1..x4:")$
genmatrix(lambda([i, j], x[i]^(j-1)), 4, 3);
Symbolic Vandermonde (degree 2) for points x1..x4:
/* Build the Vandermonde matrix numerically for degree 2 */
/* Each column is x^j for j = 0, 1, 2 */
vander(xv, deg) := block([cols],
cols : np_reshape(np_ones([np_size(xv)]), [np_size(xv), 1]),
for j : 1 thru deg do
cols : np_hstack(cols, np_reshape(np_pow(xv, j), [np_size(xv), 1])),
cols
)$
V : vander(x, 2)$
print("Vandermonde shape:", np_shape(V))$
print("First 3 rows:")$
for i : 0 thru 2 do
print(" ", np_to_list(np_row(V, i)))$
Vandermonde shape: [20,3] First 3 rows: [1.0,0.0,0.0] [1.0,0.2631578947368421,0.06925207756232686] [1.0,0.5263157894736842,0.27700831024930744]
Solving with np_lstsq¶
The function np_lstsq(A, b) minimises $\|A x - b\|_2$ using SVD.
It returns [x, residuals, rank, singular_values].
/* Solve the least-squares problem V * a = y */
y_col : np_reshape(y, [n, 1])$
[coeffs_nd, residuals, rank_V, sv] : np_lstsq(V, y_col)$
coeffs_list : np_to_list(coeffs_nd)$
a0 : coeffs_list[1]$
a1 : coeffs_list[2]$
a2 : coeffs_list[3]$
print("Fitted coefficients:")$
print(" a0 =", a0, " (true: 1.0)")$
print(" a1 =", a1, " (true: -2.0)")$
print(" a2 =", a2, " (true: 0.5)")$
print(" rank =", rank_V)$
/* Reconstruct the fitted polynomial symbolically */
p_fit : a0 + a1*t + a2*t^2;
print("Fitted polynomial: p(t) =", p_fit)$
Fitted coefficients:
a0 = 1.0837053490879573 (true: 1.0)
a1 = -2.1517272135665984 (true: -2.0)
a2 = 0.5273594310358118 (true: 0.5)
rank = 3
0.5273594310358118*t^2-2.1517272135665984*t+1.0837053490879573
Fitted polynomial: p(t) =
0.5273594310358118*t^2-2.1517272135665984*t
+1.0837053490879573
Plotting the Fit¶
We overlay the fitted curve (a symbolic expression) onto the noisy data.
ax_draw2d(
color=red, marker_size=6,
name="Data",
points(np_to_list(x), np_to_list(y)),
color=blue, line_width=2, name="Least-squares fit (deg 2)",
explicit(p_fit, t, 0, 5),
color=black, dash="dash", name="True curve",
explicit(0.5*t^2 - 2*t + 1, t, 0, 5),
title="Quadratic Least-Squares Fit",
xlabel="x", ylabel="y",
grid=true
)$
Residual Analysis¶
The residual vector $r = y - V\hat{a}$ shows the unexplained variation. We compute the coefficient of determination $R^2$:
$$R^2 = 1 - \frac{\sum r_i^2}{\sum (y_i - \bar{y})^2}$$
An $R^2$ close to 1 indicates the model captures most of the variance.
/* Predicted values and residuals */
y_pred : np_matmul(V, coeffs_nd)$
resid : np_sub(y_col, y_pred)$
/* Sum of squared residuals */
ss_res : np_dot(np_flatten(resid), np_flatten(resid))$
/* Total sum of squares */
y_mean : np_mean(y)$
y_centered : np_sub(y, y_mean)$
ss_tot : np_dot(y_centered, y_centered)$
R2 : 1 - ss_res / ss_tot;
print("R^2 =", R2)$
print("SS_res =", ss_res)$
print("SS_tot =", ss_tot)$
0.9726930639043225 R^2 = 0.9726930639043225 SS_res = 0.9615585185138484 SS_tot = 35.2129772137291
/* Plot residuals vs x */
ax_draw2d(
color=blue, marker_size=6,
name="Residuals",
points(np_to_list(x), np_to_list(np_flatten(resid))),
color=gray, dash="dash",
explicit(0, t, 0, 5),
title="Residuals vs x",
xlabel="x", ylabel="Residual",
grid=true
)$
Underfitting and Overfitting¶
The choice of polynomial degree $d$ controls the bias-variance tradeoff:
- Degree 1 (linear): too simple to capture the curvature -- underfit
- Degree 5: captures the shape with some flexibility -- good fit
- Degree 15: has enough freedom to interpolate noise -- overfit
We fit all three and compare.
/* Helper: fit a polynomial of given degree, return symbolic expression */
polyfit(xv, yv, deg) := block(
[Vm, yc, result, c, poly],
Vm : vander(xv, deg),
yc : np_reshape(yv, [np_size(yv), 1]),
result : np_lstsq(Vm, yc),
c : np_to_list(first(result)),
poly : sum(c[k+1] * t^k, k, 0, deg),
poly
)$
p1 : polyfit(x, y, 1)$
p5 : polyfit(x, y, 5)$
p15 : polyfit(x, y, 15)$
print("Degree 1:", p1)$
print("Degree 5:", expand(p5))$
Degree 1: 0.4850699416124584*t-0.997976615527083
Degree 5:
-(0.01778366101067462*t^5)+0.22458881666238428*t^4
-0.9729345742000228*t^3
+2.167521279704431*t^2-2.980417057972085*t
+1.0793971228343548
/* Plot all fits */
ax_draw2d(
color=red, marker_size=6,
name="Data",
points(np_to_list(x), np_to_list(y)),
color="#e41a1c", dash="dash", name="Degree 1 (underfit)",
explicit(p1, t, 0, 5),
color="#377eb8", line_width=2, name="Degree 5",
explicit(p5, t, 0, 5),
color="#4daf4a", name="Degree 15 (overfit)",
explicit(p15, t, 0, 5),
color=black, dash="dash", name="True curve",
explicit(0.5*t^2 - 2*t + 1, t, 0, 5),
title="Underfitting vs Overfitting",
xlabel="x", ylabel="y",
yrange=[-3, 6],
grid=true
)$
The linear fit misses the curvature entirely. The degree-5 fit tracks the true curve closely. The degree-15 fit oscillates wildly between data points -- a classic sign of overfitting, where the model fits the noise rather than the underlying relationship.
Comparing with the Symbolic Solution [S+N]¶
For the linear case (degree 1), the least-squares solution satisfies the normal equations:
$$A^T A \, \hat{x} = A^T b$$
We can derive these symbolically in Maxima, solve them exactly, and verify
that the symbolic solution matches the numerical np_lstsq result.
/* Build the normal equations symbolically for a linear fit */
/* Design matrix for degree 1: columns are [1, x_i] */
V1 : vander(x, 1)$
y1 : np_reshape(y, [n, 1])$
/* Normal equations: (V^T V) c = V^T y */
VtV : np_matmul(np_transpose(V1), V1)$
Vty : np_matmul(np_transpose(V1), y1)$
print("V^T V (2x2):")$
np_to_matrix(VtV);
print("V^T y (2x1):")$
np_to_matrix(Vty);
V^T V (2x2): matrix([20.0,50.0],[50.0,171.05263157894737]) V^T y (2x1):
/* Solve the normal equations numerically */
coeffs_normal : np_solve(VtV, Vty)$
/* Compare with np_lstsq */
[coeffs_lstsq, res1, rank1, sv1] : np_lstsq(V1, y1)$
print("Normal equations solution:", np_to_list(coeffs_normal))$
print("np_lstsq solution: ", np_to_list(coeffs_lstsq))$
diff_norm : np_norm(np_sub(coeffs_normal, coeffs_lstsq))$
print("Difference norm:", diff_norm)$
Normal equations solution: [-0.9979766155270834,0.4850699416124584] np_lstsq solution: [-0.997976615527083,0.4850699416124584] Difference norm: 3.3306690738754696e-16
/* Derive the normal equations fully symbolically */
/* For a linear model y = c0 + c1*x with m data points: */
/* V^T V = [[m, sum(x)], [sum(x), sum(x^2)]] */
/* V^T y = [sum(y), sum(x*y)] */
print("Symbolic normal equations for linear regression:")$
print(" [ m sum(x_i) ] [c0] [ sum(y_i) ]")$
print(" [ sum(x_i) sum(x_i^2)] [c1] = [ sum(x_i*y_i) ]")$
/* Solve symbolically */
sx : np_sum(x)$
sx2 : np_dot(x, x)$
sy : np_sum(y)$
sxy : np_dot(x, y)$
/* Cramer's rule */
det_sys : n * sx2 - sx^2$
c0_sym : (sy * sx2 - sx * sxy) / det_sys$
c1_sym : (n * sxy - sx * sy) / det_sys$
print("Symbolic solution:")$
print(" c0 =", c0_sym)$
print(" c1 =", c1_sym)$
print("Matches np_lstsq:",
is(abs(c0_sym - np_to_list(coeffs_lstsq)[1]) < 1e-10 and
abs(c1_sym - np_to_list(coeffs_lstsq)[2]) < 1e-10))$
Symbolic normal equations for linear regression: [ m sum(x_i) ] [c0] [ sum(y_i) ] [ sum(x_i) sum(x_i^2)] [c1] = [ sum(x_i*y_i) ] Symbolic solution: c0 = -0.9979766155270838 c1 = 0.48506994161245853 Matches np_lstsq: true