Optimization
Numerical optimization using the L-BFGS (Limited-memory Broyden-Fletcher-Goldfarb-Shanno) algorithm. Wraps Maxima’s built-in L-BFGS implementation to work directly with ndarrays.
The optimization functions are in a separate sub-module (loaded separately from the core numerics package):
(%i1) load("numerics")$
(%i2) load("numerics-optimize")$
np_minimize (f, grad, x0) — Function
np_minimize (f, grad, x0, tolerance) — Function
np_minimize (f, grad, x0, tolerance, max_iter) — Function
Minimize a scalar-valued function using gradient-based L-BFGS optimization.
Arguments:
f— objective function taking an ndarray, returning a scalargrad— gradient function taking an ndarray, returning an ndarray of the same shapex0— initial point (real ndarray, any shape)tolerance— convergence tolerance (default:1e-8)max_iter— maximum number of iterations (default:200)
Returns: a list [x_opt, f_opt, converged] where:
x_opt— the optimized ndarray (same shape asx0)f_opt— the final objective valueconverged—trueif L-BFGS converged,falseotherwise
Both f and grad can be named functions or lambda expressions. The shape of x0 is preserved in the output — if you pass a [p, 1] column vector, you get a [p, 1] result.
Examples
Simple quadratic:
(%i1) f(x) := np_sum(np_pow(x, 2))$
(%i2) grad(x) := np_scale(2.0, x)$
(%i3) [x_opt, f_opt, ok] : np_minimize(f, grad, ndarray([3.0, 4.0]));
(%o3) [ndarray([2], DOUBLE-FLOAT), 1.97e-30, true]
(%i4) np_to_list(x_opt);
(%o4) [2.80e-16, 3.55e-16]
Rosenbrock function (classic optimization benchmark):
(%i1) f_rosen(x) := block([x1, x2],
x1 : np_ref(x, 0), x2 : np_ref(x, 1),
(1 - x1)^2 + 100 * (x2 - x1^2)^2)$
(%i2) g_rosen(x) := block([x1, x2],
x1 : np_ref(x, 0), x2 : np_ref(x, 1),
ndarray([-2*(1 - x1) - 400*x1*(x2 - x1^2),
200*(x2 - x1^2)]))$
(%i3) [x_opt, f_opt, ok] : np_minimize(f_rosen, g_rosen,
ndarray([-1.0, 1.0]), 1e-10, 500);
(%o3) [ndarray([2], DOUBLE-FLOAT), 0.0, true]
(%i4) np_to_list(x_opt);
(%o4) [1.0, 1.0]
Linear regression (symbolic-numeric bridge):
(%i1) /* Derive gradient symbolically */
L_i(a, b, x_i, y_i) := (1/2) * (y_i - a - b*x_i)^2$
(%i2) diff(L_i(a, b, x[i], y[i]), a);
(%o2) -(y_i - b*x_i - a)
(%i3) /* Implement as ndarray functions */
cost(w) := block([pred, res],
pred : np_matmul(X, w),
res : np_sub(pred, y),
np_sum(np_pow(res, 2)) / (2*n))$
(%i4) grad(w) := np_scale(1.0/n,
np_matmul(np_transpose(X), np_sub(np_matmul(X, w), y)))$
(%i5) [w_opt, loss, ok] : np_minimize(cost, grad, np_zeros([2, 1]))$
With lambda expressions:
(%i1) [x_opt, _, ok] : np_minimize(
lambda([x], np_sum(np_pow(x, 2))),
lambda([x], np_scale(2.0, x)),
ndarray([5.0, 5.0]));
(%o1) [ndarray([2], DOUBLE-FLOAT), 1.42e-29, true]
See also: np_lstsq (for linear least-squares problems)