Introduction

Purpose

The dgo package uses a deterministic partition-and-bound trust-region method to find an approximation to the global minimizer of a differentiable objective function $f(x)$ of $n$ variables $x$, subject to a finite set of simple bounds $x^l \leq x \leq x^u$ on the variables. The method offers the choice of direct and iterative solution of the key trust-region subproblems, and is suitable for large problems. First derivatives are required, and if second derivatives can be calculated, they will be exploited–-if the product of second derivatives with a vector may be found but not the derivatives themselves, that may also be exploited.

Although there are theoretical guarantees, these may require a large number of evaluations as the dimension and nonconvexity increase. The alternative GALAHAD package bgo may sometimes be preferred.

Authors

J. Fowkes and N. I. M. Gould, STFC-Rutherford Appleton Laboratory, England.

Julia interface, additionally A. Montoison and D. Orban, Polytechnique Montréal.

Originally released

July 2021, C interface August 2021.

Terminology

The gradient $\nabla_x f(x)$ of $f(x)$ is the vector whose $i$-th component is $\partial f(x)/\partial x_i$. The Hessian $\nabla_{xx} f(x)$ of $f(x)$ is the symmetric matrix whose $i,j$-th entry is $\partial^2 f(x)/\partial x_i \partial x_j$. The Hessian is sparse if a significant and useful proportion of the entries are universally zero.

Method

Starting with the initial box $x^l \leq x \leq x^u$, a sequence of boxes is generated by considering the current set, and partitioning a promising candidate into three equally-sized sub-boxes by splitting along one of the box dimensions. Each partition requires only a pair of new function and derivative evaluations, and these values, together with estimates of Lipschitz constants, makes it possible to remove other boxes from further consideration as soon as they cannot contain a global minimizer. Efficient control of the dictionary of vertices of the sub-boxes is handled using a suitable hashing procedure provided by the GALAHAD package hash; each sub-box is indexed by the concatenated coordinates of a pair of opposite vertices. At various stages, local minimization in a promising sub-box, using the GALAHAD package trb, may be used to improve the best-known upper bound on the global minimizer. If $n=1$, the specialised GALAHAD univariate global minimization package ugo is called directly.

We reiterate that although there are theoretical guarantees, these may require a large number of evaluations as the dimension and nonconvexity increase. Thus the method should best be viewed as a heuristic to try to find a reasonable approximation of the global minimum.

References

The global minimization method employed is an extension of that due to

Ya. D. Sergeyev and D. E. Kasov (2015), “A deterministic global optimization using smooth diagonal auxiliary functions”, Communications in Nonlinear Science and Numerical Simulation, Vol 21, Nos 1-3, pp. 99-111.

but adapted to use 2nd derivatives, while in the special case when $n=1$, a simplification based on the ideas in

D. Lera and Ya. D. Sergeyev (2013), “Acceleration of univariate global optimization algorithms working with Lipschitz functions and Lipschitz first derivatives” SIAM J. Optimization Vol. 23, No. 1, pp. 508–529

is used instead. The generic bound-constrained trust-region method used for local minimization is described in detail in

A. R. Conn, N. I. M. Gould and Ph. L. Toint (2000), Trust-region methods. SIAM/MPS Series on Optimization.

Call order

To solve a given problem, functions from the dgo package must be called in the following order:

dgo_initialize - provide default control parameters and set up initial data structures
dgo_read_specfile (optional) - override control values by reading replacement values from a file
dgo_import - set up problem data structures and fixed values
dgo_reset_control (optional) - possibly change control parameters if a sequence of problems are being solved
solve the problem by calling one of
dgo_solve_with_mat - solve using function calls to evaluate function, gradient and Hessian values
dgo_solve_without_mat - solve using function calls to evaluate function and gradient values and Hessian-vector products
dgo_solve_reverse_with_mat - solve returning to the calling program to obtain function, gradient and Hessian values, or
dgo_solve_reverse_without_mat - solve returning to the calling prorgram to obtain function and gradient values and Hessian-vector products
dgo_information (optional) - recover information about the solution and solution process
dgo_terminate - deallocate data structures

Symmetric matrix storage formats

The symmetric $n$ by $n$ matrix $H = \nabla_{xx}f$ may be presented and stored in a variety of formats. But crucially symmetry is exploited by only storing values from the lower triangular part (i.e, those entries that lie on or below the leading diagonal).

Both C-style (0 based)and fortran-style (1-based) indexing is allowed. Choose control.f_indexing as false for C style and true for fortran style; the discussion below presumes C style, but add 1 to indices for the corresponding fortran version.

Wrappers will automatically convert between 0-based (C) and 1-based (fortran) array indexing, so may be used transparently from C. This conversion involves both time and memory overheads that may be avoided by supplying data that is already stored using 1-based indexing.

Dense storage format

The matrix $H$ is stored as a compactdense matrix by rows, that is, the values of the entries of each row in turn are stored in order within an appropriate real one-dimensional array. Since $H$ is symmetric, only the lower triangular part (that is the part $H_{ij}$ for $0 \leq j \leq i \leq n-1$) need be held. In this case the lower triangle should be stored by rows, that is component $i \ast i / 2 + j$of the storage array Hval will hold the value H{ij}$ (and, by symmetry, $H_{ji}$) for $0 \leq j \leq i \leq n-1$.

Sparse co-ordinate storage format

Only the nonzero entries of the matrices are stored. For the $l$-th entry, $0 \leq l \leq ne-1$, of $H$, its row index i, column index j and value $H_{ij}$, $0 \leq j \leq i \leq n-1$,are stored as the $l$-th components of the integer arrays Hrow and Hcol and real array Hval, respectively, while the number of nonzeros is recorded as Hne = $ne$. Note that only the entries in the lower triangle should be stored.

Sparse row-wise storage format

Again only the nonzero entries are stored, but this time they are ordered so that those in row i appear directly before those in row i+1. For the i-th row of $H$ the i-th component of the integer array Hptr holds the position of the first entry in this row, while Hptr(n) holds the total number of entries plus one. The column indices j, $0 \leq j \leq i$, and values $H_{ij}$ of theentries in the i-th row are stored in components l = Hptr(i), $\ldots$, Hptr(i+1)-1 of the integer array Hcol, and real array Hval, respectively. Note that as before only the entries in the lower triangle should be stored. For sparse matrices, this scheme almost always requires less storage than its predecessor.