CONOPT
Loading...
Searching...
No Matches
Defining the second derivative

Data Structures

class  jconopt.SDEvaluationType
 

Functions

void jconopt.ModelData.evaluateSDLagrangian (double[] x, double[] u, int[] hessianrow, int[] hessiancol, double[] hessianval)
 Computes and returns the numerical values of the Lagrangian of the Hessian.
 
void jconopt.ModelData.evaluateSDDirLagrangian (double[] x, double[] dx, double[] u, double[] d2g, boolean newpoint)
 computes the directional second derivative for the Lagrangian
 
void jconopt.ModelData.evaluateDirectionalSD (double[] x, double[] dx, double[] d2g, int rowno, int[] jacnum, int thread)
 computes the directional second derivative for a single constraint
 
void jconopt.ModelData.initDirectionalSDEval (double[] x, double[] dx, int[] rowlist, int numthread, boolean newpoint)
 a callback for the initialisation of the second derivative evaluation.
 
void jconopt.ModelData.endDirectionalSDEval ()
 called by CONOPT after a sequence of evaluateDirectionalSD() calls each time either the point or the direction changes.
 
void jconopt.ModelData.setSDEvaluationType (SDEvaluationType sdevaltype)
 
void jconopt.ModelData.setSDLagrangianStructure (int[] rownum, int[] colnum)
 

Detailed Description

Virtual functions that can be optionally implemented by the user to define the second order derivatives.

CONOPT is able to take advantage of second derivatives. Only some of the potential ways of defining second derivatives is currently being implemented and this chapter describes the interface defined so far.

Second order information is used to formulate and solve an inner Quadratic Programming sub-model in a Sequential Quadratic Programming (SQP) like algorithm. If 2nd order information is not readily available it will be approximated from gradient values using finite differences. Please note that perturbations may require very many gradient calls so this option assumes that gradient calls are cheap. In addition, perturbations require accurate gradients. Even given accurate derivatives they are not as accurate as properly coded 2nd derivatives. It is therefore highly recommended to provide some form or second order information.

Second order information can be available in two forms: (1) as a sparse matrix of second derivatives, the Hessian matrix, and (2) or as a vector of directional second derivatives. Second-order information can be defined based on individual constraints or on the Lagrangian function, giving a total of four combinations of which three are supported at the moment. The notation and mathematical relationship between these concepts is as follows:

The Hessian of a single constraint \(k\), \(H_k\), is the matrix of second derivatives defined as

\[ H_{ij}^k = \frac{\partial^{2}g_k}{\partial x_{i}\partial x_{j}} \]

The Hessian is by definition a symmetric matrix and we will usually only work with the lower triangular part of the matrix, including the diagonal. \(H^k\) will only have non-zero elements in rows and columns corresponding to variables that appear nonlinearly in constraint \(k\), so if constraint \(k\) is sparse, the corresponding Hessian will usually also be sparse. But notice that if constraint \(k\) depends on \(p\) variables, the Hessian may have up to \(p^2\) elements in total or \(p \cdot (p+1)/2\) elements in the lower triangular part of the matrix. There are models where the constraints on average are sparse, but one or a few of the constraints depend on many variables, and their Hessians have a huge number of nonzero elements.

The Lagrangian function is defined as

\[ L(x,u)=\sum_{k}u_{k} \cdot g_{k}(x) \]

where we have ignored some terms that do not contribute to second derivatives. \(u\) is a vector of Lagrange multipliers defined by CONOPT. There is one Lagrange multiplier for each constraint.

The Hessian of the Lagrangian, \(HL\), is defined as:

\[ HL = \sum_{k}u_{k} \cdot H^{k} \quad \text{or} \quad HL_{ij} = \sum_{k}u_{k} \cdot \frac{\partial^{2}g_{k}}{\partial x_{i}\partial x_{j}} \]

The sparsety pattern for the Hessian of the Lagrangian is the union of the sparsety patterns for the Hessians of all constraints in the model. This means that if one of the constraints has a dense Hessian then the Hessian of the Lagrangian will also be dense and CONOPT cannot use a Hessian efficiently with these models.

The directional first derivative of constraint \(k\) in point \(x\) and in the direction \(v\) is defined as

\[ d1_{k}(x,v) = \frac{\partial g_{k}(x+\theta\cdot v)}{\partial \theta} = \frac{\partial g_{k}}{\partial x} \cdot v \]

or the scalar product of the first derivative of \(g_k\) with the direction \(v\). The directional second derivative is defined as

\[ d2_{k}(x,v) = \frac{\partial d1_{k}(x,v)}{\partial x} = \frac{\partial^{2}g_{k}}{\partial x^{2}} \cdot v = H^{k} \cdot v \]

or the product of the Hessian with the direction v. The directional second derivative of the Lagrangian is similarly defined as

\[ HL \cdot v = \sum_{k}u_{k} \cdot H_{k} \cdot v \]

where the order of the multiplications can be selected by the implementer. Notice that in cases where \(H_k\) has too many elements to be of practical direct use, the directional second derivative is still a vector with only one element per variable, and it may be cheap to compute.

Currently, CONOPT accept second derivative information in three forms:

1. The Hessian of the Lagrangian,

2. Directional second derivatives of individual constraints, and

3. Directional second derivatives of the Lagrangian.

The remaining combination, the Hessians of individual constraints, can be implemented if they are more convenient for some users.

CONOPT uses the second order information in a Sequential Quadratic Programming (SQP) algorithm to find good directions for a line search. The SQP model is solved approximately using a conjugate gradient (CG) algorithm. The SQP tolerances are large on the beginning of the optimization and are reduced as we approach the optimal solution. The gradients used in the CG algorithm are either computed by repeated calls to the directional second derivative routine or by a single call to the Hessian routine followed by several multiplications of the Hessian with a direction vector inside CONOPT. If both a Hessian and directional derivatives are available, CONOPT will initially use 2DDir or 2DDirLagr when few CG iterations are expected, and switch to 2DLagrVal when the accuracy is tighter and a larger number of iterations are expected.

The modeler can tell CONOPT that second order information is available as the Hessian of the Lagrangian, assumed to be a sparse matrix, by calling jconopt.ModelData.setSDLagrangianStructure(). The number of nonzero elements and the sparsety patterng of the Hessian of the Lagrangian are defined in two setup calls to 2DLagrStr. The numerical values of the Hessian must be returned by calls to jconopt.ModelData.SDLagrVal(). The numerical calls will be scattered throughout the optimization, usually with no more than one call per iteration.

If the user wants to compute the directional second derivative, this is registered by calling jconopt.ModelData.setSDEvaluationType() and supplying either ConoptSDEvaluationType::Constraint or ConoptSDEvaluationType::Lagrangian. If jconopt.SDEvaluationType.Constraint is set, then the user must implement jconopt.ModelData.SDDir(). Alternatively, if jconopt.SDEvaluationType.Lagrangian is set, then the user must implement jconopt.ModelData.SDDirLagr(). For the directional derivatives, the user may also implement jconopt.ModelData.SDDirIni() and jconopt.ModelData.SDDirEnd() for the setup and teardown of the second derivative evaluation.

Function Documentation

◆ evaluateSDLagrangian()

void jconopt.ModelData.evaluateSDLagrangian ( double[] x,
double[] u,
int[] hessianrow,
int[] hessiancol,
double[] hessianval )
inline

Computes and returns the numerical values of the Lagrangian of the Hessian.

Parameters
xthe solution vector that needs to be evaluated.
uthe vector of weights on the individual constraints.
hessianrowvector of row numbers of the lower triangular part of the hessian.
hessiancolvector of column numbers of the lower triangular part of the hessian.
hessianvalan array to store the Hessian computed at the given point and row. This is of length numHessianNonzeros().

Note: an error in the evaluation is reported by calling errorInEvaluation()

Definition at line 171 of file ModelData.java.

◆ evaluateSDDirLagrangian()

void jconopt.ModelData.evaluateSDDirLagrangian ( double[] x,
double[] dx,
double[] u,
double[] d2g,
boolean newpoint )
inline

computes the directional second derivative for the Lagrangian

Parameters
xthe solution vector that needs to be evaluated.
dxvector with the direction in which the directional second derivatives should be computed.
uthe vector of weights on the individual constraints.
d2gan array to store the directional second derivative for the Lagrangian at the given point. This is of length numVar().
newpointa boolean to indicate whether x has changed. If newpoint is false, the values of dx may have changed.

Note: an error in the evaluation is reported by calling errorInEvaluation()

Definition at line 190 of file ModelData.java.

◆ evaluateDirectionalSD()

void jconopt.ModelData.evaluateDirectionalSD ( double[] x,
double[] dx,
double[] d2g,
int rowno,
int[] jacnum,
int thread )
inline

computes the directional second derivative for a single constraint

Parameters
xthe solution vector that needs to be evaluated.
dxvector with the direction in which the directional second derivatives should be computed.
d2gan array to store the directional second derivative for a single constraint at the given point and row. This is of length numVar().
rownothe number for the row in which the nonlinear term exists.
jacnumvector with a list of column numbers for the nonlinear nonzero Jacobian elements in the row.
threadthe index of the thread from which this method is being called from.

Note: an error in the evaluation is reported by calling errorInEvaluation()

Definition at line 209 of file ModelData.java.

◆ initDirectionalSDEval()

void jconopt.ModelData.initDirectionalSDEval ( double[] x,
double[] dx,
int[] rowlist,
int numthread,
boolean newpoint )
inline

a callback for the initialisation of the second derivative evaluation.

This method is called before a sequence of calls to evaluateDirectionalSD(). The user can utilise this method for evaluating common terms and storing important information.

Parameters
xthe solution vector that needs to be evaluated.
dxvector with the direction in which the directional second derivatives should be computed.
rowlistthe list of rows that will be evaluated for the given point.
numthreadthe number of threads that will be used for the following FDEval calls.
newpointa boolean to indicate whether x has changed. If newpoint is false, the values of dx may have changed.

Note: an error in the evaluation is reported by calling errorInEvaluation()

Definition at line 230 of file ModelData.java.

◆ endDirectionalSDEval()

void jconopt.ModelData.endDirectionalSDEval ( )
inline

called by CONOPT after a sequence of evaluateDirectionalSD() calls each time either the point or the direction changes.

It can be used to perform cleanup tasks, including to report if anything went wrong.

Note: an error in the evaluation is reported by calling errorInEvaluation()

Definition at line 244 of file ModelData.java.

◆ setSDEvaluationType()

void jconopt.ModelData.setSDEvaluationType ( SDEvaluationType sdevaltype)
inline

informs CONOPT of the method for evaluating the second derivative

Reimplemented from jconopt.ConoptModelData.

Definition at line 548 of file ModelData.java.

◆ setSDLagrangianStructure()

void jconopt.ModelData.setSDLagrangianStructure ( int[] rownum,
int[] colnum )
inline

sets the structure of the second derivatives of the Lagrangian

Parameters
rownumVector of row numbers of the lower triangular part of the Hessian.
colnumVector of column numbers of the lower triangular part of the Hessian.

Reimplemented from jconopt.ConoptModelData.

Definition at line 561 of file ModelData.java.