CONOPT
Loading...
Searching...
No Matches
Defining the second derivative

Enumerations

enum class  ConoptSDEvaluationType { ConoptSDEvaluationType::None = 0 , ConoptSDEvaluationType::Constraint , ConoptSDEvaluationType::Lagrangian }
 the evaluation type for the directional second derivatives More...
 

Functions

void ConoptModelData::setSDEvaluationType (ConoptSDEvaluationType sdevaltype)
 informs CONOPT of the method for evaluating the second derivative
 
void ConoptModelData::setSDLagrangianStructure (const std::vector< int > &rownum, const std::vector< int > &colnum)
 sets the structure of the second derivatives of the Lagrangian
 
virtual int ConoptModelData::SDLagrVal (const double x[], const double u[], const int hsrw[], const int hscl[], double hsvl[], int *nodrv, int numvar, int numcon, int nhess)
 Computes and returns the numerical values of the Hessian.
 
virtual int ConoptModelData::SDDirLagr (const double x[], const double dx[], const double u[], double d2g[], int newpt, int *nodrv, int numvar, int numcon)
 computes the directional second derivative for the Lagrangian
 
virtual int ConoptModelData::SDDir (const double x[], const double dx[], double d2g[], int rowno, const int jacnum[], int *nodrv, int numvar, int numjac, int thread)
 computes the directional second derivative for a single constraint
 
virtual int ConoptModelData::SDDirIni (const double x[], const double dx[], const int rowlist[], int listsize, int numthread, int newpt, int *nodrv, int numvar)
 called by CONOPT before a sequence of 2DDir calls each time either the point or the direction changes.
 
virtual int ConoptModelData::SDDirEnd (int *nodrv)
 called by CONOPT after a sequence of 2DDir calls each time either the point or the direction changes.
 

Detailed Description

Virtual functions that can be optionally implemented by the user to define the second order derivatives.

CONOPT is able to take advantage of second derivatives. Only some of the potential ways of defining second derivatives is currently being implemented and this chapter describes the interface defined so far.

Second order information is used to formulate and solve an inner Quadratic Programming sub-model in a Sequential Quadratic Programming (SQP) like algorithm. If 2nd order information is not readily available it will be approximated from gradient values using finite differences. Please note that perturbations may require very many gradient calls so this option assumes that gradient calls are cheap. In addition, perturbations require accurate gradients. Even given accurate derivatives they are not as accurate as properly coded 2nd derivatives. It is therefore highly recommended to provide some form or second order information.

Second order information can be available in two forms: (1) as a sparse matrix of second derivatives, the Hessian matrix, and (2) or as a vector of directional second derivatives. Second-order information can be defined based on individual constraints or on the Lagrangian function, giving a total of four combinations of which three are supported at the moment. The notation and mathematical relationship between these concepts is as follows:

The Hessian of a single constraint \(k\), \(H_k\), is the matrix of second derivatives defined as

\[ H_{ij}^k = \frac{\partial^{2}g_k}{\partial x_{i}\partial x_{j}} \]

The Hessian is by definition a symmetric matrix and we will usually only work with the lower triangular part of the matrix, including the diagonal. \(H^k\) will only have non-zero elements in rows and columns corresponding to variables that appear nonlinearly in constraint \(k\), so if constraint \(k\) is sparse, the corresponding Hessian will usually also be sparse. But notice that if constraint \(k\) depends on \(p\) variables, the Hessian may have up to \(p^2\) elements in total or \(p \cdot (p+1)/2\) elements in the lower triangular part of the matrix. There are models where the constraints on average are sparse, but one or a few of the constraints depend on many variables, and their Hessians have a huge number of nonzero elements.

The Lagrangian function is defined as

\[ L(x,u)=\sum_{k}u_{k} \cdot g_{k}(x) \]

where we have ignored some terms that do not contribute to second derivatives. \(u\) is a vector of Lagrange multipliers defined by CONOPT. There is one Lagrange multiplier for each constraint.

The Hessian of the Lagrangian, \(HL\), is defined as:

\[ HL = \sum_{k}u_{k} \cdot H^{k} \quad \text{or} \quad HL_{ij} = \sum_{k}u_{k} \cdot \frac{\partial^{2}g_{k}}{\partial x_{i}\partial x_{j}} \]

The sparsety pattern for the Hessian of the Lagrangian is the union of the sparsety patterns for the Hessians of all constraints in the model. This means that if one of the constraints has a dense Hessian then the Hessian of the Lagrangian will also be dense and CONOPT cannot use a Hessian efficiently with these models.

The directional first derivative of constraint \(k\) in point \(x\) and in the direction \(v\) is defined as

\[ d1_{k}(x,v) = \frac{\partial g_{k}(x+\theta\cdot v)}{\partial \theta} = \frac{\partial g_{k}}{\partial x} \cdot v \]

or the scalar product of the first derivative of \(g_k\) with the direction \(v\). The directional second derivative is defined as

\[ d2_{k}(x,v) = \frac{\partial d1_{k}(x,v)}{\partial x} = \frac{\partial^{2}g_{k}}{\partial x^{2}} \cdot v = H^{k} \cdot v \]

or the product of the Hessian with the direction v. The directional second derivative of the Lagrangian is similarly defined as

\[ HL \cdot v = \sum_{k}u_{k} \cdot H_{k} \cdot v \]

where the order of the multiplications can be selected by the implementer. Notice that in cases where \(H_k\) has too many elements to be of practical direct use, the directional second derivative is still a vector with only one element per variable, and it may be cheap to compute.

Currently, CONOPT accept second derivative information in three forms:

1. The Hessian of the Lagrangian,

2. Directional second derivatives of individual constraints, and

3. Directional second derivatives of the Lagrangian.

The remaining combination, the Hessians of individual constraints, can be implemented if they are more convenient for some users.

CONOPT uses the second order information in a Sequential Quadratic Programming (SQP) algorithm to find good directions for a line search. The SQP model is solved approximately using a conjugate gradient (CG) algorithm. The SQP tolerances are large on the beginning of the optimization and are reduced as we approach the optimal solution. The gradients used in the CG algorithm are either computed by repeated calls to the directional second derivative routine or by a single call to the Hessian routine followed by several multiplications of the Hessian with a direction vector inside CONOPT. If both a Hessian and directional derivatives are available, CONOPT will initially use 2DDir or 2DDirLagr when few CG iterations are expected, and switch to 2DLagrVal when the accuracy is tighter and a larger number of iterations are expected.

The modeler can tell CONOPT that second order information is available as the Hessian of the Lagrangian, assumed to be a sparse matrix, by calling ConoptModelData::setSDLagrangianStructure(). The number of nonzero elements and the sparsety pattern of the Hessian of the Lagrangian are defined in two setup calls to 2DLagrStr. The numerical values of the Hessian must be returned by calls to ConoptModelData::SDLagrVal(). The numerical calls will be scattered throughout the optimization, usually with no more than one call per iteration.

If the user wants to compute the directional second derivative, this is registered by calling ConoptModelData::setSDEvaluationType() and supplying either ConoptSDEvaluationType::Constraint or ConoptSDEvaluationType::Lagrangian. If ConoptSDEvaluationType::Constraint is set, then the user must implement ConoptModelData::SDDir(). Alternatively, if ConoptSDEvaluationType::Lagrangian is set, then the user must implement ConoptModelData::SDDirLagr(). For the directional derivatives, the user may also implement ConoptModelData::SDDirIni() and ConoptModelData::SDDirEnd() for the setup and teardown of the second derivative evaluation.

Enumeration Type Documentation

◆ ConoptSDEvaluationType

enum class ConoptSDEvaluationType
strong

the evaluation type for the directional second derivatives

Enumerator
None 
Constraint 
Lagrangian 

Definition at line 85 of file defines.h.

Function Documentation

◆ setSDEvaluationType()

void ConoptModelData::setSDEvaluationType ( ConoptSDEvaluationType sdevaltype)

informs CONOPT of the method for evaluating the second derivative

◆ setSDLagrangianStructure()

void ConoptModelData::setSDLagrangianStructure ( const std::vector< int > & rownum,
const std::vector< int > & colnum )

sets the structure of the second derivatives of the Lagrangian

Parameters
rownumVector of row numbers of the lower triangular part of the Hessian.
colnumVector of column numbers of the lower triangular part of the Hessian.

◆ SDLagrVal()

virtual int ConoptModelData::SDLagrVal ( const double x[],
const double u[],
const int hsrw[],
const int hscl[],
double hsvl[],
int * nodrv,
int numvar,
int numcon,
int nhess )
inlinevirtual

Computes and returns the numerical values of the Hessian.

The structure of the Lagrangian of the Hessian is provided using setSDLagrangianStructure()

where:

  • MODE: Distinguishes between the three modes described above.
  • X: Vector with the point in which the Hessian of the Lagrangian should be computed. Defined by CONOPT when MODE = 3.
  • U: The vector of weights on the individual constraints. The Lagrangian is defined as L = SUM(r in rows)U(r) . function(r). U is defined by CONOPT when MODE = 3.
  • ROWNO: Vector of row numbers of the lower triangular part of the Hessian. Must be defined by the modeler when MODE = 2 and is provided as a help to the modeler when MODE = 3.
  • COLNO: Vector of column numbers of the lower triangular part of the Hessian. Must be defined by the modeler when MODE = 2 and is provided as a help to the modeler when MODE = 3. The elements of the Hessian must be sorted column wise, i.e. COLNO must be non-decreasing. Within each column the elements must be sorted row-wise, i.e. the elements of ROWNO must be increasing for sequences for which COLNO is constant. The row and column numbers are interpreted according to Base, i.e. they are between 1 and N when Base = 1 (Fortran conventions) and between 0 and N-1 when Base = 0 (C conventions).
  • VALUE: Vector returning the values of the second derivatives when MODE = 3. The individual elements must be defined in the order used for ROWNO and COLNO. VALUE will be initialized to zero when 2DLagrVal is called and it must be defined by the modeler.
  • NODRV: Can be set to 1 if the derivatives for some reason could not be computed, for example because some of them were not defined. Is initialized to 0 by CONOPT. CONOPT will not use second order methods in the current point if NODRV is 1.
  • N: The number of variables in the model as defined in ConoptModelData::setProblemDimension. Provided by CONOPT.
  • M: The number of constraints in the model as defined in ConoptModelData::setProblemDimension. Provided by CONOPT.
  • NHESS: The number of nonzero elements in the Hessian. When MODE = 1 CONOPT will define the maximum number of elements it will accept in NHESS, and 2DLagrVal must return the actual number of element. If the actual number is larger than the input value of NHESS then CONOPT will not use the Hessian of the Lagrangian and 2DLagrVal will not be called again. When MODE > 1 NHESS is defined by CONOPT as the value provided by the modeler in the first call.

2DLagrVal will only be called in points in which the constraint values have been computed successfully before with FDEval. Checks for function evaluation errors can therefore usually be limited. Note that 2DLagrVal in some cases can be called in a point that is different from the last point in which FDEval was called.

The maximum number of Hessian nonzero elements accepted by CONOPT is computed as the option value RVHESS multiplied by the number of nonlinear Jacobian elements. The default value of RVHESS is 10. If the Hessian has more than 10 times as many elements as the Jacobian it is expected that it is too expensive to compute and that directional 2nd derivatives in some form are more efficient to use. The modeler may allow CONOPT to use a denser Hessian by increasing RVHESS.

There is a rudimentary debugger of the values returned by 2DLagrVal. It can be turned on by setting Debug2D in coidef_debug2d() or with the option LKDEB2. Information on the debugger including messages and error return codes can be found in Error Return Codes.

Reimplemented in Tut2_ModelData.

Definition at line 432 of file modeldata.h.

◆ SDDirLagr()

virtual int ConoptModelData::SDDirLagr ( const double x[],
const double dx[],
const double u[],
double d2g[],
int newpt,
int * nodrv,
int numvar,
int numcon )
inlinevirtual

computes the directional second derivative for the Lagrangian

where:

  • X: Vector with the point in which the directional second derivatives should be computed. Defined by CONOPT.
  • DX: Vector with the direction in which the directional second derivatives should be computed. Defined by CONOPT.
  • U: The vector of weights on the individual constraints. The Lagrangian is defined as L = SUM(r in rows)U(row) \cdot function(row). U is defined by CONOPT when MODE = 3.
  • D2G: Vector returning the directional second derivative. The relevant positions (see JCNM below) will be zero on entry to 2DDir.
  • NEWPT: Indicator that describes the pair (X,DX). Is provided by CONOPT:
    - <b>1:</b> This is a new point.
    
    - <b>2:</b> This is the same point as in last call, i.e. the variables in
    `X` have not changed but the direction `DX` has changed.
    
    `NEWPT` is provided by CONOPT as a service to the modeler. It may be used to
    calculate and reuse terms that depend on `X` but do not depend on the
    direction. 
    
  • NODRV: Must be set to 1 if the directional second derivatives for some reason could not be computed, for example because some of them were not defined. CONOPT will not use any output from a point that returned a nonzero NODRV.
  • N: The number of variables in the model as defined in ConoptModelData::setProblemDimension. Provided by CONOPT.
  • M: The number of constraints in the model as defined in ConoptModelData::setProblemDimension. Provided by CONOPT.

Definition at line 445 of file modeldata.h.

◆ SDDir()

virtual int ConoptModelData::SDDir ( const double x[],
const double dx[],
double d2g[],
int rowno,
const int jacnum[],
int * nodrv,
int numvar,
int numjac,
int thread )
inlinevirtual

computes the directional second derivative for a single constraint

where:

  • X: Vector with the point in which the directional second derivatives should be computed. Defined by CONOPT.
  • DX: Vector with the direction in which the directional second derivatives should be computed. Defined by CONOPT.
  • D2G: Vector returning the directional second derivative. The relevant positions (see JCNM below) will be zero on entry to 2DDir.
  • ROWNO: The constraint for which the directional second derivatives should be computed. Defined by CONOPT. 2DDir will be called for one constraint at a time and only for nonlinear constraints.
  • JCNM: List of column numbers for the variables that appear nonlinearly in the current constraint. JCNM and NJ (see below) are provided as a service to modelers that may find this information useful; it can be ignored by others.
  • NODRV: Must be set to 1 if the directional second derivatives for some reason could not be computed, for example because some of them were not defined. CONOPT will not use any output from a point that returned a nonzero NODRV.
  • N: The number of variables in the model as defined in ConoptModelData::setProblemDimension. Provided by CONOPT.
  • NJ: The number of variables appearing nonlinearly in the current constraint. Defined by CONOPT.
  • THREAD: In some multi-threading environments multiple copies of FDEval can be called at the same time, (see Multi Threading). THREAD will hold the number of the thread for the current call of FDEval. THREAD is in the interval from 0 to the maximum number of threads allowed for FDEval.

Definition at line 458 of file modeldata.h.

◆ SDDirIni()

virtual int ConoptModelData::SDDirIni ( const double x[],
const double dx[],
const int rowlist[],
int listsize,
int numthread,
int newpt,
int * nodrv,
int numvar )
inlinevirtual

called by CONOPT before a sequence of 2DDir calls each time either the point or the direction changes.

This method can be used to initialize the 2DDir computations, for example to compute common terms.

where:

  • X: Vector with the point in which the directional second derivatives should be computed in future 2DDir calls. Defined by CONOPT.
  • DX: Vector with the direction in which the directional second derivatives should be computed in future 2DDir calls. Defined by CONOPT.
  • ROWLIST: List of the constraints that will be evaluated in the future 2DDir calls.
  • LISTSIZE: The number of elements in ROWLIST, defined by CONOPT.
  • NUMTHREAD: The number of threads to be used for the following 2DDir calls.
  • NEWPT: Indicator that describes the pair (X,DX). Is provided by CONOPT:
    - <b>1:</b> This is a new point.
    
    - <b>2:</b> This is the same point as in last call, i.e. the variables in
    `X` have not changed but the direction `DX` has changed.
    
    `NEWPT` is provided by CONOPT as a service to the modeler. It may be used to
    calculate and reuse terms that depend on `X` but do not depend on the
    direction. 
    
  • NODRV: Must be set to 1 if the directional second derivatives for some reason could not be computed, for example because some of them were not defined. CONOPT will not use any output from a point that returned a nonzero NODRV and 2DDir and the optional 2DDirEnd will not be called in this point.
  • N: The number of variables in the model as defined in ConoptModelData::setProblemDimension. Provided by CONOPT.

Definition at line 473 of file modeldata.h.

◆ SDDirEnd()

virtual int ConoptModelData::SDDirEnd ( int * nodrv)
inlinevirtual

called by CONOPT after a sequence of 2DDir calls each time either the point or the direction changes.

This method can be used to perform cleanup tasks, including to report if anything went wrong.

where:

  • NODRV: Can be set to 1 if the directional second derivatives for some reason could not be computed, for example because some of them were not defined. this is an alternative to the use of NODRV in 2DDir. CONOPT will not use any output from a point and direction that returned a nonzero NODRV.

Definition at line 488 of file modeldata.h.