CONOPT
|
Functions | |
pyconopt.ModelData.evaluateSDLagrangian (self, x, u, hessianrow, hessiancol) | |
Computes and returns the numerical values of the Lagrangian of the Hessian. | |
pyconopt.ModelData.evaluateSDDirLagrangian (self, x, dx, u, newpoint) | |
computes the directional second derivative for the Lagrangian | |
pyconopt.ModelData.evaluateDirectionalSD (self, x, dx, rowno, jacnum, thread) | |
computes the directional second derivative for a single constraint | |
pyconopt.ModelData.endDirectionalSDEval (self) | |
called by CONOPT after a sequence of evaluateDirectionalSD() calls each time either the point or the direction changes. | |
pyconopt.ModelData.setSDEvaluationType (self, sdevaltype) | |
informs CONOPT of the method for evaluating the second derivative | |
pyconopt.ModelData.setSDLagrangianStructure (self, rownum, colnum) | |
sets the structure of the second derivatives of the Lagrangian | |
Virtual functions that can be optionally implemented by the user to define the second order derivatives.
CONOPT is able to take advantage of second derivatives. Only some of the potential ways of defining second derivatives is currently being implemented and this chapter describes the interface defined so far.
Second order information is used to formulate and solve an inner Quadratic Programming sub-model in a Sequential Quadratic Programming (SQP) like algorithm. If 2nd order information is not readily available it will be approximated from gradient values using finite differences. Please note that perturbations may require very many gradient calls so this option assumes that gradient calls are cheap. In addition, perturbations require accurate gradients. Even given accurate derivatives they are not as accurate as properly coded 2nd derivatives. It is therefore highly recommended to provide some form or second order information.
Second order information can be available in two forms: (1) as a sparse matrix of second derivatives, the Hessian matrix, and (2) or as a vector of directional second derivatives. Second-order information can be defined based on individual constraints or on the Lagrangian function, giving a total of four combinations of which three are supported at the moment. The notation and mathematical relationship between these concepts is as follows:
The Hessian of a single constraint \(k\), \(H_k\), is the matrix of second derivatives defined as
\[ H_{ij}^k = \frac{\partial^{2}g_k}{\partial x_{i}\partial x_{j}} \]
The Hessian is by definition a symmetric matrix and we will usually only work with the lower triangular part of the matrix, including the diagonal. \(H^k\) will only have non-zero elements in rows and columns corresponding to variables that appear nonlinearly in constraint \(k\), so if constraint \(k\) is sparse, the corresponding Hessian will usually also be sparse. But notice that if constraint \(k\) depends on \(p\) variables, the Hessian may have up to \(p^2\) elements in total or \(p \cdot (p+1)/2\) elements in the lower triangular part of the matrix. There are models where the constraints on average are sparse, but one or a few of the constraints depend on many variables, and their Hessians have a huge number of nonzero elements.
The Lagrangian function is defined as
\[ L(x,u)=\sum_{k}u_{k} \cdot g_{k}(x) \]
where we have ignored some terms that do not contribute to second derivatives. \(u\) is a vector of Lagrange multipliers defined by CONOPT. There is one Lagrange multiplier for each constraint.
The Hessian of the Lagrangian, \(HL\), is defined as:
\[ HL = \sum_{k}u_{k} \cdot H^{k} \quad \text{or} \quad HL_{ij} = \sum_{k}u_{k} \cdot \frac{\partial^{2}g_{k}}{\partial x_{i}\partial x_{j}} \]
The sparsety pattern for the Hessian of the Lagrangian is the union of the sparsety patterns for the Hessians of all constraints in the model. This means that if one of the constraints has a dense Hessian then the Hessian of the Lagrangian will also be dense and CONOPT cannot use a Hessian efficiently with these models.
The directional first derivative of constraint \(k\) in point \(x\) and in the direction \(v\) is defined as
\[ d1_{k}(x,v) = \frac{\partial g_{k}(x+\theta\cdot v)}{\partial \theta} = \frac{\partial g_{k}}{\partial x} \cdot v \]
or the scalar product of the first derivative of \(g_k\) with the direction \(v\). The directional second derivative is defined as
\[ d2_{k}(x,v) = \frac{\partial d1_{k}(x,v)}{\partial x} = \frac{\partial^{2}g_{k}}{\partial x^{2}} \cdot v = H^{k} \cdot v \]
or the product of the Hessian with the direction v. The directional second derivative of the Lagrangian is similarly defined as
\[ HL \cdot v = \sum_{k}u_{k} \cdot H_{k} \cdot v \]
where the order of the multiplications can be selected by the implementer. Notice that in cases where \(H_k\) has too many elements to be of practical direct use, the directional second derivative is still a vector with only one element per variable, and it may be cheap to compute.
Currently, CONOPT accept second derivative information in three forms:
1. The Hessian of the Lagrangian,
2. Directional second derivatives of individual constraints, and
3. Directional second derivatives of the Lagrangian.
The remaining combination, the Hessians of individual constraints, can be implemented if they are more convenient for some users.
CONOPT uses the second order information in a Sequential Quadratic Programming (SQP) algorithm to find good directions for a line search. The SQP model is solved approximately using a conjugate gradient (CG) algorithm. The SQP tolerances are large on the beginning of the optimization and are reduced as we approach the optimal solution. The gradients used in the CG algorithm are either computed by repeated calls to the directional second derivative routine or by a single call to the Hessian routine followed by several multiplications of the Hessian with a direction vector inside CONOPT. If both a Hessian and directional derivatives are available, CONOPT will initially use 2DDir
or 2DDirLagr
when few CG iterations are expected, and switch to 2DLagrVal
when the accuracy is tighter and a larger number of iterations are expected.
The modeler can tell CONOPT that second order information is available as the Hessian of the Lagrangian, assumed to be a sparse matrix, by calling pyconopt.ModelData.setSDLagrangianStructure(). The number of nonzero elements and the sparsety patterng of the Hessian of the Lagrangian are defined in two setup calls to 2DLagrStr
. The numerical values of the Hessian must be returned by calls to pyconopt.ModelData.SDLagrVal(). The numerical calls will be scattered throughout the optimization, usually with no more than one call per iteration.
If the user wants to compute the directional second derivative, this is registered by calling pyconopt.ModelData.setSDEvaluationType() and supplying either ConoptSDEvaluationType::Constraint or ConoptSDEvaluationType::Lagrangian. If SDEvaluationType_Constraint
is set, then the user must implement pyconopt.ModelData.SDDir(). Alternatively, if SDEvaluationType_Lagrangian
is set, then the user must implement pyconopt.ModelData.SDDirLagr(). For the directional derivatives, the user may also implement pyconopt.ModelData.SDDirIni() and pyconopt.ModelData.SDDirEnd() for the setup and teardown of the second derivative evaluation.
pyconopt.ModelData.evaluateSDLagrangian | ( | self, | |
x, | |||
u, | |||
hessianrow, | |||
hessiancol ) |
Computes and returns the numerical values of the Lagrangian of the Hessian.
x | the solution vector that needs to be evaluated. |
u | the vector of weights on the individual constraints. |
hessianrow | vector of row numbers of the lower triangular part of the hessian. |
hessiancol | vector of column numbers of the lower triangular part of the hessian. |
returns a vector for the values of the Lagrangian of the Hessian. The length of the vector is of size numHessianNonzeros().
Notes: an error in the evaluation is reported by calling errorInEvaluation()
Reimplemented in qp3.QPModelData, qp4.QPModelData, and tutorial2.TutModelData.
Definition at line 2490 of file pyconopt.py.
pyconopt.ModelData.evaluateSDDirLagrangian | ( | self, | |
x, | |||
dx, | |||
u, | |||
newpoint ) |
computes the directional second derivative for the Lagrangian
x | the solution vector that needs to be evaluated. |
dx | vector with the direction in which the directional second derivatives should be computed. |
u | the vector of weights on the individual constraints. |
newpoint | a boolean to indicate whether x has changed. If newpoint is false, the values of dx may have changed. |
returns a vector for the directional second derivative. The length of the vector is the number of variables.
Notes: an error in the evaluation is reported by calling errorInEvaluation()
Definition at line 2507 of file pyconopt.py.
pyconopt.ModelData.evaluateDirectionalSD | ( | self, | |
x, | |||
dx, | |||
rowno, | |||
jacnum, | |||
thread ) |
computes the directional second derivative for a single constraint
x | the solution vector that needs to be evaluated. |
dx | vector with the direction in which the directional second derivatives should be computed. |
rowno | the number for the row in which the nonlinear term exists. |
jacnum | vector with a list of column numbers for the nonlinear nonzero Jacobian elements in the row. |
thread | the index of the thread from which this method is being called from. |
returns a vector for the directional second derivative. The length of the vector is the number of variables.
Notes: an error in the evaluation is reported by calling errorInEvaluation()
Reimplemented in qp2.QPModelData, and qp4.QPModelData.
Definition at line 2525 of file pyconopt.py.
pyconopt.ModelData.endDirectionalSDEval | ( | self | ) |
called by CONOPT after a sequence of evaluateDirectionalSD() calls each time either the point or the direction changes.
It can be used to perform cleanup tasks, including to report if anything went wrong.
Notes: an error in the evaluation is reported by calling errorInEvaluation()
Definition at line 2565 of file pyconopt.py.
pyconopt.ModelData.setSDEvaluationType | ( | self, | |
sdevaltype ) |
informs CONOPT of the method for evaluating the second derivative
Reimplemented from pyconopt.ConoptModelData.
Definition at line 2848 of file pyconopt.py.
pyconopt.ModelData.setSDLagrangianStructure | ( | self, | |
rownum, | |||
colnum ) |
sets the structure of the second derivatives of the Lagrangian
rownum | Vector of row numbers of the lower triangular part of the Hessian. |
colnum | Vector of column numbers of the lower triangular part of the Hessian. |
Reimplemented from pyconopt.ConoptModelData.
Definition at line 2857 of file pyconopt.py.