In this chapter we will discuss the following issues:
A linear optimization problem can be written as
![]() |
(9.1.1) |
where
A primal solution (x) is (primal) feasible if it satisfies all constraints in (9.1.1). If (9.1.1) has at least one primal feasible solution, then (9.1.1) is said to be (primal) feasible.
In case (9.1.1) does not have a feasible solution, the problem is said to be (primal) infeasible .
Corresponding to the primal problem (9.1.1), there is a dual problem
![]() |
(9.1.2) |
If a bound in the primal problem is plus or minus infinity, the corresponding dual variable is fixed at 0, and we use the convention that the product of the bound value and the corresponding dual variable is 0. For example
![]() |
This is equivalent to removing variable from the dual problem.
A solution
![]() |
to the dual problem is feasible if it satisfies all the constraints in (9.1.2). If (9.1.2) has at least one feasible solution, then (9.1.2) is (dual) feasible, otherwise the problem is (dual) infeasible .
We will denote a solution
![]() |
so that x is a solution to the primal problem (9.1.1), and
![]() |
is a solution to the corresponding dual problem (9.1.2). A solution which is both primal and dual feasible is denoted a primal-dual feasible primal-dual solution.
Let
![]() |
be a primal-dual feasible solution, and let
![]() |
For a primal-dual feasible solution we define the optimality gap as the difference between the primal and the dual objective value,
![]() |
where the first relation can be obtained by multiplying the dual constraints (9.1.2) by x and respectively, and the second relation comes from the fact that each term in each sum is nonnegative. It follows that the primal objective will always be greater than or equal to the dual objective.
We then define the duality gap as the difference between the primal objective value and the dual objective value, i.e.
![]() |
Please note that the duality gap will always be nonnegative.
It is well-known that a linear optimization problem has an optimal solution if and only if there exist feasible primal and dual solutions such that the duality gap is zero, or, equivalently, that the complementarity conditions
![]() |
are satisfied.
If (9.1.1) has an optimal solution and MOSEK solves the problem successfully, both the primal and dual solution are reported, including a status indicating the exact state of the solution.
If the problem (9.1.1) is infeasible (has no feasible solution), MOSEK will report a primal certificate of the infeasibility: The dual solution reported is a certificate of infeasibility, and the primal solution is undefined.
A primal certificate (certificate of primal infeasibility) is a feasible solution to the modified dual problem
![]() |
(9.1.3) |
so that the objective is strictly positive, i.e. a solution
![]() |
to (9.1.3) so that
![]() |
Such a solution implies that (9.1.3) is unbounded, and that its dual is infeasible.
We note that the dual of (9.1.3) is a problem whose constraints are identical to the constraints of the original primal problem (9.1.1): If the dual of (9.1.3) is infeasible, so is the original primal problem.
If the problem (9.1.2) is infeasible (has no feasible solution), MOSEK will report a dual certificate of the infeasibility: The primal solution reported is a certificate of infeasibility, and the dual solution is undefined.
A certificate of dual infeasibility is a feasible solution to the problem
![]() |
(9.1.4) |
where
![]() |
and
![]() |
so that the objective value is negative. Such a solution implies that (9.1.4) is unbounded, and that dual of (9.1.4) is infeasible.
We note that the dual of (9.1.4) is a problem whose constraints are identical to the constraints of the original dual problem (9.1.2): If the dual of (9.1.4) is infeasible, so is the original dual problem.
In case that both the primal problem (9.1.1) and the dual problem (9.1.2) are infeasible, MOSEK will report only one of the two possible certificates — which one is not defined (MOSEK returns the first certificate found).
Network flow problems are a special class of linear optimization problems which has many applications. The class of network flow problems can be specified as follows. Let be a directed network. Associated with every arc
is a cost
and a capacity
. Moreover, associated with each node
in the network is a lower limit
and an upper limit
on the demand(supply) of the node. Now the minimum cost of a network flow problem can be stated as follows
![]() |
(9.2.1) |
A classical example of a network flow problem is the transportation problem, where the objective is to distribute goods from warehouses to customers at lowest possible total cost, see [8] for a detailed application reference.
It is well-known that problems with network flow structure can be solved efficiently with a specialized version of the simplex method. MOSEK includes a highly tuned network simplex implementation, see Section 10.3.1 for further details on how to invoke the network optimizer.
A convex quadratic optimization problem is an optimization problem of the form
![]() |
(9.3.1) |
where the convexity requirement implies that
The convexity requirement is very important and it is strongly recommended that MOSEK is applied to convex problems only.
Any convex quadratic optimization problem can be reformulated as a conic optimization problem. It is our experience that for the majority of practical applications it is better to cast them as conic problems because
See Section 9.4.4.1 for further details.
The simplest quadratic optimization problem is
![]() |
(9.3.2) |
The problem (9.3.2) is said to be a separable problem if Q is a diagonal matrix or, in other words, that the quadratic terms in the objective all have this form
![]() |
instead of this form
![]() |
The separable form has the following advantages:
It is well-known that a positive semi-definite matrix Q can always be factorized, i.e. a matrix F exists so that
![]() |
(9.3.3) |
In many practical applications of quadratic optimization F is known explicitly; for example if Q is a covariance matrix, F would be the set of observations producing it.
Using (9.3.3), the problem (9.3.2) can be reformulated as
![]() |
(9.3.4) |
The problem (9.3.4) is also a quadratic optimization problem and has more constraints and variables than (9.3.2). However, the problem is separable. Normally, if F has fewer rows than columns, it is worthwhile to reformulate as a separable problem. Indeed consider the extreme case where F has one dense row and hence Q will be dense matrix.
The idea presented above is applicable to quadratic constraints too. Now consider the constraint
![]() |
(9.3.5) |
where F is a matrix and b is a scalar. (9.3.5) can be reformulated as
![]() |
It should be obvious how to generalize this idea to make any convex quadratic problem separable.
Next, consider the constraint
![]() |
where D is a positive semi-definite matrix, F is a matrix, and b is a scalar. We assume that D has a simple structure, i.e. D is for instance a diagonal or a block diagonal matrix. If this is the case, then it may be worthwhile performing the reformulation
![]() |
Now the question may arise: When should a quadratic problem be reformulated to make it separable or near separable? The simplest rule of thumb is that it should be reformulated if the number of non-zeros used to represent the problem decreases when reformulating the problem.
Conic optimization can be seen as a generalization of linear optimization. Indeed a conic optimization problem is a linear optimization problem plus a constraint of the form
![]() |
where is a convex cone. A complete conic problem has the form
![]() |
(9.4.1) |
The cone can be a Cartesian product of p convex cones, i.e.
![]() |
in which case can be written as
![]() |
where each . Please note that the n-dimensional Euclidean space
is a cone itself, so simple linear variables are still allowed.
MOSEK supports only a limited number of cones, specifically
![]() |
where each has one of the following forms
set:
![]() |
Quadratic cone:
![]() |
Rotated quadratic cone:
![]() |
Although these cones may seem to provide only limited expressive power they can be used to model a large range of problems as demonstrated in Section 9.4.4.
The dual problem corresponding to the conic optimization problem (9.4.1) is given by
![]() |
(9.4.2) |
where the dual cone is a product of cones
![]() |
where each is the dual cone of
. For the cone types MOSEK can handle, the relation between the primal and dual cone is given as follows:
set:
![]() |
Quadratic cone:
![]() |
Rotated quadratic cone:
![]() |
The dual problem corresponding to the dual problem is the primal problem.
In case MOSEK finds a problem to be infeasible it will report a certificate of the infeasibility. This works exactly as for linear problems (see sections 9.1.1.3 and 9.1.1.4).
This section contains several examples of inequalities and problems that can be cast as conic optimization problems.
From Section 9.3.2 we know that any convex quadratic problem can be stated on the form
![]() |
(9.4.3) |
where F and G are matrices and c and a are vectors. For simplicity we assume that there is only one constraint, but it should be obvious how to generalize the methods to an arbitrary number of constraints.
Problem (9.4.3) can be reformulated as
![]() |
(9.4.4) |
after the introduction of the new variables t and z. It is easy to convert this problem to a conic quadratic optimization problem, i.e.
![]() |
(9.4.5) |
In this case we can model the last two inequalities using rotated quadratic cones.
If we assume that F is a non-singular matrix — for instance a diagonal matrix — then
![]() |
and hence we can eliminate x from the problem to obtain:
![]() |
(9.4.6) |
In most cases MOSEK will perform this reduction automatically during the presolve phase before the optimization is performed.
The next example is the problem of minimizing a sum of norms i.e. the problem
![]() |
(9.4.7) |
where
![]() |
This problem is equivalent to
![]() |
(9.4.8) |
which in turn is equivalent to
![]() |
(9.4.9) |
where all are of the quadratic type, i.e.
![]() |
The dual problem corresponding to (9.4.9) is
![]() |
(9.4.10) |
where
![]() |
This problem is equivalent to
![]() |
(9.4.11) |
Please note that the dual problem can be reduced to an “ordinary” convex quadratically constrained optimization problem in this case due to the special structure of the primal problem. In some cases it turns out that it is much better to solve the dual problem (9.4.10) rather than the primal problem (9.4.9).
Generally an arbitrary polynomial term of the form
![]() |
cannot be represented with conic quadratic constraints, however in the following we will demonstrate some special cases where it is possible.
A particular simple polynomial term is the reciprocal, i.e.
![]() |
Now, a constraint of the form
![]() |
where it is required that x>0 is equivalent to
![]() |
which in turn is equivalent to
![]() |
The last formulation is a conic constraint plus a simple linear equality.
For example, consider the problem
![]() |
where it is assumed that and b>0. This problem is equivalent to
![]() |
(9.4.12) |
because
![]() |
implies that
![]() |
The problem (9.4.12) is a conic quadratic optimization problem having n 3 dimensional rotated quadratic cones.
The next example is the constraint
![]() |
where both t and x are variables. This set is identical to the set
![]() |
(9.4.13) |
Occasionally when modelling the market impact term in portfolio optimization, the polynomial term occurs. Therefore, consider the set defined by the inequalities
![]() |
(9.4.14) |
We will exploit that . First define the set
![]() |
(9.4.15) |
Now, if we can make sure that
![]() |
then we have the desired result since this implies that
![]() |
Please note that s can be chosen freely and that is a valid choice.
Let
![]() |
(9.4.16) |
then
![]() |
Moreover,
![]() |
leading to the conclusion
![]() |
(9.4.16) is a conic reformulation which is equivalent to (9.4.14). Please note that the x≥0 constraint does not appear explicitly in (9.4.15) and (9.4.16), but implicitly since x=v≥0.
Finally, it should be mentioned that any polynomial term of form where g is a positive rational number can be represented using conic quadratic constraints [11, pp. 12-13]
If you want to know more about what can be modelled as a conic optimization problem we recommend the references [19, 5, 11].
While a linear optimization problem either has a bounded optimal solution or is infeasible, the conic case is not as simple as that.
Consider the example
![]() |
(9.4.17) |
which corresponds to the problem
![]() |
(9.4.18) |
Clearly, the optimal objective value is zero but it is never attained because implicitly we assume that the optimal y should be finite.
Next, consider the example
![]() |
(9.4.19) |
which has the optimal solution
![]() |
implying that the optimal primal objective value is 1.
Now, the dual problem corresponding to (9.4.19) is
![]() |
(9.4.20) |
Therefore,
![]() |
and
![]() |
This implies that
![]() |
and hence . Given this fact we can conclude that
![]() |
implying that the optimal dual objective value is 1, however this is never attained. Hence, there no primal and dual bounded optimal solution that has zero duality gap exists. Of course it is possible to find a primal and dual feasible solution such that the duality gap is close to zero, however, will be very large (unless a large duality gap is allowed). This is likely to make the problem (9.4.19) hard to solve.
An inspection of problem (9.4.19) reveals the constraint , which implies that
. If we either add the redundant constraint
![]() |
to the problem (9.4.19) or eliminate and
from the problem it becomes easy to solve.
MOSEK is capable of solving smooth convex nonlinear optimization problems of the form
![]() |
(9.5.1) |
where
This means that the ith constraint has the form
![]() |
when the variable has been eliminated.
The linear term Ax is not included in g(x) since it can be handled much more efficiently as a separate entity when optimizing.
The nonlinear functions f and g must be smooth (twice differentiable) in all . Moreover, f(x) must be a convex function and
must satisfy
![]() |
So far, we have not discussed what happens when MOSEK is used to solve a primal or dual infeasible problem. In the subsequent section these issues are addressed.
Similar to the linear case, MOSEK reports dual information in the general nonlinear case. Indeed in this case the Lagrange function is defined by
![]() |
and the dual problem is given by
![]() |
which is equivalent to
![]() |
(9.5.2) |
Often an optimization problem can be formulated in several different ways, and the exact formulation used may have a significant impact on the solution time and the quality of the solution. In some cases the difference between a “good” and a “bad” formulation means the ability to solve the problem or not.
Below is a list of several issues that you should be aware of when developing a good formulation.
Avoid large bounds as these can introduce all sorts of numerical problems. Assume that a variable has the bounds
![]() |
The number 1.0e16 is large and it is very likely that the constraint is non-binding at optimum, and therefore that the bound 1.0e16 will not cause problems. Unfortunately, this is a naïve assumption because the bound 1.0e16 may actually affect the presolve, the scaling, the computation of the dual objective value, etc. In this case the constraint
is likely to be sufficient, i.e. 1.0e16 is just a way of representing infinity.
On a computer all computations are performed in finite precision, which implies that
![]() |
where is about
. This means that the results of all computations are truncated leading to the introduction of rounding errors. The upshot is that very small numbers and very large numbers should be avoided, e.g. it is recommended that all elements in A are either zero or belong to the interval
. The same holds for the bounds and the linear objective.
Consider the linear optimization problem
![]() |
(9.6.1) |
Clearly, the problem is feasible for . However, for
the problem is infeasible. This implies that an insignificant change in the right side of the constraints makes the problem status switch from feasible to infeasible. Such a model should be avoided.
Assume we have a constraint for the form
![]() |
(9.7.1) |
where is a vector of variables, and
and
are constants.
It is easy to verify that the constraint (9.7.1) is equivalent to
![]() |
(9.7.2) |
which is a set of ordinary linear inequality constraints.
Please note that equalities involving and absolute value such as
![]() |
cannot be formulated as a linear or even a convex optimization problem. It requires integer optimization.
In this section we will show how to model several versions of the Markowitz portfolio model using conic optimization.
The Markowitz portfolio model deals with the problem of selecting a portfolio of assets i.e. stocks, bonds, etc. The goal is to find a portfolio such that for a given return the risk is minimized. The assumptions are:
The variable denotes the amount of asset j traded in the given period of time and has the following meaning:
The model deals with two central quantities:
Expected return:
![]() |
Variance (Risk):
![]() |
By definition is positive semi-definite and
![]() |
where L is any matrix such that
![]() |
A low rank of is advantageous from a computational point of view. A valid L can always be computed as the Cholesky factorization of
.
In our first model we want to minimize the variance while selecting a portfolio with a specified expected target return t. Additionally the portfolio must satisfy the budget (self-financing) constraint asserting that the total amount of assets sold must equal the total amount of assets purchased. This is expressed in the model
![]() |
(9.7.3) |
where . Using the definitions above this may be formulated as a quadratic optimization problem:
![]() |
(9.7.4) |
An equivalent conic quadratic reformulation is given by:
![]() |
(9.7.5) |
Here we minimize the standard deviation instead of the variance. Please note that can be replaced by any matrix L where
. A low rank L is computationally advantageous.
We will now expand our model to include transaction costs as a fraction of the traded volume. [15, pp. 445-475] argues that transaction costs can be modelled as follows
![]() |
(9.7.6) |
and that these are important to incorporate into the model.
In the following we deal with the last of these terms denoted the market impact term. If you sell (buy) a lot of assets the price is likely to go down (up). This can be captured in the market impact term
![]() |
The and “daily volume” have to be estimated in some way, i.e.
![]() |
has to be estimated. The market impact term gives the cost as a fraction of daily traded volume (). Therefore, the total cost when trading an amount
of asset j is given by
![]() |
This leads us to the model:
![]() |
(9.7.7) |
Now, defining the variable transformation
![]() |
we obtain
![]() |
(9.7.8) |
As shown in Section 9.4.4.3 the set
![]() |
can be modelled by
![]() |
(9.7.9) |
For further reading please see the reader to [13] in particular, and [3] and [15], which also contain relevant material.