8. Case studies


8.1. Robust linear optimization

In most linear optimization examples discussed in this manual it is implicitly assumed that the problem data such as c and A are known with certainty. However, in practice this is seldom the case. For example data may just be roughly estimated, affect by measurement errors, or be affected by random events.

In this section a robust linear optimization methodology is presented which removes the assumption that the problem data is known exactly. Rather it is assumed that the data belongs to some set i.e. a box or an ellipsoid.

The computations are performed using the MOSEK optimization toolbox for MATLAB but could equally well have been implemented using the MOSEK API.

This Section is co-authored with A. Ben-Tal and A. Nemirovski.

8.1.1. Introductory example

Consider a toy linear optimization problem as follows:

A company produces two kinds of drugs, DrugI and DrugII, containing a specific active agent A, which is extracted from raw materials which should be purchased on the market. The drug production data are as follows:

Drug Selling price, Content of agent A, Production expenses per 1000 packs
$ per 1000 packs g per 1000 packs
manpower,
hours
equipment,
hours
operational
costs, $
DrugI 6,200 0.500 90.0 40.0 700
DrugII 6,900 0.600 100.0 50.0 800

There are two kinds of raw materials, RawI and RawII, which can be used as sources of the active agent. The related data are as follows:

Raw material Purchasing price, Content of agent A,
  $ per kg g per kg
RawI 100.00 0.01
RawII 199.90 0.02

Finally, the per month resources dedicated to producing the drugs are as follows:

Budget, $ Manpower, hours Equipment, hours Capacity of raw materials storage, kg
100,000 2,000 800 1,000

The problem is to find the production plan which maximizes the profit of the company.

The problem can be immediately posed as the following linear programming program:

\begin{displaymath}\nonumber{}\begin{array}{rcll}\nonumber{}\hbox{(Drug)}: &  &  & \\\nonumber{}\hbox{maximize} &  &  & \\\nonumber{}-\overbrace{\left[100\cdot \hbox{RawI}+199.90\cdot \hbox{RawII}+700\cdot \hbox{DrugI}+800\cdot \hbox{DrugII}\right]}^{\hbox{purchasing and operational costs}} &  &  & \\\nonumber{}+\underbrace{\left[6200\cdot \hbox{DrugI}+6900\cdot \hbox{DrugII}\right]}_{\hbox{ income from selling the drugs}} &  &  & \hbox{[total profit]}\\\nonumber{}\hbox{subject to} &  &  & \\\nonumber{}0.01\cdot \hbox{RawI}+0.02\cdot \hbox{RawII}-0.500\cdot \hbox{DrugI}-0.600\cdot \hbox{DrugII} & \geq{} & 0 & \hbox{[balance of active agent]}\\\nonumber{}\hbox{RawI}+\hbox{RawII} & \leq{} & 1000 & \hbox{[storage restriction]}\\\nonumber{}90.0\cdot \hbox{DrugI}+100.0\cdot \hbox{DrugII} & \leq{} & 2000 & \hbox{[manpower restriction]}\\\nonumber{}40.0\cdot \hbox{DrugI}+50.0\cdot \hbox{DrugII} & \leq{} & 800 & \hbox{[equipment restriction]}\\\nonumber{}100.0\cdot \hbox{RawI}+199.90\cdot \hbox{RawII}+700\cdot \hbox{DrugI}+800\cdot \hbox{DrugII} & \leq{} & 100000 & \hbox{[budget restriction]}\\\nonumber{}\hbox{RawI},\hbox{RawII},\hbox{DrugI},\hbox{DrugII} & \geq{} & 0 &\end{array}\end{displaymath}

where the variables are the amounts RawI, RawII (in kg) of raw materials to be purchased and the amounts DrugI, DrugII (in 1000 of packs) of drugs to be produced.

Here is the MATLAB script which specifies the problem and solves it using the MOSEK optimization toolbox:

% rlo1.m

clear prob;

prob.c   = [-100;-199.9;6200-700;6900-800];
prob.a   = sparse([0.01,0.02,-0.500,-0.600;1,1,0,0;
                   0,0,90.0,100.0;0,0,40.0,50.0;100.0,199.9,700,800]);
prob.blc = [0;-inf;-inf;-inf;-inf];
prob.buc = [inf;1000;2000;800;100000];
prob.blx = [0;0;0;0];
prob.bux = [inf;inf;inf;inf];
[r,res]  = mosekopt('maximize',prob);
xx       = res.sol.itr.xx;
RawI     = xx(1);
RawII    = xx(2);
DrugI    = xx(3);
DrugII   = xx(4);

disp(sprintf('*** Optimal value: %8.3f',prob.c'*xx));
disp('*** Optimal solution:');
disp(sprintf('RawI:   %8.3f',RawI));
disp(sprintf('RawII:  %8.3f',RawII));
disp(sprintf('DrugI:  %8.3f',DrugI));
disp(sprintf('DrugII: %8.3f',DrugII));

When executing this script, this is what is displayed:

*** Optimal value: 8819.658
*** Optimal solution:
RawI:     0.000
RawII:  438.789
DrugI:   17.552
DrugII:   0.000

We see that the optimal solution promises the company modest, but quite respectful profit 8.8%. Note that at the optimal solution, as it could be guessed in advance, the balance constraint is active: the production process utilizes the full amount of active agent contained in the raw materials.

8.1.2. Data uncertainty and its consequences.

Now note that not all data of the problem could be thought of to be “absolutely reliable”; e.g., one can hardly believe that the contents the active agent in the raw materials are exactly the “nominal data” 0.01 g/kg for RawI and 0.02 g/kg for RawII. In reality, these contents definitely vary around the indicated values. A natural assumption here is that the actual contents of active agent [[MathCmd 255]] in RawI and [[MathCmd 256]] in RawII are realizations of random variables somehow distributed around the “nominal contents” [[MathCmd 257]] and [[MathCmd 258]]. To be more specific, assume that [[MathCmd 255]] drifts in the 0.5%-margin of [[MathCmd 260]], specifically, takes with probabilities 0.5 the values [[MathCmd 261]]. Similarly, assume that [[MathCmd 256]] drifts in the 2% margin of [[MathCmd 263]], taking with probabilities 0.5 the values [[MathCmd 264]]. How do the perturbations of the contents of the active agent affect the production process?

The optimal solution prescribes to purchase 438.8 kg of RawII and to produce 17552 packs of DrugI. With the above random fluctuations in the content of the active agent in RawII, this production plan, with probability 0.5, will be infeasible – with this probability, the actual content of active agent in raw materials will be less than the one required to produce the planned amount of DrugI. For the sake of simplicity, assume that this difficulty is resolved in the simplest way: when the actual content of active agent in raw materials is insufficient, the output of the drug is reduced accordingly. With this policy, the actual production of DrugI becomes random variable which takes, with probabilities 0.5, the nominal value of 17552 packs and the 2% less value of 17201 packs. These 2% fluctuations in the production affect the profit as well; the latter becomes a random variable taking, with probabilities 0.5, the nominal value 8,820 and the 21% (!) less value 6,929. The expected profit is 7,843,which is by 11% less than the nominal profit 8,820 promised by the optimal solution of the problem.

We see that in our toy example pretty small (and unavoidable in reality) perturbations of the data may make the optimal solution infeasible, and a straightforward adjustment to the actual solution values may heavily affect the solution quality.

It turns out that the outlined phenomenon can be met in many linear programs of practical origin. Usually, in these programs at least part of the data are not known exactly and can vary around their nominal values, and these data perturbations can make the nominal optimal solution – the one corresponding to the nominal data – infeasible. It turns out that the consequences of data uncertainty can be much more severe than in our toy example. The analysis of linear optimization problems from the NETLIB collection reported in [4] demonstrates that for 13 of 94 NETLIB problems, already 0.01%-perturbations of “clearly uncertain” data can make the nominal optimal solution severely infeasible: with these perturbations, the solution, with a non-negligible probability, violates some of the constraints by 50% and more. It should be added that in the general case, in contrast to the toy example we have considered, there is no evident way to adjust the optimal solution, by a small modification, to the actual values of the data, and there are cases when such an adjustment is impossible – in order to become feasible for the perturbed data, the nominal optimal solution should be “completely reshaped”.

8.1.3. Robust linear optimization methodology

A natural approach to handling data uncertainty in optimization is offered by robust optimization methodology which, as applied to linear optimization, is as follows.

8.1.3.1. Uncertain linear programs and their robust counterparts.

Consider a linear optimization problem

\begin{math}\nonumber{}\begin{array}{lccccc}\nonumber{}\mbox{minimize} &  &  & c^{T}x &  & \\\nonumber{}\mbox{subject to} & l_{c} & \leq{} & Ax & \leq{} & u_{c},\\\nonumber{} & l_{x} & \leq{} & x & \leq{} & u_{x},\end{array}\end{math} (8.1.1)

with the data [[MathCmd 266]], and assume that this data is not known exactly; all we know is that the data vary in a given uncertainty set [[MathCmd 267]]. The simplest example is the one of interval uncertainty, where every data entry can run through a given interval:

\begin{math}\nonumber{}\begin{array}{rcl}\nonumber{}\mathcal{U} & = & \bigg\lbrace{}(c,A,l_{c},u_{c},l_{x},u_{x}):\\\nonumber{} &  & (c^{{n}}-dc,A^{{n}}-dA,l_{c}^{{n}}-dl_{c},u_{c}^{{n}}-du_{c},l_{x}^{{n}}-dl_{x},u_{x}^{{n}}-du_{x})\leq{}(c,A,l_{c},u_{c},l_{x},u_{x})\\\nonumber{} &  & \leq{}(c^{{n}}+dc,A^{{n}}+dA,l_{c}^{{n}}+dl_{c},u_{c}^{{n}}+du_{c},l_{x}^{{n}}+dl_{x},u_{x}^{{n}}+du_{x})\bigg\rbrace{}.\end{array}\end{math} (8.1.2)

Here [[MathCmd 269]] is the nominal data, [[MathCmd 270]] is the data perturbation bounds, and the inequality is understood in the entry-wise sense. Note that some of the entries in the data perturbation bounds can be zero, which means that the corresponding data entries are known exactly – are certain.

  • The family of instances (8.1.1) with data running through a given uncertainty set [[MathCmd 267]] is called an uncertain linear optimization problem.
  • A vector x is called a robust feasible solution of an uncertain linear optimization problem, if it remains feasible for all realizations of the data from the uncertainty set, i.e., if

    \begin{math}\nonumber{}l_{c}\leq{}Ax\leq{}u_{c},\,l_{x}\leq{}x\leq{}u_{x}\hbox{ for all }(c,A,l_{c},u_{c},l_{x},u_{x})\in{}\mathcal{U}.\end{math} (8.1.3)
  • We say that the robust value of the objective at x does not exceed a real t, if [[MathCmd 273]] for all realizations of the objective from the uncertainty set.
  • The Robust Optimization methodology proposes to associate with an uncertain linear program its Robust Counterpart (RC) which is the problem of minimizing the robust optimal value over the set of all robust feasible solutions, i.e., the problem

    \begin{math}\nonumber{}\min \limits _{{t,x}}\left\lbrace{}t:c^{T}x\leq{}t,\,l_{c}\leq{}Ax\leq{}u_{c},\,l_{x}\leq{}x\leq{}u_{x}\hbox{ for all }(c,A,l_{c},u_{c},l_{x},u_{x})\in{}\mathcal{U}\right\rbrace{}.\end{math} (8.1.4)

    The optimal solution to (8.1.4) is treated as the “uncertainty-immuned” solution to the original uncertain linear programming program.

8.1.3.2. Robust counterpart of uncertain of a linear optimization problem with interval uncertainty.

In general, the RC (8.1.4) of an uncertain linear optimization problem is not a linear optimization problem (since(8.1.4) has infinitely many linear constraints). There are, however, cases when (8.1.4) can be rewritten equivalently as a linear programming program; in particular, this is the case for interval uncertainty (8.1.2). Specifically, in the case of (8.1.2), the Robust Counterpart of uncertain linear program is equivalent to the following linear program in variables x,y,t:

\begin{math}\nonumber{}\begin{array}{lcccccl}\nonumber{}\mbox{minimize} &  &  & t &  &  & \\\nonumber{}\mbox{subject to} &  &  & (c^{{n}})^{T}x+(dc)^{T}y-t & \leq{} & 0, & (a)\\\nonumber{} & l_{c}^{{n}}+dl_{c} & \leq{} & (A^{{n}})x-(dA)y, &  &  & (b)\\\nonumber{} &  &  & (A^{{n}})x+(dA)y & \leq{} & u_{c}^{{n}}-du_{c}, & (c)\\\nonumber{} & 0 & \leq{} & x+y, &  &  & (d)\\\nonumber{} & 0 & \leq{} & -x+y, &  &  & (e)\\\nonumber{} & l_{x}^{{n}}+dl_{x} & \leq{} & x & \leq{} & u_{x}^{{n}}-du_{x}, & (f)\end{array}\end{math} (8.1.5)

The origin of (8.1.5) is quite transparent: the constraints (8.1.5.d-e) linking x and y merely say that [[MathCmd 276]] for all i. With this in mind, it is evident that at every feasible solution to (8.1.5) the entries in the vector [[MathCmd 277]] are lower bounds on the entries of Ax with A from the uncertainty set (8.1.2), so that (8.1.5.b) ensures that [[MathCmd 278]] for all data from the uncertainty set. Similarly, (8.1.5.c) and (8.1.5.a), (8.1.5.f) ensure, for all data from the uncertainty set, that that [[MathCmd 279]], [[MathCmd 273]], and the entries in x satisfy the required lower and upper bounds, respectively.

Note that at the optimal solution to (8.1.5), one clearly has [[MathCmd 281]]. It follows that when the bounds on the entries of x impose nonnegativity (nonpositivity) of an entry [[MathCmd 48]], then there is no necessity to introduce the corresponding additional variable [[MathCmd 283]] – it can be from the very beginning replaced with [[MathCmd 48]], if [[MathCmd 48]] is nonnegative, or with [[MathCmd 286]], if [[MathCmd 48]] is nonpositive.

Another possible formulation of problem (8.1.5) is as follows. Let

\begin{displaymath}\nonumber{}l_{c}^{{n}}+dl_{c}=(A^{{n}})x-(dA)y-f,\quad{}f\geq{}0\end{displaymath}

then this equation is equivalent to (a) in (8.1.5.b). If [[MathCmd 289]] is identical to -, then that equation should just be dropped from the computations. Similarly,

\begin{displaymath}\nonumber{}-x+y=g\geq{}0\end{displaymath}

is equivalent to (8.1.5.d). This implies

\begin{displaymath}\nonumber{}l_{c}^{{n}}+dl_{c}-(A^{{n}})x+f=-(dA)y\end{displaymath}

and

\begin{displaymath}\nonumber{}y=g+x\end{displaymath}

Substituting these values into (8.1.5) gives

\begin{math}\nonumber{}\begin{array}{lcccccl}\nonumber{}\mbox{minimize} &  &  & t &  &  & \\\nonumber{}\mbox{subject to} &  &  & (c^{{n}})^{T}x+(dc)^{T}(g+x)-t & \leq{} & 0, & \\\nonumber{} & 0 & \leq{} & f, &  &  & \\\nonumber{} &  &  & 2(A^{{n}})x+(dA)(g+x)+f+l_{c}^{{n}}+dl_{c} & \leq{} & u_{c}^{{n}}-du_{c}, & \\\nonumber{} & 0 & \leq{} & g, &  &  & \\\nonumber{} & 0 & \leq{} & 2x+g, &  &  & \\\nonumber{} & l_{x}^{{n}}+dl_{x} & \leq{} & x & \leq{} & u_{x}^{{n}}-du_{x}, &\end{array}\end{math} (8.1.6)

which after some simplifications leads to

\begin{math}\nonumber{}\begin{array}{lcccccl}\nonumber{}\mbox{minimize} &  &  & t &  &  & \\\nonumber{}\mbox{subject to} &  &  & (c^{{n}}+dc)^{T}x+(dc)^{T}g-t & \leq{} & 0, & (a)\\\nonumber{} & 0 & \leq{} & f, &  &  & (b)\\\nonumber{} &  &  & 2(A^{{n}}+dA)x+(dA)g+f-(l_{c}^{{n}}+dl_{c}) & \leq{} & u_{c}^{{n}}-du_{c}, & (c)\\\nonumber{} & 0 & \leq{} & g, &  &  & (d)\\\nonumber{} & 0 & \leq{} & 2x+g, &  &  & (e)\\\nonumber{} & l_{x}^{{n}}+dl_{x} & \leq{} & x & \leq{} & u_{x}^{{n}}-du_{x}, & (f)\end{array}\end{math} (8.1.7)

and

\begin{math}\nonumber{}\begin{array}{lcccccl}\nonumber{}\mbox{minimize} &  &  & t &  &  & \\\nonumber{}\mbox{subject to} &  &  & (c^{{n}}+dc)^{T}x+(dc)^{T}g-t & \leq{} & 0, & (a)\\\nonumber{} &  &  & 2(A^{{n}}+dA)x+(dA)g+f & \leq{} & u_{c}^{{n}}-du_{c}+l_{c}^{{n}}+dl_{c}, & (b)\\\nonumber{} & 0 & \leq{} & 2x+g, &  &  & (c)\\\nonumber{} & 0 & \leq{} & f, &  &  & (d)\\\nonumber{} & 0 & \leq{} & g, &  &  & (e)\\\nonumber{} & l_{x}^{{n}}+dl_{x} & \leq{} & x & \leq{} & u_{x}^{{n}}-du_{x}. & (f)\end{array}\end{math} (8.1.8)

Observe this problem has more variables but much fewer constraints than (8.1.5). Therefore, (8.1.8) is likely to solver faster than (8.1.5). Also note (8.1.8.b) is trivially redundant if [[MathCmd 296]].

Introductory example (continued).

Let us apply the robust optimization methodology to our drug production example presented in Section 8.1.1, assume that the only uncertain data in (Drug) are the contents of the active agent in the raw materials, and that these contents vary in 0.5%- and 2%-neighborhoods of the respective nominal values 0.01 and 0.02. With this assumption, (Drug) becomes an uncertain LP affected by interval uncertainty; the Robust Counterpart (8.1.5) of this uncertain LP is the linear program

\begin{displaymath}\nonumber{}\begin{array}{rcl}\nonumber{}\mbox{(Drug\_RC)}: &  & \\\nonumber{}\mbox{maximize} &  & \\\nonumber{}t &  & \\\nonumber{}\mbox{subject to} &  & \\\nonumber{}t\leq{}-100\cdot \mbox{\texttt{RawI}}-199.9\cdot \mbox{\texttt{RawII}}+5500\cdot \mbox{\texttt{DrugI}}+6100\cdot \mbox{\texttt{DrugII}} &  & \\\nonumber{}0.01\cdot 0.995\cdot \mbox{\texttt{RawI}}+0.02\cdot 0.98\cdot \mbox{\texttt{RawII}}-0.500\cdot \mbox{\texttt{DrugI}}-0.600\cdot \mbox{\texttt{DrugII}} & \geq{} & 0\\\nonumber{}\mbox{\texttt{RawI}}+\mbox{\texttt{RawII}} & \leq{} & 1000\\\nonumber{}90.0\cdot \mbox{\texttt{DrugI}}+100.0\cdot \mbox{\texttt{DrugII}} & \leq{} & 2000\\\nonumber{}40.0\cdot \mbox{\texttt{DrugI}}+50.0\cdot \mbox{\texttt{DrugII}} & \leq{} & 800\\\nonumber{}100.0\cdot \mbox{\texttt{RawI}}+199.90\cdot \mbox{\texttt{RawII}}+700\cdot \mbox{\texttt{DrugI}}+800\cdot \mbox{\texttt{DrugII}} & \leq{} & 100000\\\nonumber{}\mbox{\texttt{RawI}},\mbox{\texttt{RawII}},\mbox{\texttt{DrugI}},\mbox{\texttt{DrugII}} & \geq{} & 0\end{array}\end{displaymath}

Solving this problem with MOSEK and we get the following output:

*** Optimal value: 8294.567
*** Optimal solution:
RawI:    877.732
RawII:     0.000
DrugI:    17.467
DrugII:    0.000

We see that the robust optimal solution we have built “costs money” – it promises a profit of just $ 8,295 (cf. with the profit of $ 8,820 promised by the nominal optimal solution). Note, however, that the robust optimal solution remains feasible whatever are the realizations of the uncertain data from the uncertainty set in question, while the nominal optimal solution requires adjusting to these data and, with this adjusting, results in the average profit of $ 7,843, which is by 5.4% less than the profit $ 8,295 guaranteed by the robust optimal solution. Note also that the robust optimal solution is significantly different from the nominal one: both solutions prescribe to produce the same drug DrugI (in the amounts 17,467 and 17,552 packs, respectively), but from different raw materials, RawI in the case of the robust solution and RawII in the case of the nominal one. The reason is that although the price per unit of the active agent for RawII is sligthly less than for RawI, the content of the agent in RawI is more stable, so that when possible fluctuations of the contents are taken into the account, RawI turns out to be more profitable than RawII.

8.1.4. Random uncertainty and Ellipsoidal Robust Counterpart

In some cases, it is natural to assume that the perturbations affecting different uncertain data entries are random and independent of each other. In these cases, the Robust Counterpart based on the interval model of uncertainty seems to be “too conservative”: why should we expect that all the data will be simultaneously driven to their “most unfavorable” values and immune the solution against this highly unlikely situation? A less conservative approach is offered by the ellipsoidal model of uncertainty. To motivate this model, let us look what happens with a particular linear constraint

\begin{math}\nonumber{}a^{T}x\leq{}b\end{math} (8.1.9)

at a given candidate solution x in the case when the vector a of coefficients of the constraint is affected by random perturbations:

\begin{math}\nonumber{}a=a^{{n}}+\zeta ,\end{math} (8.1.10)

where [[MathCmd 300]] is the vector of nominal coefficients and [[MathCmd 301]] is a random perturbation vector with zero mean and covariance matrix [[MathCmd 302]]. In this case, the value of the left hand side of (8.1.9), evaluated at a given x, becomes a random variable with the expected value [[MathCmd 303]] and the standard deviation [[MathCmd 304]]. Now let us act as an engineer which believes that the value of a random variable never exceeds its mean plus 3 times the standard deviation; we do not intend to be that specific and replace “3” in the above rule by a safety parameter [[MathCmd 305]] which will be in our control. Believing that the value of a random variable “never” exceeds its mean plus [[MathCmd 305]] times the standard deviation, we conclude that a “safe” version of (8.1.9) is the inequality

\begin{math}\nonumber{}(a^{{n}})^{T}x+\Omega \sqrt{x^{T}V_{a}x}\leq{}b.\end{math} (8.1.11)

The word “safe” above admits a quantitative interpretation: if x satisfies 8.1.11, then one can bound from above the probability of the event that random perturbations (8.1.10) result in violating the constraint (8.1.9) evaluated at x. The bound in question depends on what we know about the distribution of [[MathCmd 301]]. For example,

  1. We always have the bound given by the Tschebyshev inequality:

    \begin{math}\nonumber{}x\hbox{ satisfies (8.1.11) }\Rightarrow \mbox{Prob}\left\lbrace{}a^{T}x>b\right\rbrace{}\leq{}\frac{1}{\Omega ^{2}}.\end{math} (8.1.12)
  2. When [[MathCmd 301]] is Gaussian, then the Tschebyshev bound can be improved to

    \begin{math}\nonumber{}x\hbox{ satisfies (8.1.11) }\Rightarrow \mbox{Prob}\left\lbrace{}a^{T}x>b\right\rbrace{}\leq{}\frac{1}{\sqrt{2\pi }}\int \limits _{\Omega }^{\infty }\exp{}\lbrace{}-t^{2}/2\rbrace{}dt\leq{}0.5\exp{}\lbrace{}-\Omega ^{2}/2\rbrace{}.\end{math} (8.1.13)
  3. Assume that [[MathCmd 312]], where [[MathCmd 313]] is certain [[MathCmd 314]] matrix, and [[MathCmd 315]] is a random vector with independent coordinates [[MathCmd 316]] symmetrically distributed in the segment [-1,1]. Setting [[MathCmd 317]] (V is a natural “upper bound” on the covariance matrix of [[MathCmd 301]]), one has

    \begin{math}\nonumber{}x\hbox{ satisfies (8.1.11) }\Rightarrow \mbox{Prob}\left\lbrace{}a^{T}x>b\right\rbrace{}\leq{}0.5\exp{}\lbrace{}-\Omega ^{2}/2\rbrace{}.\end{math} (8.1.14)

Note that in order to ensure the bounds in (8.1.13) and (8.1.14) to be [[MathCmd 320]], it suffices to set [[MathCmd 321]].

Now assume that we are given a linear program affected by random perturbations:

\begin{math}\nonumber{}\begin{array}{lccccl}\nonumber{}\mbox{minimize} &  &  & [c^{{n}}+dc]^{T}x &  & \\\nonumber{}\mbox{subject to} & (l_{c})_{i} & \leq{} & [a_{i}^{{n}}+da_{i}]^{T}x & \leq{} & (u_{c})_{i},\,\,i=1,...,m,\\\nonumber{} & l_{x} & \leq{} & x & \leq{} & u_{x},\end{array}\end{math} (8.1.15)

where [[MathCmd 323]] are the nominal data, and dc, [[MathCmd 324]] are random perturbations with zero means . Assume, for the sake of definiteness, that every one of the random perturbations [[MathCmd 325]] satisfies either the assumption of item 2, or the assumption of item 3, and let [[MathCmd 326]] be the corresponding (upper bounds on the) covariance matrices of the perturbations. Choosing a safety parameter [[MathCmd 305]] and replacing the objective and the bodies of all the constraints by their safe bounds as explained above, we arrive at the following optimization problem:

\begin{math}\nonumber{}\begin{array}{lccccl}\nonumber{}\mbox{minimize} &  &  & t &  & \\\nonumber{}\mbox{subject to} &  &  & [c^{{n}}]^{T}x+\Omega \sqrt{x^{T}V_{c}x} & \leq{} & t,\\\nonumber{} & (l_{c})_{i} & \leq{} & [a_{i}^{{n}}]^{T}x-\Omega \sqrt{x^{T}V_{{a_{i}}}x}, &  & \\\nonumber{} &  &  & [a_{i}^{{n}}]^{T}x+\Omega \sqrt{x^{T}V_{{a_{i}}}x} & \leq{} & (u_{c})_{i},\,\,i=1,...,m,\\\nonumber{} & l_{x} & \leq{} & x & \leq{} & u_{x}.\end{array}\end{math} (8.1.16)

The relation between problems (8.1.16) and (8.1.15) is as follows:

If (x,t) is a feasible solution of (8.1.16), then with probability at least

\begin{displaymath}\nonumber{}p=1-(m+1)\exp{}\lbrace{}-\Omega ^{2}/2\rbrace{}\end{displaymath}

x is feasible for randomly perturbed problem (8.1.15), and t is an upper bound on the objective of (8.1.15) evaluated at x.

We see that if [[MathCmd 305]] is not too small, (8.1.16) can be treated as a “safe version” of (8.1.15).

On the other hand, it is easily seen that (8.1.16) is nothing but the Robust Counterpart of the uncertain linear optimization problem with the nominal data [[MathCmd 323]] and the row-wise ellipsoidal uncertainty given by the matrices [[MathCmd 332]]. In the corresponding uncertainty set, the uncertainty affects the coefficients of the objective and the constraint matrix only, and the perturbation vectors affecting the objective and the vectors of coefficients of the linear constraints run, independently of each other, through the respective ellipsoids

\begin{displaymath}\nonumber{}\begin{array}{rcl}\nonumber{}E_{c} & = & \left\lbrace{}dc=\Omega V_{c}^{{1/2}}u:u^{T}u\leq{}1\right\rbrace{},\\\nonumber{}E_{{a_{i}}} & = & \left\lbrace{}da_{i}=\Omega V_{{a_{i}}}^{{1/2}}u:u^{T}u\leq{}1\right\rbrace{},\,i=1,...,m.\end{array}\end{displaymath}

It turns out that in many cases the ellipsoidal model of uncertainty is significantly less conservative, and thus – better suited for practice, than the interval model of uncertainty.

Last, but not least, it should be mentioned that problem (8.1.16) is equivalent to a conic quadratic program, specifically, to the program

\begin{math}\nonumber{}\begin{array}{lccccl}\nonumber{}\mbox{minimize} &  &  & t &  & \\\nonumber{}\mbox{subject to} &  &  & [c^{{n}}]^{T}x+\Omega z & \leq{} & t,\\\nonumber{} & (l_{c})_{i} & \leq{} & [a_{i}^{{n}}]^{T}x-\Omega z_{i}, &  & \\\nonumber{} &  &  & [a_{i}^{{n}}]^{T}x+\Omega z_{i} & \leq{} & (u_{c})_{i},\,\,i=1,...,m,\\\nonumber{} & 0 & = & w-D_{c}x &  & \\\nonumber{} & 0 & = & w^{i}-D_{{a_{i}}}x, &  & i=1,...,m,\\\nonumber{} & 0 & \leq{} & z-\sqrt{w^{T}w}, &  & \\\nonumber{} & 0 & \leq{} & z_{i}-\sqrt{(w^{i})^{T}w^{i}}, &  & i=1,...,m,\\\nonumber{} & l_{x} & \leq{} & x & \leq{} & u_{x}.\end{array}\end{math} (8.1.17)

where [[MathCmd 335]] and [[MathCmd 336]] are matrices satisfying the relations

\begin{displaymath}\nonumber{}V_{c}=D_{c}^{T}D_{c},\,V_{{a_{i}}}=D_{{a_{i}}}^{T}D_{{a_{i}}},\,i=1,...,m.\end{displaymath}

8.1.4.1. Example: Interval and Ellipsoidal Robust counterparts of uncertain linear constraint with independent random perturbations of coefficients.

Consider a linear constraint

\begin{math}\nonumber{}l\leq{}\sum \limits _{{j=1}}^{n}a_{j}x_{j}\leq{}u\end{math} (8.1.18)

and assume that the coefficients [[MathCmd 339]] of the body of the constraint are uncertain and vary in intervals [[MathCmd 340]]. The worst-case-oriented model of uncertainty here is the interval one, and the corresponding robust counterpart of the constraint is given by the system of linear inequalities

\begin{math}\nonumber{}\begin{array}{rcccl}\nonumber{}l & \leq{} & \sum \limits _{{j=1}}^{n}a^{{n}}_{j}x_{j}-\sum \limits _{{j=1}}^{n}\sigma _{j}y_{j}, &  & \\\nonumber{} &  & \sum \limits _{{j=1}}^{n}a^{{n}}_{j}x_{j}+\sum \limits _{{j=1}}^{n}\sigma _{j}y_{j} & \leq{} & u,\\\nonumber{}0 & \leq{} & x_{j}+y_{j}, &  & \\\nonumber{}0 & \leq{} & -x_{j}+y_{j}, &  & j=1,...,n.\end{array}\end{math} (8.1.19)

Now assume that we have reasons to believe that the true values of the coefficients [[MathCmd 339]] are obtained from their nominal values [[MathCmd 343]] by random perturbations, independent for different j and symmetrically distributed in the segments [[MathCmd 344]]. With this assumption, we are in the situation of item 3 and can replace uncertain constraint (8.1.18) with its ellipsoidal robust counterpart

\begin{math}\nonumber{}\begin{array}{rcccl}\nonumber{}l & \leq{} & \sum \limits _{{j=1}}^{n}a^{{n}}_{j}x_{j}-\Omega z, &  & \\\nonumber{} &  & \sum \limits _{{j=1}}^{n}a^{{n}}_{j}x_{j}+\Omega z & \leq{} & u,\\\nonumber{}0 & \leq{} & z-\sqrt{\sum \limits _{{j=1}}^{n}\sigma _{j}^{2}x_{j}^{2}}. &  &\end{array}\end{math} (8.1.20)

Note that with the model of random perturbations, a vector x satisfying (8.1.20) satisfies a realization of (8.1.18) with probability at least [[MathCmd 346]]; for [[MathCmd 347]], this probability is [[MathCmd 348]], which, for all practical purposes, is the same as to say that x satisfies all realizations of (8.1.18). On the other hand, the uncertainty set associated with (8.1.19) is the box

\begin{displaymath}\nonumber{}B=\left\lbrace{}a=(a_{1},...,a_{n})^{T}:a_{j}^{{n}}-\sigma _{j}\leq{}a_{j}\leq{}a_{j}^{{n}}+\sigma _{j},\,j=1,...,n\right\rbrace{},\end{displaymath}

while the uncertainty set associated with (8.1.20) is the ellipsoid

\begin{displaymath}\nonumber{}E(\Omega )=\left\lbrace{}a=(a_{1},...,a_{n})^{T}:\sum \limits _{{j=1}}^{n}{(a_{j}-a_{j}^{{n}})^{\frac{2}{\sigma _{j}^{2}}}}\leq{}\Omega ^{2}\right\rbrace{}.\end{displaymath}

For a moderate value of [[MathCmd 305]], say, [[MathCmd 347]], and n40, the ellipsoid [[MathCmd 353]] in its diameter, typical linear sizes, volume, etc., is incomparably less than the box B, the difference becoming the more dramatic the larger is the dimension n of the box and the ellipsoid. It follows that the ellipsoidal robust counterpart (8.1.20) of the randomly perturbed uncertain constraint (8.1.18) is much less conservative than the interval robust counterpart (8.1.19), while ensuring basically the same “robustness guarantees”. To illustrate this important point, consider a numerical example as follows:

There are n different assets on the market. The return on $ 1 invested in asset j is a random variable distributed symmetrically in the segment [[MathCmd 354]], and the returns for different assets are independent of each other. The problem is to distribute $ 1 among the assets in order to get the largest possible total return on the resulting portfolio.

A natural model of the problem is an uncertain linear optimization problem

\begin{math}\nonumber{}\begin{array}{lccccl}\nonumber{}\mbox{maximize} &  &  & \sum \limits _{{j=1}}^{n}a_{j}x_{j} &  & \\\nonumber{}\mbox{subject to} &  &  & \sum \limits _{{j=1}}^{n}x_{j} & = & 1,\\\nonumber{} & 0 & \leq{} & x_{j}, &  & j=1,...,n.\end{array}\end{math} (8.1.21)

where [[MathCmd 339]] are the uncertain returns of the assets. Both the nominal optimal solution (set all returns [[MathCmd 339]] equal to their nominal values [[MathCmd 358]]) and the risk-neutral Stochastic Programming approach (maximize the expected total return) result in the same solution: our $ 1 should be invested in the most promising asset(s) – the one(s) with the maximal nominal return. This solution, however, can be very “unreliable” if, as it typically is the case in reality, the most promising asset has the largest volatility [[MathCmd 359]] and is in this sense the most risky. To reduce risk, one can use the Robust Counterpart approach which results in the optimization problems

\begin{math}\nonumber{}\begin{array}{lccccl}\nonumber{}\mbox{maximize} &  & t &  &  & \\\nonumber{}\mbox{subject to} & 0 & \leq{} & -t+\sum \limits _{{j=1}}^{n}(\delta _{j}-\sigma _{j})x_{j}, &  & \\\nonumber{} &  &  & \sum \limits _{{j=1}}^{n}x_{j} & = & 1,\\\nonumber{} & 0 & \leq{} & x_{j}, &  & j=1,...,n.\end{array}\end{math} (8.1.22)

(the interval model of uncertainty) and

\begin{math}\nonumber{}\begin{array}{lccccl}\nonumber{}\mbox{maximize} &  &  & t &  & \\\nonumber{}\mbox{subject to} & 0 & \leq{} & -t+\sum \limits _{{j=1}}^{n}(\delta _{j})x_{j}-\Omega z, &  & \\\nonumber{} & 0 & \leq{} & z-\sqrt{\sum \limits _{{j=1}}^{n}\sigma _{j}^{2}x_{j}^{2}}, &  & \\\nonumber{} &  &  & \sum \limits _{{j=1}}^{n}x_{j} & = & 1,\\\nonumber{} & 0 & \leq{} & x_{j}, &  & j=1,...,n.\end{array}\end{math} (8.1.23)

(the ellipsoidal model of uncertainty; note that (8.1.23) is, essentially, the risk-averted portfolio model proposed in mid-50's by Markowitz).

The solution of (8.1.22) is evident – our $ 1 should be invested into the asset(s) with the largest possible guaranteed return [[MathCmd 362]]. In contrast to this very conservative policy (which in reality prescribes to keep the initial capital in a bank or in the most reliable, and thus low profit, assets), the optimal solution to (8.1.23) prescribes a quite reasonable diversification of investments which allows to get much better total return than (8.1.22), with, basically, zero risk[[MathCmd 363]]. To get an illustration, assume that there are n=300 assets with the nominal returns (per year) varying from 1.04 (bank savings) to 2.00:

\begin{displaymath}\nonumber{}\delta _{j}=1.04+0.96\frac{j-1}{n-1},\,j=1,2,...,n=300\end{displaymath}

and volatilities varying from 0 for the bank savings to 1.2 for the most promising asset:

\begin{displaymath}\nonumber{}\sigma _{j}=1.152\frac{j-1}{n-1},\,j=1,...,n=300.\end{displaymath}

Here is a MATLAB script which builds the associated problem (8.1.23), solves it via the MOSEK optimization toolbox, displays the resulting robust optimal value of the total return and the distribution of investments, and, finally, runs 10,000 simulations to get the distribution of the total return for the resulting portfolio (in these simulations, the returns of all assets are uniformly distributed in the corresponding intervals):

% File: rlo2.m

% Problem:
%
% maximize t subject to
% t <= sum(delta(j)*x(j)) -Omega*z,
% y(j) = sigma(j)*x(j), j=1,...,n,
% sum(x(j)) = 1,
% norm(y) <= z,
% 0 <= x.

clear prob;
n     = 300;
Omega = 6;

% setting nominal returns and volatilities
delta = (0.96/(n-1))*[0:1:n-1]+1.04;
sigma = (1.152/(n-1))*[0:1:n-1];

% setting mosekopt description of the problem

prob.c = -[1;zeros(2*n+1,1)];
A      = [-1,ones(1,n)+delta,-Omega,zeros(1,n);zeros(n+1,2*n+2)];
for j=1:n,
    % body of the constraint y(j) - sigma(j)*x(j) = 0:
    A(j+1,j+1)   = -sigma(j);
    A(j+1,2+n+j) = 1;
end;
A(n+2,2:n+1)        = ones(1,n);
prob.a             = sparse(A);
prob.blc           = [zeros(n+1,1);1];
prob.buc           = [inf;zeros(n,1);1];
prob.blx           = [-inf;zeros(n,1);0;zeros(n,1)];
prob.bux           = inf*ones(2*n+2,1);
prob.cones         = cell(1,1);
prob.cones{1}.type = 'MSK_CT_QUAD';
prob.cones{1}.sub  = [n+2;[n+3:1:2*n+2]'];

% running mosekopt
[r,res]=mosekopt('minimize echo(1)',prob);

% displaying the solution
xx = res.sol.itr.xx;
t  = xx(1);

disp(sprintf('Robust optimal value: %5.4f',t));
x = max(xx(2:1+n),zeros(n,1));
plot([1:1:n],x,'-m');
grid on;

disp('Press <Enter> to run simulations');
pause

% running simulations

Nsim = 10000;
out  = zeros(Nsim,1);
for i=1:Nsim,
    returns  = delta+(2*rand(1,n)-1).*sigma;
    out(i)   = returns*x;
end;
disp(sprintf('Actual returns over %d simulations:',Nsim));
disp(sprintf('Min=%5.4f Mean=%5.4f Max=%5.4f StD=%5.2f',...
    min(out),mean(out),max(out),std(out)));
hist(out);

Here are the results displayed by the script:

Robust optimal value: 1.3428
Actual returns over 10000 simulations:
Min=1.5724 Mean=1.6965 Max=1.8245 StD= 0.03
Figure 8.1: Distribution of investments among the assets in the optimal solution of.

Note that with our setup, there is exactly one asset with guaranteed return greater than 1 – asset # 1 (bank savings, return 1.04, zero volatility). Consequently, the interval robust counterpart (8.1.22) prescribes to put our $ 1 to bank, thus getting 4%-profit. In contrast to this, the diversified portfolio given by the optimal solution of (8.1.23) never yields profit less than 57.2 %, and yields at average 69.67%-profit with pretty low (0.03) standard deviation. We see that in favorable circumstances the ellipsoidal robust counterpart of an uncertain linear program indeed is less conservative than, although is basically as reliable as, the interval robust counterpart.

Finally, let us compare our results with those given by the nominal optimal solution. The latter prescribes to invest everything we have in the most promising asset (in our example this is the asset # 300 with nominal return 2.00 and volatility 1.152). Assuming that the actual return is uniformly distributed in the corresponding interval and running 10,000 simulations, we get the results as follows:

Nominal optimal value: 2.0000
Actual returns over 10000 simulations:
Min=0.8483 Mean=1.9918 Max=3.1519 StD= 0.66

We see that the nominal solution results in a portfolio which is much more risky, although better at average, than the portfolio given by the robust solution.

8.1.4.1.1. Combined Interval-Ellipsoidal Robust Counterpart.

We have considered the case when the coefficients [[MathCmd 339]] of uncertain linear constraint (8.1.18) are affected by independent of each other random perturbations symmetrically distributed in given intervals [[MathCmd 344]] and have discussed two ways to model the uncertainty:

  • interval uncertainty model (the uncertainty set [[MathCmd 267]] is the box B), where we ignore the stochastic nature of the perturbations and their independence; this model yields the interval RC (8.1.19);
  • ellipsoidal uncertainty model ([[MathCmd 267]] is the ellipsoid [[MathCmd 353]]), which does take into account the stochastic nature of data perturbations and yields the ellipsoidal RC (8.1.20).

Note that although for large n the ellipsoid [[MathCmd 353]] in its diameter, volume and average linear sizes is incomparably smaller than the box B, in the case of [[MathCmd 372]] the ellipsoid [[MathCmd 353]] in certain directions goes beyond the box. For example, the ellipsoid E(6), although much more “narrow” than B in most of the directions, is 6 times “thicker” than B in the directions of the coordinate axes. Intuition says that it hardly makes sense to keep in the uncertainty set realizations of the data which are outside of B and thus are forbidden by our model of perturbations, so that in the situation under consideration the intersection of [[MathCmd 353]] and B is a better model of the uncertainty set than the ellipsoid [[MathCmd 353]] itself. What happens when the model of the uncertainty set is the “combined interval-ellipsoidal” uncertainty [[MathCmd 376]] ?

First, it turns out that the RC of (8.1.18) corresponding to the uncertainty set [[MathCmd 377]] is still given by a system of linear and conic quadratic inequalities, specifically, the system

\begin{math}\nonumber{}\begin{array}{rcccl}\nonumber{}l & \leq{} & \sum \limits _{{j=1}}^{n}a_{j}^{{n}}x_{j}-\sum \limits _{{j=1}}^{n}\sigma _{j}y_{j}-\Omega \sqrt{\sum \limits _{{j=1}}^{n}\sigma _{j}^{2}u_{j}^{2}}, &  & \\\nonumber{} &  & \sum \limits _{{j=1}}^{n}a_{j}^{{n}}x_{j}+\sum \limits _{{j=1}}^{n}\sigma _{j}z_{j}+\Omega \sqrt{\sum \limits _{{j=1}}^{n}\sigma _{j}^{2}v_{j}^{2}} & \leq{} & u,\\\nonumber{}-y_{j} & \leq{} & x_{j}-u_{j} & \leq{} & y_{j},\,j=1,...,n,\\\nonumber{}-z_{j} & \leq{} & x_{j}-v_{j} & \leq{} & z_{j},\,j=1,...,n.\end{array}\end{math} (8.1.24)

Second, it turns out that our intuition is correct: as a model of uncertainty, [[MathCmd 379]] is as reliable as the ellipsoid [[MathCmd 353]]. Specifically, if x can be extended to a feasible solution of (8.1.24), then the probability for x to satisfy a realization of (8.1.18) is [[MathCmd 381]].

The conclusion is that if we have reasons to assume that the perturbations of uncertain coefficients in a constraint of an uncertain linear optimization problem are (a) random, (b) independent of each other, and (c) symmetrically distributed in given intervals, then it makes sense to associate with this constraint an interval-ellipsoidal model of uncertainty and use, as the robust version of the constraint, system of linear and conic quadratic inequalities (8.1.24). Note that when building the Robust Counterpart of an uncertain linear optimization problem, one can use different models of the uncertainty (e.g., interval, ellipsoidal, combined interval-ellipsoidal) for different uncertain constraints, and, consequently, use in the RC different “robust versions” of different original uncertain constraints.

8.1.5. Further references

For further information about robust linear optimization consult [4, 6].

8.2. Geometric (posynomial) optimization

8.2.1. The problem

A geometric optimization problem can be stated as follows

\begin{math}\nonumber{}\begin{array}{lccll}\nonumber{}\mbox{minimize} & \sum \limits _{{k\in{}J_{0}}}c_{k}\prod \limits _{{j=0}}^{{n-1}}t_{j}^{{a_{{kj}}}} &  &  & \\\nonumber{}\mbox{subject to} & \sum \limits _{{k\in{}J_{i}}}c_{k}\prod \limits _{{j=0}}^{{n-1}}t_{j}^{{a_{{kj}}}} & \leq{} & 1, & i=1,\ldots ,m,\\\nonumber{} & t>0, &  &  &\end{array}\end{math} (8.2.1)

where it is assumed that

\begin{displaymath}\nonumber{}\cup _{{k=0}}^{m}J_{k}=\lbrace{}1,\ldots ,T\rbrace{}\end{displaymath}

and if [[MathCmd 97]], then

\begin{displaymath}\nonumber{}J_{i}\cap J_{j}=\emptyset .\end{displaymath}

Hence, A is a [[MathCmd 99]] matrix and c is a vector of length T. Given [[MathCmd 387]] then

\begin{displaymath}\nonumber{}c_{k}\prod \limits _{{j=0}}^{{n-1}}t_{j}^{{a_{{kj}}}}\end{displaymath}

is called a monomial . A sum of monomials i.e.

\begin{displaymath}\nonumber{}\sum \limits _{{k\in{}J_{i}}}c_{k}\prod \limits _{{j=0}}^{{n-1}}t_{j}^{{a_{{kj}}}}\end{displaymath}

is called a posynomial . In general, the problem (8.2.1) is very hard to solve. However, the posynomial case where it is required that

\begin{displaymath}\nonumber{}c>0\end{displaymath}

is relatively easy. The reason is that using a simple variable transformation a convex optimization problem can be obtained. Indeed using the variable transformation

\begin{math}\nonumber{}t_{j}=e^{{x_{j}}}\end{math} (8.2.2)

we obtain the problem

\begin{math}\nonumber{}\begin{array}{lccll}\nonumber{}\mbox{minimize} & \sum \limits _{{k\in{}J_{0}}}c_{k}e^{{\sum \limits _{{j=0}}^{{n-1}}a_{{kj}}x_{j}}} &  &  & \\\nonumber{}\mbox{subject to} & \sum \limits _{{k\in{}J_{i}}}c_{k}e^{{\sum \limits _{{j=0}}^{{n-1}}a_{{kj}}x_{j}}} & \leq{} & 1, & i=1,\ldots ,m,\end{array}\end{math} (8.2.3)

which is a convex optimization problem that can be solved using MOSEK. We will call

\begin{displaymath}\nonumber{}c_{t}e^{{\left(\sum \limits _{{j=0}}^{{n-1}}a_{{tj}}x_{j}\right)}}=e^{{\left(\log (c_{t})+\sum \limits _{{j=0}}^{{n-1}}a_{{tj}}x_{j}\right)}}\end{displaymath}

for a term and hence the number of terms is T.

As stated, the problem (8.2.3) is non-separable. However, using

\begin{displaymath}\nonumber{}v_{t}=\log (c_{t})+\sum \limits _{{j=0}}^{{n-1}}a_{{tj}}x_{j}\end{displaymath}

we obtain the separable problem

\begin{math}\nonumber{}\begin{array}{lccll}\nonumber{}\mbox{minimize} & \sum \limits _{{t\in{}J_{0}}}e^{{v_{t}}} &  &  & \\\nonumber{}\mbox{subject to} & \sum \limits _{{t\in{}J_{i}}}e^{{v_{t}}} & \leq{} & 1, & i=1,\ldots ,m,\\\nonumber{} & \sum \limits _{{j=0}}^{{n-1}}a_{{tj}}x_{j}-v_{t} & = & -\log (c_{t}), & t=0,\ldots ,T,\end{array}\end{math} (8.2.4)

which is a separable convex optimization problem.

One warning about this approach is that the function

\begin{displaymath}\nonumber{}e^{x}\end{displaymath}

is only well-defined for small values of x in absolute value. Indeed [[MathCmd 397]] grows very rapidly as x becomes larger. Therefore numerical problems may arise when solving the problem on this form.

8.2.2. Applications

A large number of practical applications, particularly in electrical circuit design, can be cast as a geometric optimization problem. We will not review those applications here but rather we refer the reader to [15] and the references therein.

8.2.3. Modelling tricks

A lot of tricks that can be used modelling posynomial optimization problems are described in [15]. Therefore, in this section we cover only one important case.

8.2.3.1. Equalities

In general equalities are not allowed in (8.2.1), i.e.

\begin{displaymath}\nonumber{}\sum \limits _{{k\in{}J_{i}}}c_{k}\prod \limits _{{j=0}}^{{n-1}}t_{j}^{{a_{{kj}}}}=1\end{displaymath}

is not allowed. However, a monomial equality is not a problem. Indeed consider the example

\begin{displaymath}\nonumber{}xyz^{{-1}}=1\end{displaymath}

of a monomial equality. The equality is identical to

\begin{displaymath}\nonumber{}1\leq{}xyz^{{-1}}\leq{}1\end{displaymath}

which in turn is identical to the two inequalities

\begin{displaymath}\nonumber{}\begin{array}{lclcl}\nonumber{}xyz^{{-1}} &  &  & \leq{} & 1,\\\nonumber{}\frac{1}{xyz^{{-1}}} & = & x^{{-1}}y^{{-1}}z & \leq{} & 1.\end{array}\end{displaymath}

Hence, it is possible to model a monomial equality using two inequalities.

8.2.4. Problematic formulations

Certain formulations of geometric optimization problems may cause problems for the algorithms implemented in MOSEK. Basically there are two kinds of problems that may occur:

  • The solution vector is finite, but an optimal objective value can only be a approximated.
  • The optimal objective value is finite but implies that a variable in the solution is infinite.

8.2.4.1. Finite unattainable solution

The following problem illustrates an unattainable solution:

\begin{displaymath}\nonumber{}\begin{array}{lclcl}\nonumber{}\mbox{minimize} & x^{2}y &  & \\\nonumber{}\mbox{subject to} & xy & \leq{} & 1,\\\nonumber{} & x,y>0. &  &\end{array}\end{displaymath}

Clearly, the optimal objective value is 0, but because of the constraint the constraint x,y>0 this value can never be attained: To see why this is a problem, remember that MOSEK substitutes [[MathCmd 403]] and [[MathCmd 404]] and solves the problem as

\begin{displaymath}\nonumber{}\begin{array}{lclcl}\nonumber{}\mbox{minimize} & e^{{2t_{x}}}e^{{t_{y}}} &  & \\\nonumber{}\mbox{subject to} & e^{{t_{x}}}e^{{t_{y}}} & \leq{} & 1,\\\nonumber{} & t_{x},t_{y}\in{}{R}. &  &\end{array}\end{displaymath}

We now see that the optimal solution implies that [[MathCmd 406]] or [[MathCmd 407]], which is unattainable.

It should now be clear what the issue is: If a variable x appears only with nonnegative exponents, then fixing x=0 will minimize all terms in which it appears — but such a solution cannot be attained.

8.2.4.2. Infinite solution

A similar problem will occur if a finite optimal objective value requires a variable to be infinite. This can be illustrated by the following example:

\begin{displaymath}\nonumber{}\begin{array}{lclcl}\nonumber{}\mbox{minimize} & x^{{-2}} &  & \\\nonumber{}\mbox{subject to} & x^{{-1}} & \leq{} & 1,\\\nonumber{} & x>0, &  &\end{array}\end{displaymath}

which is a valid geometric programming problem. In this case the optimal objective is 0, but this requires x=, which is unattainable.

Again, this specific case will appear if a variable x appears only with negative exponents in the problem, implying that each term in which it appears can be minimized for [[MathCmd 409]].

8.2.5. An example

Consider the example

\begin{displaymath}\nonumber{}\begin{array}{lclcl}\nonumber{}\mbox{minimize} & x^{{-1}}y &  & \\\nonumber{}\mbox{subject to} & x^{2}y^{{-\frac{1}{2}}}+3y^{{\frac{1}{2}}}z^{{-1}} & \leq{} & 1,\\\nonumber{} & xy^{{-1}} & = & z^{2},\\\nonumber{} & -x & \leq{} & -\frac{1}{10},\\\nonumber{} & x & \leq{} & 3,\\\nonumber{} & x,y,z>0, &  &\end{array}\end{displaymath}

which is not a geometric optimization problem. However, using the obvious transformations we obtain the problem

\begin{math}\nonumber{}\begin{array}{lclcl}\nonumber{}\mbox{minimize} & x^{{-1}}y &  & \\\nonumber{}\mbox{subject to} & x^{2}y^{{-\frac{1}{2}}}+3y^{{\frac{1}{2}}}z^{{-1}} & \leq{} & 1,\\\nonumber{} & xy^{{-1}}z^{{-2}} & \leq{} & 1,\\\nonumber{} & x^{{-1}}yz^{2} & \leq{} & 1,\\\nonumber{} & \frac{1}{10}x^{{-1}} & \leq{} & 1,\\\nonumber{} & \frac{1}{3}x & \leq{} & 1,\\\nonumber{} & x,y,z>0, &  &\end{array}\end{math} (8.2.5)

which is a geometric optimization problem.

8.2.6. Solving the example

The problem 8.2.5 can be defined and solved in the MOSEK toolbox as shown below.

% go2.m

c     = [1 1 3 1 1 0.1 1/3]';
a     = sparse([[-1  1  0];
                [2 -0.5 0];
                [0 0.5 -1];
                [1 -1 -2]; 
                [-1 1 2]; 
                [-1 0 0]; 
                [1 0 0]]);

map   = [0 1 1 2 3 4 5]';
[res] = mskgpopt(c,a,map);

fprintf('\nPrimal optimal solution to original gp:');
fprintf(' %e',exp(res.sol.itr.xx));
fprintf('\n\n');

% Compute the optimal objective value and
% the constraint activities.
v = c.*exp(a*res.sol.itr.xx);

% Add appropriate terms together.
f = sparse(map+1,1:7,ones(size(map)))*v;

% First objective value. Then constraint values.
fprintf('Objective value: %e\n',log(f(1)));
fprintf('Constraint values:');
fprintf(' %e',log(f(2:end)));
fprintf('\n\n');

% Dual multipliers (should be negative)
fprintf('Dual variables (should be negative):');
fprintf(' %e',res.sol.itr.y);
fprintf('\n\n');

8.2.7. Exporting to a file

It's possible to write a geometric optimization problem to a file with the command:

 mskgpwri(c,a,map,filename)

This file format is compatible with the command line tool mskexpopt. See the MOSEK Tools User's manual for details on mskexpopt. This file format can be useful for sending debug information to MOSEK or testing. It's also possible to read the above format with the command:

 [c,a,map] = mskgpread(filename)

8.2.8. Further information

More information about geometric optimization problems is located in [18, 14, 15].

Mon Sep 14 15:56:07 2009