Section 5.2: Overview and General Notes

In equation solving our objective is to find the answer to a problem. There is in general just one valid answer (subject to the proviso that formally a given, well posed set of equations may occasionally have more that one).

In optimisation we are in a situation where a variety of acceptable answers are available to our problem. The aim is to determine which one is best in terms of some specified criterion.

Note that:

In equation solving there must be exactly as many equations as unknowns.
In optimisation there are always a different number of unknowns and equations.
Normally there are more unknowns than equations. the procedure thus involves choosing values of the extra `free' or `design' variables in order to either maximise or minimise the value of an objective function.
The objective function (o.f.) can be function of the problem and design variables, but it must be a single valued function. It is not possible to optimise with respect to more than a single criterion.
Mathematically maximising an o.f. is exactly the same as minimising minus the o.f.
A special case of also optimisation occurs when there are are fewer unknowns than equations. This includes the problem of data fitting and is discussed in section 5.2.1

The notes below cover the following topics:

General Notation for Optimisation Problems
One Variable Optimisation
Optimisation with Equality Constraints
Alphabet Soup: Some abbreviations defined
Linear and integer optimisation
Nonlinear optimisation

General Notation for Optimisation Problems

The simplest general form of an optimisation problem may be stated in words as:

`Given a set of variables x_i, i = 1, 2 ,... n and an objective function P (x₁, x₂, x₃ ...) find values for the x_i such that P is a minimum (or a maximum).'

This is called an unconstrained optimisation problem.

The next most complicated class of problem conceptually is an optimisation with inequality constraints. We may write the definition of this as:

`Given a set of variables x_i, i = 1, 2 ,... n and an objective function P (x₁, x₂, x₃ ...) find values for the x_i such that P is a minimum but such that a set of inequalities:
g₁ (x₁, x₂, x₃ ...) < 0
g₂ (x₁, x₂, x₃ ...) < 0
.....
g_m (x₁, x₂, x₃ ...) < 0
are not violated.'

As noted above we cam maximise the o.f. by changing its sign and minimising, and so can drop the explicit reference to maximisation. Similarly, we can express any kind of inequality in the above form by appropriate manipulation of the expression. The above definitions are of an optimisation involving n decision variables. Note that the number of inequality constraints, given above as m is in no way connected to the number of variables n. There may be either more or fewer inequality constraints than variables.

One Variable Optimisation

When n = 1 it is particularly easy to visualise optimisation, and a number of special methods are available for the above two types of problem.

Notes on general ideas in optimisation, illustrated by one variable examples, are in section 5.3.1. These revise the use of derivatives to find the maximum or minimum of a function.
Where derivatives are inconvenient or impossible to calculate, direct search methods methods are used. These are described in section 5.3.2.
Finally, if you want to explore some rather more sophisticated ideas, the most powerful modern optimisation techniques are based of making Quadratic approximations to the function to be optimised. This idea is described in section 5.3.3 with some more mathematics in section 5.3.3.1.
Optimisation with Equality Constraints
If instead of, or as well as, supplying a inequality constraints, constraints which are equalities, i.e. equations, are specified, then the dimension of the problem (i.e. effective number of variables) changes.
This may be seen most easily with a simple example.
Minimise P = x²
1. Without constraints
2. Subject to inequalities:
  1. x <10 and x>-3
  2. x<10, x>-3 and x>2
3. Subject to the equality: x - 2 = 0
1. The minimum here is clearly at x=0 where the o.f. is zero.
2. For case 1 the optimum is the same as in the unconstrained case, because its value, x=0, satisfies both the inequalities. The constraints have no effect and so are said to be inactive.
For case 2. the unconstrained optimum does not satisfy the new third inequality. The lowest value of the o.f. which satisfies it will lie at x=2, i.e. actually on the constraint, with an o.f. value of 4. The constraint is said to be active and is in effect an equality constraint since it forces x to be 2.
3. If the equality constraint is satisfied, then x must equal 2. This is no longer an optimisation problem, as its solution is found by solving an equation, which corresponds to the equality constraint.
In general every equality constraint reduces the effective number of `free' or design variables in an optimisation problem by one. Thus if there are:
1. Fewer equality constraints than variables, the problem remains one of optimisation but may, if required, be turned into a smaller problem. (This is not always worth doing, however.)
2. As many equality constraints as variables, the problem is one of equation solving, i.e. simply to solve the n equality equations for the n variables. Note that the objective function now takes no part in this process.
3. More constraints than variables, the problem is overconstrained and in general a solution will not be found to satisfy all constraints. The problem must be reconsidered from an engineering standpoint and reformulated.
Alphabet Soup
In mathematical formulation multivariable optimisation problems are classified into categories depending on whether:
1. The problem involves linear or nonlinear expressions of all the variables
2. Some of the variables may take only discrete or integer values.
These different classes of problem all have different solution methods, and most confusingly, tend to be identified by the initials of the name given to the class of problem! One of the first prerequisites for `bluffing your way in optimisation' is to know these these names, and especially, the initials....
Mathematical Programming
Sometimes called just MP, this is a generic term for any of the methods or algorithms used in optimisation.
LP - Linear Programming
When the objective function and all the constraints, equality and inequality, are linear in the problem variables, then a range of special techniques called Linear Programming, are applicable.
These are based on the techniques used to solve sets of linear equations.
The principles of linear programming are very simple. If the objective function is linear, then it cannot have an unconstrained minimum at any finite values of the variables. The unconstrained minimum must be minus infinity and lie at values of the variables which are either plus or minus infinity..
This is clear if you consider a single variable example, e.g.
Minimise P = x + 4
Clearly the unconstrained minimum must lie at x=-oo
The optimum of an LP must lie at a point which satisfies all the equality constraints, the active inequality constraints treated as equalities. LP can thus be thought of as a trial and error procedure which involves repeatedly solving sets of linear equations in order to find which constraints are active.
Many highly developed LP packages are available commercially and are quite user friendly. We will give one example here, but in general will not say any more about this technique.

IP - Integer Programming
Formally an integer programming problem is one where the problem variables can take only integer values. In process engineering this type of problem is typically associated with process synthesis or the development of flowsheets.
Situations which involve discrete valued decisions, e.g. whether to have one type of equipment or another type or none at all.
Example: In a particular process we may install 1, 2 or 3 CSTRs of the same size in series. This is an IP problem in a single variable to which the answer may be 1, 2 or 3.
Since there are always a finite number of solutions to an IP problem (for a problem with continuous variables there will be an infinite number) it is possible in principle to solve by complete enumeration, i.e. simply generating and evaluating all solutions. For small problems such as this one this approach is quite feasible.
Consider what would happen if instead of having only a single size of reactor we had a choice of two and could install either a large or a small unit in each of the positions in series. The number of alternatives is now much larger, and can be illustrated as in this diagram.
This tree diagram is one of the classical ways of representing IP problems. The tree has its root at the square box labelled `Reactor 1', and its leaves are the ellipses marked `Evaluate'. The square boxes represent decision nodes at which it is decided what size of reactor to install, and the branches leaving these nodes are labelled L, S or N depending on whether a large or small reactor, or none at all, has been chosen.
At each of the leaves or terminal nodes one complete process alternative has been specified and may be evaluated.
Different IP methods involve different ways of generating and searching the tree of alternatives. `Smart' methods involve two main ideas:
- Spotting common subtrees within the tree and avoiding designing and evaluating them more than once. (Dynamic Programming is one such technique.)
- Determining when it is not worth following a particular branch because it has become clear that it will always be more expensive than another alternative already found (`Branch and Bound' is such a method).
One way of approaching IP problems such as this which represent process structural alternatives is to generate a `superflowsheet' or superstructure which contains within it all possible process structures. Here is such a superstructure for the three reactors with two possible sizes. The large square boxes represent the large reactors and the small ones the small. The circles can be though of as representing two-way on-off valves which may switch the input from the left to either of two outputs to the right.
It can be seen that this process involves 9 decision variables, one for each `valve'. In modelling this process for optimisation, if no feed reaches a reactor then it is deemed not to be needed and so has no cost. We could have used a smaller number of `three-way valves' but most of the classical IP techniques actually allow variables to have only two values, i.e. they are binary numbers. The superstructure has thus been set up to use only binary valves.
MILP - Mixed Integer Linear Programming
In the above IP example only discrete or integer variables were allowed. There can be similarities in the way in which IP searches through its discrete choices and how LP (linear programming) searches to find its `active' constraints. A range of techniques is available which combine the two as mixed integer linear programming.
A MILP problem must firstly involve only linear linear constraints (i.e. equations) and a linear objective function.

NLP - Nonlinear Programming
NLP is a general term for optimisation when all variables are continuous, but some of the constraints and/or the objective function are nonlinear in the problem variables.
The examples described in section 5.3.1 are all one variable NLP problems.
For general multivariable nonlinear optimisation, extensions of the methods described for single variables can be used. However, for a large and complicated problem the effort involved in differentiation for the first derivatives (of which there will be a matrix called a Jacobian matrix) and then the second derivatives (called the Hessian matrix) can be considerable.
Non-gradient methods are available, but are very inefficient compared with their performance for single variables.
We describe briefly below two NLP methods which seek to get round these difficulties.
SLP - Successive Linear Programming
It is relatively easy to make a linear approximation to a nonlinear objective function and any nonlinear constraints, starting at some estimate of the optimal solution. This produces a linear program, which is solved to get an optimum and corresponding values of the variables.
A new linear approximation can now be made, and solved, and then the process repeated. The approach is thus similar conceptually to equation solving procedures which use linear approximations.
Unfortunately, because a linear program always finds its optimum on a constraint, if the optimum for the NLP is not in fact constrained, this method will not find it! Although it has some disadvantages, SLP may be an effective method if the optimum is known to lie on a constraint. There are even ways of making it work when the optimum is unconstrained, but this uses a concept known as a `trust region' and is not a commonly available technique.
SLP is not much used for process engineering problems, SQP, described below, is more usual.
QP and SQP - Quadratic programming
A quadratic program or QP problem is one where all the constraints are linear but the objective function can contain quadratic terms. This means that the first derivatives of the o.f. are linear (and those of the constraints are constant), and the second derivatives are constant and all derivatives are relatively easily computed.
Special and efficient methods exist for solving this type of problem.
SQP (successive quadratic programming) works like SLP except that the o.f. is made approximately quadratic rather than linear. This means that the optimum may lie away from a constraint. The constraints are given linear approximations. Section 5.3.3 describes quadratic approximation as applied to a one variable situation.
SQP is one of the most effective NLP techniques and is now the preferred method for most large scale optimisation.
MINLP - Mixed Integer Nonlinear Programming
If we combine discrete and continuous variable and allow general nonlinear constraints and objective function we get the most general class of optimisation problem called a mixed integer nonlinear program or MILP.
As might be expected, this class of problem is hard to solve. the most usual approach is to decompose the problem into an MILP and an NLP and solve these alternately, feeding the results of one stage into the next.

Next - Section 5.2.1: Data Fitting as Optimisation
Return to Section 5 Index

Section 5.2: Overview and General Notes

General Notation for Optimisation Problems

One Variable Optimisation

Optimisation with Equality Constraints

Alphabet Soup

Mathematical Programming

LP - Linear Programming

IP - Integer Programming

MILP - Mixed Integer Linear Programming

NLP - Nonlinear Programming

SLP - Successive Linear Programming

QP and SQP - Quadratic programming

MINLP - Mixed Integer Nonlinear Programming