Background
Suppose we have a vector of $n$ real variables $\vec{x} = (x_1,\dots,x_n)$. We may then specify a linear objective function on $\vec{x}$ by $f(\vec{x}) = \vec{c}^T\vec{x}$, where $\vec{c} \in \mathbb{R}^n$ gives the coefficients of the linear function. Further, we may specify a set of linear constraints on $\vec{x}$ by a matrix $A \in \mathbb{R}^{m \times n}$ and a vector $\vec{b} \in \mathbb{R}^m$, where each row of $A$ gives the coefficients of a linear constraint and $\vec{b}$ gives the right-hand side of each constraint.
A linear program (LP) is then the problem of:
- maximizing $\vec{c}^T\vec{x}$, subject to
- $A\vec{x} \leq \vec{b}$.
Typically, we have the additional constraint that $\vec{x} \succeq 0$. A linear program with this constraint is said to be in standard form. In the remainder, we assume all linear programs are in standard form.
Slack variables
We can convert the inequality $A\vec{x}\leq\vec{b}$ into a strict equality by introducing slack variables $0\preceq\vec{s}\in\mathbb{R}^m$. Specifically, we replace each constraint $\vec{a}_j\vec{x} \leq b_j$ with $\vec{a}_j\vec{x} + s_j = b_j$. We can apply this trick to the objective function as well, by introducing a new variable $z$ such that $z - \vec{c}^T\vec{x} = 0$.
All together, this lets us write the LP concisely in block-matrix form: \[ \text{maximize}\quad z\quad\text{s.t.}\quad \begin{bmatrix} 1 & -\vec{c}^T & 0\\ 0 & A & I \end{bmatrix} \begin{bmatrix} z\\ \vec{x}\\ \vec{s} \end{bmatrix} = \begin{bmatrix} 0\\ \vec{b} \end{bmatrix},\quad\vec{x},\vec{s}\succeq 0. \] This linear system is called the initial system for the LP.
The Simplex Algorithm
The simplex algorithm begins with a few observations about linear programs:
- if $\vec{c}^T\vec{x}$ has a maximum within the bounding convex polytope, then it attains the maximum at one or more vertices of the polytope,1
- if we can find a solution at some vertex of the polytope, we can thus "walk" along the edges of the polytope in the direction of increasing objective value.
The crux of the algorithm is finding a starting vertex as needed by the second observation. This motivates the following definitions:
Definition. (Basic Solution and Variables)
Given a linear program $P$, a basic solution to $P$ is a solution to its corresponding initial system, such that there are (at least?) $n$ zero entries in the vector $[\vec{x}~\vec{s}]^T$. A basic feasible solution is a basic solution that also satisfies all constraints of $P$. Each zero variable in the solution is called non-basic, and the remaining variables are said to be basic.
As it turns out, every basic feasible solution corresponds to a vertex of the LP's convex polytope. So how do we compute an initial basic feasible solution?
Todo
Get some intuition for how this works. How do the slack variables interact here?
Simplex tableau
The simplex tableau is simply the augmented matrix corresponding to the initial system: \[ \left[\begin{array}{c c c | c} 1 & -\vec{c}^T & 0 & 0\\ \hline 0 & A & I & \vec{b} \end{array}\right], \]
Todo
Jelani's lecture on the simplex method has a very different flavor than the "worked example"-style explanations found elsewhere online. It seems much clearer to me how to implement the method following his explanation, but I am not confident a faithful implementation would be efficient in practice.
But I should figure out if it really is a canonical way to implement the method.
References
- Wikipedia page on LP
- On the simplex algorithm:
- Wikipedia
- Notes from WSU
- CS270 lecture notes. The lecture recording is viewable by Berkeley students.
- DPV Ch. 7
- Interior point method
- Cornell University Computational Optimization Open Textbook
-
Suppose there is some point $p$ where the maximum is attained. Since the objective is linear, the gradient of the objective is constant, and so we can move away from $p$ towards any vertex such that the objective is non-decreasing. ↩