$\newcommand{\br}{\\}$ $\newcommand{\R}{\mathbb{R}}$ $\newcommand{\Q}{\mathbb{Q}}$ $\newcommand{\Z}{\mathbb{Z}}$ $\newcommand{\N}{\mathbb{N}}$ $\newcommand{\C}{\mathbb{C}}$ $\newcommand{\P}{\mathbb{P}}$ $\newcommand{\F}{\mathbb{F}}$ $\newcommand{\L}{\mathcal{L}}$ $\newcommand{\spa}[1]{\text{span}(#1)}$ $\newcommand{\dist}[1]{\text{dist}(#1)}$ $\newcommand{\max}[1]{\text{max}(#1)}$ $\newcommand{\min}[1]{\text{min}(#1)}$ $\newcommand{\supr}[1]{\text{sup}(#1)}$ $\newcommand{\infi}[1]{\text{inf}(#1)}$ $\newcommand{\argmax}[1]{\underset{#1}{\text{argmax }}}$ $\newcommand{\argmin}[1]{\underset{#1}{\text{argmin }}}$ $\newcommand{\set}[1]{\left\{#1\right\}}$ $\newcommand{\emptyset}{\varnothing}$ $\newcommand{\otherwise}{\text{ otherwise }}$ $\newcommand{\if}{\text{ if }}$ $\newcommand{\proj}{\text{proj}}$ $\newcommand{\union}{\cup}$ $\newcommand{\intercept}{\cap}$ $\newcommand{\abs}[1]{\left| #1 \right|}$ $\newcommand{\norm}[1]{\left\lVert#1\right\rVert}$ $\newcommand{\pare}[1]{\left(#1\right)}$ $\newcommand{\t}[1]{\text{ #1 }}$ $\newcommand{\head}{\text H}$ $\newcommand{\tail}{\text T}$ $\newcommand{\d}{\text d}$ $\newcommand{\limu}[2]{\underset{#1 \to #2}\lim}$ $\newcommand{\der}[2]{\frac{\d #1}{\d #2}}$ $\newcommand{\derw}[2]{\frac{\d #1^2}{\d^2 #2}}$ $\newcommand{\pder}[2]{\frac{\partial #1}{\partial #2}}$ $\newcommand{\pderw}[2]{\frac{\partial^2 #1}{\partial #2^2}}$ $\newcommand{\pderws}[3]{\frac{\partial^2 #1}{\partial #2 \partial #3}}$ $\newcommand{\inv}[1]{{#1}^{-1}}$ $\newcommand{\inner}[2]{\langle #1, #2 \rangle}$ $\newcommand{\nullity}[1]{\text{nullity}(#1)}$ $\newcommand{\rank}[1]{\text{rank }#1}$ $\newcommand{\nullspace}[1]{\mathcal{N}\pare{#1}}$ $\newcommand{\range}[1]{\mathcal{R}\pare{#1}}$ $\newcommand{\var}[1]{\text{var}(#1)}$ $\newcommand{\tr}[1]{\text{tr}(#1)}$ $\newcommand{\oto}{\text{ one-to-one }}$ $\newcommand{\ot}{\text{ onto }}$ $\newcommand{\ceil}[1]{\lceil#1\rceil}$ $\newcommand{\floor}[1]{\lfloor#1\rfloor}$ $\newcommand{\Re}[1]{\text{Re}(#1)}$ $\newcommand{\Im}[1]{\text{Im}(#1)}$ $\newcommand{\dom}[1]{\text{dom}(#1)}$ $\newcommand{\fnext}[1]{\overset{\sim}{#1}}$ $\newcommand{\transpose}[1]{{#1}^{\text{T}}}$ $\newcommand{\b}[1]{\boldsymbol{#1}}$ $\newcommand{\None}[1]{}$ $\newcommand{\Vcw}[2]{\begin{bmatrix} #1 \br #2 \end{bmatrix}}$ $\newcommand{\Vce}[3]{\begin{bmatrix} #1 \br #2 \br #3 \end{bmatrix}}$ $\newcommand{\Vcr}[4]{\begin{bmatrix} #1 \br #2 \br #3 \br #4 \end{bmatrix}}$ $\newcommand{\Vct}[5]{\begin{bmatrix} #1 \br #2 \br #3 \br #4 \br #5 \end{bmatrix}}$ $\newcommand{\Vcy}[6]{\begin{bmatrix} #1 \br #2 \br #3 \br #4 \br #5 \br #6 \end{bmatrix}}$ $\newcommand{\Vcu}[7]{\begin{bmatrix} #1 \br #2 \br #3 \br #4 \br #5 \br #6 \br #7 \end{bmatrix}}$ $\newcommand{\vcw}[2]{\begin{matrix} #1 \br #2 \end{matrix}}$ $\newcommand{\vce}[3]{\begin{matrix} #1 \br #2 \br #3 \end{matrix}}$ $\newcommand{\vcr}[4]{\begin{matrix} #1 \br #2 \br #3 \br #4 \end{matrix}}$ $\newcommand{\vct}[5]{\begin{matrix} #1 \br #2 \br #3 \br #4 \br #5 \end{matrix}}$ $\newcommand{\vcy}[6]{\begin{matrix} #1 \br #2 \br #3 \br #4 \br #5 \br #6 \end{matrix}}$ $\newcommand{\vcu}[7]{\begin{matrix} #1 \br #2 \br #3 \br #4 \br #5 \br #6 \br #7 \end{matrix}}$ $\newcommand{\Mqw}[2]{\begin{bmatrix} #1 & #2 \end{bmatrix}}$ $\newcommand{\Mqe}[3]{\begin{bmatrix} #1 & #2 & #3 \end{bmatrix}}$ $\newcommand{\Mqr}[4]{\begin{bmatrix} #1 & #2 & #3 & #4 \end{bmatrix}}$ $\newcommand{\Mqt}[5]{\begin{bmatrix} #1 & #2 & #3 & #4 & #5 \end{bmatrix}}$ $\newcommand{\Mwq}[2]{\begin{bmatrix} #1 \br #2 \end{bmatrix}}$ $\newcommand{\Meq}[3]{\begin{bmatrix} #1 \br #2 \br #3 \end{bmatrix}}$ $\newcommand{\Mrq}[4]{\begin{bmatrix} #1 \br #2 \br #3 \br #4 \end{bmatrix}}$ $\newcommand{\Mtq}[5]{\begin{bmatrix} #1 \br #2 \br #3 \br #4 \br #5 \end{bmatrix}}$ $\newcommand{\Mqw}[2]{\begin{bmatrix} #1 & #2 \end{bmatrix}}$ $\newcommand{\Mwq}[2]{\begin{bmatrix} #1 \br #2 \end{bmatrix}}$ $\newcommand{\Mww}[4]{\begin{bmatrix} #1 & #2 \br #3 & #4 \end{bmatrix}}$ $\newcommand{\Mqe}[3]{\begin{bmatrix} #1 & #2 & #3 \end{bmatrix}}$ $\newcommand{\Meq}[3]{\begin{bmatrix} #1 \br #2 \br #3 \end{bmatrix}}$ $\newcommand{\Mwe}[6]{\begin{bmatrix} #1 & #2 & #3\br #4 & #5 & #6 \end{bmatrix}}$ $\newcommand{\Mew}[6]{\begin{bmatrix} #1 & #2 \br #3 & #4 \br #5 & #6 \end{bmatrix}}$ $\newcommand{\Mee}[9]{\begin{bmatrix} #1 & #2 & #3 \br #4 & #5 & #6 \br #7 & #8 & #9 \end{bmatrix}}$
Definition: Problems With Equality and Inequality Constraints

$$\begin{align*} &\text{minimize }& f(\b{x}) \br &\text{subject to }& \b{h}(\b{x}) = \b{0}\br && \b{g}(\b{x}) \leq \b{0} \end{align*}$$

where $f:\R^n \to \R$, $\b{h}:\R^n \to \R^m$, $m \leq n $, and $\b{g}: \R^n \to \R^p $.

Definition: Regular Point

Let $\b{x}^* $ satisfy $\b{h}(\b{x}^*) = \b{0} $, $\b{g}(\b{x}^*) \leq \b{0} $, and let $J(\b{x}^*) $ be the index set of active inequality constraints:

$$J(\b{x}^*) \triangleq \set{ j : g_j(\b{x}^*) = 0 } $$

Then, we say that $\b{x}^* $ is a regular point if the vectors

$$\nabla h_i(\b{x}^*), \nabla g_j(\b{x}^*), 1 \leq i \leq m, j \in J(\b{x}^*) $$

are linearly independent.

Definition: Active and Inactive Inequality Constraint

An inequality constaint $g_j(\b{x}) \leq 0$ is said to be active at $\b{x}^* $ if $g_j(\b{x}^*) = 0 $. It is inactive at $\b{x}^* $ if $g_j(\b{x}^*) \leq 0 $.

Theorem : Karush-Kuhn-Tucker (KKT) Theorem

Let $f, \b{h}, \b{g} \in \mathcal{ C }^1 $. Let $\b{x}^* $ be a regular point and a local minimizer for the problem of minimizing $f $ subject to $\b{h}(\b{x}) = \b{0} $, $\b{g}(\b{x}) \leq \b{0} $. Then, there exist $\b{\lambda}^* \in \R^m $ and $\b{\mu}^* \in \R^p $ such that:

  1. $\b{\mu}^* \geq \b{0} $
  2. $Df(\b{x}^*) + \transpose{ \b{\lambda}^* } D \b{h}(\b{x}^*) + \transpose{ \b{\mu}^* } D \b{g}(\b{x}^*) = \transpose{ \b{0}}$
  3. $\transpose{ \b{\mu}^* } \b{g}(\b{x}^*) = 0$

We call $\b{\lambda}^* $ the Lagrange multiplier vector and call $\b{\mu}^*$ the Karush-Kuhn-Tucker (KKT) multiplier vector. We refer to their components as the Lagrange multipliers and the Karush-Kuhn-Tucker (KKT) multipliers, respectively.

Note

Observe that $\b{\mu}_j^* $ and $g_j(\b{x}^*) \leq 0 $. Therefore, the condition

$$\transpose{ \b{\mu}^* } \b{g}(\b{x}^*) = \mu^*_1g_1(\b{x}^*) + \mu^*_2g_2(\b{x}^*) + … + \mu^*_{ p }g_{ p }(\b{x}^*) = 0$$

implies that if $g_j(\b{x}^*) < 0 $, then $\mu_j^* = 0 $; that is, for all $j \not \in J(\b{x}^*) $, we have $\mu_j^* = 0 $. In other words, the KKT multipliers $\mu_j^* $ corresponding to inactive constraints are zero.