Chaitanya's Random Pages

December 21, 2010

The Matrix Inversion Lemma

Filed under: mathematics — ckrao @ 4:32 am

I had thought the matrix inversion lemma was difficult to prove, but it is in fact not so tricky!

The lemma states that if A and C are square invertible matrices (and B, D are matrices so that A and BCD have the same dimensions), then

\displaystyle (A + BCD)^{-1} = A^{-1} - A^{-1}B(C^{-1} + DA^{-1}B)^{-1}DA^{-1} \quad (*)

Thanks to [1], it is now easier for me to derive this formula than to remember it, the way much of mathematics should be. Other ways of arriving at the formula are by matrix blockwise elimination or inversion. See the Wikipedia entry on the Woodbury matrix identity (another name for the lemma) for more information.

Proof

1. Start with the equation (A + BC)x = b. We find x in terms of b either as x = (A + BC)^{-1} b or as follows.

2. Let y = Cx, giving us the two equations

\begin{array}{rclr}Ax + By &=& b &\quad \text{(1)}\\y &=& Cx &\quad \text{(2)}\end{array}

3. From (1) we obtain

\displaystyle x = A^{-1}(b-By). \quad \text{(3)}

4. Substituting (3) into (2) gives y = CA^{-1} (b-By) and rearranging this gives

\displaystyle y = (I + CA^{-1}B)^{-1}CA^{-1}b. \quad \text{(4)}

5. From (3) and (4) we end up with

\displaystyle x = A^{-1}b - A^{-1}By = \left[A^{-1} - A^{-1}B(I + CA^{-1}B)^{-1}CA^{-1}\right]b.

6. Since b was arbitrary, from step 1 we conclude that

\displaystyle (A + BC)^{-1} = A^{-1} - A^{-1}B(I + CA^{-1}B)^{-1}CA^{-1}.

7. To arrive at the slightly more complicated form (*) we replace C with CD and note that

\displaystyle(I + CDA^{-1}B)^{-1}CD = (I + CDA^{-1}B)^{-1}\left(C^{-1}\right)^{-1}D = (C^{-1} + DA^{-1}B)^{-1}D

(using the result (XY)^{-1} = Y^{-1}X^{-1}).

The matrix inversion lemma is especially useful when it is easy to invert A and C, e.g. if they are diagonal or have small dimension. The latter may occur in recursive formulas such as recursive least squares or the Kalman filter. The lemma is actually a special case of the Binomial inverse theorem, which applies when C is not invertible:

\displaystyle (A + BCD)^{-1} = A^{-1} - A^{-1}BC(C + CDA^{-1}BC)^{-1}CDA^{-1} \quad (*)

A couple more special cases of the matrix inversion lemma follow:

1. Sherman-Morrison formula (B and D replaced by column vectors u and v, C replaced by the identity):

\displaystyle \left(A + uv^T\right)^{-1}= A^{-1} - \frac{A^{-1}uv^TA^{-1}}{1 + v^TA^{-1}u}

2.

\displaystyle (A + B)^{-1} = A^{-1} - A^{-1}B\left(B + BA^{-1}B\right)^{-1}BA^{-1}

3.

\displaystyle (I + A)^{-1} = I - (I + A)^{-1}A

Reference:

[1] S. Boyd and L. Vandenberghe, Convex Optimization (Appendix C.4.3), Cambridge University Press, 2004

Advertisements

2 Comments »

  1. nice proof 🙂

    Comment by ekaveera — May 21, 2012 @ 2:54 pm | Reply

  2. Good proof

    Comment by jrsadow — September 14, 2016 @ 8:57 pm | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: