Chaitanya's Random Pages

April 2, 2011

Some notes on the Schur Complement

Filed under: mathematics — ckrao @ 6:11 am

In matrix theory the Schur complement of a square submatrix A in the bigger square matrix

\displaystyle M = \left[\begin{array}{cc}A & B\\C&D \end{array}\right]

is defined as S = D-CA^{-1}B. Here we assume A and D are square matrices. Similarly the Schur complement of D in M is {A - BD^{-1}C}. The formulas are reasonably easy to recall: go clockwise around the block matrix from the opposite corner, remembering to invert the opposite square submatrix.

Here are some properties and uses of the Schur complement.

1. It comes up in Gaussian elimination in the solution to a system of equations:

To eliminate x in the system of equations Ax + By = u, Cx + Dy = v, we multiply the first equation by {CA^{-1}} and subtract it from the second equation to obtain (D-CA^{-1}B)y = Sy = v - CA^{-1}u.

2. The Schur determinant formula:

\displaystyle \det \left[\begin{array}{cc}A & B\\C&D \end{array}\right] = \det A \det S

This is proved by taking determinants of both sides of the matrix factorisation

\displaystyle \left[\begin{array}{cc}A & B\\C&D \end{array}\right] = \left[\begin{array}{cc}I & 0\\CA^{-1}&I \end{array}\right]\left[\begin{array}{cc}A & B\\ 0 &D-CA^{-1}B \end{array}\right].

The formula tells us that a large matrix is invertible if and only if any top left or bottom right submatrix and its Schur complement are invertible.

3. The Guttman rank additivity formula [1, p14]:

\displaystyle {\rm rank} \left[\begin{array}{cc}A & B\\C&D \end{array}\right] = {\rm rank} A + {\rm rank} S

4. It arises in matrix block inversion: in particular the 2,2 element of the inverse of the block matrix is the inverse of the Schur complement of A.

5. The addition theorem for the Schur complement of Hermitian matrices ([1, p28]):

Firstly we define the inertia of a matrix. Let {\rm In}(M)\in \mathbb{R} be the triple (p,q,z) of integers representing the number of positive, negative and zero eigenvalues of the Hermitian matrix M ({M = M^*}) respectively. Then

\displaystyle {\rm In}(M) = {\rm In}(A) + {\rm In}(S).

6. It comes up in the minimisation of quadratic forms [3, App A5.5]:

If A is positive definite, the quadratic form u^T Au + 2v^TB^Tu + v^T Cv in the variable u (v is a constant vector) has minimum value of {v^T S v}, achieved when u = -A^{-1}Bv. This follows from the completion of squares:

\displaystyle u^T Au + 2v^TB^Tu + v^T Cv = ( u + A^{-1}Bv)^T A(u + A^{-1}Bv) + v^T S v.

7. A least squares application of 6 is the following:

Let x and y be two random vectors. The covariance of the error in linearly estimating x from y via K (in order to minimise the expectation E\|(x-Ky)\|^2, assuming the covariance R_y of y is positive definite, is the Schur complement of R_y in the joint covariance matrix

\Sigma = E \left[ \begin{array}{c} x\\y \end{array}\right] \left[\begin{array}{cc}x & y \end{array}\right] = \left[\begin{array}{cc}R_x & R_{xy}\\ R_{yx}& R_y \end{array}\right]

Similarly the covariance of the error in estimating y from x is the Schur complement of R_x in \Sigma:

\displaystyle {\rm var}[y | x] = R_y - R_{yx} R_x^{-1}R_{xy}.

8. The above leads to a matrix version of the Cauchy Schwartz Inequality [4]:

Since any covariance matrix is non-negative definite we have R_y - R_{yx} R_x^{-1}R_{xy} \succeq 0. With the inner product of random vectors in \mathbb{R}^n defined as \langle x, y \rangle := E xy^T this becomes

\displaystyle \|y\|^2 - \langle y,x \rangle \|x\|^{-2} \langle x, y \rangle \succeq 0.

References

[1] F. Zhang, “The Schur Complement and its applications”, Springer 2005.

[2] “Block matrix decompositions” in https://ccrma.stanford.edu/~jos/lattice/Block_matrix_decompositions.html

[3] Boyd and Vandenburghe, “Convex Optimisation”, Cambridge University Press, 2004.

[4] Kailath, Sayed and Hassibi, “Linear Estimation”, Prentice Hall, 2000.

[5] Schur complement, Wikipedia

Advertisements

1 Comment »

  1. […] will show that and ( is the Schur complement of in also discussed in this previous blog […]

    Pingback by Inverse variance weighting form of the conditional covariance of multivariate Gaussian vectors | Chaitanya's Random Pages — March 26, 2014 @ 5:48 am | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: