Chaitanya's Random Pages

June 10, 2012

Outline proof of the extreme value theorem in statistics

Filed under: mathematics — ckrao @ 9:15 am

Recently I read a proof in [1] of the main theorem of extreme value statistics: the Fisher–Tippett–Gnedenko theorem. In this post I give an outline.

Here we are interested in the maximum of many independent and identically distributed random variables X_i with distribution function F. Let x^* = \sup \{x: F(x) < 1\} which may be infinite. Then as n \rightarrow \infty,

{\rm Pr}(\max(X_1, X_2, \ldots X_n) \leq x) = F^n(x) tends to 0 if x < x^* and 1 if x > x^*.

Therefore \displaystyle \max(X_1, X_2, \ldots, X_n) converges in probability to x^* as n \rightarrow \infty.

In order to avoid this degenerate limiting distribution for all extreme value distributions, it is necessary to normalise the distribution.

To this end, suppose there exist real numbers a_n >0, b_n such that

\displaystyle \frac{\max(X_1, X_2, \ldots, X_n) - b_n}{a_n} approaches a non-degenerate limiting distribution.

In other words, there exists a distribution function G(x) such that

\displaystyle \lim_{n \rightarrow \infty} F^n(a_n x + b_n) = G(x).

Taking logarithms of both sides, this is equivalent to

\displaystyle \lim_{n \rightarrow \infty} n \log F(a_n x + b_n) = \log G(x)

This requires F(a_n x + b_n) \rightarrow 1 as n \rightarrow \infty. Using \log x \approx x -1 for x close to 1, the above is also equivalent to

\displaystyle \lim_{n \rightarrow \infty} \frac{1}{n(1 - F(a_n x + b_n) } = \frac{1}{- \log G(x)}. \quad \quad (1)

Next we use the following definition.

A non-decreasing function f has left-continuous inverse f^{\leftarrow} defined by

\displaystyle f^{\leftarrow}(x) := \inf\{y: f(y) > x\}.

One can use this definition to prove

Lemma 1: If f_n(x) \rightarrow g(x) for non-decreasing functions f_n then for each continuity point x we have f_n^{\leftarrow}(x) \rightarrow g^{\leftarrow}(x).

Next we claim that (1) is equivalent to

\displaystyle \frac{U(nx) - b_n}{a_n} \rightarrow G^{\leftarrow}\left(e^{-1/x}\right) \quad \quad (2)

where U is the left-continuous inverse of \frac{1}{1-F}. To see this, let V(x) = \frac{1}{1 - F(x)}. Then by the definition of U, U(x) = \inf \{ y: V(y) \geq x \}. Then for any n \in \mathbb{N},

U(nx) = \inf \left\{ y: \frac{1}{n(1-F(y))} \geq x \right\}

and so

\begin{array}{lcl} \frac{U(nx) - b_n}{a_n} &=& \inf \left\{\frac{y - b_n}{a_n}: \frac{1}{n(1 - F(y))} \geq x \right\}\\&=& \inf \left\{z: \frac{1}{n(1 - F(a_n z + b_n))} \geq x \right\}\end{array}

By (1) and the lemma as n \rightarrow \infty this tends to

\begin{array}{lcl} & & \inf \left\{ z: \frac{1}{-\log G(z)} \geq x \right\}\\& = & \inf \left\{ z: \log G(z) \geq \frac{-1}{x} \right\} \\ &=& \inf \left\{ z: G(z) \geq e^{-1/x} \right\}\\ &=& G^{\leftarrow} \left( e^{-1/x} \right), \end{array}

proving the claim. We can also write

\displaystyle \lim_{t \rightarrow \infty} \frac{U(tx) - b(t)}{a(t)} = G^{\leftarrow}\left(e^{-1/x}\right) =: D(x) \quad \quad (3)

where a(t) := a_{[t]}, b(t) := b_{[t]} and [t] denotes the integer part of t.

We are now ready to prove the main theorem of extreme value theory.

Theorem (Fisher, Tippet, Gnedenko):

\displaystyle G_{\gamma}(ax + b) = \exp\left( -(1 + \gamma x)^{-1/\gamma} \right), \quad 1 + \gamma x > 0, \quad \gamma \in \mathbb{R}

where the right side is equal to its limiting value \exp \left( - e^{-x}\right) if \gamma = 0.


This will involve numerous substitutions but the main idea is to arrive at a differential equation that can be solved to obtain the above. Suppose 1 is a continuity point of D. Then for any continuity point x > 0,

\displaystyle \lim_{t \rightarrow \infty} \frac{U(tx) - U(t)}{a(t)} = D(x) - D(1) =: E(x). \quad \quad (4)

We can write

\displaystyle \frac{ U(txy) - U(t)}{a(t)} = \frac{U(txy) - U(ty)}{a(ty)} \frac{a(ty)}{a(t)} + \frac{U(ty) - U(t)}{a(t)}.

The claim is that both \frac{a(ty)}{a(t)} and \frac{U(ty) - U(t)}{a(t)} have limits as t \rightarrow \infty. If they had more than one limit point, say A_1, A_2, B_2, B_2 then for i = 1, 2 (4) in the limit t \rightarrow \infty gives us

E(xy) = E(x) A_i + B_i.

Subtracting gives E(x) (A_1 - A_2) = B_2 - B_1 which implies A_1 = A_2, B_2 = B_1 as we know E(x) is non-constant (since we seek a non-degenerate solution).

We conclude that

\displaystyle E(xy) = E(x) A(y) + E(y). \quad \quad (5)

This is a functional equation that we wish to solve. We let s:= \log x, t := \log y, H(x) := E(e^x) to obtain

 \displaystyle H(t+ s) = H(s) A(e^t) + H(t),

which using H(0) = E(1) = 0 implies

\displaystyle \frac{H(t+s) - H(t)}{s} = \frac{H(s) - H(0)}{s} A(e^t). \quad \quad (6)

Since H is monotone (following from the monotonicity of D), it is differentiable at some t. By (6) it is differentiable at all t. Indeed from (6) we obtain

\displaystyle H'(t) = H'(0) A(e^t) \neq 0. \quad \quad (7)

Let Q(t) = H(t)/H'(0). Then Q(0) = 0, Q'(0) = 1.

From (6) and (7),

\displaystyle Q(t+s) - Q(t) = Q(s) A(e^t) = Q(s) H'(t)/H'(0) = Q(s) Q'(t). \quad \quad (8)

Similarly, Q(s+t) - Q(s) = Q(t) Q'(s) and upon subtraction from (8) we obtain Q(s) - Q(s)Q'(t) = Q(t) - Q(t)Q'(s) from which

\displaystyle \frac{Q(s)}{s} (Q'(t) - 1) = Q(t) \frac{Q'(s) - 1}{s}.

Taking the limit as s \rightarrow 0 and using Q'(0) = H'(0)/H'(0) = 1 gives the following differential equation for Q.

\displaystyle Q'(t) -1 = Q(t) Q''(0), Q(0) = 0, Q'(0) = 1 \quad \quad (9)

To solve (9), differentiate both sides with respect to t: from Q''(t) = Q'(t) Q''(0) we see that

\displaystyle (\log Q')' (t) = Q''(0) =: \gamma.

Hence Q'(t) = e^{\gamma t} and since Q(0) = 0, Q(t) = \int_0^t e^{\gamma s}\ ds.

Recalling Q(t) = H(t)/H'(0), this leads to H(t) = H'(0) \frac{e^{\gamma t} - 1}{\gamma}.

Recalling H(x) := E(e^x) := D(e^x) - D(1) this means

D(t) = D(1) + H'(0) \frac{t^{\gamma} - 1}{\gamma}.

Taking the left-continuous inverse of both sides,

D^{\leftarrow}(x) = \left(1 + \gamma \frac{x - D(1)}{H'(0)} \right)^{1/\gamma}.

Since D(x) := G^{\leftarrow}\left(e^{-1/x}\right), D^{\leftarrow}(x) = \frac{-1}{\log G(x)}.


\displaystyle G(x) = e^{-1/D^{\leftarrow}(x) }.

In other words,

\displaystyle G(H'(0) y + D(1) ) = \exp \left(- (1 + \gamma y)^{-1/\gamma} \right),

where \gamma = Q''(0) = H''(0)/H(0).

If 1 is not a continuity point, we repeat the above proof with U(t) replaced by U(tx_0) where x_0 is a continuity point. This completes the proof.

This generalised extreme value distribution encaptures three distributions depending on the nature of the tail of the original distribution X_i:

  • Type I – \gamma = 0: Gumbel (double exponential) distribution (exponential tail – e.g. normal or exponential distribution)
  • Type II – \gamma > 0: Fréchet distribution (polynomial tail – e.g. power law distribution to model extreme flood levels or high incomes)
  • Type III – \gamma < 0: reverse-Weibull distribution (finite maximum – e.g. uniform distribution)

Sample density functions are plotted below for specific values of \gamma.


[1] L. De Haan, A. Ferreira, Extreme Value Theory: An Introduction, Springer, 2006.


1 Comment »

  1. […] Outline proof of the extreme value theorem in statistics – Chaitanya’s Random Pages […]

    Pingback by Extreme Value Theory | Something Practical: Expanding My Own World in My Own Way — June 17, 2015 @ 4:33 am | Reply

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

Blog at

%d bloggers like this: