Chaitanya's Random Pages

June 26, 2012

Exports and Imports by Country

Filed under: geography — ckrao @ 1:08 pm

I recently had a look at the Year Book Australia, 2012 and was interested in seeing the current state of Australian Industry. From this page on the Australian Bureau of Statistics website, in 2010-2011 Australia had a GDP of around 1.32 trillion Australian dollars ($58k per capita) with the following breakdown by sector.

One can see here how the Construction and Financial and Insurance sectors have increased in GDP share over the past decade while manufacturing has decreased. Interestingly Mining is only around 7% of GDP and this figure has remained steady from ten years ago.

In mining as of 2009 Australia was the world’s largest “producer” of iron ore, and fourth largest (behind China, USA, India) of coal. Interestingly four countries produce 75% of the world’s iron ore: Australia, Brazil, India and China.

In 2009-2010 Australia had a business expenditure on Research and Development of around 1.3% of its GDP on (source: here)

Imports and exports were each of value around 20% of GDP (source: here). I stumbled onto the Observatory of Economic Complexity that helps one to visualise the breakdown further, over the last 15 years.

Compare the 1995 and 2010 breakdowns to see how China has become Australia’s largest export partner.

Countries Australia exports to in 1995 (left) and 2010 (right)

 

 

 

 

 

 

 

 

 

 

Next we see a breakdown of Australian exports and imports by sector. Note how dominant the mineral products have become.

Australian Exports in 1995 (left) and 2010 (right)

 

 

 

 

 

 

 

 

 

 

Finally we see machinery and transportation dominating Australian imports.

Australian Imports in 1995 (left) and 2010 (right)

 

 

 

 

 

 

 

 

 

 

For more such tree maps, see this list of countries by economic complexity.

June 23, 2012

The shortest day of the year

Filed under: climate and weather,mathematics — ckrao @ 1:14 pm

June 21 was the shortest day of this year here in Melbourne, Australia and I saw on timeanddate.com that the length of the day was 9h 32m 31s. I was curious to see how close this was to the rough estimate that one can derive for the length of day on the winter solstice based on the calculation I give below. We make the following simplifying assumptions.

  • The earth is spherical
  • On the winter solstice for the southern hemisphere the sun is directly over the tropic of cancer (i.e. directly overhead there at some time during the day)

We use the figure below to visualise the situation at this time. Assume the sun is a long way to the right of screen. It shows the sun directly overhead at the tropic of cancer (C) and Melbourne at solar noon. The left half of the image experiences night. Imagine the earth rotating about the axis joining the north and south poles (NS). Melbourne will then rotate around this axis and in this image it will always be somewhere on the line passing through D and M. When it is at the point D it corresponds to dawn or dusk. Hence the length of its daylight will the time in which it is in the red region.

Note that the angle \theta is equal to the earth’s angle of tilt, which is known to be approximately 23.5 degrees.

Below is another look at the red region from a different angle of perspective. In this figure below the line DQD' corresponds to the single point D in the figure above. Let \alpha = \angle MPD. Then the fraction of the day in daylight will be \alpha/\pi where \alpha is measured in radians.

Using the notation below, the daylight time will be \alpha/\pi \times 24 hours.

We use trigonometry to find \alpha. Firstly in triangle OPM,

\displaystyle r = R \cos \phi \quad \quad (1),

where \phi represents the latitude of Melbourne (38 degrees).

Secondly OP = R \sin \phi. Hence

\displaystyle PQ = OP \tan \theta = R \sin \phi \tan \theta. \quad \quad (2)

Finally, in triangle PDQ, \cos \alpha = PQ/r so combining this with (1) and (2) gives

\displaystyle \cos \alpha = \frac{R \sin \phi \tan \theta}{ R \cos \phi} = \frac{\sin \phi \tan \theta}{\cos \phi}.

Hence \alpha = \arccos\left(\tan \phi \tan \theta\right) and for Melbourne the predicted length of daylight is

\displaystyle \alpha/\pi \times 24 = \frac{24 \arccos\left(\tan 37.783^{\circ} \tan 23.438^{\circ}\right)}{\pi} \approx 9h 22m 54s.

This is about 10 minutes (1.8%) shorter than the actual time, the difference being due to effects such as the sun not being a point object, diffraction of light through the atmosphere and the approximations listed above. However the above calculation is a handy guide. A more precise calculation is shown here.

Incidentally, the shortest day here this year was curious temperature-wise as the graph below shows (data from the Australian Bureau of Meteorology).

The temperature hovered between 9.5°C and 9.9°C (at the half-hour sampling instances) between 6:30am and 7:30pm, with higher temperatures on either side! The maximum (after 9am) of 10.5°C was only reached at around 10pm.

June 11, 2012

Curious high or low temperature spots

Filed under: climate and weather,geography — ckrao @ 12:17 am

Here I provide a list of some places that go against our intuition about being surprisingly hot or cold at a certain time of the year given their latitude. Most of the places chosen are close enough to sea level so that elevation is not the cause (more likely ocean or continental influence).

  1. The east coast of Australia up to Sydney and Newcastle has a mid summer average maximum temperature of only 25.5°C – highly unusual for the east coast of a continent. At a corresponding latitude (33° from equator) the mid summer maximum at the east coast of Asia (Shanghai at 31°N) is 31.8°C, of South America (Buenos Aires at 34.6°S) is 30.4°C and in North America (Charleston, NC, USA) is 32.8°C.
  2. Walvis Bay, Namibia is in the tropics (23°S) yet has a mid summer average maximum temperature of only 22°C! Western South America is similarly cool with a mid summer average maximum of 25°C (Antofagasta, Chile). The corresponding west coast place in Australia has an average maximum of 32.4°C (Carnarvon at 25°S), in Africa 27°C (Dakhla, Western Sahara at 23.7°N), and in North America 32.4°C (Mazatlán, Mexico).
  3. Eureka, CA (USA) at 40.8°N has an average maximum of only 17.7°C in its warmest month. Contrast this with Fairbanks, Alaska at 65°N which still reaches an average 23°C in its warmest month with an all-time high of 37°C!
  4. Kyzyl, the capital of the Tuva republic in Russia, has an average maximum of 27°C in July and average minimum of -35°C in January! London is at the same latitude.
  5. Iceland and Ireland have very low variation between summer and winter temperatures due to ocean currents of the East Atlantic. Reykjavik, Iceland averages 2°C maximum in its coldest month up to 13°C in its warmest.
  6. The Turpan depression in northwest China surely has the hottest summer temperatures for a place so far from the equator. For example Turpan at 43°N has an average maximum of 40°C in July! A similar hot spot is Ashgabat, Turkmenistan (38°N) with an average maximum of 38°C in July. It has the same distance from the equator as Melbourne, Australia which only has an average maximum of 26°C in mid summer.
  7. Tromsø, Norway is at 69.7°N (hence beyond the arctic circle) yet in its coldest month its average temperature range is -6.5 to -2.2°C.
  8. Some parts of southern China are surprisingly cool for their latitude in winter. For example Guilin in Guangxi province is at 25°N and only 150m in elevation, yet has a range of 5.4 to 11.5°C in January (cooler than Tromsø’s warmest month!)
  9. Lima, Peru is coastal, only 12° south of the equator, yet has a maximum temperature range from 18.4° to 26.5°C from coolest to warmest months. Contrast this with Chennai, India which is also coastal and whose corresponding range is 29° to 38°C.
  10. The Southern Ocean shows very little variation between summer and winter temperatures. For example Bird Island in South Georgia at 54°S ranges from an average minimum of -5.4°C in its coldest month to an average maximum of 5.6°C in its warmest month.

Below is a map of showing all the places mentioned in this post.

June 10, 2012

Outline proof of the extreme value theorem in statistics

Filed under: mathematics — ckrao @ 9:15 am

Recently I read a proof in [1] of the main theorem of extreme value statistics: the Fisher–Tippett–Gnedenko theorem. In this post I give an outline.

Here we are interested in the maximum of many independent and identically distributed random variables X_i with distribution function F. Let x^* = \sup \{x: F(x) < 1\} which may be infinite. Then as n \rightarrow \infty,

{\rm Pr}(\max(X_1, X_2, \ldots X_n) \leq x) = F^n(x) tends to 0 if x < x^* and 1 if x > x^*.

Therefore \displaystyle \max(X_1, X_2, \ldots, X_n) converges in probability to x^* as n \rightarrow \infty.

In order to avoid this degenerate limiting distribution for all extreme value distributions, it is necessary to normalise the distribution.

To this end, suppose there exist real numbers a_n >0, b_n such that

\displaystyle \frac{\max(X_1, X_2, \ldots, X_n) - b_n}{a_n} approaches a non-degenerate limiting distribution.

In other words, there exists a distribution function G(x) such that

\displaystyle \lim_{n \rightarrow \infty} F^n(a_n x + b_n) = G(x).

Taking logarithms of both sides, this is equivalent to

\displaystyle \lim_{n \rightarrow \infty} n \log F(a_n x + b_n) = \log G(x)

This requires F(a_n x + b_n) \rightarrow 1 as n \rightarrow \infty. Using \log x \approx x -1 for x close to 1, the above is also equivalent to

\displaystyle \lim_{n \rightarrow \infty} \frac{1}{n(1 - F(a_n x + b_n) } = \frac{1}{- \log G(x)}. \quad \quad (1)

Next we use the following definition.

A non-decreasing function f has left-continuous inverse f^{\leftarrow} defined by

\displaystyle f^{\leftarrow}(x) := \inf\{y: f(y) > x\}.

One can use this definition to prove

Lemma 1: If f_n(x) \rightarrow g(x) for non-decreasing functions f_n then for each continuity point x we have f_n^{\leftarrow}(x) \rightarrow g^{\leftarrow}(x).

Next we claim that (1) is equivalent to

\displaystyle \frac{U(nx) - b_n}{a_n} \rightarrow G^{\leftarrow}\left(e^{-1/x}\right) \quad \quad (2)

where U is the left-continuous inverse of \frac{1}{1-F}. To see this, let V(x) = \frac{1}{1 - F(x)}. Then by the definition of U, U(x) = \inf \{ y: V(y) \geq x \}. Then for any n \in \mathbb{N},

U(nx) = \inf \left\{ y: \frac{1}{n(1-F(y))} \geq x \right\}

and so

\begin{array}{lcl} \frac{U(nx) - b_n}{a_n} &=& \inf \left\{\frac{y - b_n}{a_n}: \frac{1}{n(1 - F(y))} \geq x \right\}\\&=& \inf \left\{z: \frac{1}{n(1 - F(a_n z + b_n))} \geq x \right\}\end{array}

By (1) and the lemma as n \rightarrow \infty this tends to

\begin{array}{lcl} & & \inf \left\{ z: \frac{1}{-\log G(z)} \geq x \right\}\\& = & \inf \left\{ z: \log G(z) \geq \frac{-1}{x} \right\} \\ &=& \inf \left\{ z: G(z) \geq e^{-1/x} \right\}\\ &=& G^{\leftarrow} \left( e^{-1/x} \right), \end{array}

proving the claim. We can also write

\displaystyle \lim_{t \rightarrow \infty} \frac{U(tx) - b(t)}{a(t)} = G^{\leftarrow}\left(e^{-1/x}\right) =: D(x) \quad \quad (3)

where a(t) := a_{[t]}, b(t) := b_{[t]} and [t] denotes the integer part of t.

We are now ready to prove the main theorem of extreme value theory.

Theorem (Fisher, Tippet, Gnedenko):

\displaystyle G_{\gamma}(ax + b) = \exp\left( -(1 + \gamma x)^{-1/\gamma} \right), \quad 1 + \gamma x > 0, \quad \gamma \in \mathbb{R}

where the right side is equal to its limiting value \exp \left( - e^{-x}\right) if \gamma = 0.

Proof:

This will involve numerous substitutions but the main idea is to arrive at a differential equation that can be solved to obtain the above. Suppose 1 is a continuity point of D. Then for any continuity point x > 0,

\displaystyle \lim_{t \rightarrow \infty} \frac{U(tx) - U(t)}{a(t)} = D(x) - D(1) =: E(x). \quad \quad (4)

We can write

\displaystyle \frac{ U(txy) - U(t)}{a(t)} = \frac{U(txy) - U(ty)}{a(ty)} \frac{a(ty)}{a(t)} + \frac{U(ty) - U(t)}{a(t)}.

The claim is that both \frac{a(ty)}{a(t)} and \frac{U(ty) - U(t)}{a(t)} have limits as t \rightarrow \infty. If they had more than one limit point, say A_1, A_2, B_2, B_2 then for i = 1, 2 (4) in the limit t \rightarrow \infty gives us

E(xy) = E(x) A_i + B_i.

Subtracting gives E(x) (A_1 - A_2) = B_2 - B_1 which implies A_1 = A_2, B_2 = B_1 as we know E(x) is non-constant (since we seek a non-degenerate solution).

We conclude that

\displaystyle E(xy) = E(x) A(y) + E(y). \quad \quad (5)

This is a functional equation that we wish to solve. We let s:= \log x, t := \log y, H(x) := E(e^x) to obtain

 \displaystyle H(t+ s) = H(s) A(e^t) + H(t),

which using H(0) = E(1) = 0 implies

\displaystyle \frac{H(t+s) - H(t)}{s} = \frac{H(s) - H(0)}{s} A(e^t). \quad \quad (6)

Since H is monotone (following from the monotonicity of D), it is differentiable at some t. By (6) it is differentiable at all t. Indeed from (6) we obtain

\displaystyle H'(t) = H'(0) A(e^t) \neq 0. \quad \quad (7)

Let Q(t) = H(t)/H'(0). Then Q(0) = 0, Q'(0) = 1.

From (6) and (7),

\displaystyle Q(t+s) - Q(t) = Q(s) A(e^t) = Q(s) H'(t)/H'(0) = Q(s) Q'(t). \quad \quad (8)

Similarly, Q(s+t) - Q(s) = Q(t) Q'(s) and upon subtraction from (8) we obtain Q(s) - Q(s)Q'(t) = Q(t) - Q(t)Q'(s) from which

\displaystyle \frac{Q(s)}{s} (Q'(t) - 1) = Q(t) \frac{Q'(s) - 1}{s}.

Taking the limit as s \rightarrow 0 and using Q'(0) = H'(0)/H'(0) = 1 gives the following differential equation for Q.

\displaystyle Q'(t) -1 = Q(t) Q''(0), Q(0) = 0, Q'(0) = 1 \quad \quad (9)

To solve (9), differentiate both sides with respect to t: from Q''(t) = Q'(t) Q''(0) we see that

\displaystyle (\log Q')' (t) = Q''(0) =: \gamma.

Hence Q'(t) = e^{\gamma t} and since Q(0) = 0, Q(t) = \int_0^t e^{\gamma s}\ ds.

Recalling Q(t) = H(t)/H'(0), this leads to H(t) = H'(0) \frac{e^{\gamma t} - 1}{\gamma}.

Recalling H(x) := E(e^x) := D(e^x) - D(1) this means

D(t) = D(1) + H'(0) \frac{t^{\gamma} - 1}{\gamma}.

Taking the left-continuous inverse of both sides,

D^{\leftarrow}(x) = \left(1 + \gamma \frac{x - D(1)}{H'(0)} \right)^{1/\gamma}.

Since D(x) := G^{\leftarrow}\left(e^{-1/x}\right), D^{\leftarrow}(x) = \frac{-1}{\log G(x)}.

Hence

\displaystyle G(x) = e^{-1/D^{\leftarrow}(x) }.

In other words,

\displaystyle G(H'(0) y + D(1) ) = \exp \left(- (1 + \gamma y)^{-1/\gamma} \right),

where \gamma = Q''(0) = H''(0)/H(0).

If 1 is not a continuity point, we repeat the above proof with U(t) replaced by U(tx_0) where x_0 is a continuity point. This completes the proof.

This generalised extreme value distribution encaptures three distributions depending on the nature of the tail of the original distribution X_i:

  • Type I – \gamma = 0: Gumbel (double exponential) distribution (exponential tail – e.g. normal or exponential distribution)
  • Type II – \gamma > 0: Fréchet distribution (polynomial tail – e.g. power law distribution to model extreme flood levels or high incomes)
  • Type III – \gamma < 0: reverse-Weibull distribution (finite maximum – e.g. uniform distribution)

Sample density functions are plotted below for specific values of \gamma.

Reference

[1] L. De Haan, A. Ferreira, Extreme Value Theory: An Introduction, Springer, 2006.

June 7, 2012

Chris Gayle in recent T20s

Filed under: cricket,sport — ckrao @ 12:26 pm

Chris Gayle has amassed an unbelievable record in Twenty20 cricket. In 105 innings he has scored 3970 runs at an average of 43.62 with 8 hundreds and 26 fifties. Such statistics would be respectable in test cricket and fantastic for one-day internationals. Now consider that he has done this with a strike rate of 155.93 when at most 20 overs per innings are available! His statistics in the last 7 tournaments are even more mind boggling. It’s only a shame he hasn’t played a T20 international since 2010.

ESPN Cricinfo links:

The Best Batsman in Twenty20 Cricket

Chris Gayle in Twenty20 matches since 2011:A phenomenon in the shortest form

Below are his series-by-series scores since Jan 2011 together with some points of note.

1) 2010-11 Big Bash in Australia (played for Western Australia)
5th highest run scorer: 5 innings, 182 runs, HS 92, Ave 36.4, SR 193.61, 2 50s, 18 4s, 13 6s

Date Opponent Runs Balls 4s 6s SR Notes
Dec 30 2010 Tasmania 22 19 3 0 115.78
Jan 9 2011 NSW 61 30 8 4 203.33 opening partnership of 101 with Marsh in 8.3 overs
7th over: 6 6 6 4 4 6
Jan 13 2011 SA 6 2 0 1 300
Jan 18 2011 Vic 1 3 0 0 33.33
Jan 25 2011 Qld 92 40 7 8 230 opening stand 144 in 12.5 overs
120m six hit
50 reached off 20 balls

2) IPL 2011 in India (played for Royal Challengers Bangalore)
highest run scorer: 12 innings, 608 runs, Ave 67.55, SR 183.13, 2 100s, 3 50s, 56 4s, 44 6s

Date Opponent Runs Balls 4s 6s SR Notes
Apr 22 2011 Kochi Tuskers Kerala 102* 55 10 7 185.45
Apr 26 2011 Delhi Daredevils 26 14 4 1 185.71
Apr 29 2011 Pune Warriors 49 26 4 4 188.46
May 6 2011 Kings XI Punjab 107 49 10 9 218.36 he was 6 off 13 after 3 overs, then made 76 off the next 24
century off 46 balls
May 8 2011 Kochi Tuskers Kerala 44 16 3 5 275 opening stand 67 in 3.5 overs,
3rd over for 37 runs went 6, (nb)6, 4, 4, 6, 6, 4
May 11 2011 Rajasthan Royals 70* 44 6 4 159.09
May 14 2011 Kolkata Knightriders 38 12 6 2 316.66 His innings: 4 4 . 4 4 . 6 6 2 4 4 out
May 17 2011 Kings XI Punjab 0 7 0 0 0 faced maiden over to Praveen Kumar
May 22 2011 Chennai Superkings 75* 50 4 6 150
May 24 2011 Chennai Superkings 8 9 0 1 88.88
May 27 2011 Mumbai Indians 89 47 9 5 189.36 opening stand 113 in 10.4 overs
May 28 2011 Chennai Superkings 0 3 0 0 0

In this series, if Gayle scored more than 8, Bangalore won. 🙂

3) Champions League 2011-12 (played for Royal Challengers Bangalore)
second highest run scorer: 6 innings, 257 runs, Ave 42.83, SR 178.47, 2 50s, 2 100s, 15 4s, 24 6s

Date Opponent Runs Balls 4s 6s SR Notes
Sep 23 2011 Warriors 23 14 2 2 164.28
Sep 29 2011 Kolkata 25 16 0 3 156.25
Oct 3 2011 Somerset 84 48 4 8 186.95
Oct 5 2011 SA 26 15 0 3 173.33 team chased down 215 – won by 2 wickets with 6 off last ball
Oct 7 2011 NSW 92 41 8 8 224.39 team chased down 204 in 18.3, 50 scored off 20 balls
Oct 9 2011 Mumbai 5 12 1 0 41.66

4) Stanbic Bank 20 Series in Zimbabwe (played for Matabeleland Tuskers)
highest run scorer (next best 225): 6 innings, 293 runs, Ave 58.60, SR 151.03, 1 100, 2 50s, 19 4s, 19 6s

Date Opponent Runs Balls 4s 6s SR Notes
Nov 25 2011 Mashonaland Eagles 27 31 3 0 87.09
Nov 26 2011 Mountaineers 61 38 3 5 160.52
Nov 27 2011 Southern Rocks 0 1 0 0 0
Nov 29 2011 Mid West Rhinos 109* 59 7 8 184.74 lost by 7 wickets (3 balls remaining)
Dec 2 2011 Mountaineers 45 34 3 2 132.25
Dec 3 2011 Mashonaland Eagles 51 31 3 4 164.51

5) Big Bash League in Australia, 2011-12 (played for Sydney Thunder)
8th highest run scorer: 7 innings, 252 runs, Ave 42.00, SR 150.00,  1 100, 2 50s, 11 4s, 22 6s

Date Opponent Runs Balls 4s 6s SR Notes
Dec 17 2011 Melb Stars 4 4 1 0 100
Dec 23 2011 Adelaide Strikers 100* 54 3 11 185.18 sixes 7 to 10 were in consecutive deliveries faced
100 off 53
Dec 30 2011 Melb Renegades 75 54 3 5 138.88
Jan 1 2012 Hobart Hurricanes 53 33 2 5 160.6
Jan 8 2012 Sydney Sixers 0 6 0 0 0
Jan 11 2012 Perth Scorchers 20 16 2 1 125
Jan 17 2012 Brisbane Heat 0 1 0 0 0

6) Bangladesh Premier League (played for Barisal Burners)
9th highest run scorer (when others played up to 12 innings): 5 innings, 288 runs, Ave 96.00, SR 187.01,  2 100s, 19 4s, 26 6s

Date Opponent Runs Balls 4s 6s SR Notes
Feb 10 2012 Sylhet Royals 101* 44 7 10 229.54
Feb 11 2012 Duronto Rajshahi 39 23 4 2 169.56
Feb 13 2012 Khulna Royal Bengals 2 10 0 0 20
Feb 14 2012 Dhaka Gladiators 116 61 6 11 190.16 lost by 21 runs, scored 116 out of 163 for team when dismissed
Feb 16 2012 Chittagong Kings 30* 16 2 3 187.5 retired hurt with groin injury

7) IPL 2012 in India (played for Royal Challengers Bangalore)
leading run scorer (next best: 590 runs): 14 innings, 733 runs, Ave 61.08, SR 160.74, 1 100, 7 50s, 46 4s, 59 6s

Date Opponent Runs Balls 4s 6s SR Notes
Apr 10 2012 Kolkata Knightriders 2 8 0 0 25
Apr 12 2012 Chennai Superkings 68 35 2 6 194.28
Apr 15 2012 Rajasthan Royals 8 9 0 1 88.88
Apr 17 2012 Pune Warriors 81 48 4 8 168.75
Apr 20 2012 Kings XI Punjab 87 56 8 4 155.35
Apr 23 2012 Rajasthan Royals 4 8 0 0 50
Apr 28 2012 Kolkata Knightriders 86 58 7 6 148.27 was 38 off 38 balls at one stage
May 2 2012 Kings XI Punjab 71 42 6 4 169.04
May 6 2012 Deccan Chargers 26 22 1 2 118.18
May 9 2012 Mumbai Indians 82* 59 5 6 138.98
May 11 2012 Pune Warriors 57 31 3 6 183.87 50 off 24 balls
May 14 2012 Mumbai Indians 6 8 0 1 75
May 17 2012 Delhi Daredevils 128* 62 7 13 206.45 second highest T20 partnership of 204 with Kohli,
team score of 215, 129 off the last 8.2 overs
Gayle scored 90 off his last 28 balls faced
May 20 2012 Deccan Chargers 27 10 3 2 270

Aggregated, the seven series make stunning reading:

55 innings, 2613 runs, Ave 56.80, SR 169.46, 7 100s, 18 50s, 184 4s, 207 6s

Data courtesy of ESPN Cricinfo.

Blog at WordPress.com.