# Chaitanya's Random Pages

## June 29, 2014

### Melbourne’s late start to cooler temperatures in 2014

Filed under: climate and weather — ckrao @ 11:21 am

It occurred to me that Melbourne has had an unusual number of cooler days in 2014 up to the middle of June. After some checking here, I found that from January 1 to June 17 this year, the Melbourne Regional Office has only twice recorded days with a maximum temperature less than 15°C (May 2 [14.3°C] and May 4 [14.2°C]). This beats 2003 when the previous fewest number of 5 such instances took place prior to June 18. The graph and table below show that in the past this number has been as high as 40, but in recent years there is a clear downward trend.

 Melbourne Regional Office Average number of sub-15°C days before June 18 Average 1856-2013 21.6 Average 1971-2013 15.0 Average 2004-2013 11.7 2014 2

If we do the same calculation for a station further away from the city centre, say the international airport just over 20km away, there is not as much historical data and the effect is not as pronounced, but this year still equals the previous record of 2003 (11 times). In recent times the sub-15°C days prior to June 18 have been about twice as frequent here as the Regional Office.

 Melbourne Airport Average number of sub-15°C days before June 18 Average 1971-2013 24.7 Average 2004-2013 23.1 2014 11

The data is from Australia’s Bureau of Meteorology website.

## June 28, 2014

### A few sums involving inverses of binomial coefficients

Filed under: mathematics — ckrao @ 11:11 am

One of the problems from the 1990 Australian Mathematical Olympiad ([1]) asks one to prove the following result.

$\displaystyle \sum_{k=1}^{2n-1}\frac{(-1)^{k-1}}{\binom{2n}{k}} = \frac{1}{n+1}.\quad\quad (1)$

One solution in [1] makes use of the following identity:

\begin{aligned} \frac{1}{\binom{m}{k}} &= \frac{k!(m-k)!}{m!}\\ &= \frac{k!(m-k)!(m+1)(m+2)}{m!(m+1)(m+2)}\\ &= \left(\frac{m+1}{m+2}\right)\frac{k!(m-k)!(m+2)}{(m+1)!}\\ &= \left(\frac{m+1}{m+2}\right)\frac{k!(m-k)!((m-k+1) + (k+1))}{(m+1)!}\\ &=\left(\frac{m+1}{m+2}\right)\frac{k!(m-k+1)! + (k+1)!(m-k)!}{(m+1)!}\\ &= \frac{m+1}{m+2}\left(\frac{1}{\binom{m+1}{k}}+ \frac{1}{\binom{m+1}{k+1}}\right).\quad\quad(2) \end{aligned}

Then setting $m=2n$, (1) becomes a telescoping sum:

\begin{aligned} \sum_{k=1}^{2n-1}\frac{(-1)^{k-1}}{\binom{2n}{k}} &= \sum_{k=1}^{2n-1} \frac{2n+1}{2n+2}\left(\frac{(-1)^{k-1}}{\binom{2n+1}{k}}+ \frac{(-1)^{k-1}}{\binom{2n+1}{k+1}}\right)\\ &= \frac{2n+1}{2n+2}\left( \sum_{k=1}^{2n-1} \frac{(-1)^{k-1}}{\binom{2n+1}{k}} + \sum_{k=1}^{2n-1} \frac{(-1)^{k-1}}{\binom{2n+1}{k+1}}\right)\\ &= \frac{2n+1}{2n+2}\left( \frac{(-1)^0}{\binom{2n+1}{1}} + \sum_{k=2}^{2n-1} \frac{(-1)^{k-1}}{\binom{2n+1}{k}} + \frac{(-1)^{2n-1-1}}{\binom{2n+1}{2n-1+1}} + \sum_{k=1}^{2n-2} \frac{(-1)^{k-1}}{\binom{2n+1}{k+1}} \right)\\ &= \frac{2n+1}{2n+2}\left( \frac{1}{2n+1} + \frac{1}{2n+1} + \sum_{k=1}^{2n-2} \frac{(-1)^{k}}{\binom{2n+1}{k+1}} + \sum_{k=1}^{2n-2} \frac{(-1)^{k-1}}{\binom{2n+1}{k+1}} \right)\\ &= \frac{2n+1}{2n+2} \left(\frac{2}{2n+1} + 0\right)\\ &= \frac{1}{n+1}. \end{aligned}

Using a similar argument one can prove the more general identity

$\displaystyle \sum_{k=0}^n\frac{(-1)^k}{\binom{n}{k}} = \frac{n+1}{n+2}\left(1 + (-1)^n\right).\quad\quad (3)$

How about the sum $\displaystyle S(n) := \sum_{k=0}^{n}\frac{1}{\binom{n}{k}}$? Using (1) we find the recursion

\begin{aligned} S(n) &= \frac{n+1}{n+2}\left(\sum_{k=0}^n \frac{1}{\binom{n+1}{k}} + \frac{1}{\binom{n+1}{k+1}}\right)\\ &= \frac{n+1}{n+2}\left(S(n+1)-1 + S(n+1)-1\right)\\ &= \frac{2(n+1)}{n+2}\left(S(n+1)-1 \right). \quad\quad (4)\\ \end{aligned}

Hence we have the recurrence relation

$\displaystyle S(n+1) = \frac{n+2}{2(n+1)}S(n) + 1.\quad\quad(5)$

From this $S(n)$ does not have a closed form solution but using induction on n we can prove the following relations found in [2].

$\displaystyle S(n) := \sum_{k=0}^{n}\frac{1}{\binom{n}{k}} = \frac{n+1}{2^{n+1}}\sum_{k=1}^{n+1}\frac{2^k}{k} = (n+1)\sum_{k=0}^n \frac{1}{(n+1-k)2^k}.\quad\quad(6)$

Firstly note that when $n=0$, the three expressions in (5) become $\frac{1}{\binom{0}{0}}$,  $\frac{1}{2^{1}}\frac{2^1}{1}$ and $\frac{1}{1}$, which are all equal to 1. Hence (6) holds for $n=0$. Assume it holds for $n =m$. That is,

$\displaystyle S(m) := \sum_{k=0}^{m}\frac{1}{\binom{m}{k}} = \frac{m+1}{2^{m+1}}\sum_{k=1}^{m+1}\frac{2^k}{k} = (m+1)\sum_{k=0}^m \frac{1}{(m+1-k)2^k}.\quad\quad(7)$

Then using (5), on the one hand,

\begin{aligned} S(m+1) &= 1 + \frac{m+2}{2(m+1)}S(m)\\ &= 1 + \frac{m+2}{2(m+1)} \frac{m+1}{2^{m+1}}\sum_{k=1}^{m+1}\frac{2^k}{k} \\ &= 1 + \frac{m+2}{2^{m+2}}\sum_{k=1}^{m+1}\frac{2^k}{k}\\ &= \frac{(m+2)2^{m+2}}{2^{m+2}(m+2)} + \frac{m+2}{2^{m+2}}\sum_{k=1}^{m+1}\frac{2^k}{k}\\ &= \frac{m+2}{2^{m+2}}\sum_{k=1}^{m+2}\frac{2^k}{k},\quad\quad(8) \end{aligned}

while on the other,

\begin{aligned}S(m+1) &= 1 + \frac{m+2}{2(m+1)}S(m)\\ &= 1 + \frac{m+2}{2(m+1)} (m+1)\sum_{k=0}^m \frac{1}{(m+1-k)2^k}\\ &= 1 + (m+2)\sum_{k=0}^m \frac{1}{(m+1-k)2^{k+1}}\\ &= 1 + (m+2)\sum_{k=0}^m \frac{1}{(m+2-(k+1))2^{k+1}}\\ &= 1 + (m+2)\sum_{k=1}^{m+1} \frac{1}{(m+2-k)2^{k}}\\ &= (m+2)\sum_{k=0}^{m+1} \frac{1}{(m+2-k)2^k}.\quad \quad(9) \end{aligned}

Equations (8) and (9) show that if (6) holds for $n=m$, then it does for $n=m+1$. By the principle of mathematical induction, (6) then is true for all integers $n\geq 0$.

Another related sum that can be expressed in terms of $S(n)$ is $\displaystyle \frac{1}{\binom{2n}{0}} + \frac{1}{\binom{2n}{2}} + \frac{1}{\binom{2n}{4}} + \ldots + \frac{1}{\binom{2n}{2n}}$. We have

\begin{aligned}& \sum_{k=0}^{2n} \frac{1}{\binom{2n}{2k}}\\&= \frac{2n+1}{2n+2}\sum_{k=0}^{2n} \frac{1}{\binom{2n+1}{k}}\text{\quad \quad (by (5))}\\ &= \frac{2n+1}{2n+2} S(2n+1).\quad\quad(10) \end{aligned}

Note that more identities with inverses of binomial coefficients are in [2] (and references therein), where the integral

$\displaystyle \frac{1}{\binom{n}{k}} = (n+1) \int_0^1 t^k (1-t)^{n-k}\ \text{d}t$

is utilised.

#### References

[1] A. W. Plank, Mathematical Olympiads: The 1990 Australian Scene, University of Canberra, 1990.

[2] T. Mansour, Combinatorial Identities and Inverse Binomial Coefficients, available at http://math.haifa.ac.il/toufik/toupap02/qap0204.pdf

## May 31, 2014

### The world’s fastest growing countries by population

Filed under: geography — ckrao @ 7:24 am

I was amazed to learn recently how rapidly the population of some African countries is increasing. The following table shows those countries that grew by at least half a million people in the mid 2012 to mid 2013 year (see references below). Note that Nigeria is at least 50% ahead of every country other than India and China. Also note the complete absence of European countries (Italy had the largest growth there of 416,000 which is only 47th in the world). Note that the figures are estimates only.

 Country Continent Annual Population Growth (mid 2012 – mid 2013) India Asia 20,290,000 China Asia 6,688,000 Nigeria Africa 5,551,000 Pakistan Asia 3,696,000 Indonesia Asia 3,553,000 Democratic Republic of Congo Africa 2,334,000 United States North America 2,281,000 Ethiopia Africa 2,253,000 Bangladesh Asia 2,081,000 Mexico North America 2,026,000 Egypt Africa 1,893,000 Philippines Asia 1,825,000 Brazil South America 1,685,000 Kenya Africa 1,266,000 Uganda Africa 1,232,000 Tanzania Africa 1,204,000 Myanmar Asia 1,160,000 Iraq Asia 1,051,000 Saudi Arabia Asia 997,000 Iran Asia 976,000 Vietnam Asia 922,000 Sudan Africa 863,000 Algeria Africa 792,000 Mozambique Africa 790,000 Malaysia Asia 734,000 Yemen Asia 725,000 Ivory Coast Africa 717,000 South Africa Africa 704,000 Ghana Africa 659,000 Niger Africa 649,000 Angola Africa 647,000 Madagascar Africa 585,000 Burkina Faso Africa 550,000 Colombia South America 544,000 Cameroon Africa 543,000 Mali Africa 532,000 Syria Asia 531,000 Afghanistan Asia 517,000 Thailand Asia 508,000

The next table shows that more than 6/7 of the world’s current population growth is from Asia or Africa.

 Continent Annual population growth in millions from mid 2012 to mid 2013 (% of world total) Asia 51.3 (55.2%) Africa 29.3 (31.5%) North America 6.2 (6.6%) South America 4.4 (4.7%) Europe 1.1 (1.2%) Oceania 0.66 (0.7%) World 92.9

## May 17, 2014

### Forms of Stewart’s theorem

Filed under: mathematics — ckrao @ 10:41 am

Stewart’s theorem finds the length of a cevian $d=AD$ in terms of the side lengths of the triangle $ABC$ and the lengths $m, n$ into which point $D$ on $BC$ divides that side. Here are some forms of the same formula.

1. The most common form we see is

$\displaystyle b^2 m + c^2 n = a(d^2 + mn) \quad \Rightarrow \quad d^2 = \frac{b^2m + c^2n}{a} - mn.\quad\quad(1)$

An easy-to remember form of this is rewriting the above as $man + dad = bmb + cnc$ (a man and his dad hid a bomb in the sink!). This can be proved by applying the cosine rule to triangles ACD and then ABC:

\begin{aligned} d^2 &= AD^2\\ &= AC^2 + CD^2 - 2AC.CD\cos \angle DCA\\ &= AC^2 + CD^2 - 2AC.CD \cos \angle BCA\\ &= AC^2 + CD^2 - 2AC.CD \frac{CA^2 + CB^2 - AB^2}{2CA.CB}\\ &= b^2 + n^2 - 2bn \frac{b^2 + a^2 - c^2}{2ba}\\ &= b^2 + n^2 - n \frac{b^2 + a^2 - c^2}{a}\\ &= \frac{b^2(m+n) - n(b^2 + a^2 - c^2)}{a} + n^2\\ &= \frac{b^2 m + c^2n}{a} + n^2 - \frac{na^2}{a}\\ &= \frac{b^2 m + c^2n}{a} + n(n-a)\\ &= \frac{b^2 m + c^2n}{a} - mn.\\ \end{aligned}

2. If $D$ divides the side $BC$ in the ratio $BD:DC = r:s$,

$\displaystyle d^2 = \frac{rb^2 + sc^2}{r+s} - \frac{rsa^2}{(r+s)^2} = \frac{r^2b^2 + s^2c^2 + rs(b^2 + c^2 - a^2)}{(r+s)^2} .\quad\quad(2)$

3. Similar to (2) but substituting $\displaystyle \lambda = r/(r+s) = BD:BC$,

$\displaystyle d^2 = \lambda b^2 + (1-\lambda)c^2 - \lambda(1-\lambda)a^2.\quad\quad(3)$

This and the previous form are conveniently proved using vectors. Writing the vector $\mathbf{AD} = \lambda \mathbf{AC} + (1-\lambda) \mathbf{AB}$,

\begin{aligned} d^2 &= \mathbf{AD}. \mathbf{AD}\\ &= \left(\lambda \mathbf{AC} + (1-\lambda) \mathbf{AB}\right).\left(\lambda \mathbf{AC} + (1-\lambda) \mathbf{AB}\right)\\ &= \lambda^2 \mathbf{AC}.\mathbf{AC} + (1-\lambda)^2 \mathbf{AB}.\mathbf{AB} + 2\lambda(1-\lambda)\mathbf{AB}.\mathbf{AC}\\ &= \lambda^2 b^2 + (1-\lambda)^2 c^2 + \lambda(1-\lambda)\left(b^2 + c^2 - a^2\right)\\ &= b^2(\lambda^2 + \lambda(1-\lambda)) + c^2((1-\lambda)^2 + \lambda(1-\lambda)) - a^2\lambda(1-\lambda)\\ &= \lambda b^2 + (1-\lambda)c^2 - \lambda(1-\lambda)a^2. \end{aligned}

Note that this is valid for any real $\lambda$, so $D$ may lie beyond segment $BC$.

4. Writing (3) as a quadratic in $\lambda$:

$\displaystyle d^2 = \lambda^2 a^2 + \lambda(c^2-b^2 -a^2) + b^2.\quad\quad(4)$

5. A symmetric form [1], where the following distances are taken as directed segments ($CD = -DC$ etc.)

$\displaystyle \frac{BA^2}{BC.BD} + \frac{CA^2}{CB.CD} + \frac{DA^2}{DB.DC} = 1\quad\quad(5)$

Note that this is equivalent to $BA^2 . CD + CA^2 . DB = DA^2.CB + CD.DB.CB = CB(DA^2 + CD.DB)$ which is (1).

Here are a few special cases of this formula applying form (3).

• $D = C$ (i.e. $\lambda = 1)$: $d^2 = b^2$
• $D$ is the midpoint of $BC$ ($\lambda = 1/2$): $\displaystyle d^2 = b^2/2 + c^2/2 - a^2/4$ or $\displaystyle b^2 + c^2 = 2(d^2 + (a/2)^2)$ (Apollonius’ theorem)
• $D$ is a third of the way along $CB$ (closer to $C$) ($\lambda = 2/3$): $\displaystyle d^2 = 2b^2/3 + c^2/3 - 2a^2/9$
• $AD$ is the internal angle bisector of $\angle BAC$ ($\lambda = c/(b+c)$):

\begin{aligned} d^2 &= cb^2/(b+c) + bc^2/(b+c) - bca^2/(b+c)^2\\ &= bc\left[\frac{b}{b+c} + \frac{c}{b+c}- \left(\frac{a}{b+c}\right)^2\right]\\ &= bc\left[1 - \left(\frac{a}{b+c}\right)^2\right]\\ &= bc - mn. \end{aligned}

• $AD$ is the external angle bisector of $\angle BAC$ (assume $b > c$ so $\lambda = -c/(b-c)$):

\begin{aligned} d^2 &= \frac{-c}{b-c} b^2 + \frac{b}{b-c}c^2 + \frac{bc}{(b-c)^2}a^2\\ &= bc\left[\frac{-b}{b-c} + \frac{c}{b-c}+ \left(\frac{a}{b-c}\right)^2\right]\\ &= bc\left[ \left(\frac{a}{b-c}\right)^2 - 1\right]\\ &= mn - bc. \end{aligned}

## April 29, 2014

### Historical world population distribution

Filed under: geography — ckrao @ 2:31 pm

The following plots show the percentage distribution of the world’s population over a 2400 year period. To my surprise China had over a third of the world population during much of the 1800s, while the Indian subcontinent has historically had a higher share than now. It’s interesting to see the growth of the New World in the bottom right plot too. One expects Asia minus China and Africa to have significant proportional growth at least in the first half of this century.

#### Reference

http://www.worldhistorysite.com/population.html – using Colin McEvedy and Richard Jones, Atlas of World Population History (Penguin, 1978)

## April 25, 2014

### Three and four tangent circles

Filed under: mathematics — ckrao @ 12:14 pm

If we are given two tangent circles how do we find a third circle tangent to both of them? If the two circles are internally tangent, the third circle is also internally tangent to the larger circle.

If the two circles are externally tangent, the third circle is either internally or externally tangent to both.

Note that we disregard cases where the third circle has the same point of tangency as the other two such as shown below.

Locus of centre of third circle

If we let the centres of the three circles $C_1, C_2, C_3$ be $O_1, O_2, O_3$ and with respective radii $R_1, R_2, R_3$ (assume $R_1 > R_2$) then what is the locus of $O_3$ as $O_1, R_1, O_2, R_2$ are fixed while $R_3$ is allowed to vary?

We note in the internally tangent case,

$O_2 O_3 + O_3 O_1 = (R_2 + R_3) + (R_1 - R_3) = R_1 + R_2.\quad\quad (1)$

Since this distance is independent of $R_3$, $O_3$ is on an ellipse with foci at $O_1$ and $O_2$ shown in green below.

In the externally tangent case, there are two subcases depending on whether the third circle is internally or externally tangent to the other two:

$O_3 O_2 - O_3 O_1 = (R_3 - R_2) - (R_3 - R_1) = R_1 - R_2.\quad\quad (2)$

$O_3 O_1 - O_3 O_2 = (R_3 + R_1) - (R_3 + R_2) = R_1 - R_2.\quad\quad (3)$

This corresponds to the two branches of a hyperbola with foci at $O_1$ and $O_2$ shown in green below.

Notice that similar results hold when the original two circles are not necessarily tangent to each other.

Distances

Denoting the points of tangency between $C_2, C_3$ by $T_1$ and similarly defining $T_2, T_3$, we shall find some distances in the figure below.

By the cosine rule in $\triangle O_1O_2O_3$,

\begin{aligned} \cos \angle O_1 O_2 O_3 &= \frac{O_1 O_2^2 + O_2O_3^2 - O_1O_3^2}{2O_1 O_2. O_2O_3}\\ &= \frac{(R_1 - R_2)^2 + (R_2 + R_3)^2- (R_1 - R_3)^2 }{2(R_1 - R_2)(R_2 + R_3)}\\ &=\frac{2R_2^2 - 2R_1 R_2 + 2R_2 R_3 + 2R_1 R_3}{2(R_1 - R_2)(R_2 + R_3)}\\ &= \frac{-(R_1 - R_2)(R_2+R_3) + 2R_1R_3}{(R_1 - R_2)(R_2 + R_3)}\\ &= -1 + \frac{2R_1R_3}{(R_1 - R_2)(R_2 + R_3)}.\quad\quad (4) \end{aligned}

The perimeter of $\triangle O_1O_2O_3$ is $(R_1 - R_2) + (R_1 - R_3) + (R_2 + R_3) = 2R_1$, so by Heron’s formula its area is

$\displaystyle \sqrt{R_1(R_1 - (R_1 - R_2)) (R_1 - (R_1 - R_3)) (R_1 - (R_2 + R_3))} = \sqrt{R_1R_2 R_3(R_1 - R_2 - R_3)}.\quad\quad (5)$

If the points $T_3, O_1, O_2$ have coordinates $(0,0), (R_1,0), (R_2,0)$ respectively, then $O_3$ has coordinates $(R_2 + (R_2 + R_3)\cos \angle O_1 O_2 O_3, (R_2 + R_3)\sin \angle O_1 O_2 O_3)$. We have $\cos \angle O_1 O_2 O_3$ found above and $\sin \angle O_1 O_2 O_3$ can be found via

$\displaystyle \frac{1}{2}O_1O_2.O_2O_3 \sin \angle O_1 O_2 O_3 = |O_1O_2O_3| = \sqrt{R_1R_2 R_3(R_1 - R_2 - R_3)}.\quad\quad (6)$

Hence the coordinates of $O_3$ are

\begin{aligned} & (R_2 + (R_2 + R_3)\cos \angle O_1 O_2 O_3, (R_2 + R_3)\sin \angle O_1 O_2 O_3)\\ &= \left(R_2 + (R_2 + R_3)\left(-1 + \frac{2R_1R_3}{(R_1 - R_2)(R_2 + R_3)}\right), (R_2 + R_3)2\frac{\sqrt{R_1R_2 R_3(R_1 - R_2 - R_3)}}{O_1 O_2.O_2O_3}\right)\\ &= \left(R_2 - R_2 - R_3+ \frac{2R_1R_3}{(R_1 - R_2)}, (R_2 + R_3)2\frac{\sqrt{R_1R_2 R_3(R_1 - R_2 - R_3)}}{(R_1-R_2)(R_2+R_3)}\right)\\ &= \left(\frac{R_3(R_1 + R_2)}{R_1 - R_2}, \frac{2\sqrt{R_1R_2 R_3(R_1 - R_2 - R_3)}}{R_1-R_2}\right).\quad\quad (7) \end{aligned}

Note here that the first coordinate varies with $R_3$ linearly for $R_1, R_2$ fixed. The distance $T_3O_3$ is found from

\begin{aligned} T_3O_3^2 &= \frac{1}{(R_1 - R_2)^2}\left([R_3(R_1 + R_2)]^2 + 4R_1R_2R_3(R_1 - R_2 - R_3)\right)\\ &= \frac{1}{(R_1 - R_2)^2}\left(R_3^2 ((R_1 + R_2)^2 - 4R_1 R_2) + 4R_1 R_2 R_3(R_1 - R_2)\right)\\ &= \frac{1}{(R_1 - R_2)^2}\left(R_3^2 (R_1 - R_2)^2 + 4R_1 R_2 R_3(R_1 - R_2)\right)\\ &= R_3^2 + \frac{4R_1 R_2 R_3}{R_1 - R_2}.\quad\quad (8) \end{aligned}

Hence we have found that interestingly the power of the tangency point $T_3$ with respect to circle $C_3$ is the relatively simple form $\frac{4R_1R_2R_3}{R_1 - R_2}$.

We may also find the distance between points of tangency $T_1T_3$ as $2R_2 \cos (\angle O_1 O_2 O_3/2)$ using the identity $\cos^2 (A/2) = (1 + \cos A)/2$. From (4),

\begin{aligned} T_1 T_3 &= 2R_2 \cos (\angle O_1 O_2 O_3/2)\\ &= 2R_2 \sqrt{\frac{R_1R_3}{(R_1 - R_2)(R_2 + R_3)}}\\ &= \frac{2\sqrt{R_1R_2R_3} \sqrt{R_2}}{\sqrt{(R_1 - R_2)(R_2 + R_3)}}.\quad\quad(9) \end{aligned}

Next we perform the same calculations (4)-(9) with the configuration of three circles tangent externally.

By the cosine rule in $\triangle O_1O_2O_3$,

\begin{aligned} \cos \angle O_1 O_2 O_3 &= \frac{O_1 O_2^2 + O_2O_3^2 - O_1O_3^2}{2O_1 O_2. O_2O_3}\\ &= \frac{(R_1 + R_2)^2 + (R_2 + R_3)^2- (R_1 + R_3)^2 }{2(R_1 + R_2)(R_2 + R_3)}\\ &=\frac{2R_2^2 + 2R_1 R_2 + 2R_2 R_3 - 2R_1 R_3}{2(R_1 + R_2)(R_2 + R_3)}\\ &= \frac{(R_1 + R_2)(R_2+R_3) - 2R_1R_3}{(R_1 + R_2)(R_2 + R_3)}\\ &= 1 - \frac{2R_1R_3}{(R_1 + R_2)(R_2 + R_3)}. \quad\quad (10) \end{aligned}

The perimeter of $\triangle O_1O_2O_3$ is $(R_1 + R_2) + (R_1 + R_3) + (R_2 + R_3) = 2(R_1+R_2+R_3)$, so by Heron’s formula its area is the attractive form

$\displaystyle \sqrt{R_1R_2 R_3(R_1 + R_2 + R_3)}.\quad\quad (11)$

If the points $T_3, O_1, O_2$ have coordinates $(0,0), (-R_1,0), (R_2,0)$ respectively, then $O_3$ has coordinates $(R_2 - (R_2 + R_3)\cos \angle O_1 O_2 O_3, (R_2 + R_3)\sin \angle O_1 O_2 O_3)$. We have $\cos \angle O_1 O_2 O_3$ found above and $\sin \angle O_1 O_2 O_3$ can be found via

$\displaystyle \frac{1}{2}O_1O_2.O_2O_3 \sin \angle O_1 O_2 O_3 = |O_1O_2O_3| = \sqrt{R_1R_2 R_3(R_1 + R_2 + R_3)}.\quad\quad (12)$

Hence the coordinates of $O_3$ are

\begin{aligned} & (R_2 - (R_2 + R_3)\cos \angle O_1 O_2 O_3, (R_2 + R_3)\sin \angle O_1 O_2 O_3)\\ &= \left(R_2 - (R_2 + R_3)\left(1 - \frac{2R_1R_3}{(R_1 + R_2)(R_2 + R_3)}\right), (R_2 + R_3)2\frac{\sqrt{R_1R_2 R_3(R_1 + R_2 + R_3)}}{O_1 O_2.O_2O_3}\right)\\ &= \left(R_2 - R_2 - R_3+ \frac{2R_1R_3}{(R_1 + R_2)}, (R_2 + R_3)2\frac{\sqrt{R_1R_2 R_3(R_1 +R_2 + R_3)}}{(R_1+R_2)(R_2+R_3)}\right)\\ &= \left(\frac{R_3(R_1 - R_2)}{R_1 + R_2}, \frac{2\sqrt{R_1R_2 R_3(R_1 + R_2 + R_3)}}{R_1+R_2}\right).\quad\quad (13) \end{aligned}

Note again that the first coordinate varies with $R_3$ linearly for $R_1, R_2$ fixed. Formulas (7) and (13) were used to generate the diagrams in this post.

The distance $T_3O_3$ is found from

\begin{aligned} T_3O_3^2 &= \frac{1}{(R_1 + R_2)^2}\left([R_3(R_1 - R_2)]^2 + 4R_1R_2R_3(R_1 + R_2 + R_3)\right)\\ &= \frac{1}{(R_1 + R_2)^2}\left(R_3^2 ((R_1 - R_2)^2 + 4R_1 R_2) + 4R_1 R_2 R_3(R_1 + R_2)\right)\\ &= \frac{1}{(R_1 + R_2)^2}\left(R_3^2 (R_1 + R_2)^2 + 4R_1 R_2 R_3(R_1 + R_2)\right)\\ &= R_3^2 + \frac{4R_1 R_2 R_3}{R_1 + R_2}.\quad\quad (14) \end{aligned}

Hence we have found that the power of the tangency point $T_3$ with respect to circle $C_3$ is $\frac{4R_1R_2R_3}{R_1 + R_2}$.

The distance between points of tangency $T_1T_3$ is $2R_2 \sin (\angle O_1 O_2 O_3/2)$ and using the identity $\sin^2 (A/2) = (1 - \cos A)/2$ with (10),

\begin{aligned} T_1 T_3 &= 2R_2 \sin (\angle O_1 O_2 O_3/2)\\ &= 2R_2 \sqrt{\frac{R_1R_3}{(R_1 + R_2)(R_2 + R_3)}}\\ &= \frac{2\sqrt{R_1R_2R_3} \sqrt{R_2}}{\sqrt{(R_1 + R_2)(R_2 + R_3)}}.\quad\quad(15) \end{aligned}

Many of the formulas here are found from the first case by replacing $R_1$ with $-R_1$. Other distances between points in the figure can be found through similar means.

Finally, we remark that the radical axes of each pair of circles intersect at the radical centre – that point is the circumcentre of the triangle passing through the three points of tangency. With this circle being orthogonal to the three existing circles, inversion in this circle preserves the figure.

Four circles

For the case of four mutually tangential circles (where there is more than one point of tangency), there is an amazing relationship involving the complex number coordinates $z_1, z_2, z_3, z_4$ of the centres of the circles and their curvatures (reciprocals of radii) $k_i= 1/R_i$:

$\displaystyle 2\left((k_1z_1)^2 + (k_2z_2)^2 + (k_3z_3)^2 + (k_4z_4)^2\right) = \left(k_1z_1 + k_2z_2 + k_3z_3 + k_4z_4\right)^2.\quad\quad (16)$

This is known as the complex Descartes theorem. Note that for this formula to be valid for internally tangent circles, we make the sign of the curvature of the larger circle negative.

Interestingly this was only proven in 2001 in [1] with a nice proof shown in [2]. That proof assigns a sphere to each of the six points of tangency with curvature equal to the sums of the curvatures of the two circles meeting there. It then turns out that three spheres corresponding to the tangency points on the same circle are mutually tangent. If one does the same construction of spheres for the dual configuration of circles shown in blue below (the original four circles are in black), the same set of six spheres is obtained!

Note that we have more results from (16). Since the relationships remain when the points are translated by the same complex number $z$ we have

$\displaystyle 2\left((k_1(z_1-z))^2 + (k_2(z_2-z))^2 + (k_3(z_3-z))^2 + (k_4(z_4-z))^2\right) = \left(k_1(z_1-z) + k_2(z_2-z) + k_3(z_3-z) + k_4(z_4-z)\right)^2.\quad\quad (17)$

Expanding this as a quadratic in $z$ and comparing coefficients of $z^2, z$ [1] gives further two relationships:

\begin{aligned} 2\left(k_1^2 + k_2^2 + k_3^2 + k_4^2\right) &= \left(k_1 + k_2 + k_3 + k_4\right)^2, \quad\quad & (18)\\ 2\left(k_1^2 z_1 + k_2^2 z_2 + k_3^2 z_3 + k_4^2 z_4\right) &= \left(k_1 + k_2 + k_3 + k_4\right)\left(k_1 z_1 + k_2 z_2 + k_3 z_3 + k_4 z_4 \right).\quad\quad& (19) \end{aligned}

Equation (18) is known as Descartes’ Theorem, discovered in 1643.

To see (16) and (18) in action, let us set for example $z_1 = 3, R_1 = 3, z_2 = 1, R_2 = 1, R_3 = 1$. Then applying (7) we find

\begin{aligned} z_3 &= \frac{R_3(R_1 + R_2)}{R_1 - R_2} + \frac{2\sqrt{R_1R_2R_3(R_1 - R_2 - R_3)}}{R_1 - R_2}i\\ &= \frac{1(3+1)}{3-1} + \frac{2\sqrt{3.1.1(3-1-1)}}{3-1}i\\ &= 2 + i\sqrt{3}.\quad\quad(20) \end{aligned}

From (18) we may write

$k_4 = k_1 + k_2 + k_3 \pm 2\sqrt{k_1 k_2 + k_2 k_3 + k_1 k_3},\quad\quad(21)$

which for $k_1 = -1/3$ (negative as the larger circle is internally tangent), $k_2 = k_3 = 1$ leads to $k_4 = 5/3 \pm 2\sqrt{1/3}$[ or $R_4 = 1/k_4 = 3(7 \mp \sqrt{15})/34$.]

From (16) we may write

$z_4 = (k_1z_1 + k_2z_2 + k_3z_3 \pm 2\sqrt{k_1 k_2z_1z_2 + k_2 k_3z_2z_3 + k_1 k_3z_1z_3})/k_4,\quad\quad(22)$

which for $k_1 z_1 = -1, k_2 z_2 = 1, k_3 z_3 = 2 + i\sqrt{3}, k_4 = 5/3 \pm 2\sqrt{1/3}$ leads to

\begin{aligned} z_4 &= \left(-1 + 1 + 2 + \sqrt{3}i \pm 2\sqrt{-1}\right)/(5/3 \pm 2\sqrt{1/3})\\ &=\frac{2+(\sqrt{3}+ 2)i}{5/3 + 2/\sqrt{3}}\quad \text{or}\quad\frac{2+(\sqrt{3}- 2)i}{5/3 - 2/\sqrt{3}}.\quad\quad(23)\end{aligned}

This is used to generate the sketch below. As can be seen there are two possible circles shown in red.

Here is how one may prove (19) without the insight of the dual configuration, alluded to in [3] and [4]. Let us consider the externally tangent case with two possible fourth circles shown in red.

If we perform an inversion of the configuration in a circle of radius $R$ centred at the tangency point $T_3$ of $C_1$ and $C_2$, the circles $C_1$ and $C_2$ invert to parallel lines while the circles $C_3, C_4$ invert to equal-radii circles $C_3', C_4'$ tangent to the parallel lines as shown below. Note that in this new configuration it is easier to see that there are two possible choices for $C_4'$.

What distance separates the two parallel lines? We need only look at the image of the points diametrically opposite $T_3$ in $C_1$ and $C_2$. This leads to the distance $d = R^2/(2R_1) + R^2/(2R_2)$. Hence the radii of the images of $C_3$ and $C_4$ is half this, or $R^2(1/R_1 + 1/R_2)/4$.

If we know the distance of the centre of the image of $C_4$ from $T_3$, the following lemma will enable us to determine $R_4$. We shall use the following useful lemma which tells us how to find the radius of a circle from quantities involving its inverse.

Lemma

If a circle has radius $r$ and distance $x$ from the origin, its inverse in a circle of radius $R$ centred at the origin has radius $r' = rR^2/|x^2-r^2|$ and its centre has distance $x' = xR^2/|x^2 - r^2|$ from the origin. (Hence $x'/r ' = x/r$.)

Proof of lemma

In the above figure the dashed circle is the circle of inversion while the separate cases $x > r$ and $x < r$ are shown in blue and red respectively.

The points on the original circle collinear with the origin are at distance $x+r$ and $|x-r|$ from the origin. The inverses of these points have distance $R^2/(x+r)$ and $R^2/(x-r)$ from the origin (this quantity is negative if $x < r$). Hence the centre of the inverted circle has distance $x' = |R^2(1/(x+r) + 1/(x-r))/2| = xR^2/|x^2 - r^2|$ from the origin and radius $r'= |R^2(1/(x-r) - 1/(x+r))/2| = rR^2/|x^2-r^2|$ as required. Note that if $x=r$ the circle inverts to a line which can be thought of as a circle with centre at infinity with infinite radius.

From (12), if the origin is at $T_3$, $O_3$ has coordinates

$\displaystyle (x,y):= \left(\frac{R_3(R_1 - R_2)}{R_1 + R_2}, \frac{2\sqrt{R_1R_2 R_3(R_1 + R_2 + R_3)}}{R_1+R_2}\right).\quad\quad (24)$

Suppose $O_3'$ has coordinates $(x',y')$ (we note that while $C_3'$ is the image of $C_3$ under the inversion, the centre $O_3'$ is generally not the image of $O_3$). Then $O_4'$ is at distance

$\displaystyle d:= \frac{R^2}{2}\left(\frac{1}{R_1} + \frac{1}{R_2}\right)\quad\quad (25)$

above or below $O_3'$ (the distance between the parallel lines), so has coordinates $(x',y'\pm d)$. By the above lemma with $r = d/2$, the radius of $C_4$ with centre $O_4$ is $dR^2/2(x'^2 + (y'\pm d)^2 - (d/2)^2)$. Hence its curvature is

$\displaystyle \frac{1}{R_4} = \frac{2(x'^2 + y'^2 + 3d^2/4 \pm 2y'd)}{dR^2}.\quad\quad(26)$

Firstly from (13)

$x^2 + y^2 = T_3O_3^2 = R_3^2 + \frac{4R_1 R_2 R_3}{R_1 + R_2}$, so by the lemma, we find $x'^2 + y'^2$ via the ratio of radii $(d/2)$ to $R_3$:

$\displaystyle x'^2 + y'^2 = (d/2R_3)^2(x^2 + y^2),\quad\quad (27)$

so

\begin{aligned}\frac{2(x'^2 + y'^2 + 3d^2/4)}{dR^2} &= \frac{2(d/2R_3)^2(x^2 + y^2) + 3d^2/2}{dR^2}\\ &=\frac{d[(x^2 + y^2)/2R_3^2 + 3/2]}{R^2}\\ &=\frac{d}{R^2}\left[ \frac{R_3^2}{2R_3^2} + \frac{4R_1 R_2 R_3}{(R_1 + R_2)2R_3^2} + \frac{3}{2}\right]\\ &= \frac{d}{R^2}\left[ 2 + \frac{2R_1R_2}{(R_1 + R_2)R_3} \right]\\ &= \frac{2d}{R^2}\left[ \frac{R_3(R_1 + R_2) + R_1 R_2}{(R_1 + R_2)R_3} \right]\\ &= \left(\frac{1}{R_1} + \frac{1}{R_2}\right) \left[ \frac{R_3(R_1 + R_2) + R_1 R_2}{(R_1 + R_2)R_3} \right]\quad \text{(by (25))}\\ &= \frac{R_1 R_2 + R_2 R_3 + R_1 R_3}{R_1 R_2 R_3}\\ &= \frac{1}{R_1} + \frac{1}{R_2} + \frac{1}{R_3}.\quad\quad(28) \end{aligned}

Next,

\begin{aligned} \frac{4y'}{R^2} &= \frac{4 y (d/2R_3)}{R^2}\\ &= \frac{4d\sqrt{R_1R_2 R_3(R_1 + R_2 + R_3)}}{R^2(R_1+R_2)R_3}\quad \text{(by (24))}\\ &= \frac{4\frac{(R^2)(R_1 + R_2)}{2R_1R_2}\sqrt{R_1R_2 R_3(R_1 + R_2 + R_3)}}{R^2(R_1+R_2)R_3}\quad \text{(by (25))}\\ &= \frac{2\sqrt{R_1R_2 R_3(R_1 + R_2 + R_3)}}{R_1R_2R_3}\\ &= 2\sqrt{\frac{1}{R_1R_2} + \frac{1}{R_2R_3} + \frac{1}{R_1R_3}}.\quad\quad(29) \end{aligned}

Using (28) and (29) in (26) gives us

\begin{aligned} \frac{1}{R_4} &= \frac{2(x'^2 + y'^2 + 3d^2/4 \pm 2y'd)}{dR^2}\\ &= \frac{1}{R_1} + \frac{1}{R_2} + \frac{1}{R_3} \pm 2\sqrt{\frac{1}{R_1R_2} + \frac{1}{R_2R_3} + \frac{1}{R_1R_3}}.\quad\quad(30) \end{aligned}

With $k_i = 1/R_i$, this becomes $k_4 = k_1 + k_2 + k_3 \pm 2\sqrt{k_1 k_2 + k_2 k_3 + k_1 k_3},$ establishing (21).

For the case of $C_2$ and $C_3$ internally tangent to $C_1$ a similar computation to the above can be carried out and we obtain

$\displaystyle \frac{1}{R_4} = -\frac{1}{R_1} + \frac{1}{R_2} + \frac{1}{R_3} \pm 2\sqrt{-\frac{1}{R_1R_2} + \frac{1}{R_2R_3} - \frac{1}{R_1R_3}},\quad\quad(31)$

which is the same as (28) with $R_1$ replaced with $-R_1$.

Note that Descartes’ theorem also holds in the special cases of two parallel lines tangent to two equal circles or a single line tangent to three mutually tangent circles. In the latter case we may set $k_4 = 0$ and assuming $k_1 \geq k_2$ we obtain from (18) the nice formula

$\sqrt{k_3} = \sqrt{k_1} \pm \sqrt{k_2}.\quad\quad(32)$

We see these two solutions below in the right figure.

Further generalisations of Descartes’ theorem to further dimensions are in [1], and more can be read about tangent circle packings in [5].

#### References

[1] J.C. Lagarias, C.L. Mallows, and A. Wilks, Beyond the Descartes Circle Theorem, American Mathematical Monthly, 109 (2002), 338-361. Available at http://www.arxiv.org/abs/math?papernum=0101066.

[2] S. Northshield, Complex Descartes Circle Theorem, to appear, American Mathematical Monthly.

[3] D. Pedoe, On a theorem in geometry, American Mathematical Monthly, 74 (1967), 627-640.

[4] P. Sarnak, Integral Apollonian Packings, American Mathematical Monthly, 118 (2011), 291-306.

[5] D. Austin, When Kissing Involves Trigonometry, Feature Column from the AMS, March 2006.

## March 28, 2014

### Catching Fire vs Frozen

Filed under: movies and TV — ckrao @ 5:35 am

The movies Hunger Games: Catching Fire and Frozen have lit up the box office recently – worldwide they are the fifth and second biggest releases of 2013. Interestingly they were both released on the same day in the US and Canada (though in only one theatre in the case of Frozen) and the graph below shows the different rates at which they accumulated their totals.

Catching Fire followed a trajectory largely typical of large blockbusters: an opening weekend of $158m (6th largest ever unadjusted) leading to a gross around$425m corresponds to a multiplier of 2.7. It still made significant money during the Christmas break with a gross exceeding $10m in its sixth weekend which is rare. The course of Frozen started off similarly in terms of second weekend drop, but had a smaller opening weekend in wide release of$67m. Funnily they had an identical second weekend drop of 53.1% which impressive for both movies given the size of the first weekend for Catching Fire and the fact that it was post-Thanksgiving weekend for Frozen’s second weekend. Following Thanksgiving weekend, Catching Fire in fact had the 3rd largest gross after 10 days, behind only Avengers and The Dark Knight.

After 15 days Catching Fire had made $317m, Frozen was on$109m and both movies were making comparable amounts of money on a daily basis at that point. It is remarkable then that Frozen has ended up within $30m of Catching Fire’s total. Frozen had unbelievable staying power during the holiday season as seen from the above graph, with only one weekend drop exceeding 30% from its 3rd weekend until its release on Blu-ray/DVD (when still in the top ten!). Furthermore during that weekend (1) the drop was still just 31.5%, (2) it was after a boost the previous weekend due to the new year holidays, and (3) it was back up to number one movie at the box office! This type of behaviour is hardly ever seen in this age of front-loadedness (where movies usually make the bulk of their money in early weeks). Frozen made more than half of its money in US + Canada after its fourth weekend of wide release and was in the top ten for an amazing 16 weeks. Its multiplier of close to 6 given that size of opening weekend is the best since Avatar and perhaps no other movie since Phantom Menace (released in 1999) has a comparable multiplier for a large opener. Nobody thought it would get close to$400m after being at $134m following a$31.6m second weekend! It has been fun following its box office journey. :)

## March 26, 2014

### Inverse variance weighting form of the conditional covariance of multivariate Gaussian vectors

Filed under: mathematics — ckrao @ 5:48 am

Let $X$ and $V$ be independent zero-mean real Gaussian vectors of respective length $m$ and $n$ with respective invertible covariance matrices $\Sigma_X$ and $\Sigma_V$. Let $C$ be a full-rank $m \times n$ matrix and define $Y$ by

$\displaystyle Y = CX + V.\quad\quad(1)$

If we are given the vector $X$ we know that $Y$ will be Gaussian with mean $CX$ and covariance $\Sigma_V$.

However suppose we are given $Y$ and wish to find the conditional distribution of $X|Y$. Here we may think of $X$ as a hidden variable and $Y$ as the observed variable. In this case the result is a little more involved. If we recover the results of this earlier blog post, $(X^T,Y^T)$ is jointly Gaussian and so $X|Y$ is Gaussian with mean

$\displaystyle E[X|Y] = E[X] + \text{cov}(X,Y)(\text{cov}(Y))^{-1}(Y - E[Y])\quad\quad(2)$

and covariance

$\displaystyle \text{cov}(X|Y) = \text{cov}(X) - \text{cov}(X,Y)\text{cov}(Y)^{-1}\text{cov}(Y,X).\quad\quad(3)$

(Here $\text{cov}(A,B) := E[AB^T]$ is the cross-covariance of $A$ and $B$, while $\text{cov}(A):= \text{cov}(A,A)$.)

Using the fact that $E(X) = 0$, $E(Y) = 0$, $\text{cov}(X) = \Sigma_X$,

$\text{cov}(X,Y) = E[X(CX+V)^T] = E[XX^T]C^T = \Sigma_X C^T$

and $\text{cov}(Y) = C\Sigma_X C^T + \Sigma_V$, (2) and (3) become

\begin{aligned} E[X|Y] &= \Sigma_X C^T(C\Sigma_X C^T + \Sigma_V)^{-1}Y,\quad\quad&(4)\\ \text{cov}(X|Y) &= \Sigma_X - \Sigma_X C^T(C\Sigma_X C^T + \Sigma_V)^{-1}C\Sigma_x. \quad\quad&(5)\end{aligned}

In this post we also derive the following alternative expressions (also described here) which the covariances appear as inverse matrices.

\boxed{ \begin{aligned} E[X|Y] &= \Sigma C^T \Sigma_V^{-1} Y\quad\quad&(6)\\ \text{cov}(X|Y) &= \Sigma,\quad\quad&(7)\\\text{where}&&\\ \Sigma &:= (\Sigma_X^{-1} + C^T \Sigma_V^{-1} C)^{-1}.\quad\quad&(8)\end{aligned} }

Note that in the scalar case $y = cx + v$ with variances $\sigma_x^2$ and $\sigma_v^2$ (5) and (7) become the identity

$\displaystyle \sigma_x^2 - \frac{\sigma_x^4c^2}{c^2 \sigma_x^2 + \sigma_v^2} = (\sigma_x^{-2} + c^2\sigma_v^{-2})^{-1}.\quad\quad(9)$

In the case where $y$ is a scalar and $C$ is a diagonal matrix, $y$ is a weighted sum of the elements of vector $X$ and (8) becomes the inverse of a sum of inverses of variances (inverse-variance weighting).

One can check algebraically that the expressions (4),(6) and (5),(7) are equivalent in the matrix case, or we may proceed as follows.

Let $V = \left[ \begin{array}{cc} V_{11} & V_{12}\\ V_{21} & V_{22} \end{array} \right] = \left[ \begin{array}{cc}E(XX^T) & E(XY^T)\\ E(YX^T) & E(YY^T) \end{array} \right]$ be the covariance matrix of the joint vector $\left[ \begin{array}{c} X\\ Y \end{array} \right]$. Then since $Y = CX + V$ we have $V_{11} = \Sigma_X$ and

\begin{aligned} V_{12} &= V_{21}^T\\ &= E(XY^T)\\ &= E(X(CX + V)^T\\ &= EXX^T C^T + EXV^T\\ &= \Sigma_X C^T. \quad\quad(10) \end{aligned}

Then as the Gaussian vector $\left[ \begin{array}{c} X\\ Y \end{array} \right]$ is zero-mean with covariance $V$, the joint pdf of $X$ and $Y$ is proportional to $\exp \left(-\frac{1}{2} [X^T Y^T]V^{-1}\left[ \begin{array}{c} X\\ Y \end{array} \right] \right)$.

The key step now is to make use of the following identity (also see this explanation) based on completion of squares:

$\displaystyle \exp \left(-\frac{1}{2} [X^T Y^T]V^{-1}\left[ \begin{array}{c} X\\ Y \end{array} \right] \right) = \exp\left( -\frac{1}{2} (X^T - Y^TA^T) S_{22}^{-1}(X-AY)\right) \exp\left( -\frac{1}{2} Y^T V_{22}^{-1} Y\right),\quad\quad(11)$

where $A$ and $S_{22}$ are matrices, defined similarly to $A$ and $s$ in the scalar equation

$\displaystyle ax^2 + 2bxy + cy^2 = a(x-Ay)^2 + sy^2.$

We will show that $A = V_{12}V_{22}^{-1}$ and $S_{22} = V_{11} - V_{12}V_{22}^{-1}V_{21}$ ($S_{22}$ is the Schur complement of $V_{22}$ in $V$ also discussed in this previous blog post).

The second term in the right side of (11) is proportional to the pdf of $Y$ (being Gaussian) $p(Y)$, so the first term must be proportional to the conditional pdf $p(X|Y)$. We are left to find $S_{22} = \text{cov}(X|Y)$ and $AY = E(X|Y)$.

From (11), $S_{22}^{-1}$ is the top-left block of $V^{-1}$ while $-S_{22}^{-1}A$ is the top-right block of $V^{-1}$. To find these blocks, consider the block matrix equation

$\displaystyle V \left[ \begin{array}{c} r\\s \end{array} \right] = \left[ \begin{array}{c} a\\b \end{array} \right] \quad \Rightarrow \quad \left[ \begin{array}{c} r\\s \end{array} \right] = V^{-1} \left[ \begin{array}{c} a\\b \end{array} \right]. \quad\quad(12)$

This is the same as the system of equations

\displaystyle \begin{aligned} V_{11} r + V_{12} s &= a, \quad\quad&(13)\\ V_{21} r + V_{22} s &= b. \quad \quad&(14)\\ \end{aligned}

Multiplying (13) by $V_{21}V_{11}^{-1}$ gives

$\displaystyle V_{21}r + V_{21}V_{11}^{-1}V_{12}s = V_{21}V_{11}^{-1}a.$

Subtracting this from (14) gives

\begin{aligned} (V_{22} - V_{21}V_{11}^{-1}V_{12})s &= b - V_{21}V_{11}^{-1}a\\ \Rightarrow s &= (V_{22} - V_{21}V_{11}^{-1}V_{12})^{-1}b - (V_{22} - V_{21}V_{11}^{-1}V_{12})V_{21}V_{11}^{-1}a.\quad \quad(15)\end{aligned}

Since $\left[ \begin{array}{c} r\\s \end{array} \right] = V^{-1} \left[ \begin{array}{c} a\\b \end{array} \right]$ the coefficients of $a$ and $b$ in (15) are the bottom-left and bottom-right blocks of $V^{-1}$ respectively. We may then write $S_{11} := V_{22} - V_{21}V_{11}^{-1}V_{12}$ and so from (15)

$\displaystyle S_{11} s = b - V_{21}V_{11}^{-1}a.\quad \quad(16)$

Also we have from (13)

$\displaystyle r = V_{11}^{-1} a - V_{11}^{-1} V_{12}s.\quad\quad (17)$

Using (16) this becomes

\begin{aligned} r &= V_{11}^{-1} a - V_{11}^{-1} V_{12} S_{11}^{-1}(b - V_{21}V_{11}^{-1}a)\\ &=(V_{11}^{-1} + V_{11}^{-1} V_{12} S_{11}^{-1} V_{21} V_{11}^{-1} )a - V_{11}^{-1} V_{12} S_{11}^{-1}b.\quad\quad(18) \end{aligned}

Note that analogous to (16) we could have written

$\displaystyle S_{22} r = a - V_{12}V_{22}^{-1} b \Rightarrow r = S_{22}^{-1} a -S_{22}^{-1}V_{12}V_{22}^{-1}b .\quad\quad(19)$

Comparing coefficients of $a$, $b$ of (18) and (19) gives

$\displaystyle S_{22}^{-1} = V_{11}^{-1} + V_{11}^{-1} V_{12} S_{11}^{-1} V_{21} V_{11}^{-1}\quad\quad(20)$

and

$\displaystyle V_{11}^{-1} V_{12} S_{11}^{-1} = S_{22}^{-1} V_{12}V_{22}^{-1}.\quad\quad(21)$

In the same way that $S_{22} = \text{cov}(X|Y)$, $S_{11} = \text{cov}(Y|X)$, which for $Y = CX + V$ is simply $\Sigma_V$ as we saw before.

Hence from (20),

\begin{aligned} \Sigma^{-1} = S_{22}^{-1} &= V_{11}^{-1} + V_{11}^{-1} V_{12} S_{11}^{-1} V_{21} V_{11}^{-1}\\ &= \Sigma_X^{-1} + \Sigma_X^{-1}\Sigma_X C^T \Sigma_V^{-1} C \Sigma_X^{-1} \Sigma_X\\ &= \Sigma_X^{-1} + C^T \Sigma_V^{-1} C \end{aligned}

and from the $b$-coefficient of (18),

\begin{aligned} S_{22}^{-1}A &= V_{11}^{-1} V_{12} S_{11}^{-1}\\ \Rightarrow A &= S_{22} V_{11}^{-1} V_{12} S_{11}^{-1}\\ &= S_{22} \Sigma_X^{-1} \Sigma_X C^T \Sigma_V^{-1}\\ &= \Sigma C^T \Sigma_V^{-1}. \end{aligned}

Hence $E[X|Y] = AY = \Sigma C^T \Sigma_V^{-1} Y$ as desired, and equations (6)-(8) have been verified.

## February 23, 2014

### Australian marsupial genera

Filed under: nature — ckrao @ 2:09 am

Native Australian mammals (those that predate human times) include monotremes (platypus and echidna), marsupials, bats, rodents (all mouse-like) and sea mammals (seals, whales and dolphins, dugongs). The marsupials of Australia extend far beyond the kangaroo/possum/koala/wombat (diprotodont) variety to the marsupial moles, carnivorous marsupials such as quolls, dunnarts, numbat and Tasmanian devil, and the omnivorous bandicoots and bilbies. (The remaining marsupial orders are confined to the Americas and comprise of the opossums, shrew opossums and Monito del Monte.) Here is a list of extant marsupial genera from those orders that are native to Australia, made simply to learn more about their diversity.

 Order Suborder/Family Subfamily/Tribe Genus Genus meaning # Australian species # world species notes Notoryctemorphia (Marsupial moles) Notoryctidae Notoryctes southern digger 2 2 marsupial moles Dasyuromorphia   (marsupial carnivores) Dasyuridae Dasyurinae – Dasyurini Dasycercus hairy tail 2 2 mulgaras Dasykaluta hairy kaluta 1 1 little red kaluta Dasyuroides resembling Dasyurus 1 1 Kowari Dasyurus hairy tail 4 6 quolls Myoictis mouse weasel 0 4 dasyures – New Guinea Neophascogale new Phascogale 0 1 Speckled dasyure  – New   Guinea Parantechinus near Antechinus 1 1 Dibbler Phascolosorex pouched shrew 0 2 Marsupial shrews – New Guinea Pseudantechinus false Antechinus 6 6 False antechinuses Sarcophilus flesh lover 1 1 Tasmanian devil Dasyurinae –   Phascogalini Antechinus hedgehog equivalent 10 10 Micromurexia small Murexia 0 1 Habbema dasyure – rocky areas of New Guinea Murexechinus hedgehog mouse 0 1 Black-tailed dasyure – tropical dry forests of New Guinea Murexia marsupial mouse (significance unknown) 0 1 Short-furred dasyure – New Guinea Paramurexia near Murexia 0 1 Broad-striped dasyure – South east Papua New Guinea Phascomurexia pouched Murexia 0 1 Long-nosed dasyure – tropical dry forests of New Guinea Phascogale pouched weasel 2 2 Sminthopsinae –   Sminthopsini Antechinomys antechinus-mouse 1 1 Kultarr Ningaui Aboriginal mythical creature 3 3 Sminthopsis mouse appearance 21 21 dunnarts Sminthopsinae – Planigalini Planigale flat weasel 4 5 Myrmecobiidae Myrmecobius ant-living 1 1 numbat Peramelemorphia   (Bilbies and bandicoots) Peramelidae Echymiperinae Echymipera pouched hedgehog 1 1 long-nosed spiny bandicoot Microperoryctes small Peroryctes 0 5 striped bandicoots – New Guinea Rhynchomeles beaked badger 0 1 Seram bandicoot – existence only recorded in 1920 Peramelinae Isoodon equal tooth 3 3 short-nosed bandicoots Perameles pouched badger 3 3 long-nosed bandicoots Peroryctidae Peroryctes pouched digger 0 2 New Guinean long-nosed bandicoots Thylacomyidae Macrotis big-ear 1 1 bilby Diprotodontia   (“two front teeth” – kangaroos and relatives) Vombatiformes – Phascolarctidae Phascolarctos pouched bear 1 1 koala Vombatiformes –   Vombatidae Vombatus wombat 1 1 Lasiorhinus hairy-nose 2 2 Phalangeriformes   (possums and gliders) – Phalangeridae Ailurops cat-like 0 2 bear cuscuses – NE Indonesia inc Sulawesi Phalanger notable digits 1 13 cuscus Spilocuscus spotted cuscus 1 5 Strigocuscus thin cuscus 0 2 Trichosurus hairy tail 5 5 bushtail possums Wyulda brush-tail possum (mistakenly assigned) 1 1 scaly-tailed possum Phalangeriformes –   Burramyidae (pygmy possums) Burramys stony-place mouse 1 1 Mountain Pygmy Possum Cercartetus possibly tail-in-air 4 4 Phalangeriformes – Tarsipedidae Tarsipes tarsier-foot 1 1 Honey possum Phalangeriformes –   Petauridae Dactylopsila naked finger 1 4 striped possums Gymnobelideus naked Belideus 1 1 Leadbeater’s possum Petaurus rope dancer 4 6 gliders Phalangeriformes –   Pseudocheiridae (ringtailed possums and relatives) Hemibelideus half Belideus (fluffy-tailed glider) 1 1 Lemur-like ringtail possum Petauroides Petaurus-like 1 1 Greater Glider Petropseudes rock-Pseudocheirus 1 1 Rock-haunting ringtail possum Pseudocheirus false hand 2 1 Common ringtail possum Pseudochirops Pseudocheirus-like 1 5 Pseudochirulus little Pseudocheirus 2 8 Phalangeriformes –   Acrobatidae Acrobates acrobat 1 1 Feathertail glider Distoechurus tail in two rows 0 1 Feather-tailed possum – New Guinea Macropodiformes –   Macropodidae Lagostrophus turning hare 1 1 Banded hare-wallaby Dendrolagus tree hare 2 13 tree-kangaroos Lagorchestes dancing hare 2 2 Macropus long foot 13 13 kangaroos and wallaroos Onychogalea nailed weasel 2 2 nail-tail wallabies Petrogale rock weasel 16 16 rock-wallabies Setonix bristle-claw 1 1 Quokka Thylogale pouched weasel 3 7 pademelons Wallabia wallaby 1 1 Swamp wallaby Macropodiformes –   Potoroidae Aepyprymnus high rump 1 1 Rufous rat-kangaroo Bettongia bettong 4 4 Potorous potoroo (Aboriginal) 3 3 Macropodiformes – Hypsiprymnodontidae Hypsiprymnodon high rump tooth 1 1 Musky rat-kangaroo Totals 151 224

#### Reference

Dictionary of Australian and New Guinean Mammals, edited by Ronald Strahan & Pamela Conder, CSIRO Publishing, 2007.

## February 9, 2014

### The validity of the index laws for complex numbers

Filed under: mathematics — ckrao @ 11:47 am

In an earlier post we saw that some of the index laws fail when the base is negative or zero. Now we shall see what occurs when the allowed values of base and exponent are extended to be complex numbers.

For $z_1 \in \mathbb{C}\backslash \{ 0\}$ we can proceed analogously to the real case and define

$\displaystyle z_1^{z_2} = \exp(z_2 \log (z_1)).\quad \quad(1)$

However what are the exponential and logarithm of a complex number? Last time we defined the logarithm before the exponential and we can do the same here. For complex $z$ we can define $\log(z)$ as $\int_1^z \frac{1}{t}\ \text{d}t$ as for the real case, but note that this time it is a contour integral along any path from $1$ to $z$ not including the origin. By allowing the path to wind around the origin any number of times, we find the function is no longer single-valued. For example if we choose the contour to be the unit circle from +1 to itself in an anticlockwise direction, then the contour may be parametrised by $t = \cos \theta + i \sin \theta$ for $\theta \in [0, 2\pi]$ ($\text{d}t = (-\sin \theta + i \cos \theta)\text{d}\theta$) and we have

\begin{aligned} \log(1) &= \int_1^1 \frac{1}{t}\ \text{d}t\\ &= \int_0^{2\pi} (\cos \theta - i \sin \theta)(-\sin \theta + i \cos \theta)\text{d}\theta\\ &= \int_0^{2\pi} i( \cos^2 \theta + \sin^2 \theta )\ \text{d}\theta \\ &= 2\pi i. \end{aligned}

More generally if $w$ is a logarithm of $z$, so is $w + 2\pi i n$ where $n$ is an integer. Defining $\exp(w)$ for complex $w$ to be $z$ where $w$ is a logarithm of $z$, we have

$z = \exp(w + 2\pi i n) = \exp(w),\quad n \in \mathbb{Z}$.

(Note that $\displaystyle \exp(\log (z)) = z$ while $\displaystyle \log(\exp (z)) = z + 2\pi i k$.)

Based on these definitions we can show that $\log(z_1z_2) = \log(z_1) + \log(z_2)$ and $\exp(z_1 + z_2) = \exp(z_1)\exp(z_2)$ in an analogous way to the real case, except that the first equation is to be interpreted as an equality of sets of values rather than individual values. Note that for this reason we have to be careful when adding or subtracting logarithms: for example for complex numbers,

$\log(z) - \log(z) = \log(1) = 2\pi i n \neq 0$

and

\begin{aligned} \log(z^2) &= \log (z) + \log (z)\\ &= (\log (z) + 2\pi i m) + (\log (z) + 2\pi i n)\\ &= 2\log (z) + 2\pi i k, \quad k \in \mathbb{Z}\\ \text{while }\quad 2 \log (z) &= 2 (\log (z) + 2\pi i n)\\ &= 2\log (z) + 4 \pi i n, \quad n \in \mathbb{Z}. \end{aligned}

Hence we cannot write $a\log (z) + b \log (z) = (a+b)\log (z)$ without paying special attention to the values of $a,b,z$. If we want to know when $\log (z^c) = c \log (z)$, we can verify that $\log (z^c) = c \log (z) + 2\pi i k, k \in \mathbb{Z}$ which is only equal to $c \log (z) = c (\log (z) + 2\pi i m)$ when $cm$ covers all integers, i.e. if $c = 1/n$ for some non-zero integer $n$:

$\displaystyle \log\left(z^{1/n}\right)= \frac{1}{n} \log(z), \quad\quad n \in \mathbb{Z}.$

Now in general,

\begin{aligned} z_1^{z_2} &= \exp(z_2 \log (z_1))\\ &= \exp(z_2 (\log (z_1) + 2\pi i n) )\\ &= \exp(z_2 \log (z_1))\exp(2\pi i n z_2).\quad\quad (2)\end{aligned}

This will be multi-valued if $nz_2$ takes on non-integer values, as $n$ varies over the integers. It will only be single-valued if $z_2$ is an integer. For example treated as a complex power, $2^{1/2}$ will have two values: $\sqrt{2}$ and $-\sqrt{2}$ while $2^{1/3}$ will take three values. The number $2^{\sqrt 2}$ will have infinitely many complex values $\exp(\sqrt{2} \log (2))\exp(2\pi i n \sqrt{2}), n \in \mathbb{Z}$ although only one of them is real-valued. Note that through (2) we can work out quantities such as:

• $(-1)^{\sqrt{2}} = \exp(i \pi \sqrt{2})\exp(2\sqrt{2} \pi i n), n \in \mathbb{Z}$ (infinitely many non-real values!)
• $i^i = \exp(-\pi/2)\exp(-2\pi n), n\in \mathbb{Z}$ (infinitely many real values!).

Also note from (2) that

$\log(z_1^{z_2}) = z_2 \log (z_1) + 2\pi i k = z_2 \textrm{Log} (z_1) + z_2 2\pi i k_1 + 2\pi i k_2.$

One can define the principal value of the logarithm $\textrm{Log}(z)$ to be that with imaginary part in the interval $(-\pi, \pi]$. Similarly one can define the principal value of the power function as

$\displaystyle z_1^{z_2} := \exp(z_2 \textrm{Log} (z_1)).\quad \quad (3)$

This gives single-valued results but they may not be as expected. For example, since $\textrm{Log}(-1) = i \pi$, $(-1)^{1/3} = \exp(i \pi/3)$ rather than the real-valued root -1. However we can now say $a \textrm{Log}(z) + b \textrm{Log}(z) = (a+b)\textrm{Log}(z)$, being single-valued.

We would like to know which of the index laws hold. In the remainder of the post we verify the identities summarised in the following table. The real number case was already treated in this post.

 Real numbers $a,b$ Complex numbers $z,z_1,z_2,a,b$ Positive real $x,y$ Multiple-valued power $z_1^{z_2} = \exp(z_2 \log z_1)$ Single-valued power $z_1^{z_2} = \exp(z_2 \textrm{Log} z_1)$ (1) $x^a x^b = x^{a+b}$ $z^{a+b}$ a subset of  $z^a z^b$ $z^a z^b = z^{a+b}$ (2) $(x^a)^b = x^{ab}$ $z^{ab}$ a subset of $(z^a)^b$ $(z^a)^b = z^{ab}\exp(2\pi i b n_0)$ (3) $(xy)^a = x^a y^a$ $(z_1 z_2)^a = z_1^a z_2^a$ $(z_1 z_2)^a = z_1^a z_2^a \exp(2 \pi i a n_{+})$ (4) $x^0 = 1$ $z^0 = 1$ $z^0 = 1$ (5) $1^a = 1$ $1^a = \exp(a 2\pi i k)$ $1^a = 1$ (6) $x^{-a} = 1/x^a$ $z^{-a} = 1/z^a$ but $z^{-a}z^a = \exp(2\pi i k)$ $z^{-a} = 1/z^a$ and  $z^{-a}z^a = 1$ (7) $x^a / x^b = x^{a-b}$ $z^{a-b}$ a subset of $z^{a}/z^{b}$ $z^a / z^b = z^{a-b}$ (8) $(x/y)^a = x^a/y^a$ $(z_1/z_2)^a = z_1^a/z_2^a$ $(z_1/z_2)^a = \frac{z_1^a}{z_2^a}\exp(2\pi i a n_{-})$

Note that in the table, $n_0, n_{+},n_{-}$ are particular integers chosen to enable equality.

For verifying identity (1) in the multi-valued power case we have

\begin{aligned} z^{a+b} &= \exp((a+b)\log (z))\\ &= \exp\left((a+b)(\textrm{Log} (z) + 2\pi i k)\right)\\ &= \exp\left((a+b)\textrm{Log} (z) \right) \exp\left( 2\pi i k (a+b)\right) \end{aligned}

while

\begin{aligned}z^a z^b &= \exp(a \log(z)) \exp(b \log(z))\\&= \exp\left(a (\textrm{Log} z + 2\pi ik) \right)\exp\left(b (\textrm{Log} z + 2\pi in) \right)\\ &= \exp\left((a+b)\textrm{Log} (z) \right) \exp\left(2\pi i (ka + nb) \right).\end{aligned}

This shows that the set of values of $z^{a+b}$ is a subset of the set of values of $z^a z^b$. In the single-valued case,

\begin{aligned} z^az^b &= \exp(a \textrm{Log} (z)) \exp(b \textrm{Log} (z))\\ &= \exp(a \textrm{Log} (z) + b \textrm{Log} (z))\\&= \exp((a+b) \textrm{Log} (z))\\ &= z^{a+b}.\end{aligned}

For identity (2) in the multi-valued power case we have

\begin{aligned} (z^a)^b &=(\exp(a \log (z)))^b\\ &= \exp(b \log(\exp(a \log( z))))\\ &= \exp(b(a \log (z)+2 \pi ik))\\ &= \exp(ba \log (z)) \exp(2 \pi ibk)\\ &= z^{ab}\exp(2\pi ibk). \end{aligned}

This shows that the set of values of $z^{ab}$ is a subset of the values of $(z^a)^b$. We have the equality $(z^a)^b = z^{ab}$ if $\exp(2\pi ibk) =1$, or $bk \in \mathbb{Z}$ for all $k$, which is true if $b \in \mathbb{Z}$. In the single-valued case, the integer $k$ is chosen so that $a \textrm{Log}(z)+2 \pi ik$ has imaginary part in the interval $(-\pi, \pi]$.

For identity (3) in the multi-valued power case we have

\begin{aligned} (z_1 z_2)^a &= \exp(a \log(z_1 z_2))\\ &= \exp(a(\log (z_1)+\log (z_2)))\\ &= \exp(a \log (z_1))\exp(a \log (z_2))\\ &= z_1^a z_2^a.\end{aligned}

In the single-valued power case,

\begin{aligned} (z_1 z_2)^a &= \exp(a \textrm{Log}(z_1 z_2))\\ &= \exp(a(\textrm{Log} (z_1)+\textrm{Log} (z_2) + 2\pi i a n_{+}))\\ &= \exp(a \textrm{Log} (z_1))\exp(a \textrm{Log} (z_2))\\ &= z_1^a z_2^a \exp(2 \pi i an_{+}). \end{aligned}

Here the value $n_{+}$ is chosen so that $\textrm{Log} (z_1)+\textrm{Log} (z_2) + 2\pi i a n_{+}$ has imaginary part in the interval $(-\pi, \pi]$.

Identity (4) comes from setting $b=0$ in identity (1). Identity (5) results from setting $y$ or $z_2$ to 1 in identity (3). Identity (6) results from setting $b = -a$ in identity (1) and using identity (4). Finally identities (7) and (8) follow from identities (1) and (3).

The moral of all this is that care is to be taken when applying the index laws to complex numbers (or indeed even when adding logarithms) by virtue of the multi-valued nature of the complex logarithm.

#### Reference

H. Haber, The complex logarithm, exponential and power functions, UC Santa Cruz Physics 116A notes (2011) available at scipp.ucsc.edu/~haber/ph116A/clog_11.pdf

Next Page »

The Rubric Theme. Create a free website or blog at WordPress.com.