Chaitanya's Random Pages

March 28, 2014

Catching Fire vs Frozen

Filed under: movies and TV — ckrao @ 5:35 am

The movies Hunger Games: Catching Fire and Frozen have lit up the box office recently – worldwide they are the fifth and second biggest releases of 2013. Interestingly they were both released on the same day in the US and Canada (though in only one theatre in the case of Frozen) and the graph below shows the different rates at which they accumulated their totals.

catching fire vs frozen

Catching Fire followed a trajectory largely typical of large blockbusters: an opening weekend of $158m (6th largest ever unadjusted) leading to a gross around $425m corresponds to a multiplier of 2.7. It still made significant money during the Christmas break with a gross exceeding $10m in its sixth weekend which is rare. The course of Frozen started off similarly in terms of second weekend drop, but had a smaller opening weekend in wide release of $67m. Funnily they had an identical second weekend drop of 53.1% which impressive for both movies given the size of the first weekend for Catching Fire and the fact that it was post-Thanksgiving weekend for Frozen’s second weekend. Following Thanksgiving weekend, Catching Fire in fact had the 3rd largest gross after 10 days, behind only Avengers and The Dark Knight.

After 15 days Catching Fire had made $317m, Frozen was on $109m and both movies were making comparable amounts of money on a daily basis at that point. It is remarkable then that Frozen has ended up within $30m of Catching Fire’s total. Frozen had unbelievable staying power during the holiday season as seen from the above graph, with only one weekend drop exceeding 30% from its 3rd weekend until its release on Blu-ray/DVD (when still in the top ten!). Furthermore during that weekend (1) the drop was still just 31.5%, (2) it was after a boost the previous weekend due to the new year holidays, and (3) it was back up to number one movie at the box office! This type of behaviour is hardly ever seen in this age of front-loadedness (where movies usually make the bulk of their money in early weeks).

Frozen made more than half of its money in US + Canada after its fourth weekend of wide release and was in the top ten for an amazing 16 weeks. Its multiplier of close to 6 given that size of opening weekend is the best since Avatar and perhaps no other movie since Phantom Menace (released in 1999) has a comparable multiplier for a large opener. Nobody thought it would get close to $400m after being at $134m following a $31.6m second weekend! It has been fun following its box office journey. :)

March 26, 2014

Inverse variance weighting form of the conditional covariance of multivariate Gaussian vectors

Filed under: mathematics — ckrao @ 5:48 am

Let X and V be independent zero-mean real Gaussian vectors of respective length m and n with respective invertible covariance matrices \Sigma_X and \Sigma_V. Let C be a full-rank m \times n matrix and define Y by

\displaystyle Y = CX + V.\quad\quad(1)

If we are given the vector X we know that Y will be Gaussian with mean CX and covariance \Sigma_V.

However suppose we are given Y and wish to find the conditional distribution of X|Y. Here we may think of X as a hidden variable and Y as the observed variable. In this case the result is a little more involved. If we recover the results of this earlier blog post, (X^T,Y^T) is jointly Gaussian and so X|Y is Gaussian with mean

\displaystyle E[X|Y] = E[X] + \text{cov}(X,Y)(\text{cov}(Y))^{-1}(Y - E[Y])\quad\quad(2)

and covariance

\displaystyle \text{cov}(X|Y) = \text{cov}(X) - \text{cov}(X,Y)\text{cov}(Y)^{-1}\text{cov}(Y,X).\quad\quad(3)

(Here \text{cov}(A,B) := E[AB^T] is the cross-covariance of A and B, while \text{cov}(A):= \text{cov}(A,A).)

Using the fact that E(X) = 0, E(Y) = 0, \text{cov}(X) = \Sigma_X,

\text{cov}(X,Y) = E[X(CX+V)^T] = E[XX^T]C^T = \Sigma_X C^T

and \text{cov}(Y) = C\Sigma_X C^T + \Sigma_V, (2) and (3) become

\begin{aligned} E[X|Y] &= \Sigma_X C^T(C\Sigma_X C^T + \Sigma_V)^{-1}Y,\quad\quad&(4)\\ \text{cov}(X|Y) &= \Sigma_X - \Sigma_X C^T(C\Sigma_X C^T + \Sigma_V)^{-1}C\Sigma_x. \quad\quad&(5)\end{aligned}

In this post we also derive the following alternative expressions (also described here) which the covariances appear as inverse matrices.

\boxed{ \begin{aligned} E[X|Y] &= \Sigma C^T \Sigma_V^{-1} Y\quad\quad&(6)\\ \text{cov}(X|Y) &= \Sigma,\quad\quad&(7)\\\text{where}&&\\ \Sigma &:= (\Sigma_X^{-1} + C^T \Sigma_V^{-1} C)^{-1}.\quad\quad&(8)\end{aligned} }

Note that in the scalar case y = cx + v with variances \sigma_x^2 and \sigma_v^2 (5) and (7) become the identity

\displaystyle \sigma_x^2 - \frac{\sigma_x^4c^2}{c^2 \sigma_x^2 + \sigma_v^2} = (\sigma_x^{-2} + c^2\sigma_v^{-2})^{-1}.\quad\quad(9)

In the case where y is a scalar and C is a diagonal matrix, y is a weighted sum of the elements of vector X and (8) becomes the inverse of a sum of inverses of variances (inverse-variance weighting).

One can check algebraically that the expressions (4),(6) and (5),(7) are equivalent in the matrix case, or we may proceed as follows.

Let V = \left[ \begin{array}{cc} V_{11} & V_{12}\\ V_{21} & V_{22} \end{array} \right] = \left[ \begin{array}{cc}E(XX^T) & E(XY^T)\\ E(YX^T) & E(YY^T) \end{array} \right] be the covariance matrix of the joint vector \left[ \begin{array}{c} X\\ Y \end{array} \right]. Then since Y = CX + V we have V_{11} = \Sigma_X and

\begin{aligned}  V_{12} &= V_{21}^T\\  &= E(XY^T)\\  &= E(X(CX + V)^T\\  &= EXX^T C^T + EXV^T\\  &= \Sigma_X C^T. \quad\quad(10)  \end{aligned}

Then as the Gaussian vector \left[ \begin{array}{c} X\\ Y \end{array} \right] is zero-mean with covariance V, the joint pdf of X and Y is proportional to \exp \left(-\frac{1}{2} [X^T Y^T]V^{-1}\left[ \begin{array}{c} X\\ Y \end{array} \right] \right).

The key step now is to make use of the following identity (also see this explanation) based on completion of squares:

\displaystyle  \exp \left(-\frac{1}{2} [X^T Y^T]V^{-1}\left[ \begin{array}{c} X\\ Y \end{array} \right] \right)  = \exp\left( -\frac{1}{2} (X^T - Y^TA^T) S_{22}^{-1}(X-AY)\right) \exp\left( -\frac{1}{2} Y^T V_{22}^{-1} Y\right),\quad\quad(11)

where A and S_{22} are matrices, defined similarly to A and s in the scalar equation

\displaystyle ax^2 + 2bxy + cy^2 = a(x-Ay)^2 + sy^2.

We will show that A = V_{12}V_{22}^{-1} and S_{22} = V_{11} - V_{12}V_{22}^{-1}V_{21} (S_{22} is the Schur complement of V_{22} in V also discussed in this previous blog post).

The second term in the right side of (11) is proportional to the pdf of Y (being Gaussian) p(Y), so the first term must be proportional to the conditional pdf p(X|Y). We are left to find S_{22} = \text{cov}(X|Y) and AY = E(X|Y).

From (11), S_{22}^{-1} is the top-left block of V^{-1} while -S_{22}^{-1}A is the top-right block of V^{-1}. To find these blocks, consider the block matrix equation

\displaystyle V \left[ \begin{array}{c} r\\s \end{array} \right] = \left[ \begin{array}{c} a\\b \end{array} \right] \quad \Rightarrow \quad \left[ \begin{array}{c} r\\s \end{array} \right] = V^{-1} \left[ \begin{array}{c} a\\b \end{array} \right]. \quad\quad(12)

This is the same as the system of equations

\displaystyle \begin{aligned}  V_{11} r + V_{12} s &= a, \quad\quad&(13)\\  V_{21} r + V_{22} s &= b. \quad \quad&(14)\\  \end{aligned}

Multiplying (13) by V_{21}V_{11}^{-1} gives

\displaystyle V_{21}r + V_{21}V_{11}^{-1}V_{12}s = V_{21}V_{11}^{-1}a.

Subtracting this from (14) gives

\begin{aligned} (V_{22} - V_{21}V_{11}^{-1}V_{12})s &= b - V_{21}V_{11}^{-1}a\\ \Rightarrow s &= (V_{22} - V_{21}V_{11}^{-1}V_{12})^{-1}b - (V_{22} - V_{21}V_{11}^{-1}V_{12})V_{21}V_{11}^{-1}a.\quad \quad(15)\end{aligned}

Since \left[ \begin{array}{c} r\\s \end{array} \right] = V^{-1} \left[ \begin{array}{c} a\\b \end{array} \right] the coefficients of a and b in (15) are the bottom-left and bottom-right blocks of V^{-1} respectively. We may then write S_{11} := V_{22} - V_{21}V_{11}^{-1}V_{12} and so from (15)

\displaystyle S_{11} s = b - V_{21}V_{11}^{-1}a.\quad \quad(16)

Also we have from (13)

\displaystyle r = V_{11}^{-1} a - V_{11}^{-1} V_{12}s.\quad\quad (17)

Using (16) this becomes

\begin{aligned}  r &= V_{11}^{-1} a - V_{11}^{-1} V_{12} S_{11}^{-1}(b - V_{21}V_{11}^{-1}a)\\  &=(V_{11}^{-1} + V_{11}^{-1} V_{12} S_{11}^{-1} V_{21} V_{11}^{-1} )a - V_{11}^{-1} V_{12} S_{11}^{-1}b.\quad\quad(18)  \end{aligned}

Note that analogous to (16) we could have written

\displaystyle S_{22} r = a - V_{12}V_{22}^{-1} b \Rightarrow r = S_{22}^{-1} a -S_{22}^{-1}V_{12}V_{22}^{-1}b .\quad\quad(19)

Comparing coefficients of a, b of (18) and (19) gives

\displaystyle S_{22}^{-1} = V_{11}^{-1} + V_{11}^{-1} V_{12} S_{11}^{-1} V_{21} V_{11}^{-1}\quad\quad(20)


\displaystyle V_{11}^{-1} V_{12} S_{11}^{-1} = S_{22}^{-1} V_{12}V_{22}^{-1}.\quad\quad(21)

In the same way that S_{22} = \text{cov}(X|Y), S_{11} = \text{cov}(Y|X), which for Y = CX + V is simply \Sigma_V as we saw before.

Hence from (20),

\begin{aligned}  \Sigma^{-1} = S_{22}^{-1} &= V_{11}^{-1} + V_{11}^{-1} V_{12} S_{11}^{-1} V_{21} V_{11}^{-1}\\  &= \Sigma_X^{-1} + \Sigma_X^{-1}\Sigma_X C^T \Sigma_V^{-1} C \Sigma_X^{-1} \Sigma_X\\  &= \Sigma_X^{-1} + C^T \Sigma_V^{-1} C  \end{aligned}

and from the b-coefficient of (18),

\begin{aligned}  S_{22}^{-1}A &= V_{11}^{-1} V_{12} S_{11}^{-1}\\  \Rightarrow A &= S_{22} V_{11}^{-1} V_{12} S_{11}^{-1}\\  &= S_{22} \Sigma_X^{-1} \Sigma_X C^T \Sigma_V^{-1}\\  &= \Sigma C^T \Sigma_V^{-1}.  \end{aligned}

Hence E[X|Y] = AY = \Sigma C^T \Sigma_V^{-1} Y as desired, and equations (6)-(8) have been verified.

February 23, 2014

Australian marsupial genera

Filed under: nature — ckrao @ 2:09 am

Native Australian mammals (those that predate human times) include monotremes (platypus and echidna), marsupials, bats, rodents (all mouse-like) and sea mammals (seals, whales and dolphins, dugongs). The marsupials of Australia extend far beyond the kangaroo/possum/koala/wombat (diprotodont) variety to the marsupial moles, carnivorous marsupials such as quolls, dunnarts, numbat and Tasmanian devil, and the omnivorous bandicoots and bilbies. (The remaining marsupial orders are confined to the Americas and comprise of the opossums, shrew opossums and Monito del Monte.) Here is a list of extant marsupial genera from those orders that are native to Australia, made simply to learn more about their diversity.

Order Suborder/Family Subfamily/Tribe Genus Genus meaning # Australian species # world species notes
Notoryctemorphia (Marsupial moles) Notoryctidae Notoryctes southern digger 2 2 marsupial moles
Dasyuromorphia   (marsupial carnivores) Dasyuridae Dasyurinae – Dasyurini Dasycercus hairy tail 2 2 mulgaras
Dasykaluta hairy kaluta 1 1 little red kaluta
Dasyuroides resembling Dasyurus 1 1 Kowari
Dasyurus hairy tail 4 6 quolls
Myoictis mouse weasel 0 4 dasyures – New Guinea
Neophascogale new Phascogale 0 1 Speckled dasyure  – New   Guinea
Parantechinus near Antechinus 1 1 Dibbler
Phascolosorex pouched shrew 0 2 Marsupial shrews – New Guinea
Pseudantechinus false Antechinus 6 6 False antechinuses
Sarcophilus flesh lover 1 1 Tasmanian devil
Dasyurinae –   Phascogalini Antechinus hedgehog equivalent 10 10
Micromurexia small Murexia 0 1 Habbema dasyure – rocky areas of New Guinea
Murexechinus hedgehog mouse 0 1 Black-tailed dasyure – tropical dry forests of New Guinea
Murexia marsupial mouse (significance unknown) 0 1 Short-furred dasyure – New Guinea
Paramurexia near Murexia 0 1 Broad-striped dasyure – South east Papua New Guinea
Phascomurexia pouched Murexia 0 1 Long-nosed dasyure – tropical dry forests of New Guinea
Phascogale pouched weasel 2 2
Sminthopsinae –   Sminthopsini Antechinomys antechinus-mouse 1 1 Kultarr
Ningaui Aboriginal mythical creature 3 3
Sminthopsis mouse appearance 21 21 dunnarts
Sminthopsinae – Planigalini Planigale flat weasel 4 5
Myrmecobiidae Myrmecobius ant-living 1 1 numbat
Peramelemorphia   (Bilbies and bandicoots) Peramelidae Echymiperinae Echymipera pouched hedgehog 1 1 long-nosed spiny bandicoot
Microperoryctes small Peroryctes 0 5 striped bandicoots – New Guinea
Rhynchomeles beaked badger 0 1 Seram bandicoot – existence only recorded in 1920
Peramelinae Isoodon equal tooth 3 3 short-nosed bandicoots
Perameles pouched badger 3 3 long-nosed bandicoots
Peroryctidae Peroryctes pouched digger 0 2 New Guinean long-nosed bandicoots
Thylacomyidae Macrotis big-ear 1 1 bilby
Diprotodontia   (“two front teeth” – kangaroos and relatives) Vombatiformes – Phascolarctidae Phascolarctos pouched bear 1 1 koala
Vombatiformes –   Vombatidae Vombatus wombat 1 1
Lasiorhinus hairy-nose 2 2
Phalangeriformes   (possums and gliders) – Phalangeridae Ailurops cat-like 0 2 bear cuscuses – NE Indonesia inc Sulawesi
Phalanger notable digits 1 13 cuscus
Spilocuscus spotted cuscus 1 5
Strigocuscus thin cuscus 0 2
Trichosurus hairy tail 5 5 bushtail possums
Wyulda brush-tail possum (mistakenly assigned) 1 1 scaly-tailed possum
Phalangeriformes –   Burramyidae (pygmy possums) Burramys stony-place mouse 1 1 Mountain Pygmy Possum
Cercartetus possibly tail-in-air 4 4
Phalangeriformes – Tarsipedidae Tarsipes tarsier-foot 1 1 Honey possum
Phalangeriformes –   Petauridae Dactylopsila naked finger 1 4 striped possums
Gymnobelideus naked Belideus 1 1 Leadbeater’s possum
Petaurus rope dancer 4 6 gliders
Phalangeriformes –   Pseudocheiridae (ringtailed possums and relatives) Hemibelideus half Belideus (fluffy-tailed glider) 1 1 Lemur-like ringtail possum
Petauroides Petaurus-like 1 1 Greater Glider
Petropseudes rock-Pseudocheirus 1 1 Rock-haunting ringtail possum
Pseudocheirus false hand 2 1 Common ringtail possum
Pseudochirops Pseudocheirus-like 1 5
Pseudochirulus little Pseudocheirus 2 8
Phalangeriformes –   Acrobatidae Acrobates acrobat 1 1 Feathertail glider
Distoechurus tail in two rows 0 1 Feather-tailed possum – New Guinea
Macropodiformes –   Macropodidae Lagostrophus turning hare 1 1 Banded hare-wallaby
Dendrolagus tree hare 2 13 tree-kangaroos
Lagorchestes dancing hare 2 2
Macropus long foot 13 13 kangaroos and wallaroos
Onychogalea nailed weasel 2 2 nail-tail wallabies
Petrogale rock weasel 16 16 rock-wallabies
Setonix bristle-claw 1 1 Quokka
Thylogale pouched weasel 3 7 pademelons
Wallabia wallaby 1 1 Swamp wallaby
Macropodiformes –   Potoroidae Aepyprymnus high rump 1 1 Rufous rat-kangaroo
Bettongia bettong 4 4
Potorous potoroo (Aboriginal) 3 3
Macropodiformes – Hypsiprymnodontidae Hypsiprymnodon high rump tooth 1 1 Musky rat-kangaroo
Totals 151 224


Dictionary of Australian and New Guinean Mammals, edited by Ronald Strahan & Pamela Conder, CSIRO Publishing, 2007.

February 9, 2014

The validity of the index laws for complex numbers

Filed under: mathematics — ckrao @ 11:47 am

In an earlier post we saw that some of the index laws fail when the base is negative or zero. Now we shall see what occurs when the allowed values of base and exponent are extended to be complex numbers.

For z_1 \in \mathbb{C}\backslash \{ 0\} we can proceed analogously to the real case and define

\displaystyle z_1^{z_2} = \exp(z_2 \log (z_1)).\quad \quad(1)

However what are the exponential and logarithm of a complex number? Last time we defined the logarithm before the exponential and we can do the same here. For complex z we can define \log(z) as \int_1^z \frac{1}{t}\ \text{d}t as for the real case, but note that this time it is a contour integral along any path from 1 to z not including the origin. By allowing the path to wind around the origin any number of times, we find the function is no longer single-valued. For example if we choose the contour to be the unit circle from +1 to itself in an anticlockwise direction, then the contour may be parametrised by t = \cos \theta + i \sin \theta for \theta \in [0, 2\pi] (\text{d}t = (-\sin \theta + i \cos \theta)\text{d}\theta) and we have

\begin{aligned} \log(1) &= \int_1^1 \frac{1}{t}\ \text{d}t\\ &= \int_0^{2\pi} (\cos \theta - i \sin \theta)(-\sin \theta + i \cos \theta)\text{d}\theta\\ &= \int_0^{2\pi} i( \cos^2 \theta + \sin^2 \theta )\ \text{d}\theta \\ &= 2\pi i. \end{aligned}

More generally if w is a logarithm of z, so is w + 2\pi i n where n is an integer. Defining \exp(w) for complex w to be z where w is a logarithm of z, we have

z = \exp(w + 2\pi i n) = \exp(w),\quad n \in \mathbb{Z}.

(Note that \displaystyle \exp(\log (z)) = z while \displaystyle \log(\exp (z)) = z + 2\pi i k.)

Based on these definitions we can show that \log(z_1z_2) = \log(z_1) + \log(z_2) and \exp(z_1 + z_2) = \exp(z_1)\exp(z_2) in an analogous way to the real case, except that the first equation is to be interpreted as an equality of sets of values rather than individual values. Note that for this reason we have to be careful when adding or subtracting logarithms: for example for complex numbers,

\log(z) - \log(z) = \log(1) = 2\pi i n \neq 0


\begin{aligned} \log(z^2) &= \log (z) + \log (z)\\ &= (\log (z) + 2\pi i m) + (\log (z) + 2\pi i n)\\ &= 2\log (z) + 2\pi i k, \quad k \in \mathbb{Z}\\ \text{while }\quad 2 \log (z) &= 2 (\log (z) + 2\pi i n)\\ &= 2\log (z) + 4 \pi i n, \quad n \in \mathbb{Z}. \end{aligned}

Hence we cannot write a\log (z) + b \log (z) = (a+b)\log (z) without paying special attention to the values of a,b,z. If we want to know when \log (z^c) = c \log (z), we can verify that \log (z^c) = c \log (z) + 2\pi i k, k \in \mathbb{Z} which is only equal to c \log (z) = c (\log (z) + 2\pi i m) when cm covers all integers, i.e. if c = 1/n for some non-zero integer n:

\displaystyle \log\left(z^{1/n}\right)= \frac{1}{n} \log(z), \quad\quad n \in \mathbb{Z}.

Now in general,

\begin{aligned} z_1^{z_2} &= \exp(z_2 \log (z_1))\\ &= \exp(z_2 (\log (z_1) + 2\pi i n) )\\ &= \exp(z_2 \log (z_1))\exp(2\pi i n z_2).\quad\quad (2)\end{aligned}

This will be multi-valued if nz_2 takes on non-integer values, as n varies over the integers. It will only be single-valued if z_2 is an integer. For example treated as a complex power, 2^{1/2} will have two values: \sqrt{2} and -\sqrt{2} while 2^{1/3} will take three values. The number 2^{\sqrt 2} will have infinitely many complex values \exp(\sqrt{2} \log (2))\exp(2\pi i n \sqrt{2}), n \in \mathbb{Z} although only one of them is real-valued. Note that through (2) we can work out quantities such as:

  • (-1)^{\sqrt{2}} = \exp(i \pi \sqrt{2})\exp(2\sqrt{2} \pi i n), n \in \mathbb{Z} (infinitely many non-real values!)
  • i^i = \exp(-\pi/2)\exp(-2\pi n), n\in \mathbb{Z} (infinitely many real values!).

Also note from (2) that

\log(z_1^{z_2}) = z_2 \log (z_1) + 2\pi i k = z_2 \textrm{Log} (z_1) + z_2 2\pi i k_1 + 2\pi i k_2.

One can define the principal value of the logarithm \textrm{Log}(z) to be that with imaginary part in the interval (-\pi, \pi]. Similarly one can define the principal value of the power function as

\displaystyle z_1^{z_2} := \exp(z_2 \textrm{Log} (z_1)).\quad \quad (3)

This gives single-valued results but they may not be as expected. For example, since \textrm{Log}(-1) = i \pi, (-1)^{1/3} = \exp(i \pi/3) rather than the real-valued root -1. However we can now say a \textrm{Log}(z) + b \textrm{Log}(z) = (a+b)\textrm{Log}(z), being single-valued.

We would like to know which of the index laws hold. In the remainder of the post we verify the identities summarised in the following table. The real number case was already treated in this post.

Real numbers a,b Complex numbers z,z_1,z_2,a,b
Positive real x,y

Multiple-valued power

z_1^{z_2} = \exp(z_2 \log z_1)

Single-valued power

z_1^{z_2} = \exp(z_2 \textrm{Log} z_1)

(1)  x^a x^b = x^{a+b}  z^{a+b} a subset of  z^a z^b  z^a z^b = z^{a+b}
(2)  (x^a)^b = x^{ab}  z^{ab} a subset of (z^a)^b  (z^a)^b = z^{ab}\exp(2\pi i b n_0)
(3)  (xy)^a = x^a y^a  (z_1 z_2)^a = z_1^a z_2^a  (z_1 z_2)^a = z_1^a z_2^a \exp(2 \pi i a n_{+})
 (4)  x^0 = 1  z^0 = 1   z^0 = 1
 (5)  1^a = 1  1^a = \exp(a 2\pi i k)   1^a = 1
 (6)  x^{-a} = 1/x^a  z^{-a} = 1/z^a but z^{-a}z^a = \exp(2\pi i k)  z^{-a} = 1/z^a and  z^{-a}z^a = 1
 (7)  x^a / x^b = x^{a-b}  z^{a-b} a subset of z^{a}/z^{b}   z^a / z^b = z^{a-b}
 (8)  (x/y)^a = x^a/y^a  (z_1/z_2)^a = z_1^a/z_2^a  (z_1/z_2)^a = \frac{z_1^a}{z_2^a}\exp(2\pi i a n_{-})

Note that in the table, n_0, n_{+},n_{-} are particular integers chosen to enable equality.

For verifying identity (1) in the multi-valued power case we have

\begin{aligned} z^{a+b} &= \exp((a+b)\log (z))\\ &= \exp\left((a+b)(\textrm{Log} (z) + 2\pi i k)\right)\\ &= \exp\left((a+b)\textrm{Log} (z) \right) \exp\left( 2\pi i k (a+b)\right) \end{aligned}


\begin{aligned}z^a z^b &= \exp(a \log(z)) \exp(b \log(z))\\&= \exp\left(a (\textrm{Log} z + 2\pi ik) \right)\exp\left(b (\textrm{Log} z + 2\pi in) \right)\\ &= \exp\left((a+b)\textrm{Log} (z) \right) \exp\left(2\pi i (ka + nb) \right).\end{aligned}

This shows that the set of values of z^{a+b} is a subset of the set of values of z^a z^b. In the single-valued case,

\begin{aligned} z^az^b &= \exp(a \textrm{Log} (z)) \exp(b \textrm{Log} (z))\\ &= \exp(a \textrm{Log} (z) + b \textrm{Log} (z))\\&= \exp((a+b) \textrm{Log} (z))\\ &= z^{a+b}.\end{aligned}

For identity (2) in the multi-valued power case we have

\begin{aligned} (z^a)^b &=(\exp(a \log (z)))^b\\ &= \exp(b \log(\exp(a \log( z))))\\ &= \exp(b(a \log (z)+2 \pi ik))\\ &= \exp(ba \log (z)) \exp(2 \pi ibk)\\ &= z^{ab}\exp(2\pi ibk). \end{aligned}

This shows that the set of values of z^{ab} is a subset of the values of (z^a)^b. We have the equality (z^a)^b = z^{ab} if \exp(2\pi ibk) =1, or bk \in \mathbb{Z} for all k, which is true if b \in \mathbb{Z}. In the single-valued case, the integer k is chosen so that a \textrm{Log}(z)+2 \pi ik has imaginary part in the interval (-\pi, \pi].

For identity (3) in the multi-valued power case we have

\begin{aligned} (z_1 z_2)^a &= \exp(a \log(z_1 z_2))\\ &= \exp(a(\log (z_1)+\log (z_2)))\\ &= \exp(a \log (z_1))\exp(a \log (z_2))\\ &= z_1^a z_2^a.\end{aligned}

In the single-valued power case,

\begin{aligned} (z_1 z_2)^a &= \exp(a \textrm{Log}(z_1 z_2))\\ &= \exp(a(\textrm{Log} (z_1)+\textrm{Log} (z_2) + 2\pi i a n_{+}))\\ &= \exp(a \textrm{Log} (z_1))\exp(a \textrm{Log} (z_2))\\ &= z_1^a z_2^a \exp(2 \pi i an_{+}). \end{aligned}

Here the value n_{+} is chosen so that \textrm{Log} (z_1)+\textrm{Log} (z_2) + 2\pi i a n_{+} has imaginary part in the interval (-\pi, \pi].

Identity (4) comes from setting b=0 in identity (1). Identity (5) results from setting y or z_2 to 1 in identity (3). Identity (6) results from setting b = -a in identity (1) and using identity (4). Finally identities (7) and (8) follow from identities (1) and (3).

The moral of all this is that care is to be taken when applying the index laws to complex numbers (or indeed even when adding logarithms) by virtue of the multi-valued nature of the complex logarithm.


H. Haber, The complex logarithm, exponential and power functions, UC Santa Cruz Physics 116A notes (2011) available at

January 30, 2014

The frequency of 40+ days in Melbourne

Filed under: climate and weather — ckrao @ 10:14 am

In a recent post I mentioned that Melbourne (the regional office weather station) has had 203 days of a maximum temperature of 40°C or more in the 159 years from 1855 to 2013. January 2014 alone has had 5 more such days, tying the record of instances in a month. There were two other times when Melbourne had 5 days reaching at least 40°C in a month: January 1905 and January 1908. The former of these had four of the hot days out of a string of five while the latter had all five in consecutive days. This year included a sequence of four consecutive instances above 41°C, the first time that has happened since records began.

The following plot, generated via the geom_smooth feature of ggplot2 in R, shows a dip in the frequency of these days in the mid 20th century, followed by an increasing frequency since 1980. (The year 2014, being incomplete to date, is not included.)

Frequency of 40 degree days in Melbourne

Interestingly the 10 years 1969-1978 only had 3 instances of 40+ degree days (including none during the four years 1969-1972). The table below shows the frequency per decade since records began, with the current period having the most. The most recent year Melbourne did not have a 40+ day was 2002.

Decade # 40+ °C days
1855-1864 12
1865-1874 11
1875-1884 13
1885-1884 6
1895-1904 21
1905-1914 22
1915-1924 11
1925-1934 6
1935-1944 15
1945-1954 10
1955-1964 8
1965-1974 8
1975-1984 14
1985-1994 10
1995-2004 16
2005-2014 27+

Melbourne has so far never had 7 days of 40°C in a year (6 times was reached in 1898 and 1900), and 2014 has a chance of at least equalling that record (edit: 2014 has set a record with its 7th day of 40+°C temperatures, the mark reached on Jan 14-17, 28 and Feb 8-9).

January 28, 2014

The validity of the index laws for real numbers

Filed under: mathematics — ckrao @ 12:51 pm

The fundamental index laws are given by

\begin{aligned} x^a x^b &= x^{a + b}\quad\quad &(1)\\(x^a)^b &= x^{ab}\quad \quad &(2)\\(xy)^a &= x^a y^a\quad\quad &(3)\end{aligned}

If these are true we can also deduce

\begin{aligned} x^0 &= 1 \ \text{ for }x \neq 0\quad \text{(setting } b = 0\text{ in (1))}\quad\quad\quad\quad&(4)\\1^a &= 1 \ \text{ for real numbers }a \quad \text{(setting } y = 1\text{ in (3))}\quad\quad\quad\quad&(5)\\x^{-a} &= 1/x^{a}\quad\text{(setting } b=-a \text{ in (1) and using (4))}\quad\quad&(6)\\x^{a}/x^{b} &= x^{a-b}\quad\text{(replacing } b \text{ with }-b \text{ in (1) and using (6))}\quad\quad&(7)\\(x/y)^a &= x^a/y^a \ \text{ where }y \neq 0\quad\text{(replacing }y \text{ with }1/y\text{ in (3))}\quad\quad&(8)\end{aligned}

In this post we discuss the following conditions under which these laws hold.

  • Case 1: The laws are true if x and y are positive real numbers and a, b are real numbers.
  • Case 2: If x or y is negative we require a and b to be rational with odd denominator for x^a or x^b to be defined (this includes a or b being integers as the denominator in this case is 1). All the laws except (2) hold in this case. For (2) to be true we require the denominators of a and ab in reduced form to be odd and that either (a) the numerators and denominators of a and b are all odd or (b) the numerator of ab in reduced form is even.
  • Case 3: The laws are mostly true if x = 0, except a,b need to be positive and laws (6), (7) do not apply.

Case 1: Positive bases

Let us seek to define x^a from first principles for arbitrary x positive and a real. We can be motivated by the continuity of the function x^t as a function of t (where x is fixed) and think of x^a as the limit of values x^r as rational numbers r approach a. To me it is easier to proceed via the logarithm function which may be defined as

\displaystyle \log(x) := \int_1^x \frac{1}{t}\ \text{d}t\quad x > 0.\quad\quad(9)

Being the area under a continuous positive-valued function, this function is continuous and monotonically increasing. It is also apparent that \log(1) = 0. From the definition the important identity \log (xy) = \log(x) + \log(y) (where x, y > 0) can be derived via a change of variable u=t/x as follows:

\begin{aligned} \log (xy) &= \int_1^{xy} \frac{1}{t}\ \text{d}t\\&=\int_1^{x} \frac{1}{t}\ \text{d}t + \int_x^{xy} \frac{1}{t}\ \text{d}t\\&= \int_1^{x} \frac{1}{t}\ \text{d}t + \int_1^{y} \frac{1}{ux}x\ \text{d}u\\&= \int_1^{x} \frac{1}{t}\ \text{d}t + \int_1^{y} \frac{1}{u}\ \text{d}u\\&=\log(x) + \log(y). \quad \quad (10)\end{aligned}

By repeated use of this rule, \log(x^n) = n \log(x) for positive integers n. As \log(2) > 0, \log(2^n) = n \log 2 grows without bound as n increases, which shows that the log function is unbounded above. It is also unbounded below due to the relationship \log(1/x) = -\log(x) (which follows from 0 = \log (1) = \log(x.(1/x)) = \log x + \log (1/x)). Hence the log function is monotonic, continuous and has range (-\infty, \infty). It thus has a continuous inverse which is how we may define the exponential function: \exp(x) is the unique positive value y satisfying

\displaystyle x = \int_1^y \frac{1}{t}\ \text{d}t.

From the relationship \log(uv) = \log(u) + \log(v) follows

\exp(x+y) = \exp(x) \exp(y) \quad\quad(11)

where x and y are real. Repeated use of this gives

\displaystyle \exp(nx) = \exp(x)^n \quad \quad (12)

for n a positive integer. Defining e:= \exp(1), we have the familiar form \exp(n) = e^n.

We can extend (12) to rational values of n. Let us recall what we mean by a rational power. For a > 0 and relatively prime positive integers p, q we define a^{p/q} to be the unique positive number y such that y^q = a^p. (That the q‘th root exists follows from the fact that the function x^q has an inverse for x > 0).

From (12) we then have

\displaystyle (\exp(x))^p = \exp(px) = \exp((px/q).q) = \exp(px/q)^q.

Setting \exp(x) to a and \exp(px/q) to y we have thus shown a^p = y^q, so \exp(x)^{p/q} = \exp(px/q). Therefore (12) is satisfied when n is positive rational. Finally we use \exp(-x) = 1/\exp(x) (which can be proved using (11)) to extend (12) to negative rationals:

\displaystyle \exp(nx) = 1/\exp(-nx) = 1/\exp(x)^{-n} = \exp(x)^n.

We are finally ready to define x^a for arbitrary x positive and a real relying on continuity. We simply extend (12) replacing \exp(x) with x (also replacing x with \log(x)) and n with real values a:

\displaystyle x^a := \exp(a \log(x)), \quad x > 0, a \in \mathbb{R}.\quad\quad(13)

From this definition it is quick to verify laws (1) and (2):

\begin{aligned} x^a x^b &= \exp(a \log(x))\exp(b \log (x))\\ &= \exp(a \log x + b\log x)\\ &= \exp((a+b)\log x)\\ &= x^{a+b}\quad\quad(14)\\\text{and }\ (x^a)^b &= \exp(b \log(x^a))\\ &= \exp(b a \log(x))\\ &= x^{ab}\quad\quad(15)\end{aligned}

Law (3) is immediate if either x or y is equal to 1. Otherwise, we can find c so that y = x^c (choose c = \log(y)/\log(x)) and then we have from (1) and (2) the following:

\begin{aligned} (xy)^a &= (x.x^c)^a\\ &= (x^{1+c})^a\\ &= x^{a + ac}\\ &= x^a x^{ac}\\ &= x^a (x^c)^a\\ &= x^a y^a.\quad\quad (16) \end{aligned}

Case 2: Negative bases

When the base x is negative we can define x^a by the real number (-1)^a(-x)^a provided (-1)^a exists. This will be the case if a is rational and has odd-valued denominator (square roots of -1 are not real-valued nor are irrational powers of -1). Writing a = p/q where p and q are relatively prime (q odd), we then have (-1)^{p/q} = (-1)^p. Since odd denominators are preserved under addition, law (1) can be shown to hold.

To check law (3) we consider the cases of both x,y being negative or only one (say y) being negative.

a) If x and y are both negative:

\begin{aligned} (xy)^{p/q} &= ((-x)(-y))^{p/q}\\&= (-x)^{p/q} (-y)^{p/q}\\ &= (-1)^{2p/q}(-x)^{p/q} (-y)^{p/q}\\&= (-1)^{p/q}(-x)^{p/q}(-1)^{p/q}(-y)^{p/q}\\&= x^{p/q}y^{p/q}\end{aligned}

b) If x is positive and y is negative:

\begin{aligned} (xy)^{p/q} &= (-1)^{p/q}(x(-y))^{p/q}\\&= (-1)^{p/q}(x)^{p/q} (-y)^{p/q}\\&= x^{p/q}y^{p/q}\end{aligned}

However rule (2) is a little more complicated. For example, (x^2)^{1/2} \neq x if x <0 so we cannot say

\displaystyle 1 = 1^{1/2} = ((-1)^2)^{1/2} = (-1)^{2/2} = (-1)^1 = -1.

Issues arise when even denominators in the exponents appear because positive square roots will be taken when negative numbers may be required. It is also possible that (x^a)^b = x^{ab} but for x^b not to be real-valued (e.g. (x^4)^{1/2} = x^2 but x^{1/2} is not real-valued).

For (x^a)^b = x^{ab} to make sense when x < 0 we require that the left and right sides are defined and are equal in the equality ((-1)^a)^b = (-1)^{ab}. Hence:

  • for the left side to exist, a must have odd denominator and if (-1)^a <0 then b must also have odd denominator
  • for the right side to exist, ab in reduced form must have odd denominator


  • (a) the left side is negative iff a,b both have odd numerators and denominators and the same is true for the right side.
  • (b) the left side is positive iff either of a,b has even numerator and the right side is positive iff ab has even numerator and odd denominator in reduced form.

We conclude that (2) holds when either (a) the numerators and denominators of a and b are all odd (in which case the result is negative) or (b) the product ab has even numerator and odd denominator in reduced form (in which case the product is positive).

This allows the possibility of an even numerator cropping up such as (x^{2/3})^3 = x^2. It is interesting to see that (x^{4/3})^{1/2} = x^{2/3} is valid but (x^{2/3})^{1/2} = x^{1/3} is not for x < 0.

Also, reduced form (cancelling out even factors) is important to avoid erroneous calculations such as (-1)^{2/4} = ((-1)^2)^{1/4} = 1.

Finally we remark that for negative bases we lose continuity: for example (-1)^q switches between +1 and -1 depending on whether the numerator of q is odd or even.

Case 3: Base 0

Note that 0^a is 0 for positive a (again defined firstly for rationals then extended to positive reals by continuity). However it is undefined for a negative which is why the laws only hold for positive indices and neither (5) nor (6) apply. In the discrete world it is common to define 0^0 to be 1 for convenience but the function x^y fails to be continuous at (x,y) = (0,0) for any choice of value of 0^0.


I’ll end this post with a cool-looking exponential of logarithm identity that is not as well known as it perhaps ought to be. For x,y >0 we have

\displaystyle x^{\log y} = y^{\log x}.

(The logarithm of both sides is \log y \log x = \log x \log y!)

For example, 4^{\log_2 9} = 9^{\log_2 4} = 9^2 = 81.

December 31, 2013

Large day to day temperature increases for Melbourne

Filed under: climate and weather — ckrao @ 4:07 am

Recently Melbourne has had a few recent unusually large maximum temperature increases from one day to the next, and I wanted to see how often this occurs. For example this year the city weather station had maximums of 19.8°C and 31°C on Nov 30 and Dec 1, 26.9°C and 39.9°C on Dec 18-19, 22.4°C and 36.5°C (Dec 27-28), and 25.5°C and 40.8°C (Jan 16-17). Similarly large decreases in temperature are more frequent due to cold fronts sweeping south-eastern Australia.

The following analysis was carried out with data from Australia’s Bureau of Meteorology.

Firstly, the following boxplot illustrates the distribution of maximum temperature differences from one day to the next. In the summer months large increases occur more frequently than I had expected. (To interpret a box plot, the thick black line represents the median, the red boxes span the quartiles and the dashed lines extend 1.5 times the interquartile range in both directions. Outliers beyond this range are plotted separately.)


One data point that immediately stands out is at top left of the graph – it seems that in 1900 there was an increase from a maximum of 15.1°C to 40.3°C on 15-16 January 1900! This appears too large to be plausible.

As expected the temperature differences are negatively skewed in each month (lower tail fatter than the upper tail), with skewness values tabulated below.

-0.37 -0.59 -0.46 -0.51 -0.37 -0.07 -0.09 -0.40 -0.52 -0.46 -0.43 -0.42

Zooming in on the warmer months December-March we have the following histograms showing the fatter lower tail. However the upper tail is larger than I had anticipated.


Here are answers to some questions I had posed regarding this data.

How often is the maximum temperature above 30°C after failing to reach 20°C the previous day?

This has happened on average 1.2 times per year in Melbourne with frequency-by-month shown below.

35 16 20 2 0 0 0 0 0 13 59 53 198

There have in fact been 13 occasions (6 in January, 4 in November) where the maximum was above 35°C after being less than 20°C the previous day. This happened most recently in 1983 (35.0°C on 25/1 after 19.4°C on 24/1).

How often is the maximum temperature above 40°C after failing to reach 25°C the previous day?

This has happened 25 times (14 in January) with 24.4°C and 44.7°C on 9-10 Jan 1939, and 24.1°C and 43.3°C (23-24 Dec 1868) being two of the bigger increases. Most recently we had maximums of 24.2°C and 40.8°C on 15-16 Jan 2007. Melbourne has experienced 203 40+ days in the 159 years of records.

How often is there a day-to-day increase of at least 15°C?

This has happened on average 0.8 times per year in Melbourne, about half the time in January as shown below.

62 19 5 0 0 0 0 0 0 0 13 31 130


Finally, listed below are some notable events.

  • The days 8-13 January 1939 (the 13th was Black Friday) had maximum temperatures of 43.1, 24.4, 44.7, 33.5, 25.6 and 45.6°C respectively, hence containing two 20-degree increases! The only other 20-degree increase was the anomalous 15.1°C to 40.3°C jump from 15-16 January 1900 mentioned earlier.
  • 9-10 Jan 1877: 19.7°C and 38.1°C
  • 9-10 Jan 1882: 19.9°C and 37.1°C
  • 26-28 Feb 1865 had maximum temperatures of 20.3, 39.7 then back to 19.9°C.
  • 1-3 Mar 1893 had maximum temperatures of 23.0, 40.8 then back to 22.1°C.
  • 4-5 Apr 1888: 17.9°C and 30.1°C, the biggest increase in April
  • 29-30 Oct 1919: 20.1°C and 34.7°C, the biggest increase in October
  • 13-14 Nov 1878: 22.3°C and 39.4°C, the biggest increase in November
  • 23-24 Dec 1868: 24.1°C and 43.3°C (mentioned earlier)
  • 15-16 Dec 1897 : 22.3°C and 41.7°C

December 25, 2013

The binomial theorem for non-positive-integer indices

Filed under: mathematics — ckrao @ 9:16 pm

If n is a positive integer, the expansion of (x+y)^n has each term being a product of n variables of the form x^k y^{n-k} where k ranges from 0 to n. The coefficient of x^k y^{n-k} is precisely the number of ways we can choose k of the n variables to be x, which is \binom{n}{k}. Hence we have the binomial theorem:

\displaystyle (x+y)^n = \sum_{k=0}^n \binom{n}{k} x^k y^{n-k}. \quad \quad(1)

But what if n is not a positive integer? This post is about how we extend this formula and why it still holds. Interestingly the same formula holds with minor modifications. Firstly we change the upper limit of the sum to infinity – the sum will then only converge in particular circumstances. Secondly we extend the definition of the binomial coefficient to complex values of n via \binom{n}{k} := \frac{n(n-1)\ldots (n-k+1)}{k!}.

Note that if x is a complex number then x^k is defined for any non-negative integer k – we simply do repeated multiplication, and define x^0 = 1 (even for x = 0). However a non-integer power of a complex number is not straightforward to define unless the number is a positive real. If x > 0 we can define x^n := \exp(n \alpha) where \alpha is the unique real solution to \exp(\alpha) = x. Extending the definition to other complex numbers leads to issues with multifunctions or discontinuities (e.g. (-1)^{1/2} could be one of two values i or -i). Hence we are going to restrict ourselves to the case x and y real.

While the term x^k is fine, we need to take care with y^{n-k} if n is no longer an integer. Hence we shall restrict ourselves to non-negative y. The binomial theorem is valid when y = 0 so consider y > 0. When does the infinite series converge? By the ratio test, the sum \sum_{k=0}^{\infty}a_k converges if the limit of |a_{k+1}|/|a_k| converges to a number less than 1 as k \rightarrow \infty. (The sum does not converge if the limit is greater than 1 and if the limit either does not converge or is equal to 1, then the test is inconclusive.) Applying this test to our case of a_k = \binom{n}{k} x^k y^{n-k} gives us

\begin{aligned}\frac{|a_{k+1}|}{|a_k|} &= \frac{n(n-1)\ldots (n-(k+1)+1) x^{k+1}y^{n-(k+1)}}{(k+1)!} \frac{k!}{n(n-1)\ldots (n-k+1) x^k y^{n-k}}&= \frac{|n-k|}{k+1}\frac{|x|}{y}. \end{aligned}

This has limit less than 1 as k \rightarrow \infty provided |x| < y. We can thus state the following.

If n is a complex number, x is real, y is positive  and |x| < y, then the sum \displaystyle\sum_{k=0}^{\infty} \binom{n}{k} x^k y^{n-k} converges.

The reason this sum is equal to (x+y)^n is a consequence of the Taylor series expansion of (1+x)^n about x = 0 and the fact that the identity \displaystyle (x+y)^n = y^n (x/y + 1)^n is valid when y > 0. Also note that the condition |x| < y implies x+y > 0 so (x+y)^n is well defined.

We can also use a differential equations approach to proving this. Fix y and consider \displaystyle f(x) =\sum_{k=0}^{\infty} \binom{n}{k} x^k y^{n-k} as a power series in x valid for |x| < y. The power series is differentiable term by term within this interval of convergence and so

\displaystyle f'(x) = \sum_{k=1}^{\infty} \binom{n}{k} k x^{k-1}y^{n-k}. \quad \quad (2)

We may also write this as

\begin{aligned} f'(x) &=\sum_{j=0}^{\infty} \binom{n}{j+1} (j+1) x^{j}y^{n-(j+1)}\\ &= \sum_{j=0}^{\infty} \frac{n(n-1)\ldots (n-j) (j+1) x^{j }y^{n-(j+1)}}{(j+1)!}\\ &= \sum_{k=0}^{\infty} \binom{n}{k} (n-k) x^{k}y^{n-k-1}.\quad \quad (3) \end{aligned}

Adding x times (2) to y times (3),

\begin{aligned} (x + y)f'(x) &= \sum_{k=0}^{\infty} \binom{n}{k} k x^{k}y^{n-k} + \sum_{k=0}^{\infty} \binom{n}{k} (n-k) x^{k}y^{n-k}\\ &= n \sum_{k=0}^{\infty}x^{k}y^{n-k}\\ &= nf(x). \quad\quad (4)\end{aligned}

Hence by (4)

\begin{aligned} d/dx [(x+y)^{-n} f(x) ] &= (x+y)^{-n} f'(x) -n (x+y)^{-n-1} f(x)\\ &= (x+y)^{-n-1} [(x+y) f'(x) - nf(x)]\\ &= 0,\end{aligned}

so we deduce that (x+y)^{-n} f(x) is constant. For x=0 this is y^{-n} f(0) = y^{-n}y^n = 1, and we conclude that f(x) = (x+y)^n.  Summarising, we have the following result.

If n is a complex number, x is real, y is positive  and |x| < y, then

\displaystyle (x+y)^n = \sum_{k=0}^{\infty} \binom{n}{k} x^k y^{n-k}.

A particularly attractive special case of this formula is for y=1, n = -1/2 and x replaced with -x:

\begin{aligned} \frac{1}{\sqrt{1-x}} &= \sum_{k=0}^{\infty} \binom{-1/2}{k} (-x)^k \\&= \sum_{k=0}^{\infty} \frac{(-1/2)(-3/2)\ldots (-1/2 - k + 1)}{k!} (-x)^k\\ &= \sum_{k=0}^{\infty} (-1)^k \frac{(1)(3)(5)\ldots (2k-1)}{k!}\frac{(-x)^k}{2^k}\\ &= \sum_{k=0}^{\infty} \frac{(2k)!}{(2)(4)\ldots (2k) k!}\frac{x^k}{2^k}\\ &= \sum_{k=0}^{\infty} \frac{(2k)!}{k! k!} \frac{x^k}{4^k} \\ &= \sum_{k=0}^{\infty} \binom{2k}{k} \frac{x^k}{4^k}, \quad |x| < 1.\end{aligned}

References/Further reading

November 30, 2013

Tendulkar statistics and links

Filed under: cricket,sport — ckrao @ 11:54 pm

Recently retired Sachin Tendulkar is probably the sportsman who gave me the most total enjoyment over my lifetime. His peak batting period was probably from 1993 to the 2003 World Cup but he came back from injury well to become ICC Player of the year in 2010 at age 37 and after more than 20 years of playing internationally. He has been adored by millions and through all the pressure and expectations over such a long period managed to remain a likeable humble figure. His solid compact batting technique allowed him to flourish in both longer and shorter formats of the game better than almost all players during his time (he played with or against 982 others in international matches!). He also chipped in handily with the ball taking over 150 ODI wickets – Steve Waugh said Tendulkar could spin the ball more than regular spinners and could have taken 100 wickets had he put his mind to it [ref].

Here are his test and ODI career summarised by two graphs. The last 30 innings averages are intended his “form” at the time.

Tendulkar test careerTendulkar ODI career

Below are some links collected about his career highlights and statistics, I may add to these over time.

HowSTAT! player profile for tests, ODIs (including career graphs for tests and ODIs)

CricInfo links:



November 29, 2013

Critical points of polynomials with respect to the roots

Filed under: mathematics — ckrao @ 1:28 pm

Recall that a critical point of a polynomial p:\mathbb{C} \rightarrow \mathbb{C} is a point w for which p'(w) = 0. The Gauss-Lucas theorem states that the critical points of p all lie within the convex hull of its set of zeros and we shall explain why this holds below. The real-number analogue of this is that the stationary points of a polynomial with real roots all lie between the smallest and largest root.

Denote the roots of the polynomial by z_1, z_2, \ldots, z_n (we’ll assume they are distinct) and the critical points by w_1, w_2, \ldots, w_{n-1}. We then may write our polynomial as p(z) = k \prod_{i=1}^n (z - z_i) for some constant k and its derivative by the product rule is

\displaystyle p'(z) = k \sum_{j=1}^n \prod_{i=1}^n \frac{z-z_i}{z-z_j} = p(z) \sum_{j=1}^n \frac{1}{z-z_j}.\quad \quad ...(1)

If p(w_i) \neq 0 then as w_i is a critical point of p, p'(w_i) = 0 and so

\displaystyle 0 = \sum_{j=1}^n \frac{1}{w_i-z_j} = \sum_{j=1}^n \frac{\bar{w_i} - \bar{z_j}}{|w_i - z_j|^2},

from which

\displaystyle w_i \sum_{j=1}^n \frac{1}{|w_i - z_j|^2} = \sum_{j=1}^n \frac{z_j}{|w_i - z_j|^2}.\quad\quad...(2)

In other words,

\displaystyle w_i = \sum_{k=1}^n \alpha_k z_k,\quad\quad...(3)

where \alpha_k = \frac{1/|w_i - z_k|^2}{\sum_{j=1}^n 1/|w_i - z_j|^2}.

Since \sum_{k=1}^n \alpha_k = \frac{\sum_{k=1}^n 1/|w_i - z_k|^2}{\sum_{j=1}^n 1/|w_i - z_j|^2} = 1, w_i is a convex combination of the roots z_k of p(z). In other words, w_i is in the convex hull of the set of roots z_k.

In the other case of p(w_i) = 0, w_i is equal to one of the roots and so is also in the convex hull of the set of roots. This completes the proof of the theorem.

However as mentioned in [1] an interpretation of (3) by Cesàro in 1885 is that if we fix unit masses at the roots z_i (if there are repeated roots place mass equal to the multiplicity), the critical points will be at precisely those points experiencing zero net force, assuming particles repel with a force proportional to the inverse of the distance between them (as opposed to an inverse square law). To see why this is so, if the force at a point w due to a mass at z is proportional to the vector \frac{1}{|z-w|^2}(z-w) (an inverse law, as this has magnitude 1/|z-w|), then the net force is the sum of these over the masses at all the roots. In terms of any critical point w_i, this is the following:

\begin{aligned} \sum_{k=1}^n \frac{1}{|z_k-w|^2}(z_k-w) &= \sum_{k=1}^n \frac{1}{|z_k-w|^2}z_k- w\sum_{k=1}^n \frac{1}{|z_k-w|^2}\\&= w_i\sum_{k=1}^n \frac{1}{|z_k-w|^2}- w\sum_{k=1}^n \frac{1}{|z_k-w|^2} \quad \text{by (2)}\\&=(w_i-w)\sum_{k=1}^n \frac{1}{|z_k-w|^2}, \end{aligned}

which is zero if and only if w=w_i, i.e. the point having zero net force is a critical point of p.

Bôcher’s theorem generalises this result, replacing polynomials with rational functions (ratios of polynomials) and simply applying negative unit masses at poles.

Another interesting result, mentioned in [2], is that the product of the distances from a root z_1 to the other roots (assuming all distinct roots) is precisely n times the product of the distances from z_1 to the critical points w_j. That is,

\prod_{i=2}^n |z_1 - z_i| = n \prod_{j=1}^{n-1} |z_1 - w_j|.\quad\quad ...(4)

To see this, if p(z) = k \prod_{i=1}^n (z-z_i), p'(z) will have the form kn \prod_{j=1}^{n-1} (z-w_j) (since it has leading term nz^{n-1}. Note that z_1 is not a critical point since we assume the roots are distinct. Hence we may consider the quotient

\begin{aligned} \frac{\prod_{i=2}^n (z_1 - z_i)}{\prod_{j=1}^{n-1} (z_1 - w_j)} &= \frac{\lim_{z \rightarrow z_1} p(z)/(z-z_1)}{p'(z_1)/n}\\&= \lim_{z \rightarrow z_1} \frac{np(z)}{(z-z_1)p'(z)}\\ &= \lim_{z \rightarrow z_1} \frac{np(z)}{(z-z_1)p(z) \sum_{j=1}^n 1/(z-z_j)} \quad \text{from (1)} \\&= \lim_{z \rightarrow z_1} \frac{n}{\sum_{j=1}^n (z-z_1)/(z-z_j)} \\ &= \lim_{z \rightarrow z_1} \frac{n}{1 + \sum_{j=2}^n (z-z_1)/(z-z_j)}\\ &= n.\end{aligned}

Taking magnitudes of both sides leads to the desired result (4). One also can obtain an angular relationship by taking the argument of both sides.

The critical points for the case n = 3 have a beautiful geometric interpretation as described in [3]: from the triangle formed by the roots, the critical points are the foci of the largest ellipse that is inscribed in the triangle (this ellipse also happens to pass through the triangle’s midpoints) – this is now known as Marden’s theorem. A generalisation proved by Marden in 1945 and mentioned in [1] is that the critical points of an n-degree polynomial are the foci of a degree n-1 plane curve tangent to each of the n(n-1)/2 line segments formed by taking pairs of the n roots. The points of tangency divide each line segment into a ratio corresponding to the multiplicities of the roots at the endpoints. Also, if the n-dimensional polygon is a linear (affine) transform of a regular polygon, it will have an inscribed ellipse passing through its midpoints (called the Steiner inellipse) and the foci of that ellipse will correspond to critical points.

Below are some other nice results about the critical points of a polynomial, gathered from [4] (also see references therein):

  • The centre of mass of the roots of p coincides with the centre of mass of the critical points of p (a nice exercise to prove).
  • (Anderson) The polynomial root dragging theorem for polynomials with real distinct roots: as roots are dragged to the right by at most \epsilon (so as not to coincide), the critical points also move to the right, each by less than \epsilon.
  • (Boelkins et al., later Frayer) The polynomial root squeezing theorem for polynomials with real distinct roots: if two roots z_i, z_j are squeezed together by the same amount without passing other roots, the critical points move towards (z_i + z_j)/2 or stay fixed.
  • (Peyser) Critical points are not too close to the roots: if a polynomial has only real roots z_1 < z_2 < \ldots < z_n and critical points w_1 < w_2 < \ldots w_{n-1} the critical points satisfy
    \displaystyle z_k + \frac{z_{k+1} - z_k}{n - k + 1} \leq w_k \leq z_{k+1} - \frac{z_{k+1} - z_k}{k+1}, \quad k= 1, \ldots, n-1.
    Equality cases are satisfied by Bernstein polynomials. Replacing 1/(n-k+1) and 1/(k+1) by m_k/(n-k+m_k) and m_{k+1}/(k + m_{k+1} respectively gives a corresponding result (due to Melman) taking into account multiplicities m_k for root z_k.


[1] Qazi Ibadur Rahman, Gerhard Schmeisser, Analytic Theory of Polynomials, Oxford University Press, 2002.

[2] Maxime Bocher, Some Propositions Concerning the Geometric Representation of Imaginaries, Annals of Mathematics, Vol. 7, No. 1/5 (1892 – 1893), pp. 70-72.

[3] Dan Kalman, The Most Marvelous Theorem in Mathematics,  The Journal of Online Mathematics and Its Applications, Vol. 8, March 2008, Article ID 1663.

[4] Neil Biegalle, Investigations in the Geometry of Polynomials, McNair Scholars Journal, Volume 13, Issue 1, Article 3. Available at:

Next Page »

The Rubric Theme Blog at


Get every new post delivered to your Inbox.

%d bloggers like this: