Chaitanya's Random Pages

January 9, 2012

Extremal points of a triangle

Filed under: mathematics — ckrao @ 11:41 am

In this post we summarise some of the points of a triangle that maximise or minimise various functions on the plane. This is on a similar theme to an earlier post on optimisation problems given a point inside a given angle. Many of the results are proved in [1].

Firstly we introduce some notation and terminology. Let ABC be a given triangle with side lengths a = BC,b = AC, c = AB and let P be a variable point in the plane. Let x, y, z be the distances from P to the sides AB, AC, BC respectively. The feet of the perpendiculars from P to the sides of ABC form a pedal triangle. If AP, BP, CP meet sides BC, CA, AB in points D, E, F, then \bigtriangleup DEF is known as the Cevian triangle of P.

Functions involving distances to the vertices

  • \displaystyle \min |PA| + |PB| + |PC|: P is the Fermat point of ABC. If the largest angle is 120 degrees or more, the Fermat point is at the vertex of this angle, otherwise we can construct it by drawing an equilateral triangle ABD external to ABC, then finding the intersection of CD and the circumcircle of ABD.
    In this figure one can show that PA + PB + PC = CD.
  • \displaystyle \max \min \{|PA|, |PB|, |PC|\}: if ABC is not obtuse, P is the circumcentre of ABC. Otherwise it is the intersection of the longest side of ABC with the perpendicular bisector of its middle-length side.
  • \displaystyle \min \{aPA + bPB + cPC\}: P is the orthocentre H of ABC, with minimum value given by four times the area of ABC.
  • \displaystyle \min PA^2 + PB^2 + PC^2: P is the centroid G of ABC with minimum value given by (a^2 + b^2 + c^2)/3.
  • \displaystyle \min aPA^2 + bPB^2 + cPC^2: P is the incentre I of ABC with minimum value given by abc.
  • \displaystyle \min xPA^2 + yPB^2 + zPC^2: This generalises the previous two expressions: P is the point described by the position vector (xA + yB + zC)/(x+y+z) with minimum value equal to (a^2yz + b^2xz + c^2 xy)/(x+y+z).
  • \displaystyle \min PA.PB.c+ PB.PC.a + PC.PA.b: P is the orthocentre H of ABC with minimum value given by abc.

Functions involving distances to the sides

  • \displaystyle \min x + y + z: P is the vertex of the largest angle of ABC.
  • \displaystyle \max xyz: P is the centroid G of ABC.
  • \displaystyle \max xy + yz + zx: P is the Mittenpunkt of ABC, given by the position vector [a(b+c-a)A + b(c + a - b)B + c(a + b - c)C ]/[2(ab + bc + ca) - (a^2 + b^2 + c^2) ]. This and the previous fact can be proved using Lagrange multipliers using the constraint that ax + by + cz is constant (equal to twice the area of ABC). The Mittenpunkt is the point of intersection of lines joining the triangle’s excentres (formed by exterior angle bisectors) and midpoints.
  • \displaystyle \max \frac{xyz}{PA.PB.PC}: P is the incentre I of ABC (IMO shortlist 2001)

Functions involving the pedal triangle

  • maximum area of pedal triangle: P is the circumcentre O of ABC.
  • minimum perimeter of pedal triangle: P is the orthocentre O of ABC.
  • minimum pedal radius (radius of circumcircle of pedal triangle): P is the incentre of ABC. See [2] for more details.

Functions involving the Cevian triangle

  • maximum area of Cevian triangle: P is the centroid G of ABC with maximum area given by a fourth of the area of ABC.

 

References

[1] T. Andreescu, O. Mushkarov, L. Stoyanov, Geometric Problems on Maxima and Minima, Birkhäuser, 2006.

[2] V. Naik, Optimization methods in planar geometry, available at www.cmi.ac.in/~vipul/olymp_resources/olympiadarticles/geometricoptimization.pdf

December 30, 2011

The distribution of Melbourne’s maximum temperatures

Filed under: climate and weather — ckrao @ 11:26 pm

Here I have generated histograms by month of the maximum temperatures of my home city Melbourne (Australia) for the forty-year period of 1971-2010. The temperatures are collected into temperature bins of size 2 degrees Celsius. The most frequent temperature range is 14-16°C in the cooler months (June-Sep) up to 20-22°C in the warmer months (Dec-Mar). The moderating influence of the Southern Ocean is the primary cause of this relatively small difference. However in the summer months hot winds from the interior of the continent skew the distribution and push the mean maximum closer to 25°C. Curiously in January the maximum has been as likely to be 36-38°C as 30-32°C.

One can see how bunched up the maximum temperatures are in the winter (never too cold) compared with the summer (sometimes too hot!). Since the data is collected over a forty year period, simply divide the y value by 40 to see how many days per month one achieves a particular range. For example, in the month of May between 8 and 9 days per month the maximum temperatures has been between 16 and 18°C.

Finally here is the above data aggregated.

Here we see the most common range is 14-16°C which has happened about 57 days per year (47 of these between May and Sep) and 81% of the time it has been between 12 and 26°C (and 17% of the time above 26°C). The overall range is 7.0 to 46.4°C. On average the maximum temperature has exceeded 40°C 1.5 times per year and been less than 10°C 0.8 times per year.

The data for this post comes from http://www.bom.gov.au/climate/data/.

The results of Fuss and Carlitz for bicentric quadrilaterals

Filed under: mathematics — ckrao @ 4:47 am

Any triangle has both an incircle (tangent to its sides) and circumcircle (through its vertices).

The distance d between the centres O, I of these circles is related to their radii r (for the incircle) and R (for the circumcircle) via Euler’s theorem in geometry:

\displaystyle d^2 = R^2 - 2Rr

or the equivalent formula

\displaystyle \frac{1}{R-d} + \frac{1}{R+d} = \frac{1}{r}.

A convex quadrilateral however need not have an incircle or circumcircle in general. For it to have an incircle (i.e. a tangential quadrilateral), its pairs of opposite sides must have the same sum. For it to have a circumcircle (i.e. a cyclic quadrilateral), its opposite angles must be supplementary. One that satisfies both of these properties is called a bicentric quadrilateral.

Remarkably, if we define d, r and R for a bicentric quadrilateral analogous to the triangular case, we have formulas similar to Euler’s due to Nicolaus Fuss and Leonard Carlitz:

Carlitz: \displaystyle d^2 = R^2 - 2Rr.\mu,

where \displaystyle \mu = \sqrt{\frac{(ab+cd)(ad + bc)}{(a+c)^2(ac+bd)}} and a,b,c,d are the quadrilateral’s side lengths.

Fuss: \displaystyle \frac{1}{(R-d)^2} + \frac{1}{(R+d)^2} = \frac{1}{r^2}

Isn’t it cool how similar these formulas look for the triangular and quadrilateral cases?! Both of these results (and indeed Euler’s theorem) follow from the intersecting chords theorem. Fuss’s theorem can be shown by proving that both sides of the equation are equal to \displaystyle \frac{1}{AI^2} + \frac{1}{CI^2}. Carlitz’s identity can be shown via the preliminary result that \mu = IE/BE, where E is the intersection of line AI with the circumcircle. A beautiful proof of Fuss’s theorem due to Salazar (using little more than a short angle chase and the theorems of Pythagoras and Apollonius) is found in [1] while that of Carlitz can be seen from p154 of [2].

By Poncelet’s Porism, it is true that if we start with two circles of radius r and R with distance d between their centres such that the Fuss equation \frac{1}{(R-d)^2} + \frac{1}{(R+d)^2} = \frac{1}{r^2} is satisfied, then a bicentric quadrilateral with those parameters may be constructed. Simply start at any point on the circumcircle and successively draw tangents to the incircle to generate the other three vertices of the quadrilateral. This was how I drew the above bicentric quadrilateral.

References

[1] A. Bogomolny, Fuss’ Theorem from Interactive Mathematics Miscellany and Puzzles http://www.cut-the-knot.org/Curriculum/Geometry/Fuss.shtml#S, Accessed 30 December 2011.

[2] O. Calin, Euclidean and Non-Euclidean Geometry: a metric approach, available at http://math.emich.edu/~ocalin/Math341/Newpdf/driver13GeomB.pdf

December 15, 2011

Australia in close test match finishes

Filed under: cricket,sport — ckrao @ 10:59 am

Following Australia’s recent narrow loss to New Zealand I had a look at this page and found that Australia features in 15 out of the 17 closest finishes (measured by runs) in a test match!

One can add to these the only two test match ties both involving Australia as well!

Here are the recent closest results (by runs) for Australia together with intermediate scores in their final innings. (By recent, I mean after 1929!)

lost by 1 run vs WI, 92/3 (7/74 chasing 184)
lost by 2 runs vs Eng, 2005 (7/137 chasing 282)
lost by 3 runs vs Eng, 82/3 (9/218 chasing 292)
lost by 5 runs vs SA, 93/4 (8/75 chasing 117)
lost by 7 runs vs NZ, 11/2 (9/199 chasing 241)
lost by 12 runs vs Eng, 98/9 (3/130 chasing 175)
lost by 12 runs vs Ind, 04/5 (7/58 chasing 107)

All were losses, but in all but one case Australia was way behind and did well to come as close as it did!

The first recent case of a close win to Australia was in 92/3 versus Sri Lanka (16 runs). That time it was Sri Lanka’s turn to collapse (2/127 chasing 181) after a first innings lead of 291!

On the plus side for Australia, its narrowest wins (after 1977) while chasing have all come against South Africa in South Africa:

won by 2 wickets vs SA, 96/7 (target 270)
won by 2 wickets vs SA, 05/6 (target 292)
won by 2 wickets vs SA, 11/2 (target 310)

Source: http://stats.espncricinfo.com/ci/content/records/255926.html

December 5, 2011

The MMSE estimator is orthogonal projection

Filed under: mathematics — ckrao @ 9:36 pm

Suppose we are given the value of the random vector Y from which we wish to estimate the value of another random vector X. In other words, given Y we want to find a function f such that f(Y) is a good approximation of X. How might we go about this? If our definition of “good approximation” is the MMSE (minimum mean squared error) E \|f(Y)-X\|^2 and the random vectors X and Y have finite second order moments (i.e. E \|X\|^2 = E \sum_i |X_i|^2 is finite), then the language of Hilbert spaces helps us greatly.

Let the dimensions of X and Y be m and n respectively (they need not be equal). The convenience of the metric \|f(Y) - X\|^2 is that we may minimise it by finding the component functions  f_1(Y), \ldots, f_m(Y) that minimise each term E|f_i(Y) - X_i|^2 of the sum separately, and then our required f is f(Y) = [f_1(Y), \ldots, f_m(Y)]^T.

Imagine a component X_i (now a random variable as opposed to a random vector) living in the space of random variables with finite second order moment (this space is denoted by L^2(\cal{F}) where \cal{F} is a collection of events whose probability can be measured). The space of measurable functions of Y from \mathbb{C}^n to \mathbb{C} forms a subspace of L^2(\cal{F}) which we denote by L^2(\sigma(Y)). (By measurable we mean that we can still compute the probability of events based on values of the function.) If X is in this subspace, it means that X_i can be entirely determined from knowledge of Y, and an error of zero is possible. If X is not in the subspace, we imagine X_i as an arrow from 0 sticking out from the subspace.

With inner product on L^2(\cal{F}) given by \langle U,V \rangle = E UV^* (noting that \langle U,U \rangle = 0 if and only if U = 0 with probability one), we see that E|f_i(Y) - X_i|^2 is minimised when the error e_i = f_i(Y) - X_i is orthogonal to any measurable function of Y. In our Hilbert space we think of f_i(Y) being the projection of X onto the subspace L^2(\sigma(Y)):

\displaystyle \langle g(Y), f_i(Y) - X_i \rangle = 0 for any random variable g(Y) \quad \quad (*)

Note that if this were not true, but instead there existed g(Y) \in L^2(\sigma(Y)) with \langle g(Y),g(Y) \rangle = E|g(Y)|^2 = 1 (i.e. normalised) and \langle f_i(Y) - X_i, g(Y) \rangle = \alpha \neq 0, we may write h(Y) = f_i(Y) - \alpha g(Y) \in L^2(\sigma(Y)) and so

\begin{array}{lcl} E|h(Y) - X_i|^2 &=& E|f_i(Y) - \alpha g(Y) - X_i|^2 \\ &=& E| f_i(Y) - X_i|^2 + |\alpha|^2 E |g(Y)|^2 - \alpha^* \langle f_i(Y) - X_i , g(Y) \rangle - \alpha \langle g(Y), f_i(Y) - X_i \rangle \\&=& E|f_i(Y) - X_i|^2 + |\alpha|^2 - |\alpha|^2 - |\alpha|^2 \\ &<& E| f_i(Y) - X_i|^2,\end{array}

contradicting the minimality of E|f_i(Y) - X_i|^2. Hence the orthogonality condition (*) must hold. Such a projection is unique in the almost-sure sense (i.e. any other random variable is equal to f_i(Y) with probability one).

By (*), f_i(Y) - X_i is also orthogonal to any constant random variable, or in other words, Ec(f_i(Y) - X_i) = 0 for any constant c showing that the error f_i(Y) - X_i has zero mean (worth remembering: zero-mean random variables are orthogonal to constants!). In vector notation, with e := f(Y) - X,

Ee = 0  or Ef(Y) = EX \quad \quad (1)

For each j, f_j(Y) is in L^2(\sigma(Y)) so by (*), \langle f_j(Y), f_i(Y) - X_i \rangle = 0. Switching to vector notation, this gives us

\displaystyle {\rm Cov}(f(Y), f(Y)-X) = Ef(Y)(f(Y) - X)^* = 0 \quad \quad (2),

where {\rm Cov}(U,V) :=: \Sigma_{UV} := E(U-E(U))(V-E(V))^* (we shall use the notation {\rm Cov}(U) = \Sigma_U to mean {\rm Cov}(U,U)).

This is the orthogonality condition in our vector case. By Pythagoras’s theorem, the minimum error is

\sum_i E|f_i(Y) - X_i|^2 = \sum_i \left(E|X_i|^2 - E|f_i(Y)|^2\right) = {\rm tr}({\rm Cov}(X) - {\rm Cov}(f(Y))),

where {\rm tr} represents the sum of the diagonal entries of a square matrix (the trace).

Linear Case

If we take the example of a linear estimator of the form f(Y) = AY + b (A a matrix, b a vector), we may use (1) and (2) to identify A and b that minimises E\|f(Y) - X\|^2. By (1), AEY + Eb = EX, from which b = Eb = EX - AEY and f(Y) = A(Y-EY) + EX.

Note that any component of Y is in itself a linear transform of Y. By (*), it must be orthogonal to any component of the error vector f(Y) - X, giving us

\displaystyle 0 = E(f(Y)-X)Y^* = E[(A(Y-EY) + EX - X)Y^*] = A\Sigma_Y - \Sigma_{XY}.

If {\rm Cov}(Y) is invertible, this gives A = \Sigma_{XY}\Sigma_Y^{-1}, leading to the following expressions:

\displaystyle f(Y) = EX + \Sigma_{XY}\Sigma_Y^{-1}(Y-EY) \quad \quad (3)

\begin{array} {lcl} {\rm Cov}(f(Y) - X) &=& {\rm Cov}(X) - {\rm Cov}(EX + {\rm Cov}(X,Y){\rm Cov}(Y)^{-1}(Y-EY))\\ &=& {\rm Cov}(X) - {\rm Cov}(X,Y){\rm Cov}(Y)^{-1}{\rm Cov}(Y-EY) {\rm Cov}(Y)^{-1}{\rm Cov}(Y,X)\\ &=& \Sigma_X - \Sigma_{XY}\Sigma_Y^{-1} \Sigma_{YX} \quad \quad (4) \end{array}

(here using the fact that {\rm Cov}(AX) = A{\rm Cov}(X)A^*),

E\|f(Y) - X\|^2 = {\rm tr}( {\rm Cov}(f(Y)) - X) = {\rm tr} (\Sigma_X - \Sigma_{XY}\Sigma_Y^{-1} \Sigma_{YX}) \quad \quad (5).

Note that this precise argument is made in least squares problems: if we wish to find a matrix W so that \|x - Wy\|^2 is minimised and Wy is only permitted in a space spanned by the columns of some matrix C, we find the projection of x onto the subspace whose columns are formed by C. This gives the orthogonality condition 0 = C^* (x - Wy) = C^* (x - Cz) (for some z, since Wy is in the column space of C), from which

\displaystyle Wy = Cz = C(C^*C)^{-1}C^*x.

In the special case when C =c is a single column vector, the projection operator has the attractive form (cc^*)/(c^*c) which is simply the outer product cc^* if c has unit length.

Conditional Expectation

The post so far has made no mention of the conditional expectation E[X|Y], which is what in fact the minimum mean squared estimator turns out to be. In brief E[X|Y] is defined as a measurable function of Y satisfying

\displaystyle E[g(Y)E[X|Y]] = E[g(Y)X] \quad \quad (+)

for any measurable function g(Y). This is a measure-theoretic definition that has the advantage of unifying the discrete and continuous cases, and avoids division-by-zero possibilities that can arise when dealing with probability densities. (Actually conditional expectation is first defined with respect to a set of events called a sigma algebra which may be considered as an information source for that random variable. The larger the set, the more information we have about that random variable.)

If expectation is thought of as an averaging process, then conditional expectation is an average with respect to information or uncertainty. Formula (+) says that this average should be equal to the unconditioned average on any measurable set (to see this take g(Y) to be 1 on that set and 0 otherwise). There are details missing here, but the interested reader is encouraged to see the references for more.

In the Hilbert space of random variables with finite second moment, this condition is equivalent to (X-E[X|Y]) being orthogonal to g(Y), so E[X|Y] is our best estimate f(Y) found earlier. Hence the conditional expectation E[X_i|Y] can be viewed as the vector projection of X_i onto L^2(\sigma(Y)), i.e. the best estimate of X_i in the expected least squares sense. Conditional probability may be defined in terms of conditional expectation via P(A|B) = E[1_A|B], where 1_A is the random variable equal to 1 if our event A occurs, and 0 otherwise.

Gaussian Case

In the particular case of X and Y being Gaussian vectors, it turns out that the best linear estimate is also the best overall estimate in the least squares sense. To see this, let f(Y) be the best linear estimate of X given Y. Since uncorrelated and independent are equivalent notions in the Gaussian world, it follows from the independence of (X-f(Y)) and f(Y) that given Y=y the conditional distribution of (X-f(Y)) does not depend on y. It is in fact zero-mean Gaussian with variance given in (4)). Hence the distribution of X (= (X-f(Y)) + f(Y)) given Y=y is Gaussian with mean f(Y) (the linear estimate of X) and the same covariance matrix. As a result the conditional expectation E[X|Y=y] is equal to the mean of that Gaussian distribution, which is f(y), and as this is true for all y, E[X|Y] = f(Y).

For example, if Y = HX + V, where X is zero-mean Gaussian, H is a deterministic n by m matrix and V is a zero-mean n-dimensional Gaussian noise vector uncorrelated with X, we have

\displaystyle \Sigma_{XY} ={\rm Cov}(X,HX +V) = {\rm Cov}(X)H^* + {\rm Cov}(X,V) = \Sigma_X H^*

\displaystyle \Sigma_Y = {\rm Cov}(HX+V) = H\Sigma_X H^*+ \Sigma_V,

and as X and Y are jointly Gaussian, our MMSE estimator is also the best linear estimator:

\displaystyle E[X|Y] = f(Y) =\Sigma_{XY}\Sigma_Y^{-1}Y = \Sigma_X H^*(H\Sigma_X H^* + \Sigma_V)^{-1} Y.

This solution is used in a wide variety of linear estimation applications, ranging from regression analysis in statistics to communication theory (estimating signals passing through a channel H and corrupted by noise V).

References

[1] Williams, Probability with Martingales, Cambridge University Press, 2001.

[2] Hajek, Notes for ECE 534: An Exploration of Random Processes for Engineers, July 2011, available here.

November 27, 2011

The popularity of the Pixar films

Filed under: movies and TV — ckrao @ 5:06 am

After an earlier post on the success of the Harry Potter films, I decided to do the same for the Pixar films. The Toy Story films is arguably the best reviewed trilogy, while many of Pixar’s other films have had extremely positive reception.

Movie Release Date Worldwide Gross (USD millions) Worldwide Gross Rank (Nov 2011) imdb rating (/10) Metascore (/100) rottentomatoes fresh reviews
Toy Story Nov-95          362.0 178 8.2 92 76/76 (100%)
A Bug’s Life Nov-98          363.4 174 7.2 77 75/82 (91%)
Toy Story 2 Nov-99          485.0 94 8 88 146/146 (100%)
Monsters, Inc. Nov-01          525.4 82 8 78 159/167 (95%)
Finding Nemo May-03          867.9 26 8.1 89 196/199 (98%)
The Incredibles Nov-04          631.4 54 8 90 222/229 (97%)
Cars Jun-06          462.0 109 7.4 73 142/193 (74%)
Ratatouille Jun-07          623.7 58 8.1 96 209/217 (96%)
WALL-E Jun-08          521.3 84 8.5 94 222/231 (96%)
Up May-09          731.3 43 8.3 88 266/271 (98%)
Toy Story 3 Jun-10       1,063.2 7 8.6 92 248/251 (99%)
Cars 2 Jun-11          551.8 74 6.5 57 73/192 (38%)

An amazing nine out of the twelve films released to date have had 95+% scores from rottentomatoes!

Here is a graph comparing the worldwide grosses, including those of the US and Canada. Note that the numbers are all in $US and do not adjust for inflation or fluctuating exchange rates.

References: boxofficemojo.com, rottentomatoes.com, imdb.com, metacritic.com

A few cute mathematical series

Filed under: mathematics — ckrao @ 3:11 am

From the geometric series \sum_{k=0}^{\infty} z^k = \frac{1}{1-z} we can arrive at a couple of attractive-looking series:

\displaystyle \sum_{n = 0}^{\infty} \frac{a^n}{(a+1)^{n+1}} = 1 \quad \quad (1)

\displaystyle \sum_{n=0}^{\infty} \left(1 - \frac{1}{a}\right)^n = a \quad \quad (2)

For example, \sum_{n=0}^{\infty} \frac{3^n}{4^{n+1}} = \sum_{n=0}^{\infty} \frac{4^n}{5^{n+1}} = 1 and \sum_{n=0}^{\infty} \left(\frac{7}{8}\right)^n = 8.

Furthermore, from the sums \sum_{k=1}^{\infty}k z^k = \frac{z}{(1-z)^2} and  \sum_{k=1}^{\infty} k^2z^k = \frac{z(1+z)}{(1-z)^3} we obtain (after setting z = a/(a+1))

\displaystyle \sum_{n = 1}^{\infty} \frac{na^{n-1}}{(a+1)^{n+1}} = 1 \quad \quad (3)

\displaystyle \sum_{n = 1}^{\infty} \frac{n^2a^n}{(a+1)^n} = a(a+1)(2a+1) \quad \quad (4)

This last sum reminds me of the identity \sum_{n = 1}^a n^2 = a(a+1)(2a+1)/6 for positive integers a, though a appear in the sums in different places!

So for what values of a are sums (1)-(4) valid (i.e. when do they converge)? For any positive integer k the sum \sum_{n=1}^{\infty} n^k z^n converges when |z| < 1, so this means sums (1), (3) and (4) converge when |a/(a+1)| < 1. In other words, |a| < |a+1|, or equivalently, a is closer to 0 than to -1. Hence the real part of a must be at least -1/2. Similarly, sum (2) converges when |(a-1)/a| < 1, which is equivalent to the real part of a being at least 1/2.

November 15, 2011

Crazy cricket from Cape Town

Filed under: cricket,sport — ckrao @ 10:52 am

The second day of the first cricket test between Australia and South Africa in Cape Town (November 10 2011) was as bizarre a day of cricket as I have heard about. Based on the references below here is a collection of some oddities and tidbits from this match.

  • In Australia’s first innings captain Michael Clarke produced one of his best test innings: 151 out of an Australian total of 284.
  • On day 1, Dale Steyn had figures of 4.1 overs for 22 runs versus Clarke and 9.5 overs, 4 wickets for 9 runs against the remaining players!
  • Boucher also took his 500th catch on the first day.
  • On day 2 Australia resumed from an overnight score of 8/214 and made it to 284, with Clarke the last batsman dismissed. South Africa made it to 1/49 just after lunch before Shane Watson came on and took two wickets in his first over (including Kallis for his first duck in 55 innings over almost four years). Then the true carnage began: both teams combined lost 16 wickets for 44 runs in 115 balls!! South Africa lost their last 7 wickets for 23, then Australia were reduced to 9/21 before the last wicket stand took them to 47 – past the lowest ever test score of 26 (by NZ). I went to bed after South Africa was bowled out for 96, thinking that even if Australia were bowled out for a similar total, they had enough of a first innings lead to win the match. The day ended with South Africa 1/81 comfortably on track for an unlikely victory. Never have I seen such a flurry of wickets with relatively free scoring on either side, on the same day!
  • Twenty-three wickets fell on the second day, the fourth most ever for a single day in test cricket and the most since 1902.
  • It was Australia’s third sub-100 score in 16 months (12 tests).  Only once in their previous 277 tests over 25 years had they done this! (This was back in Mumbai in 2003.)
  • Australia’s first innings was the sixth shortest completed innings in terms of balls (108). [ref]
  • It was only South Africa’s second sub-100 score since 1960.
  • Australia’s collapse from 1/11 to 9/21 was the third time in test cricket that 8 wickets fell for ten or fewer runs.
  • It was the eighth time that the number 11 batsman (Nathan Lyon with 14) top scored in a test innings. [ref] Interestingly the opening batsman (Shane Watson) was also the top wicket taker in South Africa’s first innings – how topsy-turvy!
  • The previous lowest score at the fall of the ninth wicket was 25.
  • Watson and Vernon Philander became the first pair of bowlers from opposite sides to take five-wicket hauls for fewer than 20 runs in the same Test. Watson’s spell was the second fastest 5-wicket haul (28 balls) from first coming on to bowl.
  • It was only the third time since WW1 that both teams had innings scores less than 100 in a test match.
  • It was the first time since 1912 that 22 batsmen were out in single figures in the first 3 innings of a Test. Seventeen batsmen in a row were dismissed for single figures!
  • South Africa became the 12th team to win a Test after being bowled out for under 100 (the third since 1907).
  • It was only the fourth time a completed innings had only the openers scoring in double figures. [ref]
  • It was the first time a batsman from each team was out twice on the same day (Clarke and Rudolph).
  • Australia’s first innings was only the second time a tenth wicket partnership accounted for more than 50% of a test innings. The first time was far more impressive: a partnership of 117 out of an innings total of 209 for England between Peter Willey and Bob Willis in 1980 against the West Indies. [ref]
  • It was the eighth time the fourth innings exceeded the sum of the second and third innings. [ref]
  • It was the ninth test in which one side scored more twice as many runs in the first innings as the opponent and still went on to lose the match. [ref]
  • South Africa won after conceding the ninth biggest first innings lead (188 runs) [ref]
  • On 11/11/11, at 11 minutes to 11, the score was 1/111! [ref]
  • On 11/11/11, at 11:11, South Africa needed 111 to win!

Other References

ESPNcricinfo on Twitter

A quiz on the capers in Cape Town | The Confectionery Stall | Cricket Blogs | ESPN Cricinfo

South Africa v Australia, 1st Test, Cape Town: The best bowling day in more than 100 years | Cricket Features | South Africa v Australia | ESPN Cricinfo

A list of amazing statistics from Day 2 of the Cape Town test: Indian Cricket Fans forum

SA-Aus 1st Test, Day 2: A day of numerous records – Rediff.com Sports

November 11, 2011

Counting rectangular lattice paths

Filed under: mathematics — ckrao @ 9:49 am

This post is inspired by an attempt at the following problem edited from the 1995 AIME.

Starting at (0,0) an object moves in the coordinate plane via a sequence of steps, each of length one. Each step is left, right, up, or down, all four equally likely. Find the probability that the object reaches (2,2) in six or fewer steps.

I wish to show my approach to the problem, which led me to find a formula for the number of ways the object reaches an arbitrary point (p,q) from (0,0) in n steps.

Before reading on, I encourage you to have a go at the above problem to get a feel for it!

The first thing to observe is that the object only can reach (2,2) in an even number of steps. The sum of the coordinates changes parity with each move, so it cannot reach (2,2) in say five steps. It can reach (2,2) in the minimum four steps only if each step moves towards that point, either to the right or upwards. Among four steps one chooses two to go up (and the other two to go right), so the number of ways of going from (0,0) to (2,2) in four steps is \binom{4}{2} = 6. Since in total there are 4^4 = 256 possible paths of four steps (a choice of four paths for each step), the probability that one arrives at (2,2) in exactly 4 steps is 6/256 = 3/128.

Next, we need to find the probability that one arrives at (2,2) for the first time in exactly six steps. To find this we will count the number of ways of arriving at (2,2) in exactly six steps (whether or not it arrived there in four steps) and subtract from that the number of ways in arriving at (2,2) in six steps having arrived there in four steps.

The second of these possibilities is easier to find, so let’s find that first. Above we saw that there are 6 ways of arriving at (2,2) in four steps. After two more steps it needs to return to (2,2), and there are 4 ways of that happening (the sixth step needs to be the reverse of the fifth). Hence in total there are 6 \times 4 = 24 ways for the object to arrive at (2,2) in six steps having been there two steps earlier.

Now we wish to find the number of ways of arriving at (2,2) in six steps without restriction.

We notice that to arrive at (2,2) in six steps, one of the steps needs to be in an “unhelpful” direction – either left or down. There are 6 choices of which of the six steps in which this occurs, and then 2 choices of direction. Then of the remaining five steps, the net position we wish to reach is (3,2) (if the unhelpful direction was left) or (2,3) (if it was down). In each case the number of ways to reach that position is \binom{5}{2} = 10. We conclude there are 6 \times 2 \times 10 = 120 ways of the object arriving at (2,2) in six steps without restriction.

The rest of the problem is easy: since there are 4^6 possible paths of six steps, the probability that one arrives at (2,2) for the first time in six steps is (120-24)/4^6 = 3/128. Combining this with the probability of arriving there in four steps, our final answer becomes 3/128 + 3/128 = 3/64.

After solving this I wondered about the question of how many ways there are of reaching an arbitrary point (p,q) in n steps. Let this number be f_n(p,q). To look for a pattern, I worked with small values of n and concentrating on the first quadrant progressively built up triangular arrays of numbers showing the number of ways of going to each point in n ways. The following recursion is used:

\displaystyle f_{n+1}(p,q) = f_n(p-1,q) + f_n(p+1,q) + f_n(p,q-1) + f_n(p,q+1)

The arrays are shown below for n = 2 to 6. Here the bottom left point corresponds to (0,0) while red dots indicate those lattice points that can not be reached.

Then it was time to stare at the numbers and look for a pattern. One thing I noticed was that the numbers corresponding to p = 0 or q = 0 (along the axes) are perfect squares. Then I noticed that for n = 6, the numbers along the diagonal p=q are multiples of \binom{6}{3} = 20. This enabled me to find the following representation for the numbers in terms of binomial coefficients.

From this the general formula (for any integers p, q) was conjectured to be

\displaystyle f_n(p,q) = \begin{cases} \binom{n}{(n + p + q)/2} \binom{n}{(n + p - q)/2}, & \mbox{if } p + q \equiv n \mod 2\\ 0, & \mbox{if } p + q \not\equiv n \mod 2. \end{cases}

This conjecture can be proven as follows. The above formula suggests that the number of ways to go to (p,q) can somehow be decoupled so that one can consider each of two directions independently. To do this we apply a rotation and replace the directions up, down, left and right with the directions north east, south west, north west, south east respectively:

\displaystyle (1,0) \rightarrow (1,-1) \quad (-1,0) \rightarrow (-1,1)
\displaystyle (0,1) \rightarrow (1,1) \quad (0,-1) \rightarrow (-1,-1)

This maps any point (p,q) to (p+q, q-p). We now consider the equivalent question of how many ways are there for the object to go to (p+q, q-p) from (0,0) using the four allowed steps (\pm 1, \pm 1).

This question is easier to solve since now we may treat each component separately:

  • for the first component we go to p+q in n steps (provided p + q \equiv n \mod 2) by choosing (n+p+q)/2 +1s and (n - (p + q))/2 -1s for each step’s first component. This can be done in \binom{n}{(n + p + q)/2}ways.
  • for the second component we go to q-p in n steps by choosing (n + q - p)/2 +1s and (n - (q-p))/2 -1s for each step’s second component. This can be done in \binom{n}{(n + p -q)/2} ways.

Hence we can go to (p+q,q-p) in a total of \displaystyle \binom{n}{(n + p + q)/2} \binom{n}{(n + p - q)/2} ways. Transforming back, this is equal to the number of ways of going to (p,q) from (0,0) using up, down, left and right moves of unit length provided p + q \equiv n \mod 2 (otherwise (p,q) cannot be reached). This proves the conjecture, an interesting result I had not seen before. The hard part was finding the pattern involving binomial coefficients by considering small cases for n.

It is interesting to see how this generalises the better known problem of reaching (p,q) (p, q non-negative) when using up or right moves only: it is equivalent to the case p + q = n, and using the above formula we find the number of ways to be

\displaystyle \binom{n}{(n + p + q)/2} \binom{n}{(n + p - q)/2} = \binom{n}{(n+n)/2} \binom{n}{(p + q + p - q)/2} = \binom{n}{p},

which we saw earlier (choose p of the n moves to be “right”).

November 6, 2011

The popularity of the Harry Potter films

Filed under: movies and TV — ckrao @ 4:52 am

Never before has a series of as many as eight films had such sustained success both at the box office and while maintaining mostly positive reception among film critics and the general public. During the past decade the Harry Potter films have amassed some US$7.7 billion worldwide at the box office, more than any franchise (second is the James Bond franchise of 24 movies, followed by Star Wars – a list of highest grossing films is here), and with an average close to $1 billion worldwide per movie (second best is the Pirates of the Caribbean franchise).

Movie Release Date Worldwide Gross (USD millions) Worldwide Gross Rank (Nov 2011) imdb rating (/10) Metascore (/100) rottentomatoes  fresh reviews
Philosopher’s/Sorcerer’s Stone Nov-01  974.76 11 7.2 64 148/186 (80%)
Chamber of Secrets Nov-02 878.98 24 7.2 63 170/205 (83%)
Prisoner of Azkaban Jun-04 796.69 33 7.7 82 210/232 (91%)
Goblet of Fire Nov-05 896.91 21 7.5 81 194/222 (87%)
Order of the Phoenix Jul-07 939.89 14 7.3 71 183/236 (78%)
Half Blood Prince Jul-09 934.42 15 7.3 78 212/254 (83%)
Deathly Hallows: Part 1 Nov-10 955.42 13 7.6 65 193/244 (79%)
Deathly Hallows: Part 2 Jul-11 1,327.84 3 8.2 87 255/265 (96%)

That all the movies have a rottentomatoes score at least 78% is most impressive and shows large appraisal among critics.

Here is a graph comparing the worldwide grosses, including those of the three biggest markets: the US and Canada, the UK and Japan. The movie franchise is arguably more popular in the UK and Japan than the US. Note that the numbers are all in $US and do not adjust for inflation or fluctuating exchange rates.

The series sure ended well, in more ways than one!

References: boxofficemojo.com, rottentomatoes.com, imdb.com, metacritic.com

Next Page »

Theme: Rubric. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.