Chaitanya's Random Pages

July 12, 2020

Charting Test and ODI cricket performances over consecutive innings

Filed under: cricket,sport — ckrao @ 2:23 am

Adding to my earlier blog post about the highest proportion of test scores above m after n innings, I have created some new interactive charts for best streaks early and mid-career in both bowling and batting in tests and ODIs. The data is mostly from

1. Test batting

2. Test bowling

3. ODI batting

4. ODI bowling

Below are shown a few sample charts. Clicking on the chart will take you to a new page where you can interact further.

Test batting:

Two players had a batting average over 100 after 10 test innings
A player’s stats can be viewed by opposition.

Test bowling:

Jason Holder has had an all-time great low bowling average over 20 consecutive innings, Steve Waugh also features here.
Early players dominate the list of fastest to 5 6-wicket hauls in tests.

ODI batting:

Kohli averaged almost 70 per innings over 50 consecutive innings, Warner also has also done very well from 2016-2019.
Shahid Afridi’s maintained this high a batting strike rate over 50 innings between 2004 and 2007.

ODI bowling:

Rashid Khan took 11 more wickets than the next best over 30 consecutive innings.
Over 100 ODI innings, Muralitharan averaged more than 2 better than the next best.

July 8, 2020

The largest parallelogram in a triangle

Filed under: mathematics — ckrao @ 11:59 pm

In this post we find the largest parallelogram, rhombus, rectangle and square that can be contained in a given triangle. We will see that in the first three cases we can achieve half the area of the triangle but no more, while it is generally less than this for a square.

Inscribing a parallelogram, rhombus, rectangle and square of maximum area in a triangle

1. The largest parallelogram inside a triangle

It can be readily seen that one can obtain a parallelogram having half the area of a triangle by connecting a vertex with the three midpoints of the sides. (This has half the base length of the triangle and half its height.)

Is it possible to obtain a larger parallelogram? As outlined in [1], if two or fewer vertices of the parallelogram are on sides of the triangle, a smaller similar triangle can be created by drawing a line parallel to the triangle’s side through a vertex of the parallelogram that is interior triangle. (This is done three times in the figure below.) This reduces the problem to the next case.

A smaller similar triangle containing the parallelogram can be created by lines through its vertices parallel to the sides of the triangle

We are left to consider the case where three or more vertices of the parallelogram on the triangle. We can draw a line from a vertex to the opposite side parallel to a pair of sides (in the figure below AH is drawn parallel to DG), thus dissecting the triangle into two. Each of the smaller triangles then has an inscribed parallelogram where two of the vertices are on a side of each triangle. Then by drawing lines parallel to sides if required, we create two sub-problems each having four vertices of the parallelogram on the sides of the triangle.

Dissecting a triangle so that each smaller triangle has a parallelogram with two vertices on a side. The right parallelogram is contained in a larger one HIEJ formed by lines parallel to the sides.

Finally, if all four vertices of the parallelogram are on sides of the triangle, by the pigeonhole principle, two of them are on a side (say BC as shown in the figure below). In this figure, if we let AD/AB = k and the height of ABC from BC be h_a, then by the similarity of triangles ADE and ABC, DE = k.BC and \triangle ADE has height kh_a. Then the area of the parallelogram DEFG is DE \times (1-k)h_a = k(1-k) DE.h_a which is 2k(1-k) times the area of \triangle ABC. This quantity has maximum value 1/2 when k=1/2 so we conclude that the parallelogram does not exceed half the triangle’s area.

A parallelogram with all four vertices on sides of a triangle must have two of them on one side.

2. The largest rhombus in a triangle

Constraining sides of the parallelogram to be equal (forming a rhombus), we claim that the largest rhombus that can be inscribed in a triangle is also half its area. This can be formed with two of the vertices on the second longest side of the triangle. Suppose a \geq b \geq c are the sides of the triangle with b=AC the second longest side length. Then let the segment MN joining the midpoints of AB and BC form one side of the rhombus of length b/2. It remains to be shown that there exist parallel segments of this same length from M, N to AC. The longest possible such segment has length NC = a/2 and the shortest has length h_b/2, half the length of the altitude of \triangle ABC from B. We wish to show that h_b/2 \leq b/2 \leq a/2. This follows from

\displaystyle h_b = c \sin A \leq c \leq b \leq a.

Inscribing a rhombus of maximal area in a triangle

The area of this rhombus is clearly half the area of the triangle as it has half the length of its base and half the height.

3. The largest rectangle in a triangle

Here if BC is the longest side of the triangle we form the rectangle from midpoints D, E of AB and AC respectively, dropping perpendiculars onto BC forming rectangle DEFG:

The largest rectangle inscribed in a triangle, where D and E are midpoints

The area of this rectangle is half the area of the triangle as it has half the length of its base and half the height.

Interestingly the reflections of the vertices of the triangle in the sides of the rectangle coincide, showing a paper folding interpretation of this result [2].

Reflecting A, B and C in DE, DG and EF respectively yields the same image, yielding a paper folding interpretation

Since rhombuses and rectangles are special cases of parallelograms and we found that inscribed parallelograms in a triangle occupy no more than half its area, the rhombus and rectangle constructions here are optimal.

4. The largest square in a triangle

Here we shall see that the best we can do may not be half the area of the triangle. As before, if two or fewer vertices of the square are not on the sides of the triangle it is possible to scale up the square (or scale down the triangle) so that three of the square’s vertices are on the sides. We claim that the largest square must have two of its vertices on a side of the triangle. Suppose this is not the case and we have the figure below.

Squares with a vertex on each side of \triangle ABC pivot about the point P

Consider squares LMNO inscribed in \triangle ABC so that one vertex L is on AB, M is on BC and O is on CA. We claim that the largest such square is either DEFG (two vertices on BC) or HIJK (two vertices on AC). Suppose on the contrary that neither of these squares is the largest. Then we make use of the fact that all 90-45-45 triangles LMO inscribed in \triangle ABC have a common pivot point P. This is the point at the intersection of the circumcircles of triangles OAL, LBM and MCO. To show these circles intersect at a single point, we can prove that if the circumcircles of triangles OAL and LBM intersect at P then the points O,P,M,C are cyclic by the following equality:

\displaystyle \begin{aligned} \angle MPO &= 360^{\circ} - \angle LPM - \angle OPL\\ &= (180^{\circ} - \angle LPM) + (180^{\circ} - \angle OPL)\\ &= \angle B + \angle A \\ &= 180^{\circ} - \angle C,\end{aligned}

where the second last equality makes use of quadrilaterals BMPL, ALPO being cyclic.

Additionally we have

\displaystyle \begin{aligned}\angle BPC &= \angle BPM + \angle MPC\\ &= \angle BLM + \angle MOC\\ &= (\angle LAM + \angle AML) + (\angle MAO + \angle OMA)\\&=(\angle LAM + \angle MAO ) + (\angle AML + \angle OMA )\\&= \angle BAC + \angle OML\\&=\angle A + 45^{\circ}, \end{aligned}

with similar relations for \angle APB and \angle CPA. Hence P is the unique point satisfying

\displaystyle \begin{aligned}\angle APB &= \angle ACB + 90^{\circ}\\\angle BPC &= \angle BAC + 45^{\circ}\\\angle CPA &= \angle CBA +45^{\circ}.\end{aligned}

(Each equation defines a circular arc, they intersect at a single point. Note that P may be outside triangle ABC.) This point is the centre of spiral similarity of 90-45-45 triangles LMO with L, M, O respectively on the sides AB, BC, CA of the triangle. Consider the locus of the points of the square as L, M, O vary on straight line segments pivoting about P. It follows that the fourth point of the square N also traces a line segment, between the points F and J so as to be contained within the triangle.

As the side length of the square is proportional to the distance of a vertex to its pivot point, the largest square will be where NP is maximised. We have seen that the point N varies along a line segment, so NP will be maximised at one of the extreme points – either when N=F or N=J. We therefore conclude that the largest square inside a triangle will have two points on a side.

If the triangle is acute-angled, by calculating double the area of the triangle in two ways, the side length x of a square on the side of length a with altitude h_a is derived as

\displaystyle \begin{aligned}ah_a &= 2x^2 + (a-x)x + x(h_a-x)\\&=2x^2 + ax - x^2 + xh_a - x^2\\&= ax + xh_a\\Rightarrow \quad x &=  \frac{ah_a}{a + h_a}.\end{aligned}

If the triangle is obtuse-angled, the square erected on a side may not touch both of the other two sides. In the figure below the side length of square BDEF is the same as if A were moved to G, where \triangle GBC is right-angled. In this case the square’s side length is BC.BG/(BC + BG).

The square erected on side AC when \angle ABC is obtuse

The largest square erected on a side may be constructed using the following beautiful construction [2]: simply erect a square CBDE external to the side BC and find the intersection points F = AD \cap BC, G = AE \cap BC.

These points define the base of the square to be inscribed since by similar triangles

\displaystyle \frac{IF}{BD} = \frac{AF}{AD} = \frac{FG}{DE} = h_a/(a + h_a)

so that

\displaystyle IF = FG = \frac{ah_a}{a + h_a}.

One can use this interactive demo to view the largest square in any given triangle. One needs to find the largest of the three possibilities of the largest square erected on each side. In the acute-angled-triangle case, the largest square is on the side that minimises the sum of that side length and its corresponding perpendicular height – as their product is fixed as twice the triangle’s area, this will occur when the side and height have minimal difference. For a right-angled triangle with legs a, b and hypotenuse c, we wish to compare the quantities (a+b) and (c+h), the two possible sums of the base and height of the triangle. We always have a + b - c = 2(s-c) = 2r < h because the diameter of the incircle of the triangle is shorter than the altitude from the hypotenuse (i.e. the incircle is inside the triangle). We conclude that the largest square in a right-angled triangle is constructed on its two legs rather than its hypotenuse.


[1] I. Niven, Maxima and Minima without Calculus, The Mathematical Association of America, 1981.

[2] M. Gardner, Some Surprising Theorems About Rectangles in Triangles, Math Horizons, Vol. 5, No. 1 (September 1997), pp. 18-22.

[3] Jaime Rangel-Mondragon “Largest Square inside a Triangle” Wolfram Demonstrations Project Published: March 7 2011

April 16, 2020

Highest proportion of test scores above m after n innings

Filed under: cricket,sport — ckrao @ 6:02 am

I created an interactive workbook with Tableau to determine the test batsmen who have had the highest proportion of innings scoring at least m runs after having played n test innings. Below are some screenshots for particular choices of m runs. The data is available from [1]. As expected Bradman comes up on top in many scenarios but it is interesting to see other names that appear up there.


(Click on the above image to go to the Tableau page if you wish to change the parameters. You can also select the “Innings by innings” tab to look up a player’s list of innings.)

Below we some examples for different m (full-career stats among players who have played at least 20 innings).

m=1: A total of 22 players have had an entire career of 20+ innings getting off the mark each time, with RA Duff (Australia, 1902-1905) the only to play 40 innings (note that JW Burke played 44 innings without a duck, but made 0 not out in one innings).

Angelo Mathews (SL) has managed just 2 ducks in his 154 innings to date.

1 or more

m=10: Hobbs, Hutton, Kanhai and Sobers stand out here, having played over 100 innings and reaching double figures at least 80% of the time (Hobbs over 86%). Labuschagne, Hetmyer, Handscomb and Head are recent players to feature highly here.
10 or more

m=25: Bradman starts to distance himself from the rest. Hammond, Smith, Sobers and de Villiers also impress here.
25 or more

m=50: Smith has matched Barrington’s career figures of 50+ starts. Sutcliffe had 33 50+ scores in his first 64 innings, the same as Bradman.
50 or more

m=75: Barrington’s numbers are amazing here and Katich is ahead of Kohli, Tendulkar and Lara.
75 or more

m=100: Smith and Kohli are currently higher than Sangakkara, the highest among recent retirees.
100 or more

m=125: Only Bradman (6) had more 140+ scores than Labuschagne after playing 23 test innings, equal with Graeme Smith (who had 4 150+ scores in his first 17 innings!).
125 or more

m=150: Bradman is so far ahead of the rest here. Lara and Sangakkara are just one behind Tendulkar with the most 150+ scores despite almost 100 fewer innings.
150 or more

m=175: Again Lara and Sangakkara have the same number of 175+ scores (15).
175 or more

m=200: Kohli is similar to Hammond’s career figures at this stage, with 6 of his 7 double centuries coming within 33 innings between July 2016 and December 2017.
200 or more

Please leave any other interesting observations in the comments.



December 7, 2019

Coordinates of special points of the 3-4-5 triangle

Filed under: mathematics — ckrao @ 3:40 am

One thing I observed is that the 3-4-5 triangle is rather attractive in solving problems using coordinates. If the vertices are placed at (0,3), (0,0) and (4,0) the following are the coordinates of points and equations of some lines of interest.

Line AC: x/4 + y/3 = 1

Incentre: (1, 1)

Centroid: (4/3, 1)

Circumcentre: (2, 1.5)

Orthocentre: (0, 0)

Nine-point centre: (1, 3/4) (midpoint of the midpoints of AB and BC)

Angle bisectors: y = x, y=-2x +3, y=4/3-x/3

Ex-centres (intersection of internal and external bisectors): (3,- 3), (6, 6), (-2, 2)


Lines joining the excentres (in red above): y=-x, y=x/2 +3, y = 3(x-4)

Altitude to the hypotenuse: y = 4x/3

Euler line: y=3x/4

Foot of altitude to the hypotenuse: (36/25, 48/25) (where x/4 + y/3 = 1 intersects y=4x/3)

Symmedian point (midpoint of the altitude to the hypotenuse [1]): (18/25, 24/25)

Contact points of incircle and triangle: (1,0), (0,1), (8/5, 9/5)

Gergonne point (intersection of Cevians that pass through the contact points of the incircle and triangle = the intersection of y=3-3x and y=1-x/4): (8/11, 9/11)

Nagel point (intersection of Cevians that pass through the contact points of the ex-circles and triangle = the intersection of y=3-x and y=2-x/2: (2,1)


[1] Weisstein, Eric W. “Symmedian Point.” From MathWorld–A Wolfram Web Resource.

January 26, 2019

49+ °C temperatures in Australia

Filed under: climate and weather — ckrao @ 12:32 pm

Below is a list of recorded instances of maximum temperatures of 49 degrees Celsius or more in Australia, based on [1] and [2] from Australia’s Bureau of Meteorology. Out of the 48 52 occasions, 22 26 have occurred in this decade including 8 (so far) during this summer alone! I believe all the stations have been recording temperatures for at least 20 years except Port Augusta and Keith West (which both started in 2001). Edited: 20 Dec 2019

Temperature (°C) Date Station Name State
50.7 2-Jan-60 Oodnadatta Airport SA
50.5 19-Feb-98 Mardie WA
50.3 3-Jan-60 Oodnadatta Airport SA
49.9 19-Dec-19 Nullarbor SA
49.8 19-Dec-19 Eucla WA
49.8 21-Feb-98 Emu Creek Station WA
49.8 13-Jan-79 Forrest Aero WA
49.8 3-Jan-79 Mundrabilla Station WA
49.7 10-Jan-39 Menindee Post Office NSW
49.6 12-Jan-13 Moomba Airport SA
49.5 19-Dec-19 Forrest WA
49.5 24-Jan-19 Port Augusta Aero SA
49.5 24-Dec-72 Birdsville Police Station QLD
49.4 21-Dec-11 Roebourne WA
49.4 16-Feb-98 Emu Creek Station WA
49.4 7-Jan-71 Madura Station WA
49.4 2-Jan-60 Marree Comparison SA
49.4 2-Jan-60 Whyalla (Norrie) SA
49.3 27-Dec-18 Marble Bar WA
49.3 2-Jan-14 Moomba Airport SA
49.3 9-Jan-39 Kyancutta SA
49.2 20-Dec-19 Keith West SA
49.2 24-Jan-19 Kyancutta SA
49.2 21-Feb-15 Roebourne Aero WA
49.2 10-Jan-14 Emu Creek Station WA
49.2 22-Dec-11 Onslow Airport WA
49.2 1-Jan-10 Onslow WA
49.2 11-Jan-08 Onslow WA
49.2 9-Feb-77 Mardie WA
49.2 1-Jan-60 Oodnadatta Airport SA
49.2 3-Jan-22 Marble Bar Comparison WA
49.2 11-Jan-05 Marble Bar Comparison WA
49.1 24-Jan-19 Tarcoola Aero SA
49.1 23-Jan-19 Red Rocks Point WA
49.1 13-Jan-19 Marble Bar WA
49.1 27-Dec-18 Onslow Airport WA
49.1 3-Jan-14 Walgett Airport AWS NSW
49.1 2-Jan-10 Emu Creek Station WA
49.1 18-Feb-98 Roebourne WA
49.1 23-Dec-72 Moomba SA
49 15-Jan-19 Tarcoola Aero SA
49 23-Jan-15 Marble Bar WA
49 13-Jan-13 Birdsville Airport QLD
49 9-Jan-13 Leonora WA
49 21-Dec-11 Roebourne Aero WA
49 1-Jan-10 Mardie WA
49 10-Jan-09 Emu Creek Station WA
49 11-Jan-08 Port Hedland Airport WA
49 11-Jan-08 Roebourne WA
49 12-Jan-88 Marla Police Station SA
49 6-Dec-81 Birdsville Police Station QLD
49 22-Dec-72 Marree SA




December 30, 2018

A collection of energy formulas

Filed under: science — ckrao @ 10:58 am

Energy is a quantity that is conserved as a consequence of the time translation invariance of the laws of physics. Below are some formulas calculating energy of different forms.

Kinetic energy is that associated with motion and is defined as K = \frac{1}{2} mv^2 = \frac{p^2}{2m} for a particle with mass m, velocity v and momentum p. If the mass is a fluid in motion (e.g. wind) with density \rho and volume A v t through cross-sectional area A, then K = \frac{1}{2} At\rho v^3.

Work is the result of a force F applied over a displacement \mathbf{s} and is given by the line integral

\displaystyle W = \int_C \mathbf{F} . \mathrm{d}\mathbf{s} = \int_{t_1}^{t_2} \mathbf{F} . \frac{\mathrm{d}\mathbf{s}}{\mathrm{d}t} \ \mathrm{d}t= \int_{t_1}^{t_2} \mathbf{F}.\mathbf{v}\ \mathrm{d}t .

This has the simple form W = Fs \cos \theta when force is constant and displacement is linear where \theta is the angle between the force and displacement vectors.

Using Newton’s 2nd law and the relation \frac{\mathrm{d}}{\mathrm{d}t} (\mathbf{v}^2) = 2\mathbf{v}.\frac{\mathrm{d}\mathbf{v}}{\mathrm{d}t} this can be written as

\displaystyle W = m\int_{t_1}^{t_2} \frac{d\mathbf{v}}{dt} . \mathbf{v} \mathrm{d}t = \frac{1}{2}m\int_{t_1}^{t_2} \frac{\mathrm{d}}{\mathrm{d}t} (\mathbf{v}^2) \mathrm{d}t = \frac{1}{2}m\int_{v_1^2}^{v_2^2} \mathrm{d}(\mathbf{v}^2) = \frac{1}{2}mv_2^2 - \frac{1}{2}mv_1^2.

This is the work-energy theorem which says that work is the change in kinetic energy by a net force. It can also be written as W = \int_{v_1}^{v_2} m \mathbf{v}.\mathrm{d}\mathbf{v} = \int_{p_1}^{p_2} \mathbf{v}.d\mathbf{p} where \mathbf{p} = m\mathbf{v} is momentum.

The above has the rotational analogue K = \frac{1}{2} I \omega^2 where I is moment of inertia and \omega is angular velocity and the equation for work becomes

\displaystyle W = \int_{t_1}^{t_2} \mathbf{T} . \mathbf{\omega}\ \mathrm{d}t,

where \mathbf{T} is a torque vector.

This has the simple form W = Fr \omega = \tau \omega in the special case of a constant magnitude tangential force where \tau = Fr is the torque resulting from force F applied at distance r from the centre of rotation.

Note that the time derivative of work is defined as power, so work can also be expressed as the time integral of power:

W = \int P(t)\  \mathrm{d}t = \int_{t_1}^{t_2} \mathbf{F}.\mathbf{v}\ \mathrm{d}t.

If the work done by a force field \mathbf{F} depends only on a particle’s end points and not on its trajectory (i.e. conservative forces), one may define a potential function of position, known as potential energy U satisfying \mathbf{F} = -\nabla U. By convention positive work is a reduction in potential, hence the minus sign. It then follows that in such force fields the sum of kinetic and potential energy is conserved.

Some types of potential energy:

  • due to a gravitational field: \mathbf{F} = -(GMm/r^2) \hat{r}, U = -GMm/r, where M, m are the masses of two bodies, r the distance between their centre of masses and G is Newton’s gravitation constant.
  • due to earth’s gravity at the surface: \mathbf{F} = -mg, U = mgh where g \approx 9.8 ms^{-2} and h is the object’s height above ground (small compared with the size of the earth).
  • due to a spring obeying Hooke’s law: \mathbf{F} = -kx, U = kx^2/2 where k is the spring constant and x the displacement from an equilibrium position.
  • due to an electrostatic field: \mathbf{F} = q\mathbf{E} = (k_e qQ/r^2) \hat{r}, U = k_e qQ/r where k_e is Coulomb’s constant 1/(4\pi \epsilon_0) and q, Q are charges. This can be written as U = qV where V is a potential function measured in volts.
  • for a system of point charges: \displaystyle U =k_e \sum_{1 \leq i < j \leq n} \frac{q_i q_j}{r_{ij}}.
  • for a system of conductors: U = \frac{1}{2} \sum_{i=1}^n Q_i V_i where the charge on conductor i is Q_i and its potential is V_i.
  • for a charged dielectric: the above may be generalised to the volume integral U = \frac{1}{2} \int_V \rho \Phi \ \mathrm{d}v where \rho is charge density and \Phi is the potential corresponding to the electric field.
  • for an electric dipole in an electric field: U = -\mathbf{p}.\mathbf{E} where \mathbf{p} is directed from the negative to positive charge and has magnitude equal to the product of the positive charge and charge separation distance.
  • for a current loop in a magnetic field: U = -\mathbf{\mu}.\mathbf{B} where \mathbf{\mu} is directed normal to the loop and has magnitude equal to the product of the current through the loop and its area.

In electric circuits the voltage drop across an inductance L is v = L di/dt and the current though a capacitance C is i = C dv/dt. These inserted into the relationship E = \int i(t)v(t) \ \mathrm{d}t lead to the formulas E = \frac{1}{2}L(\Delta I)^2 and E = \frac{1}{2}C(\Delta V)^2 for the energy stored in a capacitor and inductor respectively.

Also in electromagnetism the energy flux (flow per unit area per unit time) is the Poynting vector \mathbf{S} = \mathbf{E} \times \mathbf{H}, the cross product of the electric and magnetising field vectors. The electromagnetic energy in a volume V is given by ([1])

\displaystyle \frac{1}{2}\int_V \mathbf{B}.\mathbf{H} + \mathbf{E}.\mathbf{D}  \ \mathrm{d}v,

where \mathbf{D} is the electric displacement field and \mathbf{B} is the magnetic field. This is more commonly written as \displaystyle \frac{1}{2} \int_V \epsilon_0 |E|^2 + |B^2|/\mu_0 \ \mathrm{d}v when the relationships \mathbf{D} = \epsilon_0\mathbf{E}, \mathbf{B} = \mu_0 \mathbf{H} hold.

In special relativity energy is the time component of the momentum 4-vector. That is, energy and momentum are mixed in a similar way to how space and time are mixed at high velocities. Computing the norm of the momentum four-vector gives the energy-momentum relation

E^2 = (pc)^2 + (m_0c^2)^2.

This leads to E = pc for massless particles (such as photons) and more generally E = \gamma m_0 c^2 , the mass-energy equivalence relation (here \gamma = (1 - (v/c)^2)^{-1/2} and m_0 is rest mass).

In quantum mechanics the energy of a photon is also written as E = hf = hc/\lambda (Planck-Einstein relation) where h is Planck’s constant and f, \lambda are frequency and wavelength respectively. Energies of quantum systems are based on the eigenstates of the Hamiltonian operator, an example of which is \displaystyle {\hat {H}}=-{\frac {\hbar ^{2}}{2m}}\nabla^2+V(x).

Force is also equal to pressure times area, so another formula for work (e.g. done by an expanding gas) is the volume integral W = \int p \mathrm{d}V. In thermodynamics heat is energy transferred through the random motion of particles. The fundamental equation of thermodynamics quantifies the internal energy U which disregards kinetic or potential energy of a system as a whole (only considering microscopic kinetic and potential energy):

\displaystyle U = \int  \left(T \text{d}S - p \mathrm{d}V + \sum_i \mu_i \mathrm{d}N_i \right)

where T is temperature, S is entropy, N_i is the number of particles and \mu_i the chemical potential of species i. Similar formulas exist for other thermodynamic potentials such as Gibbs energy, enthalpy and Helmholtz energy.

The mean translational kinetic energy of a bulk substance is related to its temperature by \bar{E} = \frac{3}{2}k_B T where k_B is Boltzmann’s constant.

In thermal transfer the change in internal energy is given by \Delta U = m C \Delta T where m is mass and C is the heat capacity which may apply to constant volume or constant pressure.

The power per unit area emitted by a body is given by the Stefan-Boltzmann law P = A \epsilon \sigma T^4 where \epsilon is the emissivity (=1 for black body radiation) and \sigma is the Stefan–Boltzmann constant. This equation may be used to determine the energy emitted by stars using their emission spectrum.

The latent heat (thermal energy change during a phase transition) of mass m of a substance with specific latent heat constant L is given by Q = mL.

Finally, the energy of a single wavelength of a mechanical wave is \displaystyle \frac{1}{2} m\omega^2 A^2 where m the mass of a wavelength, A the amplitude and \omega the angular frequency [2]. This can be applied to finding the energy density of ocean waves for example [3].


[1] Poynting Vector. Retrieved 22:24, December 28, 2018, from

[2] Power of a Wave. Retrieved 21:23, December 30, 2018, from

[3] Wikipedia contributors, “Wave power,” Wikipedia, The Free Encyclopedia, (accessed December 30, 2018).

[4] Wikipedia contributors, “Work (physics),” Wikipedia, The Free Encyclopedia, (accessed December 30, 2018).

[5] Wikipedia contributors, “Potential energy,” Wikipedia, The Free Encyclopedia, (accessed December 30, 2018).

[6] Wikipedia contributors, “Electric potential energy,” Wikipedia, The Free Encyclopedia, (accessed December 30, 2018).

[7] Wikipedia contributors, “Thermodynamic equations,” Wikipedia, The Free Encyclopedia, (accessed December 30, 2018).

[8] H. Ohanian, Physics, 2nd edition, Norton & Company, 1989.

June 12, 2018

Rafael Nadal in best of five set matches on clay

Filed under: sport — ckrao @ 1:28 pm

Following Rafael Nadal‘s 11th French Open win, it’s worth looking at just how amazing his best-of-five set record on clay is. He now has a 111-2 win-loss record, with his only two losses against Söderling and Djokovic.

Win-loss breakdown by tournament (Masters tournament finals changed to best of 3 sets from 2007):

  • French Open: 86-2
  • Davis Cup: 18-0
  • Barcelona Open: 2-0
  • Monte Carlo Masters: 2-0
  • Rome Masters: 2-0
  • Stuttgart: 1-0

Win-loss breakdown by number of sets (overall he has won 331 and lost 36 completed sets so even winning a set against him is a big deal!):

  • 5 sets: 4-0 (Coria, Federer, Isner, Djokovic – 5th set scores 7-6 (6), 7-6 (5), 6-4, 9-7 respectively)
  • 4 sets: 22-1 (loss to Söderling)
  • 3 sets: 83-1

Most common opponents (2 or more matches):

  • Djokovic: 7-1 (lost 7 sets)
  • Federer: 7-0 (lost 7 sets)
  • Ferrer: 4-0 (lost 1 set)
  • Almagro: 4-0
  • Hewitt: 4-0 (lost 1 set)
  • Söderling: 3-1 (lost 3 sets)
  • Thiem: 3-0
  • del Potro: 3-0 (lost 1 set)
  • Gasquet: 3-0
  • Seppi: 2-0 (lost 1 set)
  • Murray: 2-0
  • Roddick: 2-0 (lost 1 set)
  • Coria: 2-0 (lost 3 sets)
  • Ljubicic: 2-0
  • Monaco: 2-0
  • Bolelli: 2-0
  • Wawrinka: 2-0 (same score of 6-2 6-3 6-1 both times)
  • Bellucci: 2-0

Breakdown by set score (almost the same likelihood of winning a set 6-2, 6-3 or 6-4):

  • 6-0: 26
  • 6-1: 61
  • 6-2: 68
  • 6-3: 66
  • 6-4: 67
  • 7-5: 18
  • 7-6: 24
  • 9-7: 1
  • 6-7: 9
  • 5-7: 6
  • 4-6: 8
  • 3-6: 6
  • 2-6: 3 (Federer in 2006 Rome, Söderling in 2009 French Open, Djokovic in 2012 French Open)
  • 1-6: 3 (Federer in 2006 French Open, del Potro in 2011 Davis Cup, Djokovic in 2015 French Open)
  • 0-6: 1 (Coria in 2005 Monte Carlo Masters)

(only one incomplete set 2-0 after which Pablo Carreno Busta retired)


(1) Tennis Abstract: Rafael Nadal ATP Match Results, Splits, and Analysis

(2) Ultimate Tennis Statistics – Rafael Nadal

April 1, 2018

A collection of binary grid counting problems

Filed under: mathematics — ckrao @ 3:52 am

The number of ways of colouring an m by n grid one of two colours without restriction is 2^{mn}. The following examples show what happens when varying restrictions are placed on the colouring.

Example 1: The number of ways of colouring an m by n grid black or white so that there is an even number of 1s in each row and column is

\displaystyle 2^{(m-1)(n-1)}.

Proof: The first m-1 rows and n-1 columns may be coloured arbitarily. This then uniquely determines how the bottom row and rightmost column are coloured (restoring even parity). The bottom right square will be black if and only if the number of black squares in the remainder of the grid is odd, hence this is also uniquely determined by the first m-1 rows and n-1 columns. Details are also given here.

Example 2: The number of ways of colouring an m by n grid black or white so that every 2 by 2 square has an odd number (1 or 3) of black squares is

\displaystyle 2^{m+n-1}.

Proof: First colour the first row and first column arbitarily (there are m+n-1 such squares each with 2 possibilities). This uniquely determines how the rest of the grid must be coloured by considering the colouring of adjacent squares above and to the left.

By the same argument, the above is the same as the number of colouring an m by n grid black or white so that every 2 by 2 square has an even number (0, 2 or 4) of black squares.

Example 3: The number of ways of colouring an m by n grid black or white so that every 2 by 2 square has two of each type is

\displaystyle 2^m + 2^n - 2.

Proof: If there are two adjacent squares of the same colour with one above the other, the remaining squares of the corresponding two rows are uniquely determined as being the same alternating between black and white. The remainder of the grid is then determined by the colouring of first column (2^m - 2 possibilities where we omit the two cases of alternating colours down the first column). Such a grid cannot have two horizontally adjacent squares of the same colour. By a similar argument a colouring that has two adjacent colours with one left of the other can be done in 2^n-2 ways. Finally we have the two additional configurations where there are no adjacent squares of the same colour, which is uniquely determined by the colour of the top left square. Hence in total we have (2^m-2) + (2^n-2) + 2 = 2^m + 2^n - 2 possible colourings.

This question for m = n = 8 was in the 2017 Australian Mathematics Competition and the general solution is also discussed here.

Example 4: The number of ways of colouring an m by n grid black or white so that each row and each column contain at least one black square is (OEIS A183109)

\displaystyle \sum_{j=0}^m (-1)^j \binom{m}{j} (2^{m-j}-1)^n.

Proof: First we count the number of colourings where a fixed subset of j columns is entirely white and each row has at least one black square. The remaining m-j columns and n rows can be coloured in (2^{m-j}-1)^n ways. To count colourings where each column has at least one black square we apply the principle of inclusion-exclusion and arrive at the above result.

Another inclusion-exclusion example shown here counts the number of 3 by 3 black/white grids in which there is no 2 by 2 black square. The answer is 417 with more terms for n by n grids in OEIS A139810.

Example 5: Suppose we wish to count the number of colourings of an m by n grid in which row i has k_i black squares and column j has l_j black squares (i = 1, 2, \ldots m, j = 1, 2, \ldots, n). Following [1], the number of ways this can be done is the coefficient of x_1^{k_1}x_2^{k_2} \ldots x_m^{k_m}y_1^{l_1}y_2^{l_2}\ldots y_n^{l_n} in the polynomial

\displaystyle \prod_{i=1}^m \prod_{j=1}^n (1 + x_i y_j).

To see this note that expanding the product gives products of terms of the form (x_i y_j) where such a term included corresponds to the i‘th row and jth column being coloured black. Hence the coefficient of x_1^{k_1}x_2^{k_2} \ldots x_m^{k_m}y_1^{l_1}y_2^{l_2}\ldots y_n^{l_n} is the number of ways in which the system \sum_{j=1}^n a_{ij} = k_i, \sum_{i=1}^m a_{ij} = l_j has a solution (i = 1, 2, \ldots m, j = 1, 2, \ldots, n) for a_{ij} equal to 1 if and only if row i and column j are coloured black and 0 otherwise.

Let us evaluate this in the special case of 2 black squares in every row and every column for an n by n square grid (i.e. k_i = l_j = 2 and m = n). Picking two squares in each column to colour black means viewing the expansion as a polynomial in y_1, \ldots, y_n the coefficient of y_1^2y_2^2\ldots y_n^2 has sums of products of n terms of the form x_ix_j. Then using [] notation to denote the coefficient of an expression, we have

\begin{aligned} \left[x_1^2x_2^2 \ldots x_n^2y_1^2y_2^2\ldots y_n^2 \right]  \prod_{i=1}^n \prod_{j=1}^n (1 + x_i y_j) &= \left[x_1^2x_2^2 \ldots x_n^2 \right] \left( \sum_{i=1}^n\sum_{j=i+1}^n x_i x_j \right)^n\\&= \left[x_1^2x_2^2 \ldots x_n^2 \right] 2^{-n} \left( \left( \sum_{i=1}^n x_i\right)^2 - \sum_{i=1}^n x_i^2 \right)^n\\ &= \left[x_1^2x_2^2 \ldots x_n^2 \right] 2^{-n} \sum_{k=0}^n (-1)^k \binom{n}{k} \left( \sum_{i=1}^n x_i^2 \right)^k\left(\sum_{i=1}^n x_i\right)^{2(n-k)}\\ &=  2^{-n}  \sum_{k=0}^n (-1)^k \binom{n}{k} \frac{n!}{(n-k)!} \frac{(2n-2k)!}{2^{n-k}}\\ &= 4^{-n}  \sum_{k=0}^n (-1)^k \binom{n}{k}^2 2^k  (2n-2k)!. \end{aligned}

Here the second last line follows from considering the number of ways that products of k terms of the form x_i^2 arise in the product \left( \sum_{i=1}^n x_i^2 \right)^k (which is \frac{n!}{(n-k)!}) and products of (n-k) terms of the form x_i^2 can be formed in the product \left(\sum_{i=1}^n x_i\right)^{2(n-k)} (which is \frac{(2n-2k)!}{2^{n-k}}).

For example, when n=4 this is equivalent to finding the coefficient of a^2b^2c^2d^2 in (ab + bc + ac + bc + bd + cd)^4. Products are either paired up in complementary ways such as in (3 \times \binom{4}{2} = 18 ways) or we have the three products,, (3 \times 4! = 72 ways). This gives us a total of 90 (this question appeared in the 1992 Australian Mathematics Competition). More terms of the sequence are found in OEIS A001499 and the 6 by 4 case (colouring two shaded squares in each row and three in each column in 1860 ways) appeared in the 2007 AIME I (see Solution 7 here).

Example 6: If we wish to count the number of grid configurations in which reflections or rotations are considered equivalent, we may make use of Burnside’s lemma that the number of orbits of a group is the average number of points fixed by an element of the group. For example, to find the number of configurations of 2 by 2 grids up to rotational symmetry, we consider the cyclic group C_4. For quarter turns there are 2^4 configurations fixed (a quadrant determines the colouring of the remainder of the grid) while for half turns there are 2^8 configurations as one half determines the colouring of the other half. This gives us an answer of

\displaystyle \frac{2^{16} + 2.2^4 + 2^8}{4} = 16456,

which is part of OEIS A047937. If reflections are also considered equivalent we need to consider the dihedral group D_4 and we arrive at the sequence in OEIS A054247.

If we want to count the number of 3 by 3 grids with four black squares up to equivalence, this is equivalent to the number of full noughts and crosses configurations. A nice video by James Grime explaining this is here (the answer is 23).

Example 7: The number of ways of colouring an m by n grid black or white so that the regions form 2 by 1 dominoes has the amazing form

\displaystyle 2^{mn/2} \prod_{j=1}^{\lceil m/2 \rceil} \prod_{k=1}^{\lceil n/2 \rceil} \left(4 \cos^2 \frac{\pi j}{m+1} + 4 \cos^2 \frac{\pi k}{n+1}\right).

For example, the 36 ways of tiling a 4 by 4 grid are given here. A proof of the above formula using the Pfaffian of the adjacency matrix of the corresponding grid graph is given in chapter 10 of [2].



[1] L. Comtet, Advanced Combinatorics: The Art of Finite and Infinite Expansions (pp 235-6), D. Reidel Publishing Company, 1974.

[2] M. Aigner, A Course in Enumeration, Springer, 2007.

December 26, 2017

The evolution of ODI team totals

Filed under: cricket,sport — ckrao @ 11:46 am

Over the years one day international cricket scores have been on the rise and this post intends to look into this in some detail. We shall restrict ourselves to first innings scores where the team batting first lasted exactly 50 overs. Hence games greater than 50 overs per team long or where a team was bowled out prematurely are omitted. There are 2349 (out of 3945) such matches according to this query on Cricinfo Statsguru and on average  7 wickets fall over the 50 overs. The plot below shows a scatter plot of the scores over time. The red curve shows that mean scores were steady around 225 during the 1980s and have been on the rise since 1990 so that now the mean score is approaching 300.


Note that the first data point in 1974 corresponds to a game that was reduced to 50 overs per side after originally intended to be a 55 over game.

If we slice the data into eras marked by calendar years of roughly equal numbers of games, the mean score had a slight slow-down in the rate of increase from 2008-2012, then accelerated again in the past five years.

Era Number of matches Mean score batting first
1974-1994 427 229
1995-1999 383 247
2000-2003 368 257
2004-2007 380 267
2008-2012 393 272
2013-2017 398 288
1974-2017 2349 260

The histograms below show how rarely teams score less than 200 runs in recent times when using the full quota of 50 overs. In fact these days a team is more likely to score over 400 than below 200 if using the full quota of 50 overs!


Comparing the distribution of first innings winning versus losing scores we find that the mean scores are 275 vs 236 respectively with sample sizes 1392 vs 901 (34 games had no result and 22 were tied). Restricting to the past five years, the median score batting first for the full 50 overs in winning matches is exactly 300.


Interestingly if we break down the runs scatter plot by team, the trends are not the same across the board. In particular England and South Africa have had more dramatic increases in recent times than the other teams, especially compared with India, Pakistan, Sri Lanka and West Indies.


Restricting to the last five years (2013-2017), here are the mean first innings scores for each team based on the match result (assuming they bat the full 50 overs).

Team Result mean score # matches
Afghanistan lost 249 6
Afghanistan won 260 12
Australia lost 295 13
Australia n/r 253 3
Australia won 310 31
Bangladesh lost 263 16
Bangladesh won 275 15
Canada lost 230 3
England lost 282 13
England won 329 22
Hong Kong won 283 4
India lost 282 12
India won 310 27
Ireland lost 244 6
Ireland tied 268 1
Ireland won 289 3
Kenya lost 260 1
Netherlands lost 265 1
New Zealand lost 277 12
New Zealand tied 314 1
New Zealand won 308 27
P.N.G. lost 218 2
P.N.G. won 232 1
Pakistan lost 266 9
Pakistan n/r 296 1
Pakistan tied 229 1
Pakistan won 290 20
Scotland lost 238 6
Scotland won 284 8
South Africa lost 258 7
South Africa n/r 301 1
South Africa won 321 36
Sri Lanka lost 249 15
Sri Lanka n/r 268 2
Sri Lanka tied 286 1
Sri Lanka won 305 22
U.A.E. lost 279 3
U.A.E. won 267 3
West Indies lost 265 10
West Indies won 298 10
Zimbabwe lost 247 9
Zimbabwe tied 257 1
Zimbabwe won 276 1

The England and South Africa numbers stand out the most here in winning causes. Also Australia has a particularly high average score of 294 in losing causes. Sri Lanka has the largest difference (56 runs) between average winning and losing scores.

Edit: The following shows the mean scores in the 100 matches prior to and after key rule changes (still focusing on first innings 50-over scores). Note that in two of the three cases, the average scores reduced.

  1. Restriction of 2 outside the 30-yard circle in the first 15 overs (’92 World Cup)
    03 Jan 88 to 20 Jan 92: 231
    12 Feb 92  to 16 Feb 94: 222
  2. Introduction of Powerplay overs
    13 Mar 04 to 30 Jun 05: 267
    07 Jul 05 to 08 Sep 06: 267
  3. Removal of powerplay, fifth fielder allowed outside the circle in the last ten overs
    17Aug 14 to 24 Jun 15: 301
    10 Jul 15 to 19 Jan 17: 289

September 8, 2017

Notes on von Neumann’s algebra formulation of Quantum Mechanics

Filed under: mathematics,science — ckrao @ 9:49 pm

The Hilbert space formulation of (non-relativistic) quantum mechanics is one of the great achievements of mathematical physics. Typically in undergraduate physics courses it is introduced as a set of postulates (e.g. the Dirac-von Neumann axioms) and hard to motivate without some knowledge of functional analysis or at least probability theory.  Some of that motivation and the connection with probability theory is summarised in the notes here – in fact it can be said that quantum mechanics is essentially non-commutative probability theory [2]. Furthermore having an algebraic point of view seems to provide a unified picture of classical and quantum mechanics.

The important difference between classical and quantum mechanics is that in the latter, the order in which measurements are taken sometimes matters. This is because obtaining the value of one measurement can disturb the system of interest to the extent that a consistently precise value of the other cannot be found. A famous example is position and momentum of a quantum particle – the Heisenberg uncertainty relation states that the product of their uncertainties (variances) in measurement is strictly greater than zero.

If measurements are treated as real-valued functions of the state space of system, we will not be able to capture the fact that the measurements do not commute. Since linear operators (e.g. matrices) do not commute in general, we use algebras of operators instead. We make use of the spectral theory leading from a special class of algebras with norm and adjoint known as von Neumann algebras which in turn are a special case of C*-algebras. The spectrum of an operator A is the set of numbers \lambda for which (A-\lambda I) does not have an inverse. Self-adjoint operators have a real spectrum and will represent the set of values that an observable (a physical variable that can be measured) can take. Hence we have this correspondence between self-adjoint operators and observables.

By the Gelfand-Naimark theorem C*-algebras can be represented as bounded operators on a Hilbert space {\cal H}. See Section II.6.4 of [3] for proof details. If the C*-algebra is commutative the representation is as continuous functions on a locally compact Hausdorff space that vanish at infinity. Furthermore we assume the C*-algebra and corresponding Hilbert space are separable, meaning the space contains a countable dense subset (analogous to how the subset of rationals are dense in the set of real numbers). This ensures that the Stone-von Neumann theorem holds which was used to show that the Heisenberg and Schrödinger pictures of quantum physics are equivalent [see pp7-8 here].

The link between C*-algebras and Hilbert spaces is made via the notion of a state which is a positive linear functional on the algebra of norm 1. A state evaluated on a self-adjoint operator outputs a real number that will represent the expected value of the observable corresponding to that operator. Note that it is impossible to have two different states that have the same expected values across over observables. A state \omega is called pure if it is an extreme point on the boundary of the (convex) space of states. In other words, we cannot write a pure state \omega as \omega = \lambda \omega_1 + (1-\lambda) \omega_2 where \omega_1 \neq \omega_2 are states and 0 < \lambda < 1). A state that is not pure is called mixed.

Now referring to a Hilbert space {\cal H}, for any mapping \Phi of bounded operators B({\cal H}) to expectation values such that

  1. \Phi(I) = 1 (it makes sense that the identity should have expectation value 1),
  2. self-adjoint operators are mapped to real numbers with positive operators (those with positive spectrum) mapped to positive numbers and
  3. \Phi is continuous with respect to the strong convergence in B({\cal H}) – i.e. if \lVert A_n \psi - A \psi \rVert \rightarrow 0 for all \psi \in H, then \Phi (A_n) \rightarrow \Phi (A),

then there is a is a unique self-adjoint non-negative trace-one operator \rho (known as a density matrix) such that \Phi (A) = \text{trace}(\rho A) for all A \in B(H) (see [1] Proposition 19.9). (The trace of an operator A is defined as \sum_k \langle e_k, Ae_k \rangle where \{e_k \} is an orthonormal basis in the separable Hilbert space – in the finite dimensional case it is the sum of the operator’s eigenvalues.) Hence states are represented by positive self-adjoint operators with trace 1. Such operators are compact and so have a countable orthonormal basis of eigenvectors.

When \rho corresponds to a projection operator onto a one-dimensional subspace it has the form \rho = vv^* where v \in {\cal H} and \lVert v \rVert = 1. In this case we can show \text{trace}(\rho A) = \langle v, Av \rangle = v^*Av, which recovers the alternative view that unit vectors of {\cal H} correspond to states (known as vector states) so that the expected value of an observable corresponding to the operator A is \langle v, Av \rangle. This is done by choosing the orthonormal basis \{e_k \} where e_1 = v and computing

\begin{aligned} \text{trace}(\rho A) &= \sum_k \langle e_k, vv^*Ae_k \rangle\\ &= \sum_k e_k^* v v^* Ae_k\\ &= e_1^* e_1 e_1^*Ae_1 \quad \text{ (as }e_k^*v = \langle e_k, v \rangle = 0\text{ for } k > 1\text{)}\\ &= e_1^*Ae_1\\ &= \langle v, Av \rangle. \end{aligned}

Trace-one operators \rho can be written as a convex combination of rank one projection operators: \rho = \sum \lambda_k v_k v_k^*. From this it can be shown that those density operators which cannot be written as a convex combination of other states (called pure states) are precisely those of the form \rho = vv^*. Hence vector states and pure states are equivalent notions. Mixed states can be interpreted as a probabilistic mixture (convex combination) of pure states.

Let us now look at the similarity with probability theory. A measure space is a triple (X, {\cal S}, \mu) where X is a set, {\cal S} is a collection of measurable subsets of X called a \sigma-algebra and \mu:{\cal S} \rightarrow \mathbb{R} \cup \infty is a \sigma-additive measure. If g is a non-negative integrable function with \int g \ d\mu = 1 it is called a density function and then we can define a probability measure p_g:{\cal S} \rightarrow [0,1] by

\displaystyle p_g(S) = \int_S  g\ d\mu \in [0,1], S \in {\cal S}.

A random variable f:X\rightarrow \mathbb{R} maps elements of a set to real numbers in such a way that f^{-1}(B) \in {\cal S} for any Borel subset of \mathbb{R}. This enables us to compute their expectation with respect to the density function g as

\displaystyle \int_X f \ dp_g = \int_X fg\ d\mu.

This is like the quantum formula \text{Tr}(\rho A) with our density operator \rho playing the role of g and operator A playing the role of random variable f. Hence a probability density function is the commutative probability analogue of a quantum state (density operator).

While Borel sets are the events from which we define simple functions and then random variables, in the non-commutative case we define operators in terms of projections (equivalently closed subspaces) of a Hilbert space {\cal H}. A projection operator P is self-adjoint, satisfies P^2 = P and has the discrete spectrum \{0,1\}. Hence they are analogous to 0-1 indicator random variables, the answers to yes/no events. For any unit vector v \in {\cal H} the expected value

\displaystyle \langle v, Pv \rangle = \langle v, P^2v \rangle = \langle Pv, Pv \rangle = \lVert Pv \rVert^2

is interpreted as the probability the observable corresponding to P will have value 1 when measured in the state corresponding to v. In particular this probability will be 1 if and only if v is in the invariant subspace of P. We define meet and join operations \vee, \wedge on these closed subspaces to create a Hilbert lattice ({\cal P}({\cal H}), \vee, \wedge, \perp):

  • A \wedge B = A \cap B
  • A \vee B = \text{closure of } A + B
  • A^{\perp} = \{u: \langle u,v \rangle = 0\ \forall v \in A\}

Borel sets form a \sigma-algebra in which the distributive law A \cap (B \cup C) = (A \cap B) \cup (A \cap C) holds for any elements of {\cal S}. However in the Hilbert lattice the corresponding rule A \wedge (B \vee C) = (A \wedge B) \vee (A \wedge C) (where A, B, C are projection operators) only holds some of the time (see here for an example). This failure of the distributive law is equivalent to the general non-commutativity of projections.

A quantum probability measure \phi:{\cal P} \rightarrow [0,1] can be defined by combining projections in a \sigma-additive way, namely \phi(0) = 0, \phi(I) = 1 and \phi(\vee_i P_i) = \sum_i \phi(P_i) where P_i are mutually orthogonal projections (P_i \leq P_j^{\perp}, i \neq j). Gleason’s theorem says that for Hilbert space dimension at least 3 a state is uniquely determined by the values it takes on the orthogonal projections – a quantum probability measure can be extended from projections to bounded operators to obtain \phi(A) = \text{Tr}(\rho_{\phi} A), similar to how characteristic functions are extended to integrable functions. Hence this is a key result for non-commutative integration (note: the continuity conditions defining \Phi in 1-3 above are stronger). We choose von Neumann algebras over C*-algebras since the former contain all spectral projections of their self-adjoint elements while the latter may not [ref].

So far we have seen that expected values of observables A are derived via the formula \text{Tr}(\rho A). To derive the distribution itself, we make of the spectral theorem and for self-adjoint operators with continuous spectrum this requires projection valued measures. A self-adjoint operator A has a corresponding function E_A:{\cal S} \rightarrow {\cal P}({\cal H}) mapping Borel sets to projections so that E_A(S) represents the event that the outcome of measuring observable A is in the set S: we require that E_A(X) = I and S \mapsto \langle u,E_A(S)v \rangle is a complex additive function (measure) for all u, v \in {\cal H}. We use E_A(\lambda) as shorthand for E_A(\{x:x\leq \lambda\}). Similar to the way a finite dimensional self-adjoint matrix M may be eigen-decomposed in terms of its eigenvalues \lambda_i and normalised eigenvectors u_i as

\begin{aligned} M &= \sum_i \lambda_i u_i u_i^T \\ &= \sum_i \lambda_i P_i \quad \text{(where }P_i := u_i u_i^T \text{ is a projection)}\\ &= \sum_i \lambda_i (E_i - E_{i-1}), \quad \text{(where } E_i := \sum{k \leq i} P_k\text{ ),} \end{aligned}

the spectral theorem for more general self-adjoint operators allows us to write

A = \int_{\sigma(A)} \lambda dE_A(\lambda)

which means that for every u, v \in {\cal H},

\langle u, Av \rangle = \int_{\sigma(A)} \lambda d\langle u,E_A v \rangle.

Here, the integrals are over the spectrum of A. Through this formula we can work with functions of operators and in particular the distribution of the random variable X corresponding to operator A in state \rho will be

\text{Pr}(X \leq x) = E\left[ 1_{\{X \leq x\} }\right] = \text{Tr} \left( \rho\int_{-\infty}^x dE_A(\lambda) \right) = \text{Tr} \left( \rho E_A(x) \right).

The similarities we have seen here between classical probability and quantum mechanics are summarised in the table below, largely taken from [2] which greatly aided my understanding. Note how the pairing between trace class and bounded operators is analogous to the duality of L^1 and L^{\infty} functions.

Classical Probability
Quantum Mechanics
(non-commutative probability)
(X,{\cal S}, \mu) – measure space ({\cal H}, {\cal P}({\cal H}), \text{Tr}) – Hilbert space model of QM
X – set {\cal H} – Hilbert space
{\cal S} – Boolean algebra of Borel subsets of X called events {\cal P}({\cal H})orthomodular lattice of projections (equivalently closed subspaces) of {\cal H}
disjoint events orthogonal projections
\mu:{\cal S} \rightarrow {\mathbb R}^{+} \cup \infty\sigma-additive positive measure \text{Tr} – functional
g \in L^1(X,\mu), g \geq 0, \int g \ d\mu = 1 – integrable functions (probability density functions) \rho \in {\cal T}({\cal H}), \rho \geq 0, \text{Tr}(\rho) = 1 – trace class operators (density operators)
p_g(S) = \int \chi_S g\ d\mu \in [0,1], S \in {\cal S}probability measure mapping Borel sets to numbers in [0,1] in a sigma-additive way \phi(S) = \text{Tr}(\rho_{\phi } S) \in [0,1], \rho_{\phi } \in {\cal T}({\cal H}), S \in {\cal P}({\cal H})quantum state mapping projections to numbers in [0,1] in a sigma-additive way
f \in L^{\infty}(X,\mu) – essentially bounded measurable functions (bounded random variables) A \in {\cal B}({\cal H}) – von Neumann algebra of bounded operators (bounded observables)
\int fg\ d\mu, g \in L^1(X,\mu) – expectation value of f \in L^{\infty}(X,\mu) with respect to p_g

\text{Tr}(\rho A), \rho \in {\cal T}({\cal H}) – expectation value of A \in {\cal B}({\cal H}) in state \rho

In summary, the fact that measurements don’t always commute lead us to consider non-commutative operator algebras. This leads us to the Hilbert space representation of quantum mechanics where a quantum state is a trace-one density operator and an observable is a bounded linear operator. We also saw that projections can be viewed as 0-1 events. The spectral theorem is used to decompose operators into a sum or integral of projections.

The richer mathematical setting for quantum mechanics allows us to model non-classical phenomena such as quantum interference and entanglement. We have not mentioned the time evolution of states, but in short, state vectors evolve unitarily according to the Schrödinger equation, generated by an operator known as the Hamiltonian.

References and Further Reading

[1] Hall, B.C., Quantum Theory for Mathematicians, Springer, Graduate Texts in Mathematics #267, June 2013 (relevant section)

[2] Redei, M., Von Neumann’s work on Hilbert space quantum mechanics

[3] Blackadar, B., Operator Algebras: Theory of C*-Algebras and von Neumann Algebras

[4] Wilce, Alexander, “Quantum Logic and Probability Theory“, The Stanford Encyclopedia of Philosophy (Spring 2017 Edition), Edward N. Zalta (ed.).

[5] Wikipedia – Quantum logic

[6] – Lattice of Projections

[7] – Spectral Measure

[8] quantum mechanics – Intuitive meaning of Hilbert Space formalism – Physics Stack Exchange

[9] This answer to: mathematical physics – Quantum mechanics in a metric space rather than in a vector space, possible? – Physics Stack Exchange

[10] functional analysis – Resolution of the identity (basic questions) – Mathematics Stack Exchange

Next Page »

Create a free website or blog at

%d bloggers like this: