Chaitanya's Random Pages

September 8, 2017

Notes on von Neumann’s algebra formulation of Quantum Mechanics

Filed under: mathematics,science — ckrao @ 9:49 pm

The Hilbert space formulation of (non-relativistic) quantum mechanics is one of the great achievements of mathematical physics. Typically in undergraduate physics courses it is introduced as a set of postulates (e.g. the Dirac-von Neumann axioms) and hard to motivate without some knowledge of functional analysis or at least probability theory.  Some of that motivation and the connection with probability theory is summarised in the notes here – in fact it can be said that quantum mechanics is essentially non-commutative probability theory [2]. Furthermore having an algebraic point of view seems to provide a unified picture of classical and quantum mechanics.

The important difference between classical and quantum mechanics is that in the latter, the order in which measurements are taken sometimes matters. This is because obtaining the value of one measurement can disturb the system of interest to the extent that a consistently precise value of the other cannot be found. A famous example is position and momentum of a quantum particle – the Heisenberg uncertainty relation states that the product of their uncertainties (variances) in measurement is strictly greater than zero.

If measurements are treated as real-valued functions of the state space of system, we will not be able to capture the fact that the measurements do not commute. Since linear operators (e.g. matrices) do not commute in general, we use algebras of operators instead. We make use of the spectral theory leading from a special class of algebras with norm and adjoint known as von Neumann algebras which in turn are a special case of C*-algebras. The spectrum of an operator A is the set of numbers \lambda for which (A-\lambda I) does not have an inverse. Self-adjoint operators have a real spectrum and will represent the set of values that an observable (a physical variable that can be measured) can take. Hence we have this correspondence between self-adjoint operators and observables.

By the Gelfand-Naimark theorem C*-algebras can be represented as bounded operators on a Hilbert space {\cal H}. See Section II.6.4 of [3] for proof details. If the C*-algebra is commutative the representation is as continuous functions on a locally compact Hausdorff space that vanish at infinity. Furthermore we assume the C*-algebra and corresponding Hilbert space are separable, meaning the space contains a countable dense subset (analogous to how the subset of rationals are dense in the set of real numbers). This ensures that the Stone-von Neumann theorem holds which was used to show that the Heisenberg and Schrödinger pictures of quantum physics are equivalent [see pp7-8 here].

The link between C*-algebras and Hilbert spaces is made via the notion of a state which is a positive linear functional on the algebra of norm 1. A state evaluated on a self-adjoint operator outputs a real number that will represent the expected value of the observable corresponding to that operator. Note that it is impossible to have two different states that have the same expected values across over observables. A state \omega is called pure if it is an extreme point on the boundary of the (convex) space of states. In other words, we cannot write a pure state \omega as \omega = \lambda \omega_1 + (1-\lambda) \omega_2 where \omega_1 \neq \omega_2 are states and 0 < \lambda < 1). A state that is not pure is called mixed.

Now referring to a Hilbert space {\cal H}, for any mapping \Phi of bounded operators B({\cal H}) to expectation values such that

  1. \Phi(I) = 1 (it makes sense that the identity should have expectation value 1),
  2. self-adjoint operators are mapped to real numbers with positive operators (those with positive spectrum) mapped to positive numbers and
  3. \Phi is continuous with respect to the strong convergence in B({\cal H}) – i.e. if \lVert A_n \psi - A \psi \rVert \rightarrow 0 for all \psi \in H, then \Phi (A_n) \rightarrow \Phi (A),

then there is a is a unique self-adjoint non-negative trace-one operator \rho (known as a density matrix) such that \Phi (A) = \text{trace}(\rho A) for all A \in B(H) (see [1] Proposition 19.9). (The trace of an operator A is defined as \sum_k \langle e_k, Ae_k \rangle where \{e_k \} is an orthonormal basis in the separable Hilbert space – in the finite dimensional case it is the sum of the operator’s eigenvalues.) Hence states are represented by positive self-adjoint operators with trace 1. Such operators are compact and so have a countable orthonormal basis of eigenvectors.

When \rho corresponds to a projection operator onto a one-dimensional subspace it has the form \rho = vv^* where v \in {\cal H} and \lVert v \rVert = 1. In this case we can show \text{trace}(\rho A) = \langle v, Av \rangle = v^*Av, which recovers the alternative view that unit vectors of {\cal H} correspond to states (known as vector states) so that the expected value of an observable corresponding to the operator A is \langle v, Av \rangle. This is done by choosing the orthonormal basis \{e_k \} where e_1 = v and computing

\begin{aligned} \text{trace}(\rho A) &= \sum_k \langle e_k, vv^*Ae_k \rangle\\ &= \sum_k e_k^* v v^* Ae_k\\ &= e_1^* e_1 e_1^*Ae_1 \quad \text{ (as }e_k^*v = \langle e_k, v \rangle = 0\text{ for } k > 1\text{)}\\ &= e_1^*Ae_1\\ &= \langle v, Av \rangle. \end{aligned}

Trace-one operators \rho can be written as a convex combination of rank one projection operators: \rho = \sum \lambda_k v_k v_k^*. From this it can be shown that those density operators which cannot be written as a convex combination of other states (called pure states) are precisely those of the form \rho = vv^*. Hence vector states and pure states are equivalent notions. Mixed states can be interpreted as a probabilistic mixture (convex combination) of pure states.

Let us now look at the similarity with probability theory. A measure space is a triple (X, {\cal S}, \mu) where X is a set, {\cal S} is a collection of measurable subsets of X called a \sigma-algebra and \mu:{\cal S} \rightarrow \mathbb{R} \cup \infty is a \sigma-additive measure. If g is a non-negative integrable function with \int g \ d\mu = 1 it is called a density function and then we can define a probability measure p_g:{\cal S} \rightarrow [0,1] by

\displaystyle p_g(S) = \int_S  g\ d\mu \in [0,1], S \in {\cal S}.

A random variable f:X\rightarrow \mathbb{R} maps elements of a set to real numbers in such a way that f^{-1}(B) \in {\cal S} for any Borel subset of \mathbb{R}. This enables us to compute their expectation with respect to the density function g as

\displaystyle \int_X f \ dp_g = \int_X fg\ d\mu.

This is like the quantum formula \text{Tr}(\rho A) with our density operator \rho playing the role of g and operator A playing the role of random variable f. Hence a probability density function is the commutative probability analogue of a quantum state (density operator).

While Borel sets are the events from which we define simple functions and then random variables, in the non-commutative case we define operators in terms of projections (equivalently closed subspaces) of a Hilbert space {\cal H}. A projection operator P is self-adjoint, satisfies P^2 = P and has the discrete spectrum \{0,1\}. Hence they are analogous to 0-1 indicator random variables, the answers to yes/no events. For any unit vector v \in {\cal H} the expected value

\displaystyle \langle v, Pv \rangle = \langle v, P^2v \rangle = \langle Pv, Pv \rangle = \lVert Pv \rVert^2

is interpreted as the probability the observable corresponding to P will have value 1 when measured in the state corresponding to v. In particular this probability will be 1 if and only if v is in the invariant subspace of P. We define meet and join operations \vee, \wedge on these closed subspaces to create a Hilbert lattice ({\cal P}({\cal H}), \vee, \wedge, \perp):

  • A \wedge B = A \cap B
  • A \vee B = \text{closure of } A + B
  • A^{\perp} = \{u: \langle u,v \rangle = 0\ \forall v \in A\}

Borel sets form a \sigma-algebra in which the distributive law A \cap (B \cup C) = (A \cap B) \cup (A \cap C) holds for any elements of {\cal S}. However in the Hilbert lattice the corresponding rule A \wedge (B \vee C) = (A \wedge B) \vee (A \wedge C) (where A, B, C are projection operators) only holds some of the time (see here for an example). This failure of the distributive law is equivalent to the general non-commutativity of projections.

A quantum probability measure \phi:{\cal P} \rightarrow [0,1] can be defined by combining projections in a \sigma-additive way, namely \phi(0) = 0, \phi(I) = 1 and \phi(\vee_i P_i) = \sum_i \phi(P_i) where P_i are mutually orthogonal projections (P_i \leq P_j^{\perp}, i \neq j). Gleason’s theorem says that for Hilbert space dimension at least 3 a state is uniquely determined by the values it takes on the orthogonal projections – a quantum probability measure can be extended from projections to bounded operators to obtain \phi(A) = \text{Tr}(\rho_{\phi} A), similar to how characteristic functions are extended to integrable functions. Hence this is a key result for non-commutative integration (note: the continuity conditions defining \Phi in 1-3 above are stronger). We choose von Neumann algebras over C*-algebras since the former contain all spectral projections of their self-adjoint elements while the latter may not [ref].

So far we have seen that expected values of observables A are derived via the formula \text{Tr}(\rho A). To derive the distribution itself, we make of the spectral theorem and for self-adjoint operators with continuous spectrum this requires projection valued measures. A self-adjoint operator A has a corresponding function E_A:{\cal S} \rightarrow {\cal P}({\cal H}) mapping Borel sets to projections so that E_A(S) represents the event that the outcome of measuring observable A is in the set S: we require that E_A(X) = I and S \mapsto \langle u,E_A(S)v \rangle is a complex additive function (measure) for all u, v \in {\cal H}. We use E_A(\lambda) as shorthand for E_A(\{x:x\leq \lambda\}). Similar to the way a finite dimensional self-adjoint matrix M may be eigen-decomposed in terms of its eigenvalues \lambda_i and normalised eigenvectors u_i as

\begin{aligned} M &= \sum_i \lambda_i u_i u_i^T \\ &= \sum_i \lambda_i P_i \quad \text{(where }P_i := u_i u_i^T \text{ is a projection)}\\ &= \sum_i \lambda_i (E_i - E_{i-1}), \quad \text{(where } E_i := \sum{k \leq i} P_k\text{ ),} \end{aligned}

the spectral theorem for more general self-adjoint operators allows us to write

A = \int_{\sigma(A)} \lambda dE_A(\lambda)

which means that for every u, v \in {\cal H},

\langle u, Av \rangle = \int_{\sigma(A)} \lambda d\langle u,E_A v \rangle.

Here, the integrals are over the spectrum of A. Through this formula we can work with functions of operators and in particular the distribution of the random variable X corresponding to operator A in state \rho will be

\text{Pr}(X \leq x) = E\left[ 1_{\{X \leq x\} }\right] = \text{Tr} \left( \rho\int_{-\infty}^x dE_A(\lambda) \right) = \text{Tr} \left( \rho E_A(x) \right).

The similarities we have seen here between classical probability and quantum mechanics are summarised in the table below, largely taken from [2] which greatly aided my understanding. Note how the pairing between trace class and bounded operators is analogous to the duality of L^1 and L^{\infty} functions.

Classical Probability
Quantum Mechanics
(non-commutative probability)
(X,{\cal S}, \mu) – measure space ({\cal H}, {\cal P}({\cal H}), \text{Tr}) – Hilbert space model of QM
X – set {\cal H} – Hilbert space
{\cal S} – Boolean algebra of Borel subsets of X called events {\cal P}({\cal H})orthomodular lattice of projections (equivalently closed subspaces) of {\cal H}
disjoint events orthogonal projections
\mu:{\cal S} \rightarrow {\mathbb R}^{+} \cup \infty\sigma-additive positive measure \text{Tr} – functional
g \in L^1(X,\mu), g \geq 0, \int g \ d\mu = 1 – integrable functions (probability density functions) \rho \in {\cal T}({\cal H}), \rho \geq 0, \text{Tr}(\rho) = 1 – trace class operators (density operators)
p_g(S) = \int \chi_S g\ d\mu \in [0,1], S \in {\cal S}probability measure mapping Borel sets to numbers in [0,1] in a sigma-additive way \phi(S) = \text{Tr}(\rho_{\phi } S) \in [0,1], \rho_{\phi } \in {\cal T}({\cal H}), S \in {\cal P}({\cal H})quantum state mapping projections to numbers in [0,1] in a sigma-additive way
f \in L^{\infty}(X,\mu) – essentially bounded measurable functions (bounded random variables) A \in {\cal B}({\cal H}) – von Neumann algebra of bounded operators (bounded observables)
\int fg\ d\mu, g \in L^1(X,\mu) – expectation value of f \in L^{\infty}(X,\mu) with respect to p_g

\text{Tr}(\rho A), \rho \in {\cal T}({\cal H}) – expectation value of A \in {\cal B}({\cal H}) in state \rho

In summary, the fact that measurements don’t always commute lead us to consider non-commutative operator algebras. This leads us to the Hilbert space representation of quantum mechanics where a quantum state is a trace-one density operator and an observable is a bounded linear operator. We also saw that projections can be viewed as 0-1 events. The spectral theorem is used to decompose operators into a sum or integral of projections.

The richer mathematical setting for quantum mechanics allows us to model non-classical phenomena such as quantum interference and entanglement. We have not mentioned the time evolution of states, but in short, state vectors evolve unitarily according to the Schrödinger equation, generated by an operator known as the Hamiltonian.

References and Further Reading

[1] Hall, B.C., Quantum Theory for Mathematicians, Springer, Graduate Texts in Mathematics #267, June 2013 (relevant section)

[2] Redei, M., Von Neumann’s work on Hilbert space quantum mechanics

[3] Blackadar, B., Operator Algebras: Theory of C*-Algebras and von Neumann Algebras

[4] Wilce, Alexander, “Quantum Logic and Probability Theory“, The Stanford Encyclopedia of Philosophy (Spring 2017 Edition), Edward N. Zalta (ed.).

[5] Wikipedia – Quantum logic

[6] Planetmath.org – Lattice of Projections

[7] Planetmath.org – Spectral Measure

[8] quantum mechanics – Intuitive meaning of Hilbert Space formalism – Physics Stack Exchange

[9] This answer to: mathematical physics – Quantum mechanics in a metric space rather than in a vector space, possible? – Physics Stack Exchange

[10] functional analysis – Resolution of the identity (basic questions) – Mathematics Stack Exchange

Advertisements

March 19, 2017

Two similar geometry problems based on perpendiculars to cevians

Filed under: mathematics — ckrao @ 7:18 am

In this post I wanted to show a couple of similar problems that can be proved using some ideas from projective geometry.

The first problem I found via the Romantics of Geometry Facebook group: let M be the point of tangency of the incircle of \triangle ABC with BC and let E be the foot of the perpendicular from the incentre X of the \triangle ABC to AM. Then show EM bisects \angle BEC.

 

perpendicularcevian1

The second problem is motivated by the above and problem 2 of the 2008 USAMO: this time let AM be a symmedian of ABC and E be the foot of the perpendicular from the circumcentre X of \triangle ABC to AM. Then show that EM bisects \angle BEC.

perpendicularcevian2

Here is a solution to the first problem inspired bythat of Vaggelis Stamatiadis. Let the line through the other two points of tangency P, Q of the incircle with ABC intersect line BC at the point N as shown below. Note that since AP and AQ are tangents to the circle, line NPQ is the polar of A with respect to the incircle.

perpendicularcevian1a

Since N is on the polar of A, by La Hire’s theorem, A is on the polar of N. The polar of N also passes through M (as NM is a tangent to the circle at M). We conclude that the polar of N is the line through A and M.

Next, let MN intersect PQ at R. By theorem 5(a) at this link, the points (N, R, P, Q) form a harmonic range. Since the cross ratio of collinear points does not change under central projection,  considering the projection from A, (N,M,B,C) also form a harmonic range. (Alternatively, this follows from the theorems of Ceva and Menelaus using the Cevians intersecting at the Gergonne point and transveral NPQ). Also, NE \perp EM as both NI and IE are perpendicular to polar AM of N.

Considering a central projection from E of line NMBC to a line N', M, P', C' parallel to NE through M, we see that (N', M, P', C') form a harmonic range. Since N' is a point at infinity, this implies M is the midpoint of B'C' and so triangles B'EM and C'EM are congruent (equality of two pairs of sides and included angle is 90^{\circ}). Hence EM bisects \angle BEC as was to be shown.

perpendicularcevian1b

For the second problem, we use the following characterisation of a symmedian: AM extended concurs with the lines of tangency of the circumcircle at B and C. (For three proofs of this see here.)

perpendicularcevian2a

Define N as the intersection of XE with BC and D as the intersection of AM with the tangents at B, C. Note that line NBMC is the polar of D with respect to the circumcircle. By La Hire’s theorem, D must be on the polar of N. This polar is perpendicular to NX (the line joining N to the centre of the circle) and as ED \perp EX by construction of E, it follows that line AEMD is the polar of N. Again by theorem 5(a) in reference (2), (N, M, B, C) form a harmonic range. Following the same argument as the previous proof, this together with NE \perp EM imply EM bisects \angle BEC as required.

By similar arguments, one can prove the following, left to the interested reader. If X is the A-excentre of \triangle ABC, M the ex-circle’s point of tangency of BC, and E the foot of the perpendicular from X to line AM, then EM bisects \angle BEC.

perpendicularcevian3

References

(1) Alexander Bogomolny, Poles and Polars from Interactive Mathematics Miscellany and Puzzles http://www.cut-the-knot.org/Curriculum/Geometry/PolePolar.shtml, Accessed 19 March 2017

(2) Poles and Polars – Another Useful Tool! | The Problem Solver’s Paradise

(3) Yufei Zhao, Lemmas in Euclidean Geometry

December 19, 2016

Some special functions and their applications

Filed under: mathematics — ckrao @ 9:55 am

Here are some notes on special functions and where they may arise. We consider functions in applied mathematics beyond field (four arithmetic operations), composition and inverse operations applied to the power and exponential functions.

1. Bessel and related functions

Bessel functions of the first (J_{\alpha}(x)) and second (Y_{\alpha}(x)) kind of order \alpha satisfy:

\displaystyle x^2 \frac{d^2 y}{dx^2} + x \frac{dy}{dx} + (x^2 - \alpha^2)y = 0.

Solutions for integer \alpha arise in solving Laplace’s equation in cylindrical coordinates while solutions for half-integer \alpha arise in solving the Helmholtz equation in spherical coordinates. Hence they come about in wave propagation, heat diffusion and electrostatic potential problems. The functions oscillate roughly periodically with amplitude decaying proportional to 1/\sqrt{x}. Note that Y_{\alpha}(x) is the second linearly independent solution when \alpha is an integer (for integer n, J_{-n}(x) = (-1)^n J_n(x)). Also, for integer n, J_n has the generating function

\displaystyle  \sum_{n=-\infty}^\infty J_n(x) t^n = e^{(\frac{x}{2})(t-1/t)},

the integral representations

\displaystyle J_n(x) = \frac{1}{\pi} \int_0^\pi \cos (n \tau - x \sin(\tau)) \,d\tau = \frac{1}{2 \pi} \int_{-\pi}^\pi e^{i(n \tau - x \sin(\tau))} \,d\tau

and satisfies the orthogonality relation

\displaystyle \int_0^1 x J_\alpha(x u_{\alpha,m}) J_\alpha(x u_{\alpha,n}) \,dx = \frac{\delta_{m,n}}{2} [J_{\alpha+1}(u_{\alpha,m})]^2 = \frac{\delta_{m,n}}{2} [J_{\alpha}'(u_{\alpha,m})]^2,

where \alpha > -1, \delta_{m,n} Kronecker delta, and u_{\alpha, m} is the m-th zero of J_{\alpha}(x).

Modified Bessel functions of the first (I_{\alpha}(x)) and second (K_{\alpha}(x)) kind of order \alpha satisfy:

\displaystyle x^2 \frac{d^2 y}{dx^2} + x \frac{dy}{dx} - (x^2 + \alpha^2)y = 0

(replacing x with ix in the previous equation).

The four functions may be expressed as follows.

\displaystyle J_{\alpha}(x) = \sum_{m=0}^\infty \frac{(-1)^m}{m! \, \Gamma(m+\alpha+1)} {\left(\frac{x}{2}\right)}^{2m+\alpha}

\displaystyle I_\alpha(x) = \sum_{m=0}^\infty \frac{1}{m! \, \Gamma(m+\alpha+1)} {\left(\frac{x}{2}\right)}^{2m+\alpha}

\displaystyle Y_\alpha(x) = \frac{J_\alpha(x) \cos(\alpha\pi) - J_{-\alpha}(x)}{\sin(\alpha\pi)}

\displaystyle K_\alpha(x) = \frac{\pi}{2} \frac{I_{-\alpha} (x) - I_\alpha (x)}{\sin (\alpha \pi)}

(In the last formula we need to take a limit when \alpha is an integer.)

Note that K and Y are singular at zero.

The Hankel functions H_\alpha^{(1)}(x) = J_\alpha(x)+iY_\alpha(x) and H_\alpha^{(2)}(x) = J_\alpha(x)-iY_\alpha(x) are also known as Bessel functions of the third kind.

The functions J_\alphaY_\alpha, H_\alpha^{(1)}, and H_\alpha^{(2)} all satisfy the recurrence relations (using Z in place of any of these four functions)

\displaystyle \frac{2\alpha}{x} Z_\alpha(x) = Z_{\alpha-1}(x) + Z_{\alpha+1}(x),
\displaystyle 2\frac{dZ_\alpha}{dx} = Z_{\alpha-1}(x) - Z_{\alpha+1}(x).

Bessel functions of higher orders/derivatives can be calculated from lower ones via:

\displaystyle \left( \frac{1}{x} \frac{d}{dx} \right)^m \left[ x^\alpha Z_{\alpha} (x) \right] = x^{\alpha - m} Z_{\alpha - m} (x),
\displaystyle \left( \frac{1}{x} \frac{d}{dx} \right)^m \left[ \frac{Z_\alpha (x)}{x^\alpha} \right] = (-1)^m \frac{Z_{\alpha + m} (x)}{x^{\alpha + m}}.

In particular, note that -J_1(x) is the derivative of J_0(x).

The Airy functions of the first (Ai(x)) and second (Bi(x)) kind satisfy

\displaystyle \frac{d^2y}{dx^2} - xy = 0.

This arises as a solution to Schrödinger’s equation for a particle in a triangular potential well and also describes interference and refraction patterns.

2. Orthogonal polynomials

Hermite polynomials (the probabilists’ defintion) can be defined by:

\displaystyle \mathit{He}_n(x)=(-1)^n e^{\frac{x^2}{2}}\frac{d^n}{dx^n}e^{-\frac{x^2}{2}}=\left (x-\frac{d}{dx} \right )^n \cdot 1,

and are orthogonal with respect to weighting function w(x) = e^{-x^2} on (-\infty, \infty).

They satisfy the differential equation

\displaystyle \left(e^{-\frac{x^2}{2}}u'\right)' + \lambda e^{-\frac{1}{2}x^2}u = 0

(where \lambda is forced to be an integer if we insist u be polynomially bounded at \infty)

and the recurrence relation

\displaystyle {\mathit{He}}_{n+1}(x)=x{\mathit{He}}_n(x)-{\mathit{He}}_n'(x).

The first few such polynomials are 1, x, x^2-1, x^3-3x, \ldots. The Physicists’ Hermite polynomials H_n(x) are related by H_n(x)=2^{\tfrac{n}{2}}{\mathit{He}}_n(\sqrt{2} \,x) and arise for example as the eigenstates of the quantum harmonic oscillator.

Laguerre polynomials are defined by

\displaystyle L_n(x)=\frac{e^x}{n!}\frac{d^n}{dx^n}\left(e^{-x} x^n\right) =\frac{1}{n!} \left( \frac{d}{dx} -1 \right) ^n x^n = \sum_{k=0}^n \binom{n}{k}\frac{(-1)^k}{k!} x^k,

and are orthogonal with respect to e^{-x} on (0,\infty).

They satisfy the differential equation

\displaystyle  xy'' + (1 - x)y' + ny = 0,

recurrence relation

\displaystyle L_{k + 1}(x) = \frac{(2k + 1 - x)L_k(x) - k L_{k - 1}(x)}{k + 1},

and have generating function

\displaystyle \sum_n^\infty  t^n L_n(x)=  \frac{1}{1-t} e^{-\frac{tx}{1-t}}.

The first few values are 1, 1-x, (x^2-4x+2)/2. Note also that L_{-n}(x)=e^xL_{n-1}(-x).

The functions come up as the radial part of solution to Schrödinger’s equation for a one-electron atom.

Legendre polynomials can be defined by

\displaystyle P_n(x) = {1 \over 2^n n!} {d^n \over dx^n } \left[ (x^2 -1)^n \right]

and are orthogonal with respect to the L^2 norm on (-1,1).

They satisfy the differential equation

\displaystyle {d \over dx} \left[ (1-x^2) {d \over dx} P_n(x) \right] + n(n+1)P_n(x) = 0,

recurrence relation

and have generating function

\sum_{n=0}^\infty P_n(x) t^n = \displaystyle \frac{1}{\sqrt{1-2xt+t^2}}.

The first few values are 1, x, (3x^2-1)/2, (5x^3-3x)/2.

They arise in the expansion of the Newtonian potential 1/|x-x'| (multipole expansions) and Laplace’s equation where there is axial symmetry (spherical harmonics are expressed in terms of these).

Chebyshev polynomials of the 1st kind T_n(x) can be defined by

T_n(x) =\begin{cases} \cos(n\arccos(x)) & \ |x| \le 1 \\ \frac12 \left[ \left (x-\sqrt{x^2-1} \right )^n + \left (x+\sqrt{x^2-1} \right )^n \right] & \ |x| \ge 1 \\ \end{cases}

and are orthogonal with respect to weighting function w(x) = 1/\sqrt{1-x^2} in (-1,1).

They satisfy the differential equation

\displaystyle (1-x^2)\,y'' - x\,y' + n^2\,y = 0,

the relations

\displaystyle T_{n+1}(x) = 2xT_n(x) - T_{n-1}(x)

\displaystyle (1 - x^2)T_n'(x) = -nx T_n(x) + n T_{n-1}(x)

and have generating function

\displaystyle \sum_{n=0}^{\infty}T_n(x) t^n = \frac{1-tx}{1-2tx+t^2}.

The first few values are 1, x, 2x^2-1, 4x^3-3x, \ldots. These polynomials arise in approximation theory, namely their roots are used as nodes in piecewise polynomial interpolation. The function f(x) = \frac1{2^{n-1}}T_n(x) is the polynomial of leading coefficient 1 and degree n where the maximal absolute value on (-1,1) is minimal.

Chebyshev polynomials of the 2nd kind U_n(x) are defined by

\displaystyle  U_n(x)  = \frac{\left (x+\sqrt{x^2-1} \right )^{n+1} - \left (x-\sqrt{x^2-1} \right )^{n+1}}{2\sqrt{x^2-1}}

and are orthogonal with respect to weighting function w(x) = \sqrt{1-x^2} in (-1,1).

They satisfy the differential equation

\displaystyle  (1-x^2)\,y'' - 3x\,y' + n(n+2)\,y = 0,

the recurrence relation

\displaystyle U_{n+1}(x) = 2xU_n(x) - U_{n-1}(x)

and have generating function

\displaystyle \sum_{n=0}^{\infty}U_n(x) t^n = \frac{1}{1-2 t x+t^2}.

The first few values are 1, 2x, 4x^2-1, 8x^3-4x, \ldots. (There are also less well known Chebyshev  polynomials of the third and fourth kind.)

Bessel polynomials y_n(x) may be defined from Bessel functions via

\displaystyle y_n(x)=\sqrt{\frac{2}{\pi x}}\,e^{1/x}K_{n+\frac 1 2}(1/x)  = \sum_{k=0}^n\frac{(n+k)!}{(n-k)!k!}\,\left(\frac{x}{2}\right)^k.

They satisfies the differential equation

\displaystyle x^2\frac{d^2y_n(x)}{dx^2}+2(x\!+\!1)\frac{dy_n(x)}{dx}-n(n+1)y_n(x)=0.

The first few values are 1, x+1, 3x^2+3x+1,\ldots.

3. Integrals

The error function has the form

\displaystyle \rm{erf}(x) = \frac{2}{\sqrt\pi}\int_0^x e^{-t^2}\,\mathrm dt.

This can be interpreted as the probability a normally distributed random variable with zero mean and variance 1/2 is in the interval (-x,x).

The cdf of the normal distribution $\Phi(x)$ is related to this via \Phi(x) = (1 + {\rm erf}(x/\sqrt{2})/2. Hence the tail probability of the standard normal distribution Q(x) is Q(x) = (1 - {\rm erf}(x/\sqrt{2}))/2.

Fresnel integrals are defined by

\displaystyle S(x) =\int_0^x \sin(t^2)\,\mathrm{d}t=\sum_{n=0}^{\infty}(-1)^n\frac{x^{4n+3}}{(2n+1)!(4n+3)}
\displaystyle C(x) =\int_0^x \cos(t^2)\,\mathrm{d}t=\sum_{n=0}^{\infty}(-1)^n\frac{x^{4n+1}}{(2n)!(4n+1)}

They have applications in optics.

The exponential integral {\rm Ei}(x) (used in heat transfer applications) is defined by

\displaystyle {\rm Ei}(x)=-\int_{-x}^{\infty}\frac{e^{-t}}t\,dt.

It is related to the logarithmic integral

\displaystyle {\rm li} (x) =   \int_0^x \frac{dt}{\ln t}

by \mathrm{li}(x) = \mathrm{Ei}(\ln x) (for real x).

The incomplete elliptic integral of the first, second and third kinds are defined by

\displaystyle F(\varphi,k) = \int_0^\varphi \frac {d\theta}{\sqrt{1 - k^2 \sin^2 \theta}}

\displaystyle E(\varphi,k) =  \int_0^\varphi \sqrt{1-k^2 \sin^2\theta}\, d\theta

 \displaystyle \Pi(n ; \varphi \setminus \alpha) = \int_0^\varphi  \frac{1}{1-n\sin^2 \theta} \frac {d\theta}{\sqrt{1-(\sin\theta\sin \alpha)^2}}

Setting \varphi = \pi/2 gives the complete elliptic integrals.

Any integral of the form \int_{c}^{x} R \left(t, \sqrt{P(t)} \right) \, dt, where c is a constant, R is a rational function of its arguments and P(t) is a polynomial of 3rd or 4th degree with no repeated roots, may be expressed in terms of the elliptic integrals. The circumference of an ellipse of semi-major axis a, semi-minor axis b and eccentricity e = \sqrt{1-b^2/a^2} is given by 4aE(e), where E(k) is the complete integral of the second kind.

(Some elliptic functions are related to inverse elliptic integral, hence their name.)

The (upper) incomplete Gamma function is defined by

\displaystyle \Gamma(s,x) = \int_x^{\infty} t^{s-1}\,e^{-t}\,{\rm d}t.

It satisfies the recurrence relation \Gamma(s+1,x)= s\Gamma(s,x) + x^{s} e^{-x}. Setting s= 0 gives the Gamma function which interpolates the factorial function.

The digamma function is the logarithmic derivative of the gamma function:

\displaystyle \psi(x)=\frac{d}{dx}\ln\Big(\Gamma(x)\Big)=\frac{\Gamma'(x)}{\Gamma(x)}.

Due the relation \psi(x+1) = \psi(x) + 1/x, this function appears in the regularisation of divergent integrals, e.g.

\sum_{n=0}^{\infty} \frac{1}{n+a}= - \psi (a).

The incomplete Beta function is defined by

\displaystyle B(x;\,a,b) = \int_0^x t^{a-1}\,(1-t)^{b-1}\,\mathrm{d}t.

When setting x=1 this becomes the Beta function which is related to the gamma function via

\displaystyle B(x,y)=\frac{\Gamma(x)\,\Gamma(y)}{\Gamma(x+y)}.

This can be extended to the multivariate Beta function, used in defining the Dirichlet function.

\displaystyle B(\alpha_1,\ldots,\alpha_K) = \frac{\Gamma(\alpha_1) \cdots \Gamma(\alpha_K)}{\Gamma(\alpha_1 + \ldots + \alpha_K)}.

The polylogarithm, appearing as integrals of the Fermi–Dirac and Bose–Einstein distributions, is defined by

\displaystyle {\rm Li}_s(z) = \sum_{k=1}^\infty \frac{z^k}{k^s} = z + \frac{z^2}{2^s} + \frac{z^3}{3^s} + \cdots

Note the special case {\rm Li}_1(z) = -\ln (1-z) and the case s=2 is known as the dilogarithm. We also have the recursive formula

\displaystyle {\rm Li}_{s+1}(z) = \int_0^z \frac {{\rm Li}_s(t)}{t}\,\mathrm{d}t.

4. Generalised Hypergeometric functions

All the above functions can be written in terms of generalised hypergeometric functions.

\displaystyle {}_pF_q(a_1,\ldots,a_p;b_1,\ldots,b_q;z) = \sum_{n=0}^\infty \frac{(a_1)_n\dots(a_p)_n}{(b_1)_n\dots(b_q)_n} \, \frac {z^n} {n!}

where (a)_n = \Gamma(a+n)/\Gamma(a) = a(a+1)(a+2)...(a+n-1) for n > 0 or (a)_0 = 1.

The special case p=q=1 is called a confluent hypergeometric function of the first kind, also written M(a;b;z).

This satisfies the differential equation (Kummer’s equation)

\displaystyle \left (z\frac{d}{dz}+a \right )w = \left (z\frac{d}{dz}+b \right )\frac{dw}{dz}.

The Bessel, Hankel, Airy, Laguerre, error, exponential and logarithmic integral functions can be expressed in terms of this.

The case p=2, q=1 is sometimes called Gauss’s hypergeometric functions, or simply hypergeometric functions. This satisfies the differential equation

\displaystyle \left (z\frac{d}{dz}+a \right ) \left (z\frac{d}{dz}+b \right )w =\left  (z\frac{d}{dz}+c \right )\frac{dw}{dz}.

The Legendre, Hermite and Chebyshev, Beta, Gamma functions can be expressed in terms of this.

Further reading

The Wolfram Functions Site

Wikipedia: List of mathematical functions

Wikipedia: List of special functions and eponyms

Wikipedia: List of q-analogs

Wikipedia Category: Orthogonal polynomials

Weisstein, Eric W. “Laplace’s Equation.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/LaplacesEquation.html

June 26, 2016

2016 has many factors

Filed under: mathematics — ckrao @ 11:16 am

The number 2016 has at least as many factors (36) as any positive integer below it except 1680 = 2^4\times 3\times 5\times 7 (which has 40 factors). The next time a year will have more factors is 2160 = 2^4\times 3^3\times 5, also with 40 factors.

Here are the numbers below 2160 also with 36 factors:

  • 1260 = 2^2 \times 3^2 \times 5 \times 7
  • 1440 = 2^5 \times 3^2 \times 5
  • 1800 = 2^3 \times 3^2 \times 5^2
  • 1980 = 2^2 \times 3^2 \times 5 \times 11
  • 2016 = 2^5 \times 3^2 \times 7
  • 2100 = 2^2 \times 3 \times 5^2 \times 7

The first integer with more than 40 factors is $2520 = 2^3 \times 3^2 \times 5 \times 7$ (48 factors).

References

[1] N. J. A. Sloane and Daniel Forgues, Table of n, a(n) for n = 1..100000 (first 10000 terms from N. J. A. Sloane), A000005 – OEIS.

[2] Highly composite number – Wikipedia, the free encyclopedia

March 26, 2016

Applying AM-GM in the denominator after flipping the sign

Filed under: mathematics — ckrao @ 8:44 pm

There are times when solving inequalities that one has a sum of fractions in which applying the AM-GM inequality to each denominator results in the wrong sign for the resulting expression.

For example (from [1], p18), if we wish to show that for real numbers x_1, x_2, \ldots, x_n with sum n that

\displaystyle \sum_{i = 1}^n \frac{1}{x_i^2 + 1}\geq \frac{n}{2},

we may write x_i^2 + 1 \geq 2x_i (equivalent to (x_i-1)^2 \geq 0), but this implies \frac{1}{x_i^2 + 1} \leq \frac{1}{2x_i} and so the sign goes the wrong way.

A way around this is to write

\begin{aligned}  \frac{1}{x_i^2 + 1} &= 1 - \frac{x_i^2}{x_i^2 + 1}\\  &\geq 1 - \frac{x_i^2}{2x_i}\\  &= 1 - \frac{x_i}{2}.  \end{aligned}

Summing this over i then gives \sum_{i=1}^n \frac{1}{x_i^2 + 1} \geq n - \sum_{i=1}^n (x_i/2) = n/2 as desired.

Here are a few more examples demonstrating this technique.

2. (p9 of [2]) If a,b,c are positive real numbers with a + b + c = 3, then

\dfrac{a}{1 +b^2} + \dfrac{b}{1 +c^2} + \dfrac{c}{1 +a^2} \geq \dfrac{3}{2}.

To prove this we write

\begin{aligned}  \frac{a}{1 + b^2} &= a\left(1 - \frac{b^2}{1 + b^2}\right)\\  &\geq a\left(1 - \frac{b}{2}\right) \quad \text{(using the same argument as before)}\\  &=a - \frac{ab}{2}.  \end{aligned}

Next we have 3(ab + bc + ca) \leq (a + b + c)^2 = 9 as this is equivalent to (a-b)^2 + (b-c)^2 + (c-a)^2 \geq 0. This means ab + bc + ca \leq 3. Putting everything together,

\begin{aligned}  \frac{a}{1 + b^2} + \frac{b}{1 + c^2} + \frac{c}{1 + a^2}&\geq \left( a - \frac{ab}{2} \right) + \left( b - \frac{bc}{2} \right) + \left( c - \frac{ca}{2} \right)\\  &= (a + b + c) - (ab + bc + ca)/2\\  &\geq 3 - 3/2\\  &=\frac{3}{2},  \end{aligned}

as required.

3. (based on p8 of [2]) If x_i > 0 for i= 1, 2, \ldots, n and \sum_{i = 1}^n x_i^2 = n then

\displaystyle \sum_{i=1}^n \frac{1}{x_i^3 + 2} \geq \frac{n}{3}.

By the AM-GM inequality, x_i^3 + 2 = x_i^3 + 1 + 1 \geq 3x_i, so

\begin{aligned}  \frac{1}{x_i^3 + 2} &= \frac{1}{2}\left( 1 - \frac{x_i^3}{x_i^3 + 2} \right)\\  &\geq \frac{1}{2}\left( 1 - \frac{x_i^3}{3x_i} \right)\\  &= \frac{1}{2}\left( 1 - \frac{x_i^2}{3} \right).  \end{aligned}

Summing this over i gives

\begin{aligned}  \sum_{i=1}^n \frac{1}{x_i^3 + 2} &\geq \frac{1}{2} \sum_{i=1}^n \left( 1 - \frac{x_i^2}{3} \right)\\  &= \frac{1}{2}\left( n - \frac{n}{3} \right)\\  &= \frac{n}{3}.  \end{aligned}

4. (from [3]) If x, y, z are positive, then

\dfrac {x ^ 3}{x ^ 2 + y ^ 2} + \dfrac {y ^ 3}{y ^ 2 + z ^ 2} + \dfrac {z ^ 3}{z ^ 2 + x ^ 2 } \geq \dfrac {x + y + z}{2}.

Once again, focusing on the denominator,

\begin{aligned}  \dfrac {x ^ 3}{x ^ 2 + y ^ 2} &= x\left(1 - \dfrac {y ^ 2} {x ^ 2 + y ^ 2} \right)\\  &\geq x \left(1 -\dfrac{xy^2}{2xy} \right)\\  &= x-\dfrac{y}{2}.  \end{aligned}

Hence,

\begin{aligned}  \dfrac {x ^ 3}{x ^ 2 + y ^ 2} + \dfrac {y ^ 3}{y ^ 2 + z ^ 2} + \dfrac {z ^ 3}{z ^ 2 + x ^ 2 } &\geq x-\dfrac{y}{2} + y-\dfrac{z}{2} + z-\dfrac{x}{2}\\  &= \dfrac {x + y + z}{2},  \end{aligned}

as desired.

5. (from the 1991 Asian Pacific Maths Olympiad, see [4] for other solutions) Let a_1, a_2, \ldots, a_n, b_1, b_2, \ldots, b_n be positive numbers with \sum_{i = 1}^n a_i = \sum_{i = 1}^n b_i. Then

\displaystyle\sum_{i=1}^n\frac{a_i^2}{a_i + b_i} \geq \frac{1}{2}\sum_{i=1}^n a_i.

Here we write

\begin{aligned}  \sum_{i=1}^n\frac{a_i^2}{a_i + b_i} &= \sum_{i=1}^n a_i \left(1 - \frac{b_i}{a_i + b_i} \right)\\  &\geq \sum_{i=1}^n a_i \left(1 - \frac{b_i}{2\sqrt{a_i b_i}} \right) \\  &= \frac{1}{2} \sum_{i=1}^n \left( 2a_i - \sqrt{a_i b_i} \right) \\  &= \frac{1}{4} \sum_{i=1}^n \left( 4a_i - 2\sqrt{a_i b_i} \right)\\  &= \frac{1}{4} \sum_{i=1}^n \left( 2a_i +a_i - 2\sqrt{a_i b_i} + b_i \right) \quad \text{(as } \sum_{i=1}^n a_i = \sum_{i=1}^n b_i\text{)}\\  &= \frac{1}{4} \sum_{i=1}^n \left( 2a_i + \left(\sqrt{a_i} - \sqrt{b_i}\right)^2 \right)\\  &\geq \frac{1}{4} \sum_{i=1}^n 2a_i\\  &= \frac{1}{2} \sum_{i=1}^n a_i,  \end{aligned}

as required.

References

[1] Zdravko Cvetkovski, Inequalities: Theorems, Techniques and Selected Problems, Springer, 2012.

[2] Wang and Kadaveru, Advanced Topics in Inequalities, available from http://www.artofproblemsolving.com/community/q1h1060665p4590952

[3] Cauchy Reverse Technique: https://translate.google.com.au/translate?hl=en&sl=ja&u=http://mathtrain.jp/crt&prev=search

[4] algebra precalculus – Prove that \frac{a_1^2}{a_1+b_1}+\cdots+\frac{a_n^2}{a_n+b_n} \geq \frac{1}{2}(a_1+\cdots+a_n). – Mathematics StackExchange

February 27, 2016

Cutting a triangle in half

Filed under: mathematics — ckrao @ 9:40 pm

Here is a cute triangle result that I’m surprised I had not known previously. If we are given a point on one of the sides of a triangle, how do we find a line through the triangle that cuts its area in half?

Clearly if that point is either a midpoint or one of the vertices, the answer is a median of the triangle. A median cuts a triangle in half since the two pieces have the same length side and equal height.

median

So what if the point is not a midpoint or a vertex? Referring to the diagram below, if P is our desired point closer to A than B, the end point Q of the area-bisecting segment would need to be on side BC so that area(BPQ) = area(ABC)/2.

areabisector

In other words, we require area(BPQ) = area(BDQ), or, subtracting the areas of triangle BDQ from both sides,

\displaystyle |DPQ| = |DCQ|.

Since these two triangles share the common base DQ, this tells us that we require them to have the same height. In other words, we require CP to be parallel to DQ. This tells us how to construct the point Q given P on AB:

  1. Construct the midpoint D of AB.
  2. Draw DQ parallel to AP.

See [1] for an animation of this construction.

In turns out that the set of all area-bisecting lines are tangent to three hyperbolas and enclose a deltoid of area (3/4)\ln(2) - 1/2 \approx 0.01986 times the original triangle. [2,3,4]

References

[1] Jaime Rangel-Mondragon, “Bisecting a Triangle” http://demonstrations.wolfram.com/BisectingATriangle/ from the Wolfram Demonstrations Project Published: July 10, 2013

[2] Ismail Hammoudeh, “Triangle Area Bisectors”  http://demonstrations.wolfram.com/TriangleAreaBisectors/ from the Wolfram Demonstrations Project

[4] Henry Bottomley, Area bisectors of a triangle, January 2002

January 30, 2016

Patterns early in the digits of pi

Filed under: mathematics — ckrao @ 8:59 pm

Recently when taking a look at the early decimal digits of \pi I made the following observations:

 3.141592653589793238462643383279502884197169399375105820974944592307816406286…

  • The first run of seven distinct digits (8327950, shown underlined) appears in the 26th to 32nd decimal place. Curiously the third such run (5923078, also underlined) in decimal places 61 to 67 contains the same seven digits. (There is also a run of seven distinct digits in places 51 to 57 with 5820974.)
  • Decimal digits 60 to 69 (shown in bold) are distinct (i.e. all digits are represented once in this streak). The same is true for digits 61 to 70 as both digits 60 and 70 are ‘4’.

Assume the digits of \pi are generated independently from a uniform distribution. Firstly, how often would we expect to see a run of 7 distinct digits? Places k to k+6 are distinct with probability

\displaystyle \frac{9}{10} \times \frac{8}{10} \times \frac{7}{10} \times \frac{6}{10} \times \frac{5}{10} \times \frac{4}{10} = \frac{9!}{3.10^6} = \frac{189}{3125}.

Hence we expect runs of 7 distinct digits to appear 3125/189 \approx 16.5 places apart. This includes the possibility of runs such as 12345678 which contain two runs of 7 distinct digits that are only 1 place apart.

How often would we expect to see the same 7 digits appearing in a run as we did in places 26-32 and 61-67? Furthermore let’s assume the two runs have no overlap, so we discount possibilities such as 12345678 which have a six-digit overlap. We expect a given sequence (e.g. 1234567, in that order) to appear 1/10^7 of the time. There are 7! = 5040 permutations of such a sequence, but of these 1 has overlap 6 (2345671), 2! has overlap 5 (3456712 or 3456721), 3! has overlap 4, …, 6! has overlap 1 with the original sequence. This leaves us with 5040 - (1 + 2 + 6 + 24 + 120 + 720) = 4167 possible choices of the next run to have the same 7 digits but non-overlapping (or to appear in precisely the same order – overlap 7). Hence we expect the same 7 digits to recur (no overlap with the original run) after approximately 10^7/4167 \approx 2400 places apart so what we saw in the first 100 places was remarkable.

Now let us turn to runs of all ten distinct digits. Repeating the argument above, such runs occur every 10^{10}/10! \approx 2756 places. According to [1] the next time we see ten distinct digits is in decimal places 5470 to 5479.

To answer the question of when we would expect to see the first occurrence of ten distinct digits, we adopt an argument from renewal-reward theory based on Sec 7.9.2 of [2] (there also exist approaches based on setting up recurrence relations, or martingale theory (modelling a fair casino), see [3]-[4]). Firstly we let T be the first time we get 10 consecutive distinct values – we wish to find E[T] where E denotes the expected value operator. Note that this will be more than the 2756 answer we obtained above since we make no assumption of starting with a run of ten distinct digits – there is no way T could be 1 for example, but we could have two runs of ten distinct digits that are 1 apart.

From a sequence of digits we first define a renewal process in which after we get 10 consecutive distinct values (at time T) we start over and wait for the next run of 10 consecutive distinct values without using any of the values  up to time T. Such a process will then have an expected length of cycle of E[T].

Next, suppose we earn a reward of $1 every time the last 10 digits of the sequence are distinct (so we would have obtained $1 at each of decimal places 69 and 70 in the \pi example). By an important result in renewal-reward theory, the long run average reward is equal to the expected reward in a cycle divided by the expected length of a cycle.

In a cycle we will obtain

  • $1 at the end
  • $1 at time 1 in the cycle with probability 1/10 (if that digit is the same as ten digits before it)
  • $1 at time 2 in the cycle with probability 2/100 (if the last two digits match those ten places before it)
  • $1 at time 9 in the cycle with probability 9!/10^9 (if the last nine digits match those ten places before it)

Hence the expected reward in a cycle is given by

\displaystyle 1 + \sum_{i=1}^9 \frac{i!}{10^i} = \sum_{i=0}^9 \frac{i!}{10^i}.

We have already seen that the long run average reward is 10!/10^{10} at each decimal place. Hence the expected length of a cycle E[T] (i.e. the expected number of digits before we expect the first run of ten consecutive digits) is given by

\frac{10^{10}}{10!}\sum_{i=0}^9 \frac{i!}{10^i} \approx 3118.

Hence it is pretty cool that we see it so early in the decimal digits of \pi. 🙂

References

[1] A258157 – Online Encyclopedia of Integer Sequences

[2] S. Ross, Introduction to Probability Models, Academic Press, 2014.

[3] Combinatorics Problem on Expected Value – Problem Solving: Dice Rolls – Daniel Wang | Brilliant

[4] A Collection of Dice Problemsmadandmoonly.com

December 23, 2015

Areas of sections of a triangle from distances to its sides

Filed under: mathematics — ckrao @ 12:35 pm

If a point P is in the interior of triangle ABC distance x, y and z from the sides, what is the ratio of the area of quadrilateral BXPZ to that of ABC?

distances_to_sides

One way of determining this is to draw parallels to the sides of the triangles through P. Let X_1 and X_2 be where these parallels meet side BC as shown below.

distances_to_sides_construction

Let the sides of the triangles have lengths a, b, c with corresponding altitudes h_a, h_b, h_c.

Then as \triangle PX_1 X_2 and \triangle ACB are similar,

\begin{aligned}|PX_1X| &= |PX_1X_2| \frac{X_1X}{X_1X_2}\\ &= |PX_1X_2| \frac{b\cos C}{a}\\ &= |PX_1X_2| \frac{b (a^2 + b^2 - c^2)}{2a^2b} \quad \text{ (cosine rule)}\\ &= \left(\frac{x}{h_a} \right)^2|ABC|\frac{(a^2 + b^2 - c^2)}{2a^2}\\ &= |ABC|\left(\frac{ax}{ax+by+cz}\right)^2\frac{(a^2 + b^2 - c^2)}{2a^2}\\&= \frac{|ABC|x^2(a^2 + b^2 - c^2)}{2(ax+by+cz)^2},\quad\quad (1) \end{aligned}

where the second last line follows from twice the area of |ABC| being ah_a = ax + by + cz.

Similarly,

\displaystyle |PY_1Z| = \frac{|ABC|z^2(b^2 + c^2 - a^2)}{2(ax+by+cz)^2}.\quad \quad (2)

Finally,

\begin{aligned}|X_1Y_1B| &= \left(\frac{h_b-y}{h_b}\right)^2|ABC|\\ &= \left(1-\frac{by}{bh_b}\right)^2 |ABC|\\ &= \left(1-\frac{by}{2|ABC|}\right)^2|ABC|\\ &= \left(1-\frac{by}{ax+by+cz}\right)^2|ABC|\\ &= \left(\frac{ax +cz}{ax+by+cz}\right)^2|ABC|. \quad\quad(3)\end{aligned}

Combining (1), (2) and (3), we obtain our desired answer as

\begin{aligned} \frac{|BXPZ|}{|ABC|} &= \frac{|X_1Y_1B|-|PX_1X|-|PY_1Z|}{|ABC|}\\&= \left(\frac{ax +cz}{ax+by+cz}\right)^2-\frac{x^2(a^2 + b^2 - c^2)}{2(ax+by+cz)^2}-\frac{z^2(b^2 + c^2 - a^2)}{2(ax+by+cz)^2}\\&=\frac{(ax+cz)^2 - x^2(a^2 +b^2-c^2)/2 - z^2(b^2+c^2-a^2)/2}{(ax+by+cz)^2}\\ &= \frac{2axcz + x^2(a^2 - b^2 + c^2) + z^2(c^2 +a^2-b^2)}{(ax+by+cz)^2}\\&= \frac{4axcz + (x^2 + z^2)(a^2 - b^2 + c^2)}{2(ax+by+cz)^2}.\quad\quad(4)\end{aligned}

Similar formulas can be found for quadrilaterals XPYC and YPZA by permuting variables. Note that if P is outside the triangle or if the triangle is obtuse-angled, care must be taken in the signs of the areas (the quadrilaterals may not be convex) and variables x, y, z.

Note that (4) may also be written as

\displaystyle \frac{|BXPZ|}{|ABC|} = \frac{ac(2xz + (x^2 + z^2)\cos B)}{(ax+by+cz)^2}.\quad\quad(5)

Special cases

1) If \triangle ABC is equilateral, a=b=c and from (4) we obtain

\begin{aligned} \frac{|BXPZ|}{|ABC|} &= \frac{4axcz + (x^2 + z^2)(a^2 - b^2 + c^2)}{2(ax+by+cz)^2}\\ &= \frac{4a^2xz + (x^2 + z^2)(a^2)}{2a^2(x+y+z)^2}\\ &= \frac{4xz + x^2 + z^2}{2(x+y+z)^2}.\quad\quad(6)\end{aligned}

2) If P is at the incentre of \triangle ABC, then x = y = z = r (the inradius) and from (4) we have

\begin{aligned} \frac{|BXPZ|}{|ABC|} &= \frac{4axcz + (x^2 + z^2)(a^2 - b^2 + c^2)}{2(ax+by+cz)^2}\\ &= \frac{4a^2xz + (x^2 + z^2)(a^2)}{2a^2(x+y+z)^2}\\ &= \frac{4xz + x^2 + z^2}{2(x+y+z)^2}.\quad\quad(7)\end{aligned}

3) If \triangle P is right-angled at B, then quadrilateral BXPZ is a rectangle with area xz and \triangle ABC has area ac/2 and from (5),

\begin{aligned} \frac{|BXPZ|}{|ABC|} &= \frac{2acxz )}{(ax+by+cz)^2}\\ &= \frac{2acxz )}{(ac)^2}\\ &= \frac{2xz}{ac}.\quad \quad (8)\end{aligned}

as expected.

4) If a=c and x=z (symmetric isosceles triangle case) then from (4),

\begin{aligned} \frac{|BXPZ|}{|ABC|} &= \frac{4axcz + (x^2 + z^2)(a^2 - b^2 + c^2)}{2(ax+by+cz)^2}\\ &= \frac{4a^2x^2 + 2x^2(2a^2-b^2)}{2(2ax+by)^2}\\ &= \frac{x^2(4a^2 -b^2)}{(2ax+by)^2}.\quad\quad(9)\end{aligned}

November 25, 2015

An identity based on three numbers summing to zero

Filed under: mathematics — ckrao @ 11:06 am

Here is a nice identity which according to [1] appeared in a 1957 Chinese mathematics competition.

If x + y + z = 0 then

\displaystyle \left(\frac{x^2 + y^2 + z^2}{2} \right)\left(\frac{x^5 + y^5 + z^5}{5} \right) = \left(\frac{x^7 + y^7 + z^7}{7} \right).\quad \quad (1)

An elegant proof of this avoids any lengthy expansions. Let x, y and z be roots of the cubic polynomial

\begin{aligned} (X-x)(X-y)(X-z) &= X^3 - (x+y+z)X^2 + (xy + yz + zx)X - xyz\\ &:= X^3 + aX + b.\quad \quad (2)\end{aligned}

Then

\begin{aligned} x^2 + y^2 + z^2 &= (x+y+z)^2 - 2(xy + yz + xz)\\ &= 0 -2a\\ &= -2a\quad\quad (3)\end{aligned}

and summing the relation X^3 = -aX - b for each of X=x, X=y and X=z, gives

\begin{aligned}x^3 + y^3 + z^3 &= -a(x + y + z) - 3b\\ &= -3b.\quad\quad (4)\end{aligned}

In a similar manner, X^4 = -aX^2 - bX and so

\begin{aligned} x^4 + y^4 + z^4 &= -a(x^2 + y^2 + z^2) - b(x + y + z)\\ &= -a(-2a)\\ &= 2a^2.\quad \quad (5)\end{aligned}

Next, X^5 = -aX^3 - bX^2 and so

\begin{aligned} x^5 + y^5 + z^5 &= -a(x^3 + y^3 + z^3) - b(x^2 + y^2 + z^2)\\ &= -a(-3b) -b(-2a)\\ &= 5ab.\quad \quad (6)\end{aligned}

Finally, X^7 = -aX^5 - bX^4 and so

\begin{aligned} x^7 + y^7 + z^7 &= -a(x^5 + y^5 + z^5) - b(x^4 + y^4 + z^4)\\ &= -a(5ab) -b(2a^2)\\ &= -7a^2b.\quad \quad (7)\end{aligned}

We then combine (3), (6) and (7) to obtain (1). It seems that x^n + y^n + z^n for higher values of n are more complicated expressions in a and latex $b$, so we don’t get as pretty a relation elsewhere.

Reference

[1] Răzvan Gelca and Titu Andreescu, Putnam and Beyond, Springer, 2007.

October 28, 2015

The product of distances to a point from vertices of a regular polygon

Filed under: mathematics — ckrao @ 11:12 am

Here is a cool trigonometric identity I recently encountered:

\displaystyle \prod_{k=1}^n \sin \frac{(2k-1)\pi}{4n} = \prod_{k=1}^n \cos \frac{(2k-1)\pi}{4n} = \frac{\sqrt{2}}{2^n}.

For example, for n = 9:

\displaystyle \sin 5^{\circ} \sin 15^{\circ} \sin 25^{\circ} \ldots \sin 85^{\circ} = \frac{\sqrt{2}}{2^9}.

After thinking about it for some time I realised that the terms on the left side can each be seen as half the lengths of chords of a unit circle with 4n evenly spaced points that can then be rearranged to be distances from a point on the unit circle to half of the points of a regular (2n)-gon, as shown in the figure below.

equalchords

With the insight of this figure we then write

\begin{aligned} \prod_{k=1}^n \sin \frac{(2k-1)\pi}{4n} &= \left(\prod_{k=1}^{2n} \left| \sin \frac{(2k-1)\pi}{4n} \right|\right)^{1/2} \quad \text{ (all terms are positive)}\\ &= \prod_{k=1}^{2n} \left| \frac{\exp(i\frac{(2k-1)\pi}{4n}) - \exp(-i\frac{(2k-1)\pi}{4n})}{2i} \right|^{1/2} \\ &= \frac{1}{2^n} \prod_{k=1}^{2n} \left| \exp\left(-i \frac{(2k+1)\pi}{4n}\right) \left(\exp\left(i\frac{4k\pi}{4n}\right) - \exp\left(i\frac{2\pi}{4n}\right)\right) \right|^{1/2}\\ &= \frac{1}{2^n} \prod_{k=1}^{2n} \left| \exp\left(-i \frac{(2k+1)\pi}{4n}\right) \right|^{1/2} \left| \left(\exp\left(i\frac{4k\pi}{4n}\right) - \exp\left(i\frac{2\pi}{4n}\right)\right) \right|^{1/2}\\ &= \frac{1}{2^n} \prod_{k=1}^{2n} \left| \left(\exp\left(i\frac{2k\pi}{2n}\right) - \exp\left(i\frac{\pi}{2n}\right)\right) \right|^{1/2}\\ &= \frac{1}{2^n} \left|\prod_{k=1}^{2n} \left(z-\exp\left(i\frac{2k\pi}{2n}\right)\right) \right|^{1/2} \quad \text{where }z = \exp\left(i\frac{\pi}{2n}\right)\\ &= \frac{1}{2^n} |(z^{2n}-1)|^{1/2}\\ &= \frac{1}{2^n} |-1-1|^{1/2}\\ &= \frac{\sqrt{2}}{2^n}. \end{aligned}

(The cosine formula can be derived in a similar manner.)

In general, the product of the distances of any point z in the complex plane to the n roots of unity \omega_n  is

\displaystyle \prod_{k=0}^{n-1} |z-\omega_n| = |z^n - 1|.

The above case was where z^n = -1. Two more cases are illustrated below, this time for n = 10. In the left example the product of distances is

\displaystyle \prod_{k=0}^9 |(1+i)-\exp(2\pi i k/10)| = |(1+i)^{10}-1| = 5\sqrt{41}

while for the right example it is

\displaystyle \prod_{k=0}^9 |1/2-\exp(2\pi i k/10)| = |(1/2)^{10}-1| = 1023/1024.

equalchords2

Note that earlier in the year I posted on the distances to a line from vertices of a regular polygon.

Next Page »

Create a free website or blog at WordPress.com.

%d bloggers like this: