Chaitanya's Random Pages

January 30, 2011

Two interesting proofs of the Cauchy-Schwarz inequality (real case)

Filed under: mathematics — ckrao @ 5:09 am

The Cauchy-Schwarz inequality is one of the most celebrated mathematical results known for its wide range of applications, as it holds in general inner product spaces.

\displaystyle \langle{x,y\rangle} \leq \left\|x\right\| \left\|y\right\|


\displaystyle \left(\sum_{i=1}^n x_i y_i\right)^2 \leq \left(\sum_{i=1}^n x_i^2\right)\left(\sum_{i=1}^n y_i^2\right)

Most notably, it is used in proving the triangle inequality for an inner product space (e.g. Euclidean space with the dot product), where the norm (e.g. length) is derived from the inner product.

In the real case, the most common proofs I have seen are either by looking at the discriminant of a non-negative quadratic form, or from Lagrange’s identity. Here are two more interesting proofs from [1] and [2] respectively.

1. (finite discrete case) We start with the following elementary inequality (assuming that x, y and x+y are non-zero):

\displaystyle \frac{(a+b)^2}{x+y} \leq \frac{a^2}{x} + \frac{b^2}{y}.

This can be shown to be equivalent to (ay-bx)^2 \geq 0.

By an easy proof by induction this can be extended to

\displaystyle \frac{(a_1 +a_2 + \ldots + a_n)^2}{b_1+b_2 + \ldots + b_n} \leq \frac{a_1^2}{b_1} + \frac{a_2^2}{b_2} + \ldots + \frac{a_n^2}{b_n}.

Now let a_i = x_i y_i and b_i = y_i^2 for i= 1, 2, \ldots, n. This magically leads to

\displaystyle \left(\sum_{i=1}^n x_i y_i\right)^2 \leq \left(\sum_{i=1}^n x_i^2\right)\left(\sum_{i=1}^n y_i^2\right),

as required.

2. (for any real inner product space) We simply use the non-negativity of the squared length of the difference of two normalised vectors x and y:

\displaystyle 0 \leq \left\|\frac{x}{||x||}- \frac{y}{||y||}\right\|^2 = 2 - 2\frac{\langle x, y \rangle}{\left\|x\right\| \left\|y\right\|}.

From this we obtain \displaystyle \langle{x,y\rangle} \leq \left\|x\right\| \left\|y\right\| as desired.


[1] “Cauchy-Schwarz Inequality: Yet Another Proof”, available at

[2] J. M. Steele, The Cauchy-Schwarz Master Class: An Introduction to the Art of Mathematical Inequalities, MAA Problem Books Series, 2004.

January 24, 2011

Long flights and fun with near-antipodes

Filed under: geography — ckrao @ 11:49 pm

I recently heard about the existence of a flight between Los Angeles and Dubai, so it prompted me to look up the longest flights in terms of distance.

The Los Angeles-Dubai route is the fourth longest (13420km in 16.5 hrs). In number one comes the 18 hour, 50 minute beast from Newark (New Jersey) to Singapore, at 15345km (9535 mi), or 38% across the globe! You might be wondering whether this flies via Europe or the Pacific Ocean and the answer is… neither! As seen below (ref) it goes north of Greenland, across the Arctic, within 150km of the north pole, then south through Russia, Mongolia, China and South East Asia.

Shortest path between Newark and Singapore

It should be pointed out that the flight is not always so polar, since it often takes advantage of favourable winds in the Atlantic. This unexpected path (being used to seeing flat 2-D maps all the time) made me check out a few more counter-intuitive paths around the globe, courtesy of Wolfram Alpha.

Los Angeles to Dubai is another one that flies north of Greenland and then over the boundary between Europe and Asia.

Los Angeles to Dubai

From Melbourne to Dakar, Senegal, the shortest path is to go way south first (to more than 60 degrees south latitude) and the path is almost entirely oceanic.

Melbourne to Dakar

To go from Melbourne to Recife in northeastern Brazil, you would pretty much traverse the south pole.

Melbourne to Recife

Finally, let us look at a few near-antipodal paths. Antipodes are points on the globe that are diametrically opposite. It is intriguing that less than 4% of the earth’s land is antipodal to land, with Australia antipodal to the Atlantic, North America antipodal to the Indian Ocean, and Africa and most of Eurasia antipodal to the Pacific (see this image). There are infinitely many shortest paths between antipodes (I am assuming the world is spherical now), and slightly perturbing one or both of these endpoints can change the shortest path drastically.

As an example, we take the paths between Buenos Aires and near-antipodal points of east Asia: Shanghai, Seoul and Beijing (distances from Buenos Aires between 19250 and 19650km). The true antipode is somewhere in the ocean within the triangle formed by this trio of big cities.

Buenos Aires to ShanghaiBuenos Aires to SeoulBuenos Aires to Beijing

As you can see, three very different directions!

January 15, 2011

The parabola as an analogue multiplier

Filed under: mathematics — ckrao @ 2:10 am

Suppose you had the parabola y = x^2 at your disposal. Here is a neat way of using it to multiply two positive numbers a and b. I saw this at the Wild About Math blog.

Locate the points (-a,a^2) and (b,b^2) on the parabola y = x^2. Then the line joining these two points has y-intercept ab. 🙂

Why is this true? Try to figure it out for yourself before reading on.

One way of seeing it is in observing that the y-intercept will be a weighted sum of the heights a^2 and b^2. Given the points’ distances from the y axis are a and b, the weights are in the ratio b:a and since they sum to 1, they must be b/(a+b) and a/(a+b). Hence the y-intercept is

\displaystyle a^2.\frac{b}{a+b} + b^2.\frac{a}{a+b} = \frac {ab(a+b)}{a+b} = ab.

January 8, 2011

Recap of 2010-11 Ashes and SA vs Ind series

Filed under: cricket,sport — ckrao @ 9:30 am

Since my previous post about the Ashes, Australia bounced back to win the third test in Perth and then it was all one-way traffic as England dominated the final two tests, winning by an innings in each, to secure their first away Ashes series win since the 1986-7 season. Overall Australia were outplayed in all departments.  They did not have nearly the same bowling penetration thanks to the English mastery of swinging the ball, and the Aussies were mostly unable to grind out long innings with the bat, unlike the English.

The only time the Australians were on top was in Perth and there it was largely thanks to Mitchell Johnson, brought back into the side. Firstly he top-scored in the Australian first innings as they recovered from 4/36 to reach 268. At the time that did not seem like a competitive total, and things seemed worse for the home side as England cruised to 0/78 in reply. Then a remarkable spell of late-swinging bowling reduced England to 5/98 and they never recovered. They proceeded to make 187 with Johnson ending with 6, then Australia pressed home the advantage with a second innings of 309 (Hussey 116, Watson 95). Five quick wickets by Australia late on day 3 (I was there!) snuffed out any chance England had, and they made only 123 in their second dig (Harris taking 6 wickets). Australia passed 300 in a test innings for the 27th and last consecutive test (second all-time, see here for a full list).

Then onto the Boxing Day test in Melbourne, and a first day crowd of 84,345 saw one of the most one-sided days in test cricket history (I was there!). Australia were bundled out for 98 and then England were 0/157 at stumps! It was the biggest ever lead a team has enjoyed after 1 day of test cricket without losing a wicket. The crowds for the next two days were also amazing (67,149 & 68,727) as England made it to 513 (Trott 168*) and Australia could only manage 258 in their second innings (Bresnan coming in for Steve Finn took 4 wickets). In achieving a 2-1 series lead the Ashes were safely retained by England.

The final test in Sydney saw Michael Clarke in charge for the first time in test cricket after Ricky Ponting’s broken finger ruled him out (he had captained 73 consecutive tests dating back to 2004). Australia batted first and for a change had a good start and were 1/105 at one point, but none of their batsmen could go on to make a big score, and they were all out for 280 (4/66 for Anderson). England then proceeded to put on a brilliant batting display to register their record fourth score of 500+, ending up with 644 (Cook 189, Prior 118, Bell 115). With a lead of 364, Australia had no chance and would only make 281 in their second innings, again nobody making a big score. It was as comprehensive a way to finish a series as a team could hope for.

Here are some statistical tidbits.

  • Remarkably, this was the 20th consecutive Ashes series which had a definite result (i.e. not a drawn series). The last time an Ashes series was drawn was way back in 1972.
  • Alastair Cook was man of the series with 766 runs in 7 innings @ 127.67, with 3 centuries and 2 50s. It was the second highest aggregate in a series by an Englishman against Australia (behind Wally Hammond’s 905 in 1928-9). The last time somebody scored more in a series was 1993/4 when Brian Lara scored 798 runs against England (including a then world record score of 375). Cook’s ICC cricket rating went up from 628 points and a ranking of 30th to 803 and a ranking of 5th.
  • England had 5 batsmen averaging over 50 (Cook, Trott, Pietersen, Bell, Prior), while only Hussey did so for Australia (see here). Trott also shot up the batting rankings from 16th to 4th.
  • England’s pace trio of Anderson, Tremlett and Bresnan were the best bowlers of the series with Anderson’s haul of 24 wickets (@26.04) the second best ever by an Englishman in Australia (Frank Tyson had 28 wickets in the 1954/5 series).
  • In Sydney Ian Bell scored his first century against Australia after 30 innings and 11 50+ scores.
  • It was the first time in test cricket a team had century partnerships for the 6th, 7th and 8th wickets in the same innings. England “recovered” from 5/226 to 644.
  • It was the first time that a home side had three innings defeats in a series since South Africa did in 1936.

Meanwhile, the top two ranked teams South Africa and India played a most enthralling series, ending up level at 1-1. The third test saw some astonishing batting by stalwarts Sachin Tendulkar and in particular Jacques Kallis who surely had his best ever test with the bat following up on his first double century in the first test. In the first innings he negotiated difficult conditions coming in at 2/34 and then held the tail together for 161 out of an innings total of 362. India were at a similar 2/28 when Tendulkar came in, and saw off two devastating spells by Dale Steyn (5/75) to reach 146 out of a team total of 364. Then Kallis, battling a side injury and a team position of 4/64 and later 6/130, scored a series saving 109 not out in the second innings to take South Africa to the safe total of 341. India had to score 340 on the final day to win, but arguably could only have done so if Sehwag had a field day, which would have been unlikely against this quality of bowling. The remaining batsmen negotiated the day safely to secure a draw.

The third and fourth days provided some of the most intense competition seen in recent times. You can see some video clips here:

Steyn’s dismissal of Pujara (day 3) was fast bowling at its best!

More statistical tidbits:

  • Kallis and Tendulkar are both in purple patches in their career, despite both being aged over 35. As mentioned in this recent Cricinfo article by S Rajesh, Kallis has averaged 62.44 since January 1999 – 12 years! He has had no fewer than 4 purple patches, though one of them was a small one with 5 centuries in 4 tests. It is rare for any batsman to have even one such purple patch.

Jacques Kallis – best purple patches with the bat

Time frame Tests Innings not out runs Average 100s 50s
29 Mar 01 – end 02 18 31 10 1637 77.95 4 10
24 Oct 03 – 6 Jan 06 26 46 10 2892 80.33 12 12
1 Oct 07 – 20 Nov 07 4 7 2 767 153.40 5 1
19 Mar 09 – 6 Jan 11 15 26 5 1772 84.38 10 3

Tendulkar has arguably had three purple patches including the current one. Here I am simply looking at the players’ cumulative averages and looking for a noticeably large increase in career average from a local minimum to a local maximum (including additional tests if they include centuries).

Sachin Tendulkar – best purple patches with the bat

Time frame Tests Innings not out runs Average 100s 50s
start 93 to 8 Jul 96 21 29 5 1826 76.08 6 10
17 Apr 97 to 23 Apr 02 41 70 7 4335 68.81 18 14
6 Nov 08 to 6 Jan 11 24 40 6 2540 74.71 12 8
  • Sehwag’s 11 off 40 balls in the second innings of the third test was by his slowest innings of double digits. In fact only on one other occasion has he scored double digits with a strike rate below 50 runs per hundred balls (13 off 29 balls in Nov 2001 in Port Elizabeth) – see his innings sorted by strike rate here.
  • Steyn had his most prolific series with 21 wickets in the 3 tests, confirming his place as the world premier bowler of our time. He has taken 206 wickets in his last 37 tests at an average of 21.24 and strike rate of 37.9 balls/wicket.
  • Kallis has 22 man of the match awards in test cricket, more than any other player.
  • Tendulkar has now had four years in a row in which his first test innings of the calendar year is a century.

Finally, I produced a graph showing the cumulative run totals of the top test run-scorers. Kallis has for the first time overtaken Tendulkar in terms of career runs by a certain age. I have also included Alastair Cook in the graph to show that he is second to Tendulkar in runs at such a young age.

We see how similar Kallis, Tendulkar and Ponting are at age 35, also how close Lara and Dravid are late in their careers due to a late surge by the former.

Here are some of their records for similar run totals to Kallis, and we see Lara achieved his runs a little quicker (in number of tests and innings) than the other three. The century count for Kallis, Ponting and Tendulkar is remarkably similar. Tendulkar was at a local minimum in his average at this point and proceeded to enjoy the 12-century purple patch noted above.

Cumulative Totals of Top 5 Test Run scorers after roughly 11950 runs

Player Age at start of test when reaching ~11950 test runs Tests Innings not out Runs Average 100s 50s
Lara 37 years, 209 days 131 232 6 11953 52.89 34 48
Kallis 35 years, 78 days 145 246 38 11947 57.44 40 54
Tendulkar 35 years, 176 days 151 246 25 11939 54.02 39 49
Ponting 35 years, 206 days 145 245 27 11954 54.83 39 51
Dravid 37 years, 313 days 147 253 29 11943 53.32 31 59

Kallis has a clearly higher average than the rest at the same stage and has a break from test cricket before his next surge at the records. And to think that he has 270 test wickets as well!

January 3, 2011

Solving linear recurrence relations with constant coefficients

Filed under: mathematics — ckrao @ 11:16 am

Here is a method (algorithm) to solve recurrence relations of the following form, without the use of generating functions or z-transforms.

\displaystyle x_{n+k} + c_{k-1}x_{n+k-1} + c_{k-2}x_{n+k-2} + \ldots + c_0 x_n = f(n), \medskip n = 0, 1, \ldots \quad (*)

We assume we are given k initial conditions (to enable us to solve uniquely the k’th order equation), and that f(n) is of the form

\displaystyle f(n) = \sum_{j=1}^J \alpha_j^n q_j(n), where q_j(n) is a polynomial of degree d_j.

1. Solve the equation with f(n) = 0 first (the homogeneous form of the equation).

a) Write down, then factorise the characteristic equation (which arises from setting x_k = \lambda^k).

\displaystyle \lambda^k + c_{k-1}\lambda^{k-1} + \ldots + c_1 \lambda + c_0 = (\lambda - r_1)^{s_1} (\lambda - r_2)^{s_2} \ldots (\lambda - r_m)^{s_m},

where r_1, r_2, \ldots, r_m are distinct.

b) The homogeneous solution (i.e. that corresponding to f(n) = 0) is

\displaystyle x_n^{(h)} = r_1^n p_1(n) + r_2^np_2(n) + \ldots + r_m^np_m(n),

where p_i(n) is a polynomial of degree s_i-1.

2. Incorporate the term f(n). For j = 1 to J,

  • If \alpha_j \neq r_i for all i, x_n^{(p,j)} = \alpha_j^n\left(t_{j,0} + t_{j,1} n + \ldots + t_{j,d_j} n^{d_j}\right)
  • If \alpha_j = r_i for some i, x_n^{(p,j)} = \alpha_j^n n^{s_i}\left(t_{j,0} + t_{j,1} n + \ldots + t_{j,d_j} n^{d_j}\right)

3. Find t_{j,k} for all j and k by substituting x_n^{(p,j)} into (*).

4. Combine the homogeneous and particular solutions to find the general solution x_n = x_n^{(h)} + \sum_{j=1}^J x_n^{(p,j)}.

5. Use the initial conditions to find p_1(n), p_2(n), \ldots, p_m(n) in step 1b.

Let us try this out with the following example.

Solve x_{n+2} + 2x_{n+1} - 3x_n = 4, \quad x_0 = 6, x_1 = -1.

1a) The characteristic equation is found by substituting x_n = \lambda^n and setting the right side to 0:

\displaystyle \lambda^2 + 2\lambda - 3 = (\lambda + 3)(\lambda - 1)

b) The homogeneous solution is then given by x_n^{(h)} = C_1(1)^n + C_2(-3)^n.

2. We have f(n) = 4 = 1^n .4. Since 1 is one of the roots of the characteristic polynomial of multiplicity 1, we try a particular solution of the form x_n^{(p)} = n^1 t_0 = nt_0.

3. Substituting x_n^{(p)} into the original equation:

\displaystyle (n+1)t_0 + 2nt_0 - 3(n-1)t_0 = 3n+1

Matching the constant coefficient of both sides: t_0 +3t_0 = 4, so t_0 = 1.

4. Combining homogeneous and particular solutions, x_n = x_n^{(h)} + x_n^{(p)} = C_1 + C_2(-3)^n + n.

5. Use the initial values to find C_1, C_2:

  • n = 0:\quad \quad x_0 = C_1 + C_2(-3)^0 + 0 = C_1 + C_2 = 6
  • n = 1:\quad \quad x_1 = C_1 + C_2(-3)^1 + 1 = C_1 - 3C_2 + 1= -1

Solving gives C_1 = 4, C_2 = 2. Hence x_n = 2.(-3)^n + n + 4 and it can be verified by substitution that this indeed satisfies the recurrence equation with the given initial conditions.

So why does this procedure work? The remainder of this post attempts to provide some insight. It is not intended to be a complete proof, but should provide enough examples for one to be able to generalise and complete the details.

Part A. We start by considering the first order equation x_{n+1} - \beta x_n = \sum_{j = 1}^J q_j(n) \alpha_j^n. \quad (+)

1. By setting x_n = y_n \beta^n this can be converted to an equation of the form y_{n+1} - y_n = \frac{1}{\beta}\sum_{j = 1}^J q_j(n) (\alpha_j/\beta)^n, so we are reduced to solving an equation of the original form (+) with \beta = 1.

2. If we can find a solution to x_{n+1} - x_n = q_j(n) \alpha_j^n for each j, then summing the J solutions will give a solution for x_{n+1} - x_n = \sum_{j=1}^J q_j(n) \alpha_j^n. We are reduced further to solving an equation of the form x_{n+1} - x_n = q(n) \alpha^n.

3. If we let x_n = p(n)\alpha^n, x_{n+1} - x_n = q(n) \alpha^n becomes

\displaystyle \alpha^{n+1}p(n+1) - \alpha^n p(n) = q(n)\alpha^n,

which reduces to \alpha p(n+1) - p(n) = q(n). To solve this we have the following lemma which is interesting in its own right.

Lemma 1: Let q(n) be a given polynomial of degree d, and \alpha a constant. Then there exists a polynomial p(n) such that the equation \alpha p(n+1) - p(n) = q(n) is satisfied.

  • If \alpha \neq 1, p has degree d.
  • If \alpha = 1, p has degree d+1.

For example, 2p(n+1) - p(n) = n has the solution p(n) = n-2 (linear), while p(n+1) - p(n) = n has the solution p(n) = \frac{1}{2}n(n-1) (quadratic).

To prove the lemma, we find the coefficients of p(n) by solving a system of linear equations. The invertibility of the equations comes from the fact that the system is lower triangular with non-zero diagonal entries. This means back-substitution can be used to find the coefficients.

Case 1: \alpha \neq 1

Let p(n) = \sum_{k = 0}^d c_k n^k. Then

\begin{array}{lcl} p(n+1) &=& \sum_{k = 0}^d c_k (n+1)^k\\&=&\sum_{k = 0}^d c_k \sum_{l = 0}^k n^l \binom{k}{l}\\& = & \sum_{m=0}^d n^m \left[\sum_{k=m}^d\binom{k}{m}c_k\right]\\&=&\sum_{k=0}^d n^k \left[\sum_{j = k}^d \binom{j}{k} c_j\right].\end{array}

Hence \alpha p(n+1) - p(n) = \sum_{k=0}^d n^k \left[\left(\alpha\sum_{j = k}^d \binom{j}{k} c_j\right) - c_k\right].

If q(n) = q_d n^d + q_{d-1}n^{d-1} + \ldots + q_0, matching coefficients of n^k leads to the following system of equations.

\left[\begin{array}{ccccc}\alpha \binom{d}{d}-1&0&0&\cdots&0\\ \alpha \binom{d}{d-1} &\alpha \binom{d-1}{d-1} - 1 & 0 & \cdots&0\\ \alpha \binom{d}{d-2}&\alpha \binom{d-1}{d-2}&\alpha \binom{d-2}{d-2}-1&0&\vdots \\ \vdots & \vdots & & \ddots & \\ \alpha \binom{d}{0} & \alpha \binom{d-1}{0} & \cdots & \alpha \binom{1}{0} & \alpha \binom{0}{0}-1 \end{array}\right]\left[\begin{array}{c}c_d\\c_{d-1}\\c_{d-2}\\ \vdots \\ c_0\end{array}\right] = \left[\begin{array}{c} q_d\\q_{d-1}\\q_{d-2}\\ \vdots \\ q_0\end{array}\right]

For \alpha \neq 1 the diagonal entries are non-zero and hence the system of equations may be inverted to solve for the coefficients c_k of p(n).

Case 2: \alpha = 1

This time we let p(n) = \sum_{k = 1}^{d+1} c_k n^k (degree d+1) and proceed as in case 1 with \alpha set to 1, leading to the following system.

\left[\begin{array}{ccccc}\binom{d+1}{d}& 0 & 0 & \cdots & 0 \\ \binom{d+1}{d-1} & \binom{d}{d-1} & 0 & \cdots & 0\\ \binom{d+1}{d-2} & \binom{d}{d-2} & \binom{d-1}{d-2}&0&\vdots\\ \vdots & \vdots & & \ddots & 0\\ \binom{d+1}{0}&\binom{d}{0}&\cdots&\binom{2}{0}&\binom{1}{0} \end{array}\right]\left[\begin{array}{c}c_{d+1}\\c_d\\c_{d-1}\\ \vdots \\ c_1\end{array}\right] = \left[\begin{array}{c} q_d\\q_{d-1}\\q_{d-2}\\ \vdots \\ q_0 \end{array}\right]

Again the diagonal entries are non-zero, and hence may be solved by back-substitution starting with c_{d+1}. This completes the proof of Lemma 1. Note that any constant term can be added to p(n) and the result still satisfies \alpha p(n+1) - p(n) = q(n).

To tie this with the original 5 steps at the top of the post, the homogeneous solution to x_{n+1} - \beta x_n = 0 is seen to be x_n^{(h)} = \beta^nx_0 (easily linked to the characteristic equation \lambda - \beta). The two cases in step 2 are equivalent to verifying those of Lemma 1, since we applied the substitution y_n = x_n/\beta^n. Then the homogeneous and particular solutions may be added and initial conditions used to determine any remaining coefficients.

The main conclusion of Part A after following the above steps is as follows:

Proposition 1 : The solution of x_{n+1} - \beta x_n = q(n) \alpha^n (where q(n) has degree d), is of the form A \beta^n + q_1(n)\alpha^n, where q_1(n) is a polynomial whose degree is d if \alpha \neq \beta and d+1 if \alpha = \beta.

Part B: Now we seek to solve the original equation (*). This can be done by successively reducing the order of the left side to reach eventually first order, and then apply the result of Proposition 1.

Let S be the shift operator on sequences, so that S applied to the sequence \{x_0, x_1, x_2, \ldots\} yields \{x_1, x_2, x_3, \ldots\}. This is a linear operator in that S(\alpha x_n + \beta y_n) = \alpha S x_n + \beta S y_n where x_n and y_n are sequences. We can multiply S by scalars, so that for example 2S(x_n) = 2x_{n+1} = S(2x_n), indicating also that S commutes with scalar multiplication. Furthermore, we can also take powers of S, corresponding to successive application of the operator (e.g. S^2 x_n = x_{n+2}). Finally, we can form linear combinations of these powers, hence polynomials in S. Let I denote the identity operator. Then equation (*)  can be written as

\displaystyle (S^k + c_{k-1}S^{k-1} + c_{k-2}S^{k-2} + \ldots + c_0 I)x_n = f(n).

The polynomial in S can factorise, as the algebra of polynomials carries over to the shift operator (i.e. the shift operator generates a commutative ring of linear operators), so we may write something like

x_{n+2} - 5x_{n+1} + 6x_n = (S^2 - 5S + 6I)x_n = (S-2I)(S-3I)x_n = (S-2I)(x_{n+1} - 3x_n) = (S-3I)(S-2I)x_n = (S-3I)(x_{n+1} - 2x_n).

We can start to see how the roots of the characteristic polynomial in 1a) above play a role.

Suppose we wish to find the general solution to the second order equation x_{n+2} - 5x_{n+1} + 6x_n = 0. Letting y_n = x_{n+1} - 3x_n, we find from the above factorisation that (S-2I)y_n = 0. The solution to this is y_n = 2^n y_0. Then x_{n+1} - 3x_n = 2^n y_0, and from Proposition 1, this has solution x_n = 2^nC_1 + 3^n C_2.

Consider now the equation x_{n+2} - 4x_{n+1} + 4x_n = 0. This is equivalent to (S-2)(S-2)x_n = 0. Letting y_n = x_{n+1} - 2x_n, we find that(S-2I)y_n = 0. The solution to this is y_n = 2^n y_0. Then x_{n+1} - 2x_n = 2^n y_0 and from Proposition 1, this has solution x_n = 2^n (C_1 + C_2 n) (i.e. a different form when there are repeated roots).

This can be generalised to higher orders and the case f(n) \neq 0. We factorise the characteristic polynomial (S - r_1I)^{s_1} (S- r_2I)^{s_2} \ldots (S- r_mI)^{s_m}x_n = f(n) and make first order substitutions with an aim to reduce the order of the recurrence by one each time. Each first order equation is of the form given in Part A and hence may easily be solved by applying Proposition 1.

We will give a final example to illustrate this.

Solve (S-3I)(S-3I) (S-2I) x_n = 2^n n.

Let y_n = (S-2I)x_n, z_n = (S-3I)y_n. Then (S-3I)z_n = 2^n n, which has solution of the form z_n = 2^n (An + B) + 3^n C. Then we have

(S-3I)y_n = z_n = 2^n (An + B) + 3^nC.

This has solution of the form y_n = 2^n (A_1n + B_1) + 3^n(C_1n + D_1). Finally, (S-2I)x_n = 2^n (A_1n + B_1) + 3^n(C_1n + D_1), which by Proposition 1 has solution of the form

x_n = 2^n (A_2n^2 + B_2n + C_2) + 3^n(D_2n + E_2).

Note that C_2, D_2 and E_2 are part of the homogeneous solution meaning that they are found only from the initial values and they contribute nothing to the right side of the equation upon substitution. The coefficients A_2 and B_2 are found by substituting x_n = (A_2n^2 + B_2n)2^n into the original equation and comparing coefficients.

This entire explanation (i.e. Parts A and B) may make better sense after trying more examples. It is nice to have a procedure as outlined in steps 1-5, but more rewarding to see why it works.

Create a free website or blog at

%d bloggers like this: