# Chaitanya's Random Pages

## May 27, 2012

### The distribution of X/(X+Y) is uniform when X and Y are iid exponential

Filed under: mathematics — ckrao @ 12:31 pm

Suppose $X$ and $Y$ are independently distributed exponential random variables with mean 1. This means $f_X(x) = e^{-x} = {\rm Pr} (X >x)$. What is the distribution of $\displaystyle Z = \frac{X}{X+Y}$? Here is how one might proceed. Firstly note that $Z$ must be between 0 and 1.

For $t \in (0,1)$,

$\begin{array}{lcl} {\rm Pr}\left(\frac{X}{X+Y} \leq t\right) &=& {\rm Pr}\left(\frac{X+Y}{X} \geq \frac{1}{t}\right)\\&=& {\rm Pr}\left(Y \geq X\left(\frac{1}{t}-1\right)\right)\\ & = & \int_0^{\infty} f_X(x){\rm Pr}\left(Y \geq x\left(\frac{1}{t}-1\right)\right)\ dx \\&=& \int_0^{\infty} e^{-x} e^{-x\left(\frac{1}{t}-1\right)}\ dx\\&=&\int_0^{\infty} e^{-\frac{x}{t}}\ dx\\&=& t.\end{array}$

This shows that $\frac{X}{X+Y}$ is uniformly distributed between 0 and 1, a beautiful result!

More generally,

if $X_1, X_2, \ldots X_n$ are iid exponential, then for $k < n$ the ratio $\displaystyle \frac{\sum_{i=1}^k X_i}{\sum_{i=1}^n X_i}$ has the same distribution as the $k$‘th smallest number out of $(n-1)$ numbers chosen independently and uniformly between 0 and 1.

This is also known as the Beta distribution with parameters k, n-k. We consider a Poisson process where the time between arrivals is exponentially distributed. The ratio represents the proportion of waiting time for the k’th arrival to the n’th arrival.

The proof of this result can be seen for example in [1] or [2], but we will use another approach that I learnt as a student – one using counting processes.

Denote the partial sum $\sum_{i=1}^k X_i$ by $T_k$, the time to the k’th arrival. We wish to show that given $T_n = t$ the joint distribution of $(T_1, T_2, \ldots, T_{n-1})$ is the same as the joint distribution of $(U_{(1)}, U_{(2)}, \ldots U_{(n-1)})$, where $U_{(j)}$  is the $j$‘th smallest sample of $U_1, U_2, \ldots U_{(n-1)}$ chosen iid uniformly in $(0,t)$. Dividing by $t$ and then taking expectations over the distribution of $T_n$ would then lead to the result.

Define the counting process $N_s$ (the number of arrivals in time s) by

$\displaystyle N_s = \max \{i: T_i \leq s\}.$

By the definition of $X_i$ and $T_k$ this is a Poisson process. For $t_n = t$ We can decompose $N_{t_n}$ into the sum of independent Poisson random variables $N_{t_1}$, $N_{t_2}- N_{t_1}, \ldots, N_{t_n}- N_{t_{n-1}}$. Given $N_{t_n} = N_t = k$ they have a multinomial distribution with $k$ trials and probabilities $\frac{t_1}{t}, \frac{t_2 - t_1}{t}, \ldots, \frac{t_n - t_{n-1}}{t}$.

Next define the corresponding process $M_s$ determined by $j$ uniform random variables $U_j \in (0,t)$:

$\displaystyle M_s = \max \{i: U_(i) \leq s\}.$

This can also be written as the sum of $k$ indicator variables $\sum_{j=1}^k {\bf 1}\{U_j \leq s\}$ where

$\displaystyle {\bf 1}\{U_j \leq s\} = \begin{cases} 1 & {\rm if}\ U_j \leq s\\ 0 & {\rm otherwise}\end{cases}$

Each indicator variable is a Bernoulli random variable with probability ${\rm Pr}(U_j \leq s) = s/t$. One may decompose $M_{t_1}$, $M_{t_2}- M_{t_1}, \ldots, M_{t_{n}}- M_{t_{n-1}}$ have multinomial distribution with probabilities $\frac{t_1}{t}, \frac{t_2 - t_1}{t}, \ldots, \frac{t_n - t_{n-1}}{t}$.

This is the same distribution as that found above for $\left( N_{t_1}, N_{t_2} - N_{t_1}, \ldots, N_{t_n} - N_{t_{n-1}} \right)$  given $t_n = t$. We conclude that the counting processes $\{N_s\}$ and $\{M_s\}$ have the same finite-dimensional distributions (given $t_n = t$), so $(T_1, T_2, \ldots, T_{n-1})$ and $(U_{(1)}, U_{(2)}, \ldots U_{(n-1)})$ from which they are defined do too. This completes the proof.

By the way, if $X$ and $Y$ were independent and uniformly distributed between 0 and 1, then what is the distribution of $X = X/(X+Y)$ in this case? We have $f_X(x) =1$ for $x \in (0,1)$ and ${\rm Pr}(Y \geq y) = 1-y$ for $y \in (0,1)$ giving us for $t \in (0,1)$,

$\begin{array}{lcl} {\rm Pr} \left(\frac{X}{X+Y} \leq t\right)&=& \int_0^{\infty} f_X(x){\rm Pr}\left(Y \geq x\left(\frac{1}{t}-1\right)\right)\ dx\\ &=& \int_0^{\infty} {\bf 1} \{x \in (0,1)\} \left(1 - x \left(\frac{1}{t}-1\right) \right) {\bf 1}\left\{x \left(\frac{1}{t}-1 \right) \in (0,1)\right\}\ dx \end{array}$

For $t \in (0,1/2]$ this becomes

$\displaystyle \int_0^{t/(1-t)} \left(1 - x \frac{(1-t)}{t} \right) \ dx = \frac{t}{1-t} - \frac{1-t}{t} \frac{1}{2} \left( \frac{t}{1-t}\right)^2 = \frac{t}{2(1-t)}.$

For $t \in (1/2, 1)$ this is

$\displaystyle \int_0^1 \left(1 - x \frac{(1-t)}{t} \right) \ dx = 1- \frac{1-t}{t} \frac{1}{2} = \frac{3}{2} - \frac{1}{2t}.$

By taking derivatives we find that the pdf (probability density function) of $Z$ is given by

$\displaystyle f_Z(t) = \begin{cases} \frac{1}{2(1-t)^2} & {\rm if} \ t \in (0, 1/2)\\ \frac{1}{2t^2} & {\rm if} \ t \in (1/2, 1) \end{cases}$

The cdf and pdf of this random variable are plotted below.

Hence we see that when two numbers are chosen uniformly between 0 and 1, the ratio of one of them to their sum is more likely to be close to 0.5 than to any other value. When the two numbers are chosen according to an exponential distribution, the same ratio is uniform between 0 and 1.