Abstract
In this chapter, we discuss sequences of random variables and their convergence. The central limit theorem, one of the most important and widely-used results in many areas of the applications of random variables, will also be described.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Here, \({\mathop {=}\limits ^{d}}\) means ‘equal in distribution’ as introduced in Example 3.5.18.
- 2.
- 3.
- 4.
- 5.
- 6.
The inequality (6.A.15) holds true also when X is replaced by \(|X|^r\) for \(r >0\).
- 7.
A function h is called convex or concave up when \(h(tx+(1-t)y) \le th(x) + (1-t)h(y)\) for every two points x and y and for every choice of \(t \in [0, 1]\). A convex function is a continuous function with a non-decreasing derivative and is differentiable except at a countable number of points. In addition, the second order derivative of a convex function, if it exists, is non-negative.
References
E.F. Beckenbach, R. Bellam, Inequalities (Springer, Berlin, 1965)
W.B. Davenport Jr., Probability and Random Processes (McGraw-Hill, New York, 1970)
J.L. Doob, Heuristic approach to the Kolmogorov-Smirnov theorems. Ann. Math. Stat. 20(3), 393–403 (1949)
W. Feller, An Introduction to Probability Theory and Its Applications, 3rd edn. revised printing (Wiley, New York, 1970)
W.A. Gardner, Introduction to Random Processes with Applications to Signals and Systems, 2nd edn. (McGraw-Hill, New York, 1990)
R.M. Gray, L.D. Davisson, An Introduction to Statistical Signal Processing (Cambridge University Press, Cambridge, 2010)
G. Grimmett, D. Stirzaker, Probability and Random Processes (Oxford University, London, 1982)
A. Leon-Garcia, Probability, Statistics, and Random Processes for Electrical Engineering, 3rd edn. (Prentice Hall, New York, 2008)
G.A. Mihram, A cautionary note regarding invocation of the central limit theorem. Am. Stat. 23(5), 38 (1969)
V.K. Rohatgi, A.KMd.E. Saleh, An Introduction to Probability and Statistics, 2nd edn. (Wiley, New York, 2001)
J.M. Stoyanov, Counterexamples in Probability, 3rd edn. (Dover, New York, 2013)
J.B. Thomas, Introduction to Probability (Springer, New York, 1986)
R.D. Yates, D.J. Goodman, Probability and Stochastic Processes (Wiley, New York, 1999)
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendices
6.1.1 Appendix 6.1 Convergence of Probability Functions
Â
For a sequence \(\left\{ X_n \right\} _{n=1}^{\infty }\), let \(F_n\) and \(M_n\) be the cdf and mgf, respectively, of \(X_n\). We first note that, when \(n\rightarrow \infty \), the sequence of cdf’s does not always converge and, even when it does, the limit is not always a cdf.
Example 6.A.1
Consider the cdf
of \(X_n\). Then, the limit of the sequence \(\left\{ F_n \right\} _{n=1}^{\infty }\) is
which is a cdf. \(\diamondsuit \)
Example 6.A.2
(Rohatgi and Saleh 2001) Consider the cdf \(F_n(x) = u(x-n)\) of \(X_n\). The limit of the sequence \(\left\{ F_n \right\} _{n=1}^{\infty }\) is \(\lim \limits _{n\rightarrow \infty }F_n(x) = 0\), which is not a cdf.
Example 6.A.3
(Rohatgi and Saleh 2001) Assume the pmf \( \mathsf {P}\left( X_n=-n \right) =1\) for a sequence \(\left\{ X_n \right\} _{n=1}^{\infty }\). Then, the mgf is \(M_n(t) = e^{-nt}\) and its limit is \(\lim \limits _{n\rightarrow \infty }M_n(t) = M(t)\), where
The function M(t) is not an mgf. In other words, the limit of a sequence of mgf’s is not necessarily an mgf. \(\diamondsuit \)
Example 6.A.4
Assume the pdf \(f_n(x) = \frac{n}{\pi } \frac{1}{1+n^2x^2}\) of \(X_n\). Then, the cdf is \(F_n(x) = \frac{n}{\pi }\int _{-\infty }^{x}\frac{dt}{1+n^2t^2}\). We also have \(\lim \limits _{n\rightarrow \infty }f_n(x) = \delta (x)\) and \(\lim \limits _{n\rightarrow \infty }F_n(x) = u(x)\). These limits imply \(\lim \limits _{n\rightarrow \infty } \mathsf {P}\left( \left| X_n-0\right| > \varepsilon \right) = \int _{-\infty }^{-\varepsilon }\delta (x) dx + \int _{\varepsilon }^{\infty }\delta (x)dx = 0\) and, consequently, \(\left\{ X_n \right\} _{n=1}^{\infty }\) converges to 0 in probability. \(\diamondsuit \)
6.1.2 Appendix 6.2 The Lindeberg Central Limit Theorem
Â
The central limit theorem can be expressed in a variety of ways. Among those varieties, the Lindeberg central limit theorem is one of the most general ones and does not require the random variables to have identical distribution.
Theorem 6.A.1
(Rohatgi and Saleh 2001)Â For an independent sequence \(\left\{ X_i \right\} _{i=1}^{\infty }\), let the mean, variance, and cdf of \(X_i\) be \(m_i\), \(\sigma _i^2\), and \(F_i\), respectively. Let
When the cdf \(F_i\) is absolutely continuous, assume that the pdf \(f_i (x) = \frac{d}{dx} F_i (x)\) satisfies
for every value of \(\varepsilon >0\). When \(\left\{ X_i \right\} _{i=1}^{\infty }\) are discrete random variables, assume the pmf \(p_i (x) = \mathsf {P}\left( X_i =x\right) \) satisfiesFootnote 5
for every value of \(\varepsilon >0\), where \(\left\{ x_{il} \right\} _{l=1}^{L_i}\) are the jump points of \(F_i\) with \(L_i\) the number of jumps of \(F_i\). Then, the distribution of
converges to \(\mathcal {N}(0,1)\) as \(n \rightarrow \infty \).
Â
Example 6.A.5
(Rohatgi and Saleh 2001) Assume an independent sequence \(\left\{ X_k \sim U\left( -a_k, a_k \right) \right\} _{k=1}^{\infty } \). Then, \(\mathsf {E}\left\{ X_k \right\} =0\) and \(\mathsf {Var}\left\{ X_k \right\} = \frac{1}{3}a_k^2\). Let \(\left| a_k \right| <A\) and \(s_n^2 = \sum \limits _{k=1}^n \mathsf {Var}\left\{ X_k \right\} = \frac{1}{3}\sum \limits _{k=1}^n a_k^2 \rightarrow \infty \) when \(n \rightarrow \infty \). Then, from the Chebyshev inequality \( \mathsf {P}(|Y-\mathsf {E}\{Y\}|\ge \varepsilon ) \le \frac{\mathsf {Var}\{Y\}}{\varepsilon ^2}\) discussed in (6.A.16), we get
as \(n \rightarrow \infty \) because .
Meanwhile, assume \(\sum \limits _{k=1}^{\infty } a_k^2 < \infty \), and let \(s_n^2 \uparrow B^2\) for \(n \rightarrow \infty \). Then, for a constant k, we can find \(\varepsilon _k\) such that \( \varepsilon _k B < a_k\), and we have \(\varepsilon _k s_n < \varepsilon _k B\). Thus, \( \mathsf {P}\left( \left| X_k \right|> \varepsilon _k s_n \right) \ge \mathsf {P}\left( \left| X_k \right|> \varepsilon _k B \right) > 0\). Based on this result, for \(n \ge k\), we get
implying that the Lindeberg condition is not satisfied. In essence, in a sequence of uniformly bounded independent random variables, a necessary and sufficient condition for the central limit theorem to hold true is \(\sum \limits _{k=1}^{\infty } \mathsf {Var}\left\{ X_k \right\} \rightarrow \infty \). \(\diamondsuit \)
Example 6.A.6
(Rohatgi and Saleh 2001) Assume an independent sequence \(\left\{ X_k \right\} _{k=1}^{\infty }\). Let \(\delta >0\), \(\alpha _k = \mathsf {E}\left\{ \left| X_k \right| ^{2+\delta } \right\} < \infty \), and \(\sum \limits _{j=1}^n \alpha _j = o \left( s_n^{2+\delta } \right) \). Then, the Lindeberg condition is satisfied and the central limit theorem holds true. This can be shown easily as
because \(x^2 < \frac{|x|^{2+\delta }}{\varepsilon ^\delta s_n^\delta }\) from \( |x|^{\delta } x^2 > \left| \varepsilon s_n\right| ^\delta x^2\) when \(|x|> \varepsilon s_n\). We can similarly show that the central limit theorem holds true in discrete random variables. \(\diamondsuit \)
The conditions (6.A.5) and (6.A.6) are the necessary conditions in the following sense: for a sequence \(\left\{ X_i \right\} _{i=1}^{\infty }\) of independent random variables, assume the variances \(\left\{ \sigma _i^2 \right\} _{i=1}^{\infty }\) of \(\left\{ X_i \right\} _{i=1}^{\infty }\) are finite. If the pdf of \(X_i\) satisfies (6.A.5) or the pmf of \(X_i\) satisfies (6.A.6) for every value of \(\varepsilon >0\), then
and
and the converse also holds true, where \(\overline{X}_n = \frac{1}{n} \sum \limits _{i=1}^{n} X_i\) is the sample mean of \(\left\{ X_i \right\} _{i=1}^{n}\) defined in (5.4.1).
6.1.3 Appendix 6.3 Properties of Convergence
 Â
(A) Continuity of Expected Values
When the sequence \(\left\{ X_n \right\} _{n=1}^{\infty }\) converges to X, the sequence \(\left\{ \mathsf {E}\left\{ X_n \right\} \right\} _{n=1}^{\infty }\) of expected values will also converge to the expected value \(\mathsf {E}\{X\}\), which is called the continuity of expected values. The continuity of expected values (Gray and Davisson 2010) is a consequence of the continuity of probability discussed in Appendix 2.1.
-
(1)
Monotonic convergence. If \(0 \le X_n \le X_{n+1}\) for every integer n, then \(\mathsf {E}\left\{ X_n \right\} \rightarrow \mathsf {E}\{X\}\) as \(n \rightarrow \infty \). In other words, \(\mathsf {E}\left\{ \underset{n \rightarrow \infty }{\lim } X_n \right\} = \underset{n \rightarrow \infty }{\lim } \mathsf {E}\left\{ X_n \right\} \).  Â
-
(2)
Dominated convergence. If \(\left| X_n\right| < Y\) for every integer n and \(\mathsf {E}\{Y\} < \infty \), then \(\mathsf {E}\left\{ X_n \right\} \rightarrow \mathsf {E}\{X\}\) as \(n \rightarrow \infty \).  Â
-
(3)
Bounded convergence. If there exists a constant c such that \(\left| X_n\right| \le c\) for every integer n, then \(\mathsf {E}\left\{ X_n \right\} \rightarrow \mathsf {E}\{X\}\) as \(n \rightarrow \infty \).  Â
(B) Properties of Convergence
We list some properties among various types of convergence. Here, a and b are constants.
-
(1)
If \(X_n \overset{p}{\rightarrow } X\), then \(X_n-X \overset{p}{\rightarrow } 0\), \(a X_n \overset{p}{\rightarrow } a X\), and \(X_n-X_m \overset{p}{\rightarrow } 0\) for \(n, m \rightarrow \infty \).
-
(2)
If \(X_n \overset{p}{\rightarrow } X\) and \(X_n \overset{p}{\rightarrow } Y\), then \( \mathsf {P}(X=Y) =1\).
-
(3)
If \(X_n \overset{p}{\rightarrow } a\), then \(X_n^2 \overset{p}{\rightarrow } a^2\).
-
(4)
If \(X_n \overset{p}{\rightarrow } 1\), then \(\frac{1}{X_n}\overset{p}{\rightarrow } 1\).
-
(5)
If \(X_n \overset{p}{\rightarrow } X\) and Y is a random variable, then \(X_n Y \overset{p}{\rightarrow } XY\).
-
(6)
If \(X_n \overset{p}{\rightarrow } X\) and \(Y_n \overset{p}{\rightarrow } Y\), then \(X_n \pm Y_n \overset{p}{\rightarrow } X \pm Y\) and \(X_n Y_n \overset{p}{\rightarrow } X Y\).
-
(7)
If \(X_n \overset{p}{\rightarrow } a\) and \(Y_n \overset{p}{\rightarrow } b \ne 0\), then \(\frac{X_n}{Y_n} \overset{p}{\rightarrow } \frac{a}{b}\).
-
(8)
If \(X_n \overset{d}{\rightarrow } X\), then \(X_n+a \overset{d}{\rightarrow } X+a\) and \(b X_n \overset{d}{\rightarrow } b X\) for \(b \ne 0\).
-
(9)
If \(X_n \overset{d}{\rightarrow } a\), then \(X_n \overset{p}{\rightarrow } a\). Therefore, \(X_n \overset{d}{\rightarrow } a \rightleftarrows X_n \overset{p}{\rightarrow } a\).
-
(10)
If \(\left| X_n-Y_n\right| \overset{p}{\rightarrow } 0\) and \(Y_n \overset{d}{\rightarrow } Y\), then \(X_n \overset{d}{\rightarrow } Y\). Based on this, it can be shown that \(X_n \overset{d}{\rightarrow } X\) when \(X_n \overset{p}{\rightarrow } X\).
-
(11)
If \(X_n \overset{d}{\rightarrow } X\) and \(Y_n \overset{p}{\rightarrow } a\), then \(X_n \pm Y_n \overset{d}{\rightarrow } X \pm a\), \(X_n Y_n \overset{d}{\rightarrow } aX\) for \(a \ne 0\), \(X_n Y_n \overset{p}{\rightarrow } 0\) for \(a = 0\), and \(\frac{X_n}{Y_n} \overset{d}{\rightarrow } \frac{X}{a}\) for \(a \ne 0\).
-
(12)
If \(X_n \overset{r=2}{\longrightarrow } X\), then \(\lim \limits _{n \rightarrow \infty } \mathsf {E}\left\{ X_n\right\} = \mathsf {E}\{X\}\) and \(\lim \limits _{n \rightarrow \infty } \mathsf {E}\left\{ X_n^2\right\} = \mathsf {E}\left\{ X^2 \right\} \).
-
(13)
If \(X_n \overset{L^r}{\rightarrow } X\), then \(\lim \limits _{n \rightarrow \infty } \mathsf {E}\left\{ \left| X_n\right| ^r\right\} = \mathsf {E}\left\{ |X|^r\right\} \).
-
(14)
If \(X_1> X_2> \cdots > 0\) and \(X_n \overset{p}{\rightarrow } 0\), then \(X_n {\mathop {\longrightarrow }\limits ^{a.s.}} 0\).
(C) Convergence and Limits of Products
Consider the product
The infinite product \(\prod \limits _{k=1}^{\infty }a_k\) is called convergent to the limit A when \(A_n \rightarrow A\) and \(A \ne 0\) for \(n\rightarrow \infty \); divergent to 0 when \(A_n \rightarrow 0\); and divergent when \(A_n\) is not convergent to a non-zero value. The convergence of products is often related to the convergence of sums as shown below.
-
(1)
When all the real numbers \(\left\{ a_k \right\} _{k=1}^{\infty }\) are positive, the convergence of \(\prod \limits _{k=1}^{\infty }a_k\) and that of \(\sum \limits _{k=1}^{\infty } \ln a_k\) are the necessary and sufficient conditions of each other.
-
(2)
When all the real numbers \(\left\{ a_k \right\} _{k=1}^{\infty }\) are positive, the convergence of \(\prod \limits _{k=1}^{\infty }(1+a_k)\) and that of \(\sum \limits _{k=1}^{\infty } a_k\) are the necessary and sufficient conditions of each other.
-
(3)
When all the real numbers \(\left\{ a_k \right\} _{k=1}^{\infty }\) are non-negative, the convergence of \(\prod \limits _{k=1}^{\infty }(1-a_k)\) and that of \(\sum \limits _{k=1}^{\infty } a_k\) are the necessary and sufficient conditions of each other.
6.1.4 Appendix 6.4 Inequalities
In this appendix  we introduce some useful inequalities (Beckenbach and Bellam 1965) in probability spaces.
(A) Inequalities for Random Variables
Theorem 6.A.2
(Rohatgi and Saleh 2001)Â If a measurable function h is non-negative and \(\mathsf {E}\{h(X)\}\) exists for a random variable X, then
for \(\varepsilon > 0\), which is called the tail probability inequality.
Proof
Assume X is a discrete random variable. Letting \( \mathsf {P}\left( X = x_k \right) = p_k\), we have \(\mathsf {E}\{ h(X) \} = \left( \sum \limits _A + \sum \limits _{A^c} \right) h\left( x_k \right) p_k \ge \sum \limits _A h\left( x_k \right) p_k\) when \(A = \left\{ k: \, h\left( x_k \right) \ge \epsilon \right\} \): this yields \(\mathsf {E}\{ h(X) \} \ge \epsilon \sum \limits _A p_k = \epsilon \mathsf {P}( h(X) \ge \epsilon )\) and, subsequently, (6.A.14). \(\spadesuit \)
Theorem 6.A.3
 If X is a non-negative random variable, thenFootnote 6
for \(\alpha >0\), which is called the Markov inequality.
The Markov inequality can be proved easily from (6.A.14) by letting \(h(X) = |X|\) and \(\varepsilon =\alpha \). We can show the Markov inequality also from by recollecting that a pdf is non-negative.
Theorem 6.A.4
  The mean \(\mathsf {E}\{Y\}\) and variance \(\mathsf {Var}\{Y\}\) of any random variable Y satisfy
for any \(\varepsilon >0\), which is called the Chebyshev inequality.
Proof
The random variable \(X = (Y-\mathsf {E}\{Y\})^2\) is non-negative. Thus, if we use (6.A.15), we get \( \mathsf {P}\left( [Y-\mathsf {E}\{Y\}]^2 \ge \varepsilon ^2 \right) \le \frac{ 1}{\varepsilon ^2} \mathsf {E}\left\{ [Y-\mathsf {E}\{Y\} ]^2 \right\} = \frac{\mathsf {Var}\{Y\}}{\varepsilon ^2}\). Now, noting that \( \mathsf {P}\left( [Y-\mathsf {E}\{Y\}]^2\ge \varepsilon ^2 \right) = \mathsf {P}(|Y-\mathsf {E}\{Y\}|\ge \varepsilon )\), we get (6.A.16). \(\spadesuit \)
Theorem 6.A.5
(Rohatgi and Saleh 2001)Â The absolute mean \(\mathsf {E}\{|X|\}\) of any random variable X satisfies
which is called the absolute mean inequality.
Proof
Let the pdf of a continuous random variable X be \(f_X\). Then, because \( \mathsf {E}\{ |X| \} = \int _{-\infty }^{\infty } |x|f_X(x)dx = \sum \limits _{k=0}^{\infty } \int _{k \le |x| < k+1} |x|f_X(x)dx\), we have
Now, employing \(\sum \limits _{k=0}^{\infty } k \mathsf {P}(k \le |X|< k+1 ) = \sum \limits _{n=1}^{\infty } \sum \limits _{k=n}^{\infty } \mathsf {P}(k \le |X| < k+1 ) = \sum \limits _{n=1}^{\infty } \mathsf {P}(|X| \ge n )\) and \(\sum \limits _{k=0}^{\infty } (k+1) \mathsf {P}(k \le |X|< k+1 ) = 1+\sum \limits _{k=0}^{\infty } k \mathsf {P}(k \le |X| < k+1 ) = 1+ \sum \limits _{n=1}^{\infty } \mathsf {P}(|X| \ge n )\) in (6.A.18), we get (6.A.17). A similar procedure will show the result for discrete random variables. \(\spadesuit \)
Theorem 6.A.6
  If f is a convexFootnote 7 function, then
which is called the Jensen inequality.
Proof
Let  \(m = \mathsf {E}\{ X \}\). Then, from the intermediate value theorem, we have
for \(-\infty< \alpha < \infty \). Taking the expectation of the above equation, we get \(\mathsf {E}\{ h(X) \} = h(m) + \frac{1}{2} h^{\prime \prime } ( \alpha ) \sigma _X^2\). Recollecting that \(h^{\prime \prime } ( \alpha ) \ge 0\) and \(\sigma _X^2 \ge 0\), we get \(\mathsf {E}\{h(X)\} \ge h(m) = h (\mathsf {E}\{X\})\). \(\spadesuit \)
Theorem 6.A.7
(Rohatgi and Saleh 2001)Â Â If the n-th absolute moment \(\mathsf {E}\left\{ |X|^n \right\} \) is finite, then
for \(1 \le s < r \le n\), which is called the Lyapunov inequality.
Proof
Consider the bi-variable formula
where f is the pdf of X. Letting \(\beta _n = \mathsf {E}\left\{ |X|^n \right\} \), (6.A.22) can be written as \(Q(u,v) = (u \, v) \left( \begin{array}{cc} \beta _{k-1} &{} \beta _k \\ \beta _k &{} \beta _{k+1} \end{array} \right) (u \, v)^T\). Now, we have \(\left| \begin{array}{cc} \beta _{k-1} &{} \beta _k \\ \beta _k &{} \beta _{k+1} \end{array} \right| \ge 0\), i.e., \(\beta _k^{2k} \le \beta _{k-1}^k \beta _{k+1}^k\) because \(Q \ge 0\) for every choice of u and v. Therefore, we have
with \(\beta _0 =1\). If we multiply the first \(k-1\) consecutive inequalities in (6.A.23), then we have \(\beta _{k-1}^k \le \beta _k^{k-1}\) for \(k=2, 3, \ldots , n\), from which we can easily get \(\beta _1 \le \beta _2^{ \frac{1}{2} } \le \beta _3^{\frac{1}{3}} \le \cdots \le \beta _n^{\frac{1}{n}}\). \(\spadesuit \)
Theorem 6.A.8
  Let g(x) be a non-decreasing and non-negative function for \(x \in (0, \infty )\). If \(\frac{\mathsf {E}\{g(|X|)\}}{g (\varepsilon )}\) is defined, then
for \(\varepsilon >0\), which is called the generalized Bienayme-Chebyshev inequality.
Proof
Let the cdf of X be F(x). Then, we get \(\mathsf {E}\{g(|X|)\} \ge g(\varepsilon ) \int _{|x| \ge \varepsilon } dF(x) = g(\varepsilon ) \mathsf {P}( |X| \ge \varepsilon )\) by recollecting . \(\spadesuit \)
Letting \(g(x) = x^r\) in the generalized Bienayme-Chebyshev inequality, we can easily get the Bienayme-Chebyshev inequality discussed below. In addition, the Chebyshev inequality discussed in Theorem 6.A.4 is a special case of the generalized Bienayme-Chebyshev inequality and of the Bienayme-Chebyshev inequality.
Theorem 6.A.9
  When the r-th absolute moment \(\mathsf {E}\left\{ |X|^r \right\} \) of X is finite, where \(r > 0\), we have
for \(\varepsilon >0\), which is called the Bienayme-Chebyshev inequality.
(B) Inequalities of Random Vectors
Theorem 6.A.10
(Rohatgi and Saleh 2001)Â Â For two random variables X and Y, we have
which is called the Cauchy-Schwarz inequality.
Proof
First, note that \(\mathsf {E}\{ |XY| \}\) exists when \(\mathsf {E}\left\{ X^2 \right\} < \infty \) and \(\mathsf {E}\left\{ Y^2\right\} < \infty \) because \(|ab| \le \frac{a^2+b^2}{2}\) for real numbers a and b. Now, if \(\mathsf {E}\left\{ X^2 \right\} = 0\), then \( \mathsf {P}(X=0) = 1\) and thus \(\mathsf {E}\{ XY \} = 0\), implying that (6.A.26) holds true. Next when \(\mathsf {E}\left\{ X^2 \right\} > 0\), recollecting that \(\mathsf {E}\left\{ (\alpha X + Y)^2 \right\} = \alpha ^2 \mathsf {E}\left\{ X^2 \right\} + 2 \alpha \mathsf {E}\{ XY \} + \mathsf {E}\left\{ Y^2 \right\} \ge 0\) for any real number \(\alpha \), we have \(\frac{ \mathsf {E}^2 \{ XY \}}{\mathsf {E}\left\{ X^2 \right\} } - 2 \frac{ \mathsf {E}^2 \{ XY \}}{\mathsf {E}\left\{ X^2 \right\} } + \mathsf {E}\left\{ Y^2 \right\} \ge 0\) by letting \(\alpha = - \frac{\mathsf {E}\{ XY \}}{\mathsf {E}\left\{ X^2 \right\} }\). This inequality is equivalent to (6.A.26). \(\spadesuit \)
Theorem 6.A.11
(Rohatgi and Saleh 2001) For zero-mean independent random variables \(\left\{ X_i \right\} _{i=1}^{n}\) with variances \(\left\{ \sigma _i^2 \right\} _{i=1}^{n}\), let \(S_k = \sum \limits _{j=1}^k X_j\). Then,
for \(\varepsilon > 0\), which is called the Kolmogorov inequality.
Proof
Let \(A_0 = \Omega \), \(A_k = \left\{ {\underset{1 \le j \le k}{\max }} \left| S_j \right| \le \varepsilon \right\} \) for \(k=1, 2, \ldots , n\), and \(B_k = A_{k-1} \cap A_k^c = \left\{ \left| S_1 \right| \le \varepsilon , \left| S_2 \right| \le \varepsilon , \ldots , \left| S_{k-1} \right| \le \varepsilon \right\} \cap \ \left\{ \text {at least one of} \left| S_1 \right| , \left| S_2 \right| , \ldots , \left| S_k \right| \text {is larger than } \varepsilon \right\} \), i.e.,
Then, \(A_n^c = \overset{n}{\underset{k=1}{\cup }} B_k\) and \(B_k \subseteq \left\{ \left| S_{k-1} \right| \le \varepsilon , \left| S_k \right| > \varepsilon \right\} \). Recollecting the indicator function \(K_A (x)\) defined in (2.A.27), we get , i.e.,
Noting that \(S_n-S_k=X_{k+1}+X_{k+2}+\cdots +X_n\) and \(S_k K_{B_k} \left( S_k \right) \) are independent of each other, that \(\mathsf {E}\left\{ X_k \right\} =0\), that \(\mathsf {E}\left\{ K_{B_k} \left( S_k \right) \right\} = \mathsf {P}\left( B_k \right) \), and that \(\left| S_k \right| \ge \varepsilon \) under \(B_k\), we have \(\mathsf {E}\left[ \left\{ S_n K_{B_k} \left( S_k \right) \right\} ^2 \right] = \mathsf {E}\left\{ \left( S_n-S_k \right) ^2 K_{B_k} \left( S_k \right) \right\} + \mathsf {E}\left[ \left\{ S_k K_{B_k} \left( S_k \right) \right\} ^2\right] \ge \mathsf {E}\left[ \left\{ S_k K_{B_k} \left( S_k \right) \right\} ^2\right] \), i.e.,
from (6.A.29). Subsequently, using and (6.A.30), we get \(\sum \limits _{k=1}^n \sigma _k^2 \ge \varepsilon ^2 \sum \limits _{k=1}^n \mathsf {P}\left( B_k \right) = \varepsilon ^2 \mathsf {P}\left( A_n^c \right) \), which is the same as (6.A.27). \(\spadesuit \)
Example 6.A.7
(Rohatgi and Saleh 2001) The Chebyshev inequality (6.A.16) with \(\mathsf {E}\{ Y \} = 0\), i.e.,
is the same as the Kolmogorov inequality (6.A.27) with \(n=1\). \(\diamondsuit \)
Theorem 6.A.12
 Consider i.i.d. random variables \(\left\{ X_i \right\} _{i=1}^{n}\) with marginal mgf \(M(t) = \mathsf {E}\left\{ e^{ tX_i } \right\} \). Let \(Y_n = \sum \limits _{i=1}^{n} X_i\) and \(g(t) = \ln M(t)\). If we let the solution to \(\alpha = n g'(t)\) be \(t_r\) for a real number \(\alpha \), then
and
The inequalities (6.A.32) and (6.A.33) are called the Chernoff bounds.
When \(t_r = 0\), the right-hand sides of the two inequalities (6.A.32) and (6.A.33) are both 1 from : in other words, the Chernoff bounds simply say that the probability is no larger than 1 when \(t_r = 0\), and thus the Chernoff bounds are more useful when \(t_r \ne 0\).
Example 6.A.8
(Thomas 1986) Let \(X \sim \mathcal {N}(0,1)\), \(n=1\), and \(Y_1=X\). From the mgf \(M(t)= \exp \left( \frac{t^2}{2} \right) \), we get \(g(t) = \ln M(t) = \frac{t^2}{2}\) and \(g'(t)=t\). Thus, the solution to \(\alpha = n g'(t) = t\) is \(t_r=\alpha \). In other words, the Chernoff bounds can be written as
and
for \(X \sim \mathcal {N}(0,1)\).
\(\diamondsuit \)
Example 6.A.9
For \(X \sim P (\lambda )\), assume \(n=1\) and \(Y_1 = X\). From the mgf \(M(t)=\exp \{\lambda (e^t-1)\}\), we get \(g(t) = \ln M(t) = \lambda (e^t-1)\) and \(g'(t) = \lambda e^t\). Solving \(\alpha = n g'(t) = \lambda e^t\), we get \(t_r = \ln \left( \frac{\alpha }{\lambda }\right) \). Thus, \(t_r > 0\) when \(\alpha > \lambda \), \(t_r = 0\) when \(\alpha = \lambda \), and \(t_r < 0\) when \(\alpha < \lambda \). Therefore, we have
and
because \(n\left\{ t_r g'\left( t_r \right) - g\left( t_r \right) \right\} = \ln \left( \frac{\alpha }{\lambda }\right) ^{\alpha } - \alpha +\lambda \) from and \(t_r g'\left( t_r \right) = \alpha \ln \left( \frac{\alpha }{\lambda }\right) \).
Theorem 6.A.13
If p and q are both larger than 1 and \(\frac{1}{p} +\frac{1}{q}=1\), then
which is called the Hölder inequality.
Theorem 6.A.14
If \(p >1\), thenÂ
which is called the Minkowski inequality.
It is easy to see that the Minkowski inequality is a generalization of the triangle inequality \(|a-b|\le |a-c|+|c-b|\).Â
Exercises
Exercise 6.1
For the sample space [0, 1], consider a sequence of random variables defined by
and let \(X(\omega ) = 0\) for \(\omega \in [0,1]\). Assume the probability measure \( \mathsf {P}(a\le \omega \le b ) = b-a\), the Lebesgue measure mentioned following (2.5.24), for \(0\le a\le b\le 1\). Discuss if \(\left\{ X_n(\omega )\right\} _{n=1}^{\infty }\) converges to \(X(\omega )\) surely or almost surely.
Â
Exercise 6.2
For the sample space [0, 1], consider the sequence
and let \(X(\omega ) = 5\) for \(\omega \in [0,1]\). Assuming the probability measure \( \mathsf {P}(a\le \omega \le b ) = b-a\) for \(0\le a\le b\le 1\), discuss if \(\left\{ X_n(\omega )\right\} _{n=1}^{\infty }\) converges to \(X(\omega )\) surely or almost surely.
Exercise 6.3
When \(\left\{ X_i\right\} _{i=1}^{n}\) are independent random variables, obtain the distribution of \(S_n = \sum \limits _{i=1}^{n} X_i\) in each of the following five cases of the distribution of \(X_i\).
-
(1)
geometric distribution with parameter \(\alpha \),
-
(2)
\({ NB}\left( r_i, \alpha \right) \),
-
(3)
\( P\left( \lambda _i \right) \),
-
(4)
\(G \left( \alpha _i,\beta \right) \), and
-
(5)
\(C\left( \mu _i,\theta _i\right) \).         Â
Exercise 6.4
To what does \(\frac{ S_n }{n}\) converge in Example 6.2.10?
Exercise 6.5
Let \(Y =\frac{X-\lambda }{\sqrt{\lambda }}\) for a Poisson random variable \(X \sim P (\lambda )\). Noting that the mgf of X is \(M_X(t) = \exp \left\{ \lambda \left( e^{t}-1 \right) \right\} \), show that Y converges to a standard normal random variable as \(\lambda \rightarrow \infty \).
Exercise 6.6
For a sequence \(\left\{ X_n \right\} _{n=1}^{\infty }\) with the pmf
show that \(X_n \overset{l}{\rightarrow } X\), where X has the distribution \( \mathsf {P}(X=0)=1\).
Exercise 6.7
Discuss if the weak law of large numbers holds true for a sequence of i.i.d. random variables with marginal pdf \(f(x) = \frac{1+ \alpha }{x^{2+\alpha }} u (x-1)\), where \(\alpha > 0\).
Exercise 6.8
Show that \(S_n = \sum \limits _{k=1}^{n}X_k\) converges to a Poisson random variable with distribution P(np) when \(n\rightarrow \infty \) for an i.i.d. sequence \(\left\{ X_n \right\} _{n=1}^{\infty }\) with marginal distribution b(1, p).
Exercise 6.9
Discuss the central limit theorem for an i.i.d. sequence \(\left\{ X_i \right\} _{i=1}^{\infty }\) with marginal distribution \(B(\alpha , \beta )\).
Exercise 6.10
An i.i.d. sequence \(\left\{ X_i \right\} _{i=1}^{n}\) has marginal distribution \(P (\lambda )\). When n is large enough, we can approximate as \(S_n=\sum \limits _{k=1}^{n}X_k \sim \mathcal {N}(n\lambda ,n\lambda )\). Using the continuity correction, obtain the probability \( \mathsf {P}\left( 50 < S_n \le 80 \right) \).
Exercise 6.11
Consider an i.i.d. Bernoulli sequence \(\left\{ X_i \right\} _{i=1}^{n}\) with \(\mathsf {P}\left( X_i = 1 \right) =p\), a binomial random variable \(M \sim b(n,p)\) which is independent of \(\left\{ X_i \right\} _{i=1}^{n}\), and \(K=\sum \limits _{i=1}^{n} X_i\). Note that K is the number of successes in n i.i.d. Bernoulli trials. Obtain the expected values of \(U=\sum \limits _{i=1}^{K} X_i\) and \(V=\sum \limits _{i=1}^{M} X_i\).
Exercise 6.12
The result of a game is independent of another game, and the probabilities of winning and losing are each \( \frac{1}{2} \). Assume there is no tie. When a person wins, the person gets 2 points and then continues. On the other hand, if the person loses a round, the person gets 0 points and stops. Obtain the mgf, expected value, and variance of the score Y that the person may get from the games.
Exercise 6.13
Let \(P_n\) be the probability that we have more head than tail in a toss of n fair coins.
-
(1)
Obtain \(P_{3}\), \(P_{4}\), and \(P_{5}\).
-
(2)
Obtain the limit \(\lim \limits _{n \rightarrow \infty } P_n\).
Exercise 6.14
For an i.i.d. sequence \(\left\{ X_n \sim \mathcal {N}(0,1) \right\} _{n=1}^{\infty }\), let the cdf of \( \overline{X}_n = \frac{1}{n} \sum \limits _{i=1}^n X_i\) be \(F_n\). Obtain \(\underset{n \rightarrow \infty }{\lim } F_n (x)\) and discuss whether the limit is a cdf or not.
Exercise 6.15
Consider \(X_{[1]} = \min \left( X_1 , X_2, \ldots , X_n \right) \) for an i.i.d. sequence \(\left\{ X_n \sim U(0, \theta ) \right\} _{n=1}^{\infty }\). Does \(Y_n = n X_{[1]}\) converge in distribution? If yes, obtain the limit cdf.
Exercise 6.16
The marginal cdf F of an i.i.d. sequence \(\left\{ X_n \right\} _{n=1}^{\infty }\) is absolutely continuous. For the sequence \(\left\{ Y_n \right\} _{n=1}^{\infty } = \left\{ n \left\{ 1-F \left( M_n \right) \right\} \right\} _{n=1}^{\infty }\), obtain the limit \(\underset{n \rightarrow \infty }{\lim } F_{Y_n}(y)\) of the cdf \(F_{Y_n}\) of \(Y_n\), where \(M_n = \max \left( X_1 , X_2 , \ldots , X_n \right) \).
Exercise 6.17
Is the sequence of cdf’s
convergent? If yes, obtain the limit.
Exercise 6.18
In the sequence \(\left\{ Y_i=X+W_i \right\} _{i=1}^{n}\), X and \(\left\{ W_i \right\} _{i=1}^{n}\) are independent of each other, and \(\left\{ W_i \sim \mathcal {N}\left( 0, \sigma _i^2\right) \right\} _{i=1}^{n}\) is an i.i.d. sequence, where \(\sigma _i^2\le \sigma _{\max }^2 < \infty \). We estimate X via
and let the error be \(\varepsilon _n = \hat{X}_n- X\).
-
(1)
Express the cf, mean, and variance of \(\hat{X}_n\) in terms of those of X and \(\left\{ W_i \right\} _{i=1}^{n}\).
-
(2)
Obtain the covariance \(\mathsf {Cov}\left( Y_i, Y_j \right) \).
-
(3)
Obtain the pdf \(f_{\varepsilon _n}(\alpha )\) and the conditional pdf \(f_{\hat{X}|X}(\alpha |\beta )\).
-
(4)
Does \(\hat{X}_n\) converge to X? If yes, what is the type of the convergence? If not, what is the reason?
Exercise 6.19
Assume an i.i.d. sequence \(\left\{ X_i \right\} _{i=1}^{\infty }\) with marginal pdf \(f(x) = e^{-x+\theta }u(x- \theta )\). Show that
and that
for \(Y = \frac{1}{n} \sum \limits ^n_{i=1} X_i\).
Exercise 6.20
Show that \(\max \left( X_1 , X_2 , \ldots , X_n \right) {\mathop {\longrightarrow }\limits ^{p}} \theta \) for an i.i.d. sequence \(\left\{ X_i \right\} _{i=1}^{\infty }\) with marginal distribution \(U(0, \theta )\).
Exercise 6.21
Assume an i.i.d. sequence \(\left\{ X_n \right\} _{n=1}^{\infty }\) with marginal cdf
Let \(\left\{ Y_n \right\} _{n=1}^{\infty }\) and \(\left\{ Z_n \right\} _{n=1}^{\infty }\) be defined by \(Y_n= \max \left( X_1 , X_2 , \ldots , X_n \right) \) and \(Z_n=n\left( 1-Y_n \right) \). Show that the sequence \(\left\{ Z_n \right\} _{n=1}^{\infty }\) converges in distribution to a random variable Z with cdf \(F(z) = \left( 1-e^{-z} \right) u(z)\).
Exercise 6.22
For the sample space \(\Omega = \{1,2,\ldots \}\) and probability measure \( \mathsf {P}(n) =\frac{\alpha }{n^2}\), assume a sequence \(\left\{ X_n \right\} _{n=1}^{\infty }\) such that
Show that, as \(n\rightarrow \infty \), \(\left\{ X_n \right\} _{n=1}^{\infty }\) converges to \(X=0\) almost surely, but does not converge to \(X=0\) in the mean square, i.e., \(\mathsf {E}\left\{ \left( X_n-0 \right) ^2 \right\} \nrightarrow 0\).
Exercise 6.23
The second moment of an i.i.d. sequence \(\left\{ X_i \right\} _{i=1}^{\infty }\) is finite. Show that \(Y_n \overset{p}{\rightarrow } \mathsf {E}\left\{ X_1 \right\} \) for \(Y_n = \frac{2}{n(n+1)}\sum \limits ^n_{i=1}iX_i\).
Exercise 6.24
For a sequence \(\left\{ X_n \right\} _{n=1}^{\infty }\) with
we have \(X_n \overset{r=2}{\longrightarrow } 0\) because \(\lim \limits _{n\rightarrow \infty }\mathsf {E}\left\{ X_n^2 \right\} = \lim \limits _{n\rightarrow \infty }\frac{1}{n} = 0\). Show that the sequence \(\left\{ X_n \right\} _{n=1}^{\infty }\) does not converge almost surely.
Exercise 6.25
Consider a sequence \(\left\{ X_i \right\} _{i=1}^{n}\) with a finite common variance \(\sigma ^2\). When the correlation coefficient between \(X_i\) and \(X_j\) is negative for every \(i \ne j\), show that the sequence \(\left\{ X_i \right\} _{i=1}^{n}\) follows the weak law of large numbers. (Hint. Assume \(Y_n = \frac{1}{n}\sum \limits _{k=1}^{n} \left( X_k - m_k\right) \) for a sequence \(\left\{ X_i\right\} _{i=1}^{\infty }\) with mean \(\mathsf {E}\left\{ X_i\right\} =m_i\). Then, it is known that a necessary and sufficient condition for \(\left\{ X_i\right\} _{i=1}^{\infty }\) to satisfy the weak law of large numbers is that
as \(n \rightarrow \infty \).)
Exercise 6.26
For an i.i.d. sequence \(\left\{ X_i \right\} _{i=1}^{n}\), let \(\mathsf {E}\left\{ X_i\right\} =\mu \), \(\mathsf {Var}\left\{ X_i \right\} =\sigma ^2\), and \(\mathsf {E}\left\{ X_i^4 \right\} < \infty \). Find the constants \(a_n\) and \(b_n\) such that \(\frac{V_n- a_n}{b_n }\overset{l}{\rightarrow }Z\) for \(V_n =\sum \limits ^n _{k=1} \left( X_k - \mu \right) ^2\), where \(Z \sim \mathcal {N}(0,1)\).
Exercise 6.27
When the sequence \(\left\{ X_k \right\} _{k=1}^{\infty }\) with \( \mathsf {P}\left( X_k =\pm k^{\alpha } \right) = \frac{1}{2} \) satisfies the strong law of large numbers, obtain the range of \(\alpha \).
Exercise 6.28
Assume a Cauchy random variable X with pdf \(f_X (x) = \frac{a}{\pi (x^2+a^2)}\).
-
(1)
Show that the cf is
$$\begin{aligned} \varphi _X (w) \ = \ e^{-a|w|}. \end{aligned}$$(6.E.12) -
(2)
Show that the sample mean of n i.i.d. Cauchy random variables is a Cauchy random variable.
Exercise 6.29
Assume an i.i.d. sequence \(\left\{ X_i \sim P (0.02) \right\} _{i=1}^{100}\). For \(S = \sum \limits ^{100}_{i=1} X_i\), obtain the value \( \mathsf {P}(S\ge 3)\) using the central limit theorem and compare it with the exact value.
Exercise 6.30
Consider the sequence of cdf’s
among which four are shown in Fig. 6.3. Obtain \(\lim \limits _{n\rightarrow \infty }F_n (x)\) and discuss if \(\frac{d}{dx} \left\{ \lim \limits _{n\rightarrow \infty } F_n (x) \right\} \) is the same as \(\lim \limits _{n\rightarrow \infty } \left\{ \frac{d}{dx}F_n(x)\right\} \).  Â
Exercise 6.31
Assume an i.i.d. sequence \(\left\{ X_i \sim \chi ^2 (1) \right\} _{i=1}^{\infty }\). Then, we have \( S_n \sim \chi ^2 (n)\), \(\mathsf {E}\left\{ S_n \right\} =n\), and \(\mathsf {Var}\left\{ S_n \right\} = 2n\). Thus, letting \(Z_n = \frac{1}{\sqrt{2n}} \left( S_n -n \right) = \sqrt{\frac{n}{2}} \left( \frac{ S_n }{n} -1 \right) \), the mgf \(M_n (t) = \mathsf {E}\left\{ e^{t Z_n} \right\} =\exp \left( - t \sqrt{ \frac{n}{2} } \right) \left( 1- \frac{2t}{\sqrt{2n}} \right) ^{-\frac{n}{2}}\) of \(Z_n\) can be obtained as
for \(t<\sqrt{\frac{n}{2}}\). In addition, from Taylor approximation, we get
for \(0< \theta _n < t \sqrt{\frac{2}{n}}\). Show that \(Z_n \overset{l}{\rightarrow } Z \sim \mathcal {N}(0,1)\).
Exercise 6.32
In a soccer game, the number N of shootings of a player is a Poisson random variable with mean \(\mu =12\). The probability of a goal for a shooting is \(\frac{1}{8}\) and is independent of N. Obtain the distribution, mean, and variance of the number of goals.
Exercise 6.33
For a facsimile (fax), the number W of pages sent is a geometric random variable with pmf \(p_W (k) = \frac{3^{k-1}}{4^k}\) for \(k \in \{ 1, 2, \ldots \}\) and mean \(\frac{1}{\beta }=4\). The amount \(B_i\) of information contained in the i-th page is a geometric random variable with pmf \(p_B (k) = \frac{ 1 }{10^5} \left( 1- 10^{-5} \right) ^{k-1}\) for \(k \in \{ 1, 2, \ldots \}\) with expected value \(\frac{1}{\alpha }=10^5\). Assuming that \(\left\{ B_i \right\} _{i=1}^{\infty }\) is an i.i.d. sequence and that W and \(\left\{ B_i \right\} _{i=1}^{\infty }\) are independent of each other, obtain the distribution of the total amount of information sent via this fax.
Exercise 6.34
Consider a sequence \(\left\{ X_i\right\} _{i=1}^{\infty }\) of i.i.d. exponential random variables with mean \(\frac{1}{\lambda }\). A geometric random variable N is of mean \(\frac{1}{p}\) and is independent of \(\left\{ X_i\right\} _{i=1}^{\infty }\). Obtain the expected value and variance of the random sum \(S_N = \sum \limits _{i=1}^{N} X_i\).
Exercise 6.35
Depending on the weather, the number N of icicles has the pmf \(p_N (n) = \frac{1}{10} 2^{2- |3-n|}\) for \(n=1, 2, \ldots , 5\) and the lengths \(\left\{ L_i \right\} _{i=1}^{\infty }\) of icicles are i.i.d. with marginal pdf \(f_L (v) = \lambda e^{-\lambda v} u(v)\). In addition, N and \(\left\{ L_i \right\} _{i=1}^{\infty }\) are independent of each other. Obtain the expected value of the sum T of the lengths of the icicles.
Exercise 6.36
Check if the following sequences of cdf’s are convergent, and if yes, obtain the limit:
-
(1)
sequence \(\left\{ F_n(x) \right\} _{n=1}^{\infty }\) with cdf
$$\begin{aligned} F_n(x)= & {} \left\{ \begin{array}{ll} 0, \quad &{} \quad x< -n , \\ \frac{1}{2n}(x+n), &{} \quad -n \le x < n , \\ 1, &{} \quad x \ge n. \end{array} \right. \end{aligned}$$(6.E.16) -
(2)
sequence \(\left\{ F_n(x) \right\} _{n=1}^{\infty }\) such that \(F_n (x) = F(x+n)\) for a continuous cdf F(x).
-
(3)
sequence \(\left\{ G_n(x) \right\} _{n=1}^{\infty }\) such that \(G_n (x) = F \left( x+(-1)^n n \right) \) for a continuous cdf F(x).
Exercise 6.37
For the sequence \(\left\{ X_n \sim G(n,\beta ) \right\} _{n=1}^{\infty }\), obtain the limit distribution of \(Y_n=\frac{X_n}{n^2}\).
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Song, I., Park, S.R., Yoon, S. (2022). Convergence of Random Variables. In: Probability and Random Variables: Theory and Applications. Springer, Cham. https://doi.org/10.1007/978-3-030-97679-8_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-97679-8_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-97678-1
Online ISBN: 978-3-030-97679-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)