Discrete Random Variables and Probability Distributions

  • Chapter
  • First Online:
Modern Mathematical Statistics with Applications

Part of the book series: Springer Texts in Statistics ((STS))

  • 18k Accesses

Abstract

Suppose a city’s traffic engineering department monitors a certain intersection during a one-hour period in the middle of the day. Many characteristics might be of interest: the number of vehicles that enter the intersection, the largest number of vehicles in the left turn lane during a signal cycle, the speed of the fastest vehicle going through the intersection, the average speed \( \bar{x} \) of all vehicles entering the intersection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Thailand)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 93.08
Price includes VAT (Thailand)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 109.99
Price excludes VAT (Thailand)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info
Hardcover Book
EUR 129.99
Price excludes VAT (Thailand)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    For those unfamiliar with the term, a countably infinite set is one for which the elements can be enumerated: a first element, a second element, and so on. The set of all positive integers and the set of all integers are both countably infinite, but an interval like [2, 5] on the number line is not.

  2. 2.

    P(X = x) is read “the probability that the rv X assumes the value x.” For example, P(X = 2) denotes the probability that the resulting X value is 2.

  3. 3.

    “Between a and b, inclusive” is equivalent to (a ≤ X ≤ b).

  4. 4.

    A quantity is ot) (read “little o of delta t”) if, as Δt approaches 0, so does ot)/Δt. That is, ot) is even more negligible than Δt itself. The quantity (Δt)2 has this property, but sin(Δt) does not.

  5. 5.

    If we define \( \left( {\begin{array}{*{20}c} a \\ b \\ \end{array} } \right) = 0 \) for a < b, then h(x; n, M, N) may be applied for all integers 0 ≤ x ≤ n.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jay L. Devore .

Supplementary Exercises: (138–169)

Supplementary Exercises: (138–169)

  1. 138.

    Consider a deck consisting of seven cards, marked 1, 2, …, 7. Three of these cards are selected at random. Define a rv W by W = the sum of the resulting numbers, and compute the pmf of W. Then compute μ and σ2. [Hint: Consider outcomes as unordered, so that (1, 3, 7) and (3, 1, 7) are not different outcomes. Then there are 35 outcomes, and they can be listed.] (This type of rv actually arises in connection with Wilcoxon’s rank-sum test, in which there is an x sample and a y sample and W is the sum of the ranks of the x’s in the combined sample.)

  2. 139.

    After shuffling a deck of 52 cards, a dealer deals out 5. Let X = the number of suits represented in the five-card hand.

    1. a.

      Show that the pmf of X is

      x

      1

      2

      3

      4

      p(x)

      .002

      .146

      .588

      .264

      [Hint: p(1) = 4P(all spades), p(2) = 6P(only spades and hearts with at least one of each), and p(4) = 4P(2 spades ∩ one of each other suit).]

    2. b.

      Compute μ, σ2, and σ.

  3. 140.

    Let X be a rv with mean µ. Show that E(X2) ≥ µ2, and that E(X2) > µ2 unless X is a constant. [Hint: Consider variance.]

  4. 141.

    Of all customers purchasing automatic garage-door openers, 75% purchase a chain-driven model. Let X = the number among the next 15 purchasers who select the chain-driven model.

    1. a.

      What is the pmf of X?

    2. b.

      Compute P(X > 10).

    3. c.

      Compute P(6 ≤ X ≤ 10).

    4. d.

      Compute μ and σ2.

    5. e.

      If the store currently has in stock 10 chain-driven models and 8 shaft-driven models, what is the probability that the requests of these 15 customers can all be met from existing stock?

  5. 142.

    A friend recently planned a cam** trip. He had two flashlights, one that required a single 6-V battery and another that used two size-D batteries. He had previously packed two 6-V and four size-D batteries in his camper. Suppose the probability that any particular battery works is p and that batteries work or fail independently of one another. Our friend wants to take just one flashlight. For what values of p should he take the 6-V flashlight?

  6. 143.

    Binary data is transmitted over a noisy communication channel. The probability that a received binary digit is in error due to channel noise is 0.05. Assume that such errors occur independently within the bit stream.

    1. a.

      What is the probability that the 3rd error occurs on the 50th transmitted bit?

    2. b.

      On average, how many bits will be transmitted correctly before the first error?

    3. c.

      Consider a 32-bit “word.” What is the probability of exactly 2 errors in this word?

    4. d.

      Consider the next 10,000 bits. What approximating model could we use for X = the number of errors in these 10,000 bits? Give the name of the model and the value(s) of the parameter(s).

  7. 144.

    A manufacturer of flashlight batteries wishes to control the quality of its product by rejecting any lot in which the proportion of batteries having unacceptable voltage appears to be too high. To this end, out of each large lot (10,000 batteries), 25 will be selected and tested. If at least 5 of these generate an unacceptable voltage, the entire lot will be rejected. What is the probability that a lot will be rejected if

    1. a.

      Five percent of the batteries in the lot have unacceptable voltages?

    2. b.

      Ten percent of the batteries in the lot have unacceptable voltages?

    3. c.

      Twenty percent of the batteries in the lot have unacceptable voltages?

    4. d.

      What would happen to the probabilities in parts (a)–(c) if the critical rejection number were increased from 5 to 6?

  8. 145.

    Of the people passing through an airport metal detector, .5% activate it; let X = the number among a randomly selected group of 500 who activate the detector.

    1. a.

      What is the (approximate) pmf of X?

    2. b.

      Compute P(X = 5).

    3. c.

      Compute P(5 ≤ X).

  9. 146.

    An educational consulting firm is trying to decide whether high school students who have never before used a hand-held calculator can solve a certain type of problem more easily with a calculator that uses reverse Polish logic or one that does not use this logic. A sample of 25 students is selected and allowed to practice on both calculators. Then each student is asked to work one problem on the reverse Polish calculator and a similar problem on the other. Let p = P(S), where S indicates that a student worked the problem more quickly using reverse Polish logic than without, and let X = number of S’s.

    1. a.

      If p = .5, what is P(7 ≤ X ≤ 18)?

    2. b.

      If p = .8, what is P(7 ≤ X ≤ 18)?

    3. c.

      If the claim that p = .5 is to be rejected when either X ≤ 7 or X ≥ 18, what is the probability of rejecting the claim when it is actually correct?

    4. d.

      If the decision to reject the claim p = .5 is made as in part (c), what is the probability that the claim is not rejected when p = .6? When p = .8?

    5. e.

      What decision rule would you choose for rejecting the claim p = .5 if you wanted the probability in part (c) to be at most .01?

  10. 147.

    Consider a disease whose presence can be identified by carrying out a blood test. Let p denote the probability that a randomly selected individual has the disease. Suppose n individuals are independently selected for testing. One way to proceed is to carry out a separate test on each of the n blood samples. A potentially more economical approach, group testing, was introduced during World War II to identify syphilitic men among army inductees. First, take a part of each blood sample, combine these specimens, and carry out a single test. If no one has the disease, the result will be negative, and only the one test is required. If at least one individual is diseased, the test on the combined sample will yield a positive result, in which case the n individual tests are then carried out. If p = .1 and n = 3, what is the expected number of tests using this procedure? What is the expected number when n = 5? [The article “Random Multiple-Access Communication and Group Testing” (IEEE Trans. Commun. 1984: 769–774) applied these ideas to a communication system in which the dichotomy was active/idle user rather than diseased/nondiseased.]

  11. 148.

    Let p1 denote the probability that any particular code symbol is erroneously transmitted through a communication system. Assume that on different symbols, errors occur independently of one another. Suppose also that with probability p2 an erroneous symbol is corrected upon receipt. Let X denote the number of correct symbols in a message block consisting of n symbols (after the correction process has ended). What is the probability distribution of X?

  12. 149.

    The purchaser of a power-generating unit requires c consecutive successful start-ups before the unit will be accepted. Assume that the outcomes of individual start-ups are independent of one another. Let p denote the probability that any particular start-up is successful. The random variable of interest is X = the number of start-ups that must be made prior to acceptance. Give the pmf of X for the case c = 2. If p = .9, what is P(X ≤ 8)? [Hint: For x ≥ 5, express p(x) “recursively” in terms of the pmf evaluated at the smaller values x − 3, x − 4, …, 2.] (This problem was suggested by the article “Evaluation of a Start-Up Demonstration Test,” J. Qual. Tech. 1983: 103–106.)

  13. 150.

    A plan for an executive travelers’ club has been developed by an airline on the premise that 10% of its current customers would qualify for membership.

    1. a.

      Assuming the validity of this premise, among 25 randomly selected current customers, what is the probability that between 2 and 6 (inclusive) qualify for membership?

    2. b.

      Again assuming the validity of the premise, what are the expected number of customers who qualify and the standard deviation of the number who qualify in a random sample of 100 current customers?

    3. c.

      Let X denote the number in a random sample of 25 current customers who qualify for membership. Consider rejecting the company’s premise in favor of the claim that p > .10 if x ≥ 7. What is the probability that the company’s premise is rejected when it is actually valid?

    4. d.

      Refer to the decision rule introduced in part (c). What is the probability that the company’s premise is not rejected even though p = .20 (i.e., 20% qualify)?

  14. 151.

    Forty percent of seeds from maize (modern-day corn) ears carry single spikelets, and the other 60% carry paired spikelets. A seed with single spikelets will produce an ear with single spikelets 29% of the time, whereas a seed with paired spikelets will produce an ear with single spikelets 26% of the time. Consider randomly selecting ten seeds.

    1. a.

      What is the probability that exactly five of these seeds carry a single spikelet and produce an ear with a single spikelet?

    2. b.

      What is the probability that exactly five of the ears produced by these seeds have single spikelets? What is the probability that at most five ears have single spikelets?

  15. 152.

    A trial has just resulted in a hung jury because eight members of the jury were in favor of a guilty verdict and the other four were for acquittal. If the jurors leave the jury room in random order and each of the first four leaving the room is accosted by a reporter in quest of an interview, what is the pmf of X = the number of jurors favoring acquittal among those interviewed? How many of those favoring acquittal do you expect to be interviewed?

  16. 153.

    A reservation service employs five information operators who receive requests for information independently of one another, each according to a Poisson process with rate λ = 2/min.

    1. a.

      What is the probability that during a given 1-min period, the first operator receives no requests?

    2. b.

      What is the probability that during a given 1-min period, exactly four of the five operators receive no requests?

    3. c.

      Write an expression for the probability that during a given 1-min period, all of the operators receive exactly the same number of requests.

  17. 154.

    Grasshoppers are distributed at random in a large field according to a Poisson distribution with parameter λ = 2 per square yard. How large should the radius R of a circular sampling region be taken so that the probability of finding at least one in the region equals .99?

  18. 155.

    A newsstand has ordered five copies of a certain issue of a photography magazine. Let X = the number of individuals who come in to purchase this magazine. If X has a Poisson distribution with parameter µ = 4, what is the expected number of copies that are sold?

  19. 156.

    Individuals A and B begin to play a sequence of chess games. Let S = {A wins a game}, and suppose that outcomes of successive games are independent with P(S) = p and P(F) = 1 − p (they never draw). They will play until one of them wins ten games. Let X = the number of games played (with possible values 10, 11, …, 19).

    1. a.

      For x = 10, 11, …, 19, obtain an expression for p(x) = P(X = x).

    2. b.

      If a draw is possible, with p = P(S), q = P(F), 1 − p − q = P(draw), what are the possible values of X? What is P(20 ≤ X)? [Hint: P(20 ≤ X) = 1 − P(X < 20).]

  20. 157.

    A test for the presence of a disease has probability .20 of giving a false-positive reading (indicating that an individual has the disease when this is not the case) and probability .10 of giving a false-negative result. Suppose that ten individuals are tested, five of whom have the disease and five of whom do not. Let X = the number of positive readings that result.

    1. a.

      Does X have a binomial distribution? Explain your reasoning.

    2. b.

      What is the probability that exactly three of the ten test results are positive?

  21. 158.

    The generalized negative binomial pmf, in which r is not necessarily an integer, is

    $$ nb\left( {x;r,p} \right) = k(r,x) \times p^{r} (1 - p)^{x} \quad x = 0,1,2, \ldots $$

    where

    $$ k(r,x) = \left\{ {\begin{array}{*{20}c} {\frac{(x + r - 1)(x + r - 2) \cdots (x + r - x)}{x!}} & {x = 1,2, \ldots } \\ 1 & {x = 0} \\ \end{array} } \right. $$

    Let X, the number of plants of a certain species found in a particular region, have this distribution with p = .3 and r = 2.5. What is P(X = 4)? What is the probability that at least one plant is found?

  22. 159.

    A small publisher employs two typesetters. The number of errors (in one book) made by the first typesetter has a Poisson distribution mean µ1, the number of errors made by the second typesetter has a Poisson distribution with mean µ2, and each typesetter works on the same number of books. Then if one such book is randomly selected, the function

    $$ p(x{;}\;\mu_{1} ,\mu_{2} ) = .5e^{{ - \mu_{1} }} \frac{{\mu_{1}^{x} }}{x!} + .5e^{{ - \mu_{2} }} \frac{{\mu_{2}^{x} }}{x!}\quad x = 0,1,2, \ldots $$

    gives the pmf of X = the number of errors in the selected book.

    1. a.

      Verify that p(x; µ1, µ2) is a legitimate pmf (≥ 0 and sums to 1).

    2. b.

      What is the expected number of errors in the selected book?

    3. c.

      What is the standard deviation of the number of errors in the selected book?

    4. d.

      How does the pmf change if the first typesetter works on 60% of all such books and the second typesetter works on the other 40%?

  23. 160.

    The mode of a discrete random variable X with pmf p(x) is that value x* for which p(x) is largest (the most probable x value).

    1. a.

      Let X ~ Bin(n, p). By considering the ratio b(x + 1; n, p)/b(x; n, p), show that b(x; n, p) increases with x as long as x < np − (1 − p). Conclude that the mode x* is the integer satisfying (n + 1)p − 1 ≤ x* ≤ (n + 1)p.

    2. b.

      Show that if X has a Poisson distribution with parameter µ, the mode is the largest integer less than µ. If µ is an integer, show that both µ − 1 and µ are modes.

  24. 161.

    For a particular insurance policy the number of claims by a policy holder in 5 years is Poisson distributed. If the filing of one claim is four times as likely as the filing of two claims, find the expected number of claims.

  25. 162.

    If X is a hypergeometric rv, show directly from the definition that E(X) = nM/N (consider only the case n < M). [Hint: Factor nM/N out of the sum for E(X), and show that the terms inside the sum are of the form h(y; n − 1, M − 1, N − 1), where y = x − 1.]

  26. 163.

    Use the fact that

    $$ \sum\limits_{{{\text{all}}\;x}} {(x - \mu )^{2} p(x)} \quad \ge \sum\limits_{x:|x - \mu | \ge k\sigma } {(x - \mu )^{2} p(x)} $$

    to prove Chebyshev’s inequality, given in Exercise 45 of this chapter.

  27. 164.

    The simple Poisson process of Section 3.6 is characterized by a constant rate λ at which events occur per unit time. A generalization is to suppose that the probability of exactly one event occurring in the interval (t, t + Δt) is λ(t) · Δt + ot) for some function λ(t). It can then be shown that the number of events occurring during an interval [t1, t2] has a Poisson distribution with parameter

    $$ \mu = \int\limits_{{t_{1} }}^{{t_{2} }} {\lambda (t)dt} $$

    The occurrence of events over time in this situation is called a nonhomogeneous Poisson process. The article “Inference Based on Retrospective Ascertainment,” J. Amer. Statist. Assoc. 1989: 360–372, considers the intensity function

    $$ \lambda (t) = e^{a + bt} $$

    as appropriate for events involving transmission of HIV via blood transfusions. Suppose that a = 2 and b = .6 (close to values suggested in the paper), with time in years.

    1. a.

      What is the expected number of events in the interval [0, 4]? In [2, 6]?

    2. b.

      What is the probability that at most 15 events occur in the interval [0, .9907]?

  28. 165.

    Suppose a store sells two different coffee makers of a particular brand, a basic model selling for $30 and a fancy one selling for $50. Let X denote the number of people among the next 25 purchasing this brand who choose the more expensive model. Then h(X) = revenue = 50X + 30(25 − X) = 20X + 750, a linear function. If the choices are independent and have the same probability, then how is X distributed? Find the mean and standard deviation of h(X). Explain why the choices might not be independent with the same probability.

  29. 166.

    Let X be a discrete rv with possible values 0, 1, 2,. .. or some subset of these. The function \( \uppsi(s) = E(s^{X} ) = \sum\nolimits_{x = 0}^{\infty } {s^{x} \cdot p(x)} \) is called the probability generating function (pgf) of X.

    1. a.

      Suppose X is the number of children born to a family, and p(0) = .2, p(1) = .5, and p(2) = .3. Determine the pgf of X.

    2. b.

      Determine the pgf when X has a Poisson distribution with parameter µ.

    3. c.

      Show that ψ(1) = 1.

    4. d.

      Show that \( {\uppsi^{\prime}}(0) = p(1) \). (You’ll need to assume that the derivative can be brought inside the summation, which is justified.) What results from taking the second derivative with respect to s and evaluating at s = 0? The third derivative? Explain how successive differentiation of ψ(s) and evaluation at s = 0 “generates the probabilities in the distribution.” Use this to recapture the probabilities of (a) from the pgf. [Note: This shows that the pgf contains all the information about the distribution—knowing ψ(s) is equivalent to knowing p(x).]

  30. 167.

    Three couples and two single individuals have been invited to a dinner party. Assume independence of arrivals to the party, and suppose that the probability of any particular individual or any particular couple arriving late is .4 (the two members of a couple arrive together). Let X = the number of people who show up late for the party. Determine the pmf of X.

  31. 168.

    Consider a sequence of identical and independent trials, each of which will be a success S or failure F. Let p = P(S) and q = P(F).

    1. a.

      Define a random variable X as the number of trials necessary to obtain the first S, a geometric random variable. Here is an alternative approach to determining E(X). Just as P(B) = P(B|A)P(A) + P(B|A′)P(A′), it can be shown that

      $$ E\left( X \right) = E\left( {X|A} \right)P\left( A \right) + E\left( {X|A^{\prime} } \right)P\left( {A^{\prime} } \right) $$

      where \(E(X|A) \) denotes the expected value of X given that the event A has occurred. Now let A = {S on 1st trial}. Show again that E(X) = 1/p. [Hint: Denote E(X) by μ. Then given that the first trial is a failure, one trial has been performed and, starting from the second trial, we are still looking for the first S. This implies that \(E(X|A^{\prime}) = E(X|F) = 1 + \mu\).]

    2. b.

      The expected value property in (a) can be extended to any partition A1, A2, …, Ak of the sample space:

      $$ E(X) = E(X|A_{1} ) \cdot P(A_{1} ) + E(X|A_{2} ) \cdot P(A_{2} ) + \cdots + E(X|A_{k} ) \cdot P(A_{k} ) $$

      Now let Y = the number of trials necessary to obtain two consecutive S’s. It is not possible to determine E(Y) directly from the definition of expected value, because there is no formula for the pmf of Y; the complication is the word consecutive. Use the weighted average formula to determine E(Y). [Hint: Consider the partition with k = 3 and A1 = {F}, A2 = {SS}, A3 = {SF}.]

  32. 169.

    For a discrete rv X taking values in {0, 1, 2, 3, …}, we shall derive the following alternative formula for the mean:

    $$ \mu_{X} = \sum\limits_{x = 0}^{\infty } {[1 - F(x)]} $$
    1. a.

      Suppose for now the range of X is {0, 1, …, N} for some positive integer N. By re-grou** terms, show that

      $$ \begin{array}{*{20}l} \hfill {\sum\limits_{x = 0}^{N} {[x \cdot p(x)]} = p(1) + p(2) + p(3) + \cdots + p(N)} \\ \hfill { + p(2) + p(3) + \cdots + p(N)} \\ \hfill { + p(3) + \cdots + p(N)} \\ \hfill \vdots \\ \hfill { + p(N)} \\ \end{array} $$
    2. b.

      Re-write each row in the above expression in terms of the cdf of X, and use this to establish that

      $$ \sum\limits_{x = 0}^{N} {[x \cdot p(x)]} = \sum\limits_{x = 0}^{N - 1} {[1 - F(x)]} $$
    3. c.

      Let N → \( \infty \) in part (b) to establish the desired result, and explain why the resulting formula works even if the maximum value of X is finite. [Hint: If the largest possible value of X is N, what does 1 − F(x) equal for x ≥ N?] (This derivation also implies that a discrete rv X has a finite mean iff the series \( \sum {[1 - F(x)]} \) converges.)

    4. d.

      Let X have a geometric distribution with parameter p. Use the cdf of X and the alternative mean formula just derived to determine µX.

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Devore, J.L., Berk, K.N., Carlton, M.A. (2021). Discrete Random Variables and Probability Distributions. In: Modern Mathematical Statistics with Applications. Springer Texts in Statistics. Springer, Cham. https://doi.org/10.1007/978-3-030-55156-8_3

Download citation

Publish with us

Policies and ethics

Navigation