Chapter 4. Discrete Distributions

Chapter Objectives


In this chapter, readers will learn to do the following:

• Determine the mean and standard deviation for discrete random variables

• Evaluate the probability of obtaining k successes in n trials using formulae and cumulative binomial probability table

• Evaluate the probability of an event happening x times within a given interval of time or space using the formulae and cumulative Poisson probability table

• Evaluate the probability of getting the first success after having a consecutive number of failures (geometric distribution)

 

4.1. The Mean and Standard Deviation for a Discrete Random Variable

In this chapter, we will analyze three of the most useful probability distributions of discrete variable distributions: binomial, Poisson, and geometric. As we could see in previous chapters, regardless of the shapes of the distribution of numbers, the data set can be described by measures of the centre (mean) and dispersion (standard deviation). Therefore, it would be reasonable to derive the formulae, which would work for all types of distributions of discrete random variables. Let’s start this with the following example.

Example 4.1

A die was rolled four times. Let X be the number of observations of odd-numbered faces. Construct a table of probability distribution and the probability histogram for the probability distribution. Determine the following probabilities using the probability histogram/distribution:

(a) \(P(x<3)\)

(b) \(P(x>2.5)\)

(c) \(P(0<x<3)\)

(d) \(P(0<x\le3)\)

Solution:

A die has six faces: 1, 2, 3, 4, 5, 6. Three of them are odd: 1, 3, 5. The probability of observing an odd face in each attempt is 3/6 = 1/2. After rolling the die 4 times, an odd number can be observed, either 0, 1, 2, 3, or 4 times. These are values of X. The corresponding probabilities can be easily evaluated using the following probability tree (fig. 4.1).

image

Figure 4.1. Probability tree constructed for example 4.1

As we can see from the diagram, the total number of outcomes is 16. For example, out of four rolls, two odd numbers were observed six times. Therefore, the probability P(x=2) = 6/16 or 0.375. Other probabilities can be evaluated similarly.

Table 4.1

X

P(x)

0

1/16 = 0.0625

1

4/16 = 0.25

2

6/16 = 0.375

3

4/16 = 0.25

4

1/16 = 0.0625

Total

1

Based on table 4.1, we can construct the probability histogram, which, in fact, represents the distribution of probability.

Figure 4.2. Probability histogram constructed for example 4.1

As we could see in the relative frequency histograms in previous chapters, these types of graphs are very useful for determining the probabilities of events.

\[
\begin{aligned}
(a)\;& P(x<3)=P(0)+P(1)+P(2)=0.0625+0.25+0.375=0.6875\\
(b)\;& P(x>2.5)=P(3)+P(4)=0.25+0.0625=0.3125\\
(c)\;& P(0<x<3)=P(1)+P(2)=0.25+0.375=0.625\\
(d)\;& P(0<x\le3)=P(1)+P(2)+P(3)=0.25+0.375+0.25=0.875
\end{aligned}
\]

Let’s analyze this example using common sense and make some conclusions. Although we can observe the odd numbers either 0, 1, 2, 3, or 4 times, most likely it will be observed twice. In other words, the mean of the probability distribution equals 2. Numerically, the mean of the probability of discrete variable X, with the values xi and probabilities p(xi) = P(X = xi), where i = 1, 2, . . . , N, can be determined by the following formula:

[latex]\displaystyle \mu=\sum_{i=1}^{N} x_i\,p(x_i)\qquad\text{(F4.1)}[/latex]

This formula can be determined as a numerical definition of the mean. It is really saying that if we conducted the experiment N times, we would expect values of X to occur in proportion to their assigned probabilities.

In physics, there exists an analogy of this concept, called a centre of mass distributed along a line, which is determined as

[latex]\displaystyle m_c=\sum_{i=1}^{N} x_i\,\frac{m_i}{M}\qquad\text{(F4.2)}[/latex]

where mi is the mass of particle i situated in the distance of xi from the origin and [latex]\displaystyle M=\sum_{i=1}^{N} m_i[/latex] is the total mass of the particles. The detailed explanation of the concept of the centre of the mass is beyond the scope of this book. However, the similarity between formulae (F4.1) and (F4.2) are obvious. Figure 4.3 can help readers to visualize the centre of the mass.

image

Figure 4.3. Centre of mass

As noted earlier in this book, the mean of any variable does not tell us enough about the variable itself; we also need some measure of the dispersion of the variable, which is the variance. In fact, the variance can be considered as a mean of the square of deviations of each observation from the mean of variables and be evaluated applying the logic used for (F4.1). Therefore,

[latex]\displaystyle \sigma^2=\sum_{i=1}^{N}(x_i-\mu)^2\,p(x_i)\qquad\text{(F4.2)}[/latex]

It can be shown that the formula (F4.2) is equivalent to

[latex]\displaystyle \sigma^2=\sum_{i=1}^{N} x_i^2\,p(x_i)-\mu^2\qquad\text{(F4.3)}[/latex]

The standard deviation can be determined by the definition as [latex]\sigma=\sqrt{\sigma^2}[/latex]

.

The formula (F4.2) is considered a definition of the variance and called the definition formula. The formula (F4.3) is more convenient for conducting calculations and hence is called the computation formula.

Now, we can evaluate the mean and variance and eventually the standard deviation of the discrete variable X in example 4.1 using formulae (F4.1) and (F4.2).     

[latex]\displaystyle \mu=\sum_{i=1}^{5}{x_ip(x_i)=(0\bullet0.0625)+(1\bullet0.25)+(2\bullet0.375)+(3\bullet0.25)+(4\bullet0.0625)=2}[/latex]

[latex]\displaystyle \sigma^2=\sum_{i=1}^{5}(x_i-\mu)^2p(x_i)[/latex]

[latex]\displaystyle =(0-2)^2\bullet0.0625+(1-2)^2\bullet0.25+(2-2)^2\bullet0.375+(3-2)^2\bullet0.25+(4-2)^2\bullet0.0625=1[/latex]

Consequently, [latex]\sigma=\sqrt{\sigma^2}=\sqrt1=1[/latex]

4.2. Binomial Distribution

In the previous chapter, we analyzed examples with a spinner with red and blue sectors.

image

Figure 4.4. Spinner with two equal sectors

We had determined the probabilities of spinning the red and blue sectors as equal to 1/2 or 0.5, since the areas of sectors are equal.

Let’s consider a spinner whose one quarter is red (fig. 4.5).

Figure 4.5. Spinner with one-quarter and three-quarter sectors

Based on the classic approach, one can determine that the probability of spinning the red sector is equal to 1/4. This result can be confirmed if we repeat the experiment many times. We can note two important observations.

 

In other words, the variable has only two outcomes. This type of variable is called binomial. If we consider the observation of red as a success, the observation of no-red can be classified as a failure. Using the probability symbols, we can write that P(red) = P(success) = 1/4 = 0.25 and P(no-red) = P(failure) = 3/4 = 0.75. In statistics, the probabilities of success and failure are denoted as p and q, respectively. In the previous chapter, we stated that the sum of all probabilities equals 1. Therefore, p + q = 1 or q = 1 ­– p. The same result for q can be obtained, considering that failure is the complement of success. Note that instead of experimenting many times on one spinner, we can carry out one experiment on many identical spinners.

Assume that we repeat our experiments (trials) n times, and X counts the number of successes in the n experiments. If we conduct another series of n experiments, we might get a different number for successes. This means that X is a random variable, and we call X a binomial random variable. Let the function p(x) represent the probability distribution of binomial random variable X. The shapes of probability distributions are determined by the probability of success. Below, we provide some examples of binomial distributions with various probabilities of success and numbers of trials (fig. 4.6):

image

Figure 4.6. Binomial probability distributions with various probabilities of success and numbers of trials

The probability of obtaining x successes in n trials with a probability, p, of success, [latex]0\le p\le1[/latex] , in each trial can be calculated using the following formula:

[latex]\displaystyle P(X=x)=C(n,x)\,p^x q^{\,n-x}\qquad\text{(F4.4)}[/latex]

where [latex]q=1-p\quad\text{and}\quad x=0,1,2,\ldots,n[/latex]

Example 4.2

Joseph hits a target at 50 m with 60% of the time. He shot five times. Determine the following probabilities:

[latex](a)\ \text{Joseph hits the target exactly three times}[/latex]
[latex](b)\ \text{Joseph hits the target at least three times}[/latex]
[latex](c)\ \text{Joseph hits the target at most three times}[/latex]
[latex](d)\ \text{Joseph hits the target more than three times}[/latex]

Solution:

The probability of success is p = 0.6, hence q = 1 – 0.6 = 0.4. The number of experiments n = 5.

 (a) Using the formula (F4.4), we find                                                                                                                   [latex]P(X=3)=C(5,3)\bullet0.6^3\bullet0.4^{5-3}=10\bullet0.216\bullet0.16=0.3456[/latex]

(b) Hitting the target at least 3 times means that Joseph might hit the target either 3, 4, or 5 times. Therefore, the probability of this event will be equal to the sum of the probabilities of 3 simple events—that is, hitting the target exactly 3 times, exactly 4 times, and exactly 5 times.

[latex]\displaystyle P(X\ge3)=P(X=3)+P(X=4)+P(X=5)[/latex]

[latex]\displaystyle =C(5,3)\bullet0.6^3\bullet0.4^{5-3}+C(5,4)\bullet0.6^4\bullet0.4^{5-4}+C(5,5)\bullet0.6^5\bullet0.4^{5-5}                                                                                                 [/latex]

[latex]\displaystyle =0.3456+0.2592+0.0778[/latex]

[latex]\displaystyle =0.6826[/latex]

(c) Hitting the target at most 3 times means that Joseph might hit the target either 3 times, 2 times, 1 time, or never. Therefore, the probability of hitting the target at most 3 times equals to the sum of the probabilities of hitting the target 3 times, 2 times, 1 time, or never.

[latex]\displaystyle P(X\le3)=P(X=3)+P(X=2)+P(X=1)+P(X=0)[/latex]

[latex]\displaystyle =C(5,3)\bullet0.6^3\bullet0.4^{5-3}+C(5,2)\bullet0.6^2\bullet0.4^{5-2}+C(5,1)\bullet0.6^1\bullet0.4^{5-                                                                           1}+C(5,0)\bullet0.6^0\bullet0.4^{5-0}[/latex]

[latex]\displaystyle =0.3456+0.2304+0.0768+0.0102[/latex]

[latex]\displaystyle =0.6630[/latex]

 

(d) Hitting the target more than 3 times means that Joseph might hit the target either 4 or 5 times. Therefore,

[latex]\displaystyle P(X>3)=P(X=4)+P(X=5)[/latex]

[latex]\displaystyle =C(5,4)\bullet0.6^4\bullet0.4^{5-4}+C(5,5)\bullet0.6^5\bullet0.4^{5-5}[/latex]

[latex]\displaystyle =0.2592+0.0778[/latex]

[latex]\displaystyle =0.3370[/latex]

Let’s note that one can reduce the calculations using the complement of events. For instance, in part (d), considering that the event “hitting the target more than 3 times” is the complement of the event “hitting the target at most 3 times,” we can evaluate the required probability as

[latex]\displaystyle P(X>3)=1-P(X\le3)=1-0.6630=0.3370[/latex]

We strongly recommend that readers practise performing the calculations for all examples presented in this book. If you follow our recommendation and try to complete the calculations for example 4.2, you already noticed that this is a time-consuming task and leaves plenty of room for human error. An alternative and much simpler way to determine the probabilities is to use binomial tables (table A1, Appendix). In fact, this table includes sub-tables for various number of trials, n, usually up to 20 or 25.

Later, we will present similar tables developed for other probability distributions. Before providing instructions on the use of these tables, we would like to emphasize that all represent cumulative probabilities.

Evaluation of a binomial probability, [latex]\displaystyle P(X = k)[/latex] , using the cumulative binomial probability table (table A1, Appendix) can be carried out in the following steps:

1. Find the sub-table for the given [latex]n[/latex] (number of trials).

2. Find the column for the given value of [latex]p[/latex] (probability of success).

3. Find the cumulative probability [latex]\displaystyle P(X\le k)=P(X=0)+\cdots+P(X=k)[/latex] presented at the intersection of the row marked [latex]k[/latex] and column [latex]p[/latex].

4. Find the cumulative probability [latex]\displaystyle P(X\le k-1)=P(X=0)+\cdots+P(X=k-1)[/latex] presented at the intersection of the row marked [latex]k-1[/latex] and column [latex]p[/latex].

5. Evaluate the probability as [latex]\displaystyle P(X=k)=P(X\le k)-P(X\le k-1)[/latex]

Now, let’s evaluate the desired probabilities of example 4.2 using the cumulative probability table (table A1, Appendix). Below we provide the required subtable for n = 5.

 

x

p = 0.6

0

0.774

0.590

0.328

0.168

0.078

0.031

0.010

0.002

0.000

0.000

0.000

1

0.977

0.919

0.737

0.528

0.337

0.188

0.087

0.031

0.007

0.000

0.000

2

0.999

0.991

0.942

0.837

0.683

0.500

0.317

0.163

0.058

0.009

0.001

3

1.000

1.000

0.993

0.969

0.913

0.813

0.663

0.472

0.263

0.081

0.023

4

1.000

1.000

1.000

0.998

0.990

0.969

0.922

0.832

0.672

0.410

0.226

5

1.000

1.000

1.000

1.000

1.000

1.000

1.000

1.000

1.000

1.000

1.000

(a) [latex]\displaystyle P(X=3)=P(X\le3)-P(X\le2)=0.663-0.317=0.346[/latex]

(b) [latex]\displaystyle P(X\ge3)=1-P(X\le2)=1-0.317=0.683[/latex]

(c) [latex]\displaystyle P(X\le3)=0.663[/latex]

(d) [latex]\displaystyle P(X>3)=1-P(X\le3)=1-0.663=0.337[/latex]

4.3. Poisson Distribution

The binomial random variable is the number of successes in a fixed number of trials of some random experiment. Many processes, however, generate only a single type of outcome, and not within a confined number of trials, but rather along a continuum of time or space.

Let’s consider the following examples

(1) Arrivals at a car wash in one hour
(2) Repairs needed in 10 kilometres of highway
(3) Telephone calls received by a switchboard in a specified time period
(4) Number of times when a trap was full in a trapping season

In these examples, the variables are assigned either to a specific space or time period. In the first example, the number of car arrivals varies from hour to hour. In the second example, the number of repair requests is assigned to the tenth kilometre, and, of course, it also varies over time. In the third example, the number of telephone calls in a specified time period is a variable, and it changes from day to day. In the fourth example, the number of successful trappings is a variable, and it changes within a trapping season. Usually, this time is regulated by governments.

In statistics, these types of variables are called Poisson random variables. The Poisson random variable is the number of occurrences of a specified event within a specified time or space. It is named after French mathematician Siméon Denis Poisson (1781–1840). The Poisson distribution is largely used in many areas of science, including physics, chemistry, and biology.

The Poisson variable must satisfy three criteria:

  • The number of successes that occur in any interval is independent of the number of successes that occur in any other interval.
  • The probability that success will occur in an interval is the same for all intervals of equal size and is proportional to the size of the interval.
  • The probability that two or more successes will occur in an interval approaches zero as the interval becomes smaller.

Note that the third criterion means that, in general, Poisson distributions have relatively low probabilities.

Let X be the number of events that occur in a period of time or space, and μ be a mean of such events. Then, the probability of x occurrences of this event is

[latex]\displaystyle P(X=x)=\frac{\mu^x e^{-\mu}}{x!}\quad (F4.5)[/latex]

for values of x = 0, 1, 2, and so on. This formula can be derived using calculus, and its detailed explanation is beyond the objectives of this book. Let’s just note that

e-μ

is called a natural exponential function, where e is an irrational constant approximately equal to 2.71814. This function can be evaluated using ordinary scientific calculators. Readers can easily conduct required computations using their calculators. We will use the formula (F4.5) to solve the following example.

Example 4.3

An office in the Public Relations Department receives an average of 90 calls per 45 minutes.

 (a) What is the probability of receiving exactly two calls in a minute?

(b)What is the probability of receiving two or less calls in a minute?

(c) What is the probability of receiving more than two calls in a minute?

(d)What is the probability of receiving more than two calls and less than five calls in a minute?

Solution:

First, we need to find the mean value of received calls per minute.

[latex]\displaystyle \mu=\frac{90}{45}=2\ \text{calls/minute}[/latex]

 (a)Given [latex]\displaystyle \mu=2,\quad x=2.\ \text{Using the formula (F4.5),}[/latex]

 

[latex]\displaystyle P(X=2)=\frac{2^2 e^{-2}}{2!}=\frac{4\bullet0.13534}{2}=0.27067[/latex]

 (b) Receiving 2 or less calls in a minute means that the number of calls could be 0, 1, or 2.

Therefore,

[latex]\displaystyle P(X\le2)=P(X=0)+P(X=1)+P(X=2)[/latex]

[latex]\displaystyle =\frac{2^0 e^{-2}}{0!}+\frac{2^1 e^{-2}}{1!}+\frac{2^2 e^{-2}}{2!}[/latex]

[latex]\displaystyle =0.13534+0.27067+0.27067[/latex]

[latex]\displaystyle =0.67668[/latex]

 

   (c) One can find this probability as [latex]P(X>2)=P(X=3)+P(X=4)+P(X=5)+\cdots[/latex]. However, we can ease this operation considering that the event [latex]X>2[/latex] is the complement of [latex]X\le2[/latex], whose probability was determined in part (b) of thi solution. Therefore,

[latex]\displaystyle P(X>2)=1-P(X\le2)=1-0.67668=0.32332[/latex]

   (d) [latex]\displaystyle P(2As we noted in the section about the binomial variables, finding probabilities using formulae is pretty straightforward. However, the computation process is time-consuming and may gain human errors. We think, after this note, the reader is already looking for a “table” technique similar to one used for binomial variables. Indeed, there exist tables for Poisson variables, including cumulative probabilities. They are provided as table A2 in the Appendix of this book. To use the Poisson cumulative probability tables, first we have to select the column for the given value of the mean µ. Then we need to select the row marked x. The number at the intersection of the selected column and row shows the cumulative probability

[latex]\displaystyle P(X\le x)=P(X=0)+\cdots+P(X=x)[/latex]

Now, let’s solve example 4.3 using table A2 (Appendix). Below, we provide the required segment of the table for µ = 2.

x/µ

x μ = 2.0 μ = 2.5 μ = 3.0 μ = 3.5 μ = 4.0 μ = 4.5
0 0.135 0.082 0.055 0.033 0.018 0.011
1 0.406 0.287 0.199 0.136 0.092 0.061
2 0.677 0.544 0.423 0.321 0.238 0.174
3 0.857 0.758 0.647 0.537 0.433 0.342
4 0.947 0.891 0.815 0.725 0.629 0.532
5 0.983 0.958 0.916 0.858 0.785 0.703
6 0.995 0.986 0.966 0.935 0.889 0.831
7 0.999 0.996 0.988 0.973 0.949 0.913
8 1.000 0.999 0.996 0.990 0.979 0.960
9 1.000 0.999 0.997 0.992 0.983
10 1.000 0.999 0.997 0.993
11 1.000 0.999 0.998
12 1.000 0.999
13 1.000

$$
\begin{array}{l}
(a)\; P(X=2)=P(X\le2)-P(X\le1)=0.677-0.406=0.271\\
(b)\; P(X\le2)=0.677\\
(c)\; P(X>2)=1-P(X\le2)=1-0.677=0.323\\
(d)\; P(2<X<5)=P(X\le4)-P(X\le2)=0.947-0.677=0.270
\end{array}
$$

4.4. Geometric Distribution

Talking about the binomial variables, we analyzed several examples, where each trial had only two outcomes: success or failure. This type of experiment is called a Bernoulli trial, named after Jacob Bernoulli (1655–1705), a Swiss mathematician known for his significant contributions to calculus. Consider that one repeats a series of Bernoulli trial until a success is obtained. The distribution of probabilities of the number of consecutive trials repeated until the first success is called the geometric distribution. In other words, the geometric distribution is a type of discrete probability distribution that represents the probability of the number of successive failures before the first success is obtained in a Bernoulli trial.

Since the geometric distribution represents the probabilities of consecutive Bernoulli trials, the required criteria are similar to the criteria of Binomial distribution.

  • All trials are independent.
  • Each trial can have only two outcomes, classified as success and failure.
  • The probability of success (usually denoted by p) is the same for each trial. Consequently, the probability of failure is also the same for each trial, since its probability is [latex]\displaystyle q=1-p .[/latex]

Assume that the probability of hitting a target for a shooter is p. Let’s drive a formula for the probability that she/he would succeed in hitting the target in x-th attempt after (x – 1) failures. Since all trials are independent, we can consider this event as an intersection of x simple event: (x – 1) consecutive failures and one success. Then using the formula (F3.17), we can obtain the following result:

[latex]\displaystyle P(X=x)=(1-p)\bullet(1-p)\cdots(1-p)\bullet p[/latex]

[latex]\displaystyle P(X=x)=(1-p)^{x-1}p\quad (F4.6)[/latex]

where x = 1, 2, . . .

The classical definition of the geometric distribution differs slightly from the one provided in most textbooks. More detailed analysis of this distribution is beyond the scope of the present book.

Sometimes we are also required to determine the probability that the number of trials to observe the first success would not exceed x. This probability can be determined by the formula

[latex]\displaystyle P(X\le x)=1-(1-p)^x\quad (F4.7)[/latex]

where x = 1, 2, . . .

The geometric distribution is used in life sciences, economy, law, and many other areas, where it is required to evaluate the probability of decision making. Below, we will analyze an example of an Indigenous hand game, where a certain stage of the game is continued until one of the players makes a successful decision.

Example 4.4

Hand games are popular among Indigenous Peoples in North America. There exist various versions of these games (https://youtu.be/nruGeuMKIwo). In all of them, two teams play against each other, accompanied by traditional drumming and music. Very often, each team includes five players. The team members sit face to face. A player of one team hides an object (mainly a decorated stick) in her/his hand, while her/his opponent guesses in which hand the object is hidden. Assume that a player makes correct guesses 70% of the time. Determine the following probabilities:

 (a) The player made a right guess in the third attempt

 (b) The player made a right guess at most in three attempts

Solution:

The probability of success is p = 0.7. To determine the required probabilities, we will use the formula (F4.6) for part (a) and formula (F4.7) for part (b).

(a) [latex]\displaystyle P(X=3)=(1-0.7)^{3-1}\bullet0.7=0.3^2\bullet0.7=0.063[/latex]

(b) [latex]\displaystyle P(X\le3)=1-(1-0.7)^3=1-0.3^3=0.973[/latex]

Other Important Discrete Probability Distributions

One has to note that, depending on the area of application, probability distributions can be defined in various ways. In this chapter, we already analyzed some discrete probability distributions. Below, we will briefly describe three more discrete probability distributions. The detailed studies of them and other distributions are beyond the objectives of this book.

Negative Binomial Distributions

Above, we analyzed examples when, during each random trial, the probability of success was the same. Let’s define rolling face 5 as a success on a fair die and rolling 1, 2, 3, 4, and 6 as a failure. The probability of rolling 5 in each experiment is equal to 1/6 and stays unchanged during randomly selected experiments. Consider that we need to continue experiments until rolling face 5 two times. Until we achieve this result, we may observe failures. The probability distribution of the number of these failures (faces 1, 2, 3, 4, and 6) is called a negative binomial distribution. In general, the negative binomial distribution is a discrete probability distribution that describes the number of failures until a certain number of successes is observed in a sequence of Bernoulli trials. The negative binomial distribution is often used in fields such as ecology, epidemiology, and insurance to model overdispersed count data, where the variance exceeds the mean.

Hypergeometric Distributions

Example 4.5

Assume that there are three red and four blue marbles in a jar. A child, closing her eyes, will select a marble two consecutive times.

(a) What is the probability that the child will select two blue marbles if she returns the marble to the jar after the first attempt? This manner of selection is called with replacement.

(b) What is the probability that the child will select two blue marbles if she does not return the marble to the jar after the first tempt?This manner of selection is called without replacement.

Solution:

(a) If the girl returns the selected marble to the jar after each attempt, her chances to select a blue marble in each attempt remain the same, and are determined as

[latex]\displaystyle P(B)=\frac{4}{7}[/latex]

         Then, the probability that she selects blue marbles in both attempts under the given condition (with replacement) can be found using the formula (F3.17).

[latex]\displaystyle P(BB)=P(B)P(B)=\frac{4}{7}\bullet\frac{4}{7}=\frac{16}{49}[/latex]

(b) The probability that the girl will get a blue marble in the first attempt is

 

[latex]\displaystyle P(B)=\frac{4}{7}[/latex]

Since she does not return the selected marble back to the jar, the probability that she will get the blue marble on the second  attempt can be determined as

 

$$
P(B|B)=\frac{3}{3+3}=\frac{3}{6}=\frac{1}{2}
$$

Therefore, the probability that the girl selects blue marbles in both attempts under the given condition (without   replacement) can be found using the multiplication rule (formula F3.16).

$$
P(BB)=P(B)P(B|B)=\frac{4}{7}\times\frac{1}{2}=\frac{4}{14}=\frac{2}{7}
$$

The hypergeometric distribution is valuable for modelling situations similar to the one provided in example 4.5, where the items are not put back after being selected, making it essential in various fields such as statistics, biology, and quality assurance.

Multinomial Distributions

Above, we provided an explanation of the binomial distribution, which models the independent experiments with two outcomes only. The multinomial distribution is a generalization of the binomial distribution. Assume that we roll a 6-sided die n = 10 times. Each side of the die represents a different category: 1, 2, 3, 4, 5, and 6. If we want to find the probability of rolling 3 times a face 1, 2 times a face 2, and 5 times a face 3, we would use the multinomial distribution to calculate the likelihood of this specific outcome based on the individual probabilities of each outcome. In this example, the multinomial distribution models the probability of getting a point on each side of a k-sided die rolled n times. Consider n independent trials, each of which results in success for precisely one of the k categories, with each category having a given fixed probability of success. The multinomial distribution gives the probability of any particular combination of numbers of successes for the different categories.

Chapter 4 Summary

• The mean and standard deviation for a discrete random variable
• Binomial distribution
• Poisson distribution
• Geometric distribution

You can also access the presentation of the lecture just by clicking here: click

EXERCISES

Chapter 4. Discrete Distributions

The Mean and Standard Deviation for a Discrete Random Variable

  1.  The average age of death in industrialized countries is 76 years. The average age of death in countries that are partially industrialized is 69 years. The average age of death in non-industrialized countries is 55 years.

30% of the world’s population lives in industrialized countries

50% lives in partially industrialized countries

20% lives in non-industrialized countries.

What is the expected age of death for any individual?

    2.  An airline company charges $600 for a flight from Regina to Toronto. If the flight is overbooked, the airline pays $300 to the        passenger, and puts them on a later flight. From past experience, the airline knows that 3% of the passengers will be overbooked. What is the expected revenue the airline will get from each passenger?

 

3.  An insurance company is attempting to estimate the revenue it will make on each accident insurance policy it sells. Each policy  costs $25. If an individual has a minor injury, they receive $500 from the insurance company. If they have a major injury, they will receive $1,200. If they die, their beneficiary will receive $10,000. The insurance company has calculated the following probabili ties for an accident occurring:

no accident

.984

minor accident

.012

major accident

.003

death

.001

Calculate the expected revenue the insurance company will make on each policy.

 

4.  A roulette table has 18 red slots, 18 black slots, and 2 green slots. If a player bets $1 on red and wins, the player wins $2.

a.  If a player bets $1 many times on red, how much can the player expect to win on average?

b.  If the player bets $10 each time on red, how much can the player expect to win?

 

5.  As assistant to the marketing manager of Sludge Chemical Company, you must find the expected net profit of two possible new  products. You have calculated the following probability distributions for the net profits of each product:

Product A

 

                                   Product B

 

Net profit

P(net profit)

Net profit

P(net profit)

-$5,000

0.1

$0

0.2

0

0.2

$1,000

0.3

$3,000

0.4

$3,000

0.3

$6,000

0.3

$5,000

0.2

                                           Find the expected net profit for each product.

6.  Three coins are tossed. If 3 heads appear, the player wins $30. If 2 heads appear, the player wins $5. If 1 head appears, the player loses $15. If 0 heads appear, the player loses $20. What are the expected winnings in a game?

7.  In the “Quick Draw” casino card game, a player chooses a single card from a deck of 52 well-shuffled cards. If the card the player   selects is the king of hearts, the player wins $104; if the card is an ace, the player wins $78; if the card selected is anything else,             the player loses $13. Considering a negative amount won to be a loss, how much should the player expect to win in one play of            “Quick Draw”?

image

Binomial Distribution (Indigenous Games)

      1.   A telemarketer makes five phone calls per hour and is able to make a sale on 30% of these contacts. Use the binomial distribu tion (p=0.3 because of 30%) to answer the following questions:                                                                                                              

(a) What is the expected number of sales?

 (b) What is the probability that there will be exactly four sales? Show the calculations and express the answer to four places after the decimal point. Do not use tables for this part.

 (c) What is the probability that there will be exactly four sales? Use the binomial table (table A1, Appendix) to answer the question.

 (d) What is the probability that there will be at least two contracts? Use the tables provided.

      2.  Among individuals in a certain country, 40% have blood type A. If six individuals. are randomly selected, what is the probability that

(a) All the individuals have type A?

             (b) None of the individuals have type A?

             (c) Fewer than two individuals have type A?

             (d) Use the binomial table (table A1, Appendix) to evaluate the probability that more than two individuals have type A.

      3.   A town in southern Saskatchewan consists of 40% retired people. In a random sample of 10 people, what is the probability of finding less than or equal to three retired people?

4.  The unemployment rate Regina is 8.3%. If a sample of 10 Regina residents is taken, what is the probability that

        (a) None of the individuals is unemployed?

        (b) Only one individual is unemployed?

        (c) Use the binomial table (table A1, Appendix) to evaluate that fewer than two individuals are unemployed.

       5.  The failure rate for a certain brand of condoms is 20%. If 15 condoms are used, what is the probability that

(a) No condoms fail?
(b) One condom fails?
(c) Two condoms fail?
(d) At least one condom fails?
(e) All the condoms fail?

6.  The effects of radiation due to ingesting radium have been analyzed for a large number of people. It has been shown that if a  certain amount of radium is ingested, 22% of the individuals will develop bone cancer. If eight people ingest the same amount  of radium, what is the probability that

             (a) All eight develop bone cancer?

             (b) None develop bone cancer?

             (c) At least one individual develops bone cancer?

             (b) Use appropriate table to find the probability that more than two develop bone cancer.

       7.  The probability is 0.95 that an Internet site is online at any time during the day. If 12 individuals try to connect to the Internet   site, what is the probability that

             (a)All will be able to connect to the site?

             (b)11 will be able to connect to the site?

             (c)Less than 11 will be able to connect to the site?

Poisson Distribution

  1. Textbook authors and publishers work very hard to minimize the number of errors in a text. However, some errors are unavoidable. A statistical editor reports that the number of errors per chapter has a Poisson distribution with a mean of 3.0. What is the probability that

(a) The chapter has exactly three errors. Express your answer to six places after the decimal and do not use tables.

(b) The chapter has two or more errors. Use the Poisson table (table A2, Appendix).

    2.  X has a Poisson distribution with λ= 20 unit/week. In order to find P(X>1) when X is the number of occurrences per two weeks,   we use λ=10 unit/ two weeks. Use the Poisson table (table A2, Appendix) to find [latex]\displaystyle P(X>2)[/latex].

    3.  (Introduction to Statistics, 2nd Ed, Test Bank, Anderson, D.R., Sweeney, D.J., Williams, T.A., 1991) Given a Poisson random variable  x, where the average number of times an event occurs in a certain period of time is 2.5, then what is P(x = 0)?

    4.  (Introductory Business Statistics, Holmes, A., Illowsky, B., Dean, S., Openstax, 2017) According to a survey a university professor  gets, on average, seven emails per day. Let X = the number of emails a professor receives per day. The discrete random variable X takes on the values x = 0, 1, 2. The random variable X has a Poisson distribution of [latex]\displaystyle X\sim P(7)[/latex] . The mean is seven emails. Use the Poisson tables (table A2, Appendix) to answer the following questions:

 

(a) What is the probability that an email user receives exactly two emails per day?

(b) What is the probability that an email user receives at most two emails per day?

(c) What is the standard deviation?

    5.  (Introductory Business Statistics, Holmes, A., Illowsky, B., Dean, S., Openstax, 2017) Text message users receive or send an average of 41.5 text messages per day. Use the Poisson table (table A2, Appendix) to evaluate the following:

 

(a) How many text messages does a text message user receive or send per hour?

(b) What is the probability that a text message user receives or sends two messages per hour?

(c) What is the probability that a text message user receives or sends more than two messages per hour?

definition

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Introductory Statistics Copyright © 2026 by Arzu Sardarli and Andrei Volodin is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.