As a private tutor for statistics, I have found that distribution and probability are topics that can cause a lot of confusion amongst students. Below are a number of example questions with solutions to help you in understanding probability.
Question 1
Three coins are in a box. One is a two-headed coin, another is a two-tailed coin and the third is a fair coin. One of the coins is selected and flipped at random, heads come up. What is the probability that it is the two-headed coin?
Solution
Let event A be selecting the two-headed coin, and event B be flipping a heads
We want to find the probability of event A given event B, which can be expressed as P(A|B).
Using Bayes' Theorem, we have:
P(A|B) = P(B|A) * P(A) / P(B)
P(B|A) = 1
P(A) = 1/3
To calculate P(B), there are two possibilities: either we selected the two-headed coin and flipped heads, or we selected the fair coin and flipped heads. The probability of the first scenario is (1/3) * 1 = 1/3, and the probability of the second scenario is (1/3) * 1/2 = 1/6 (since the fair coin has a 1/2 probability of landing on heads). Therefore, P(B) = 1/3 + 1/6 = 1/2.
P(A|B) = P(B|A) * P(A) / P(B)
P(A|B) = 1 * (1/3) / (1/2) = 2/3
The probability that the selected coin is the two-headed coin, given that we flipped a heads, is 2/3.
Question 2
A large shipment of parts is received, out of which five are tested for defects. Past experience with the supplier leads you to place a beta(1,9) prior on the part defect rate. Suppose no defects are observed in the five tested from the shipment. Give the posterior distribution on the defect rate, the MAP estimate of the defect rate, the Bayes estimate (LMS) of the defect rate, and the 95% HDI.
Solution
i. posterior ∝ prior * likelihood
posterior ∝ Beta(1,9) * Binomial(0; 5, p)
posterior ∝ p^0 * (1 - p)^5 * p^0.01-1 * (1-p)^9-1
we get:
posterior ∝ (1 - p)^14 * p^-0.99
This is a Beta(1, 14) distribution. Therefore the posterior distribution is a Beta(1,14).
ii. MAP estimate
For a Beta(1, 14) distribution, the mode is given by (a-1)/(a+b-2), where a and b are the shape parameters of the Beta distribution. Therefore, a = 1 (the prior shape parameter) + 0 (no defects observed in the sample) = 1, and b = 9 (the prior scale parameter) + 5 (the number of non-defective parts in the sample) = 14.
The MAP estimate of the defect rate is:
MAP = (1-1)/(1+14-2) = 0/13 = 0
iii. The Bayes estimate of the defect rate
The Bayes estimate of the defect rate is:
Estimate = a/(a+b) = 1/(1+14) = 0.0667
iv. The 95% HDI
The 95%HDI is (0.0018, 0.2316). Therefore, this gives us the HDI of [0.0018, 0.2316], indicating that there is a 95% probability that the true defect rate falls within this range.
Question 3
Out of a production lot of electronic components six are to be tested to estimate the mean component lifetime, 8. The six components to be tested are independently selected from the production line and their lifetimes X1, ..., X6 are observed. It is known that component lifetime is an exponential random variable with mean (expected value) 8. Suppose the six observed lifetimes are 15, 12, 14, 10, 12, 11.
(a) Find the likelihood function and maximum likelihood estimate for 8. Calculate a 95% confidence interval using classical methods. Discuss why this confidence interval might be suspect-e.g., what assumptions may not hold.
The maximum likelihood is given by:
L(8; X1,...,X6) = f(X1;8) x f(X2;8) x f(X3;8) x f(X4;8) x f(X5;8) x f(X6;8)
= (1/8)e^(-15/8) x (1/8)e^(-12/8) x (1/8)e^(-14/8) x (1/8)e^(-10/8) x (1/8)e^(-12/8) x (1/8)e^(-11/8)
= (1/8)^6 e^(-74/8)
The 95% confidence interval is:
x̄ ± t(0.025, 5) * s/√6
x̄ = (15+12+14+10+12+11)/6 = 12.33
The standard deviation is:
s = √[((15-12.33)^2 + (12-12.33)^2 + (14-12.33)^2 + (10-12.33)^2 + (12-12.33)^2 + (11-12.33)^2)/5] = 1.972
t0.025,5=2.571
=12.33 ± 2.571 * 1.972/√6
= 10.2602, 14.3998
The 95% confidence interval is between [10.2602, 14.3998]
The sample size of 6 is small, which might result in a large margin of error. Consequently, it is crucial to think about the analysis's potential shortcomings and sources of bias because it assumes the observed durations are independent and identically distributed exponential random variables with mean 8, the confidence interval may be misleading. If there are variations in the production line, then this assumption may not hold.
(b) From past experience, it is known that, among production lots, & is distributed according to an inverse gamma distribution with a = 10, B = 100. Give the MAP estimate of mean component lifetime, the Bayes estimate (LMS) of the mean component lifetime, and the 90% HDI of mean component lifetime.
a. MAP estimate of the mean
We know that the component lifespan follows an exponential distribution with a mean of 8, so let's call this random variable X. Let us estimate the unknown mean of X as. Assume a previous distribution on with parameters a = 10 and b = 100, which is an inverse gamma distribution.
The Likelihood is given below:
L(µ | x1, ..., x6) = ∏(1/8) exp(-xi/8) = (1/8)^6 exp(-∑xi/8)
=P(µ | x1, ..., x6) ∝ L(µ | x1, ..., x6) P(µ)
= (1/8)^6 exp(-∑xi/8) * (b^a / Γ(a)) µ^(-a-1) exp(-b/µ)
log P(µ | x1, ..., x6) ∝ -∑xi/8 - log µ - a log µ + log b + constant
Taking the derivative of the logarithm with respect to µ and setting it to zero. Solving for µ, we get:
µ = (a + 6) / (b + ∑xi)
µ = (10 + 6) / (100 + 15 + 12 + 14 + 10 + 12 + 11) = 0.0920
The MAP estimate of the mean component lifetime is 0.0920 units
b. The Bayes estimate (LMS) of the mean component lifetime
The posterior is given by:
P(µ | x1, ..., x6) ∝ (1/8)^6 exp(-∑xi/8) * (b^a / Γ(a)) µ^(-a-1) exp(-b/µ)
The posterior distribution is the inverse gamma distribution with parameters a= a + n = 16 and b = b + ∑xi = 181, where n is the sample size (in this case, n = 6).
Estimate = b / (a - 1) = 181 / 15 = 12.0667
The 90% HDI is given by:
P(µ | x1, ..., x6) ∝ (1/8)^6 exp(-∑xi/8) * (b^a / Γ(a)) µ^(-a-1) exp(-b/µ)
The inverse is given by:
f(x; a, b) = b^a / Γ(a) x^(-a-1) exp(-b/x)
Where x > 0 is the parameter of interest, and Γ(a) is the gamma function.
The range goes from a high of 16.92 to a low of 10.58. Therefore, the 90% HDI for the average lifetime of a component is [10.58, 16.92]. In this range, you'll find the highest concentration of the posterior probability mass (about 90%).
c. Comparison of the two approaches
The Bayesian 90% HDI for the mean component lifespan is [10.58, 16.92], while the classical 95% confidence range is [10.03, 15.97]. Both intervals are overlapping, but the Bayesian interval is larger. This is due to the fact that the classical technique assumes the parameter is fixed but unknown, but the Bayesian approach accounts for our uncertainty about the value of the parameter.
When the population distribution is known or can be assumed to be normal, a large sample size, and the classical technique may be selected. When the sample size is small, the population distribution is uncertain or non-normal, or there is prior knowledge that can be integrated into the analysis, the Bayesian method may be selected. Because the sample size is small and we have prior information about the distribution of the parameter, the Bayesian approach may be preferable here.
Question 4
An oncologist believes that 90% of cancer patients will positively respond to a new interferon treatment and that it is unlikely that this proportion will be below 80%.
a. Determine a beta prior on the (positive) response proportion, 8, which models the oncologist's beliefs. (Hint: Find the beta which has mean µ = 0.9 and μ-2σ = 0.8, where σ is the standard deviation of the desired beta. Relate these conditions to the parameters defining a beta distribution.)
Let the positive response proportion be denoted by p. The oncologist believes that the probability of a positive response is 0.9, so we have:
µ = E[p] = 0.9
The oncologist also believes that it is unlikely that the proportion will be below 0.8, which can be translated as the probability of a positive response being at least 0.8 with a confidence of 95%, so we have:
µ - 2σ = 0.8, where σ is the standard deviation of the beta distribution.
We can use these two equations to solve for the shape parameters of the beta distribution. Let a and b be the shape parameters. Then we have:
a / (a + b) = µ = 0.9
a / (a + b)^2 * (1 / (1 + a + b)) = (µ - 2σ) / (2σ) = 0.8
Solving for a and b, we get:
a = (µ * (1 - µ) / σ^2 - µ) = 4.5
b = a * (1 / µ - 1) = 0.5
Where σ^2 = a * b / ((a + b)^2 * (a + b + 1)) is the variance of the beta distribution.
Therefore, the beta prior that models the oncologist's beliefs is Beta(4.5, 0.5).
b. Give the posterior distribution on the proportion.
Let n be the sample size and y be the number of patients who positively responded to the interferon treatment. Then the likelihood function is:
L(p|y,n) = p^y * (1-p)^(n-y)
Using Bayes' theorem, the posterior distribution is proportional to the product of the likelihood and the prior:
f(p|y,n) ∝ p^y * (1-p)^(n-y) * p^(a-1) * (1-p)^(b-1)
We can simplify this expression by combining the powers of p and (1-p):
f(p|y,n) ∝ p^(y+a-1) * (1-p)^(n-y+b-1)
This is the kernel of a beta distribution with parameters (y+4.5, n-y+0.5), so the posterior distribution is:
f(p|y,n) = Beta(y+4.5, n-y+0.5)
Therefore, the posterior distribution on the proportion is also a beta distribution with updated shape parameters (y+4.5, n-y+0.5)
If you need further assistance and would like to have private lessons, I´m a statistics tutor who can help you. Feel free to reach out to me for further information about classes.