In this section students will:
A random variable describes the outcomes of a statistical experiment in words. The values of a random variable can vary with each repetition of an experiment.
Upper case letters such as \(X\) or \(Y\) denote a random variable. Lower case letters like \(x\) or \(y\) denote the value of a random variable.
If \(X\) is a random variable, then \(X\) is written in words, and \(x\) is given as a number.
For example, let \(X\) = the number of heads you get when you toss three fair coins. The sample space for the toss of three fair coins is TTT, TTH, THT, HTT, HHT, HTH, THH, HHH. Then, \(x=0,1,2,3\). \(X\) is in words and \(x\) is a number.
Notice that for this example, the \(x\) values are countable outcomes. Because you can count the possible values that \(X\) can take on and the outcomes are random (the \(x\) values 0, 1, 2, 3), \(X\) is a discrete random variable.
Calculations of probabilities for flipping a coin three times (or flipping 3 coins):
\(P(0~heads)=P(TTT)=(0.5)^3=0.125\)
\(P(1~head)=P(TTH~or~HTT~or~THT)=3(0.5)^3=0.375\)
\(P(1~head)=P(THH~or~HTH~or~HHT)=3(0.5)^3=0.375\)
\(P(0~heads)=P(HHH)=(0.5)^3=0.125\)
\(X\) is the number of heads
\(x\) are the outcomes: \(x=0,1,2,3\)
All probabilities are valid: \(0\le X \le
1\)
All probabilities sum to \(1\): \(\sum p(x)=1\)
A probability distribution is a mathematical description of the probabilities of events, subsets of the sample space. The sample space, often represented in notation by \(S\) is the set of all possible outcomes of a random phenomenon being observed; it may be any set: a set of real numbers, a set of descriptive labels, a set of vectors, a set of arbitrary non-numerical values, etc.
A child psychologist is interested in the number of times a newborn baby’s crying wakes its parental unit after midnight. For a random sample of 50 parents, the following information was obtained. Let \(X\) = the number of times per week a newborn baby’s crying wakes its parent after midnight. For this example, \(x = 0, 1, 2, 3, 4, 5\). \(P(X=x)=P(x)\) = probability that \(X\) takes on a value \(x\).
\(X\) is the number of times baby
wakes parent
\(x\) takes on the values 0, 1, 2, 3,
4, 5.
This is a discrete PDF because:
Each \(P(X=x)\) is between zero and
one, inclusive
The sum of the probabilities is one
In this section students will:
Since not all values of a random variable have the same probability, to calculate the mean, we have to approach it in a slightly different way. The mean is called an Expected Value (\(E(X)\)). It is a weighted mean (weighted average); meaning some values have a greater chance of happening than others.
\[E(X)=\sum xp(x_i)=x_1p(x_1)+x_2p(x_2)+\cdots+x_np(x_n)\]
The expected value is often referred to as the “long-term” average or mean.
This means that over the long term of doing an experiment over and over, you would expect this average.
The Law of Large Numbers states that, as the number of trials in a probability experiment increases, the difference between the theoretical probability of an event and the relative frequency approaches zero (the theoretical probability and the relative frequency get closer and closer together).
When evaluating the long-term results of statistical experiments, we often want to know the “average” outcome.
This “long-term average” is known as the mean or expected value of the experiment and is denoted by the Greek letter \(\mu\).
In other words, after conducting many trials of an experiment, you would expect this average value.
A men’s soccer team plays soccer zero, one, or two days a week. The probability that they play zero days is 0.2, the probability that they play one day is 0.5, and the probability that they play two days is 0.3. How many days per week, on average, would we expected the men’s soccer team to play soccer?
To do the problem, first let the random variable \(X\) = the number of days the men’s soccer team plays soccer per week. \(X\) takes on the values \(x=0, 1, 2\). Construct a PDF table adding a column x*P(x). In this column, you will multiply each x value by its probability.
Add the last column x*P(x) to find the long term average or expected value: \[EX=(0)(0.2)+(1)(0.5)+(2)(0.3)=0+0.5+0.6=1.1\]
The expected value is 1.1.
The men’s soccer team would, on the average, expect to play soccer 1.1 days per week. The number 1.1 is the long-term average or expected value if the men’s soccer team plays soccer week after week after week.
Suppose you play a game with a biased coin. You play each game by tossing the coin once. \(P(heads)=\frac23\) and \(P(tails)=\frac13\). If you toss a head, you pay $6. If you toss a tail, you win $10. If you play this game many times, will you come out ahead?
Define random variable \(X\) (and
\(x\))
\(X=\) number of heads in one trial
(one coin flip)
\(x={-\$6,\$10}\)
\(E(X)=\sum xp(x_i)=-6(2/3)+10(1/3)=-2/3\); not coming out ahead (behind by almost $0.70)
Variance: \[VX=\sum(x-EX)^2p(x)\]
Standard deviation: \[SDX=\sqrt{VX}\]
Example from previous slide:
\[VX=(-6--2/3)^2(2/3)+(10--2/3)^2(1/3)=56.8888889\]
\[SDX=\sqrt{VX}=\sqrt{56.8888889}=7.5424723\]
In this section students will:
There are four characteristics of a binomial experiment.
The outcomes of a binomial experiment fit a binomial probability distribution.
The random variable \(X=\) the number of successes obtained in the n independent trials.
The parameters are \(n\) and \(p\): \(n=\) number of trials, \(p=\) probability of a success on each trial.
\[\text{Shorthand notation: }X\sim
bin(n,p) \text{ or }X\sim B(n,p)\]
Read this as “\(X\) is a random
variable with a binomial distribution with parameters n and p.”
Binomial probabilities are found by using the binomial distribution function. Stating the probability question mathematically is the start.
Formula: \[P(X=x)=\binom{n}{x}p^xq^{n-x}~~~~~x=0,1,\cdots,n\]
\[\binom{n}{x}=\frac{n!}{x!(n-x)!}\]
is a combination and reads as “n choose x.” It is the number of ways
that \(x\) things can be chosen from
\(n\) trials (number of ways to get a
sum of 7 when rolling two 6-sided dice, as an example).
Note: the exponents of \(p\) and \(q\) MUST sum to \(n\) (\(x+(n-x)=n\)) AND \(p+q=1\)
Expected value, variance, standard deviation
\[EX=np~~~VX=npq~~~SDX=\sqrt{npq}\]
\(\binom{n}{x}\) is a combination and reads as “n choose x.” It is the number of ways that \(x\) things can be chosen from \(n\) trials (as an example, the number of ways to get a sum of 7 when rolling two 6-sided dice).
\(\binom{n}{x}=\frac{n!}{x!(n-x)!}\) with \(n!=n(n-1)(n-2)(n-3)\ldots 1\). This function is found on most calculators and websites (one is linked in this Module in Canvas)
Some hints with combinations:
\[\binom{n}{n}=1,~\binom{n}{1}=n,~ \binom{n}{n-1}=n,~ \binom{n}{0}=1\]
A fair coin is flipped 15 times. Each flip is independent. What is the probability of getting exactly 10 heads? What is the probability of getting more than 12 heads?
Let \(X=\) the number of heads in 15 flips of the fair coin. \(X\) takes on the values \(x=0, 1, 2, 3, \cdots, 15\). Since the coin is fair, \(p=0.5\) and \(q=1-p=0.5\). The number of trials is \(n=15\). State the probability question mathematically and calculate it
\[P(X=10)=\binom{n}{x}p^xq^{n-x}=\binom{15}{10}(0.5)^{10}(0.5)^{15-10}=
(3003)(0.5)^{10}(0.5)^5=0.0916\]
\[P(X>12)=P(X\ge13)=P(X=13)+P(X=14)+P(X=15)\]
\[\binom{15}{13}(0.5)^{13}(0.5)^{15-13}=
(105)(0.5)^{13}(0.5)^2=0.0032\]
\[\binom{15}{14}(0.5)^{14}(0.5)^{15-14}=
(15)(0.5)^{14}(0.5)=5\times 10^{-4}\]
\[\binom{15}{15}(0.5)^{15}(0.5)^{15-15}=
(1)(0.5)^{15}(0.5)^0=0\]
\[P(X>12)=0.0032+5\times
10^{-4}+0=0.0037\]
[Note: you could calculate the probabilities the way we initially did with flipping a coin two times and for flipping a coin three times but using the binomial distribution for flipping a coin \(15\) times is easier and less time-consuming. Either way is correct but using the binomial for flipping more than 3 coins is easier]
About 32% of students participate in a community volunteer program outside of school. If 15 students are selected at random, find the probability that at least 2 of them participate in a community volunteer program outside of school. How many students, on average, participate in a community volunteer program outside of school? Also calculate the variance and standard deviation.
\[n=15,~p=0.32~q=1-p=1-0.32=0.68\] and thus \[X\sim bin(15, 0.32)\]
\[P(X\ge
2)=P(2)+P(3)+\cdots+P(15)\] OR! \[P(X\ge 2)=1-P(X<2)=1-P(X\le
1)=1-[P(0)+P(1)]\]
\[P(X=0)=\binom{15}{0}(0.32)^{0}(0.68)^{15-0}=(1)(0.32)^{0}(0.68)^{15}=0.0031\]
\[P(X=1)=\binom{15}{1}(0.32)^{1}(0.68)^{15-1}=(15)(0.32)^{1}(0.68)^{14}=0.0217\]
\[P(X\ge
2)=1-[P(0)+P(1)]=1-(0.0031+0.0217)=1-0.0248=0.9752\]
\[EX=np=15(0.32)=4.8\]
\[VX=npq=15(0.32)(0.68)=3.26\]
\[SDX=\sqrt{VX}=\sqrt{3.26}=1.81\]
There are three assumptions for an experiment to have a Poisson distribution, and is used in modelling rare events
One of the things this distribution was used for was to model the number of horse kicks Prussian soldiers received (seriously, go look it up!)
Poisson is pronounced as “pwa-so(n)”. If you have seen The Little Mermaid, recall the song that the chef sings when Sebastian is running from him in the kitchen Little Mermaid: Les Poissons
\[\text{Shorthand notation: }X\sim
pois(\mu)~or~X\sim P(\mu)\]
\(\mu\) is the average or a rate (use
\(\mu=np\), \(EX\) from binomial)
You are allowed to use your calculator to do as much of the calculation
as you want. There are instructions in the book on how to use the
command if you have a TI graphing calculator on p. 267.
\[P(X=x)=\frac{e^{-\mu} \mu^x}{x!}~~~x=0,1,\cdots,\infty\]
\[EX=\mu\]
\[VX=\mu\]
\[SDX=\sqrt{\mu}\]
Consider an experiment that consists of counting the number of \(\alpha\)-particles given off in a 1-second time interval by 1 gram of radioactive material. If the average number of \(\alpha\)-particles given off is 3.2, what is the probability of exactly 2 \(\alpha\)-particles given off in the next 1-second interval? What is the probability that no \(\alpha\)-particles are given off in the next 1-second interval? More than 2 \(\alpha\)-particles?
\[X\sim pois(3.2)~or~X\sim P(3.2)\]
\[P(X=x)=\frac{e^{-\mu} \mu^x}{x!}\]
\[P(X=2)=\frac{e^{-3.2}3.2^2}{2!}=0.2087025\]
\[P(X=0)=\frac{e^{-3.2}3.2^0}{0!}=0.0407622\]
\[P(X>2)=1-P(X\le2)=1-[P(2)+P(1)+P(0)]\]
\[=1-\left(\frac{e^{-3.2}3.2^2}{2!}+\frac{e^{-3.2}3.2^1}{1!}+\frac{e^{-3.2}3.2^0}{0!}\right)\]
\[=1-0.3799037=0.6200963\]
\[EX=\mu=3.2\] \[VX=\mu=3.2\] \[SDX=\sqrt{VX}=\sqrt{\mu}=\sqrt{3.2}=1.7888544\]