In this section students will:
The graph of a continuous probability distribution is a curve. The probability of interest is represented by the area under the curve.
The curve is called the probability density function (pdf). Function notation is used (\(f(x)\)) to represent the curve. \(f(x)\) is the function that corresponds to the graph; we use the density function \(f(x)\) to draw the graph of a probability distribution.
Area under the curve is given by a different (related) function called the cumulative distribution function (cdf). The cumulative distribution function is used to find probabilities as areas under the curve.
The normal, a continuous distribution, is the most important of all the distributions. It is widely used and even more widely abused.
Its graph is bell-shaped. You see the bell curve in almost all disciplines. Two important points about this curve are:
The normal distribution has two parameters (two numerical descriptive measures), the mean \(\mu\) and the standard deviation \(\sigma\).
If \(X\) is a quantity to be measured that has a normal distribution with mean, \(\mu\), and standard deviation, \(\sigma\), we designate this by writing \(X\sim N(\mu,\sigma)\).
The cumulative distribution function is \(P(X<x)\). It is calculated either by a calculator or a computer, or it is looked up in a table.
The normal distribution has two parameters: mean (\(\mu\)) and standard deviation (\(\sigma\)).
\[\text{Shorthand notation: }X \sim N(\mu,\sigma)\]
The probability density function (pdf) for a normal distribution is:
\[f(x)=\frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{(x-\mu)^2}{2\sigma^2}}\]
To use this, you would have to use calculus but we will not have to, there is another (easier) way to calculate probabilities.
The standard normal distribution is a normal distribution of standardized values, called \(z\)-scores. The \(z\)-scores are measured in units of the standard deviation. It is always centered at 0 and always has a standard deviation of 1, thus
\[Z \sim N(0,1)\]
\(Z\)-score logistics
\(z\)-scores are the number of standard deviations the value is in relation to the mean. If \(z=-1\), that is one standard deviation below the mean, if \(z=1\), that is one standard deviation above the mean. Values of \(x\) that are smaller than the mean will have negative \(z\)-scores, \(x\) values that are larger than the mean will have positive \(z\)-scores. If \(x\) is equal to the mean, then it has a \(z\)-score of 0. \(Z\)-scores allow comparison of data that originate on different scales
The Empirical Rule is derived from the normal distribution:
68% of observations are within the interval \(\overline{X}\pm 1s\)
95% of observations are within the interval \(\overline{X}\pm 2s\)
99.7% of observations are within the interval \(\overline{X}\pm 3s\)
One of the examples from the Standard Normal examples will show the area between -1 and 1
With mean \(\mu=0\) and standard deviation \(\sigma=1\), thus
\[Z \sim N(0,1)\]
If \(X\sim N(\mu,\sigma)\) then
\[z=\frac{X-\mu}{\sigma}\]
and \[X=z_0\sigma+\mu\]
To find a z-score associated with a probability (percent)
With \(z\)-scores:
\(P(Z<1)\)
\(P(Z>1)\)
\(P(Z<-1)\)
\(P(Z>-1)\)
\(P(-1<Z<1)\)
\(z\)-score for top 1%: \(P(Z>z_0)=0.01\)
\(z\)-score for \(Q1\): \(P(Z<z_0)=0.25\)
\(z\)-score for \(Q3\): \(P(Z<z_0)=0.75\)
With \(z\)-scores:
\(P(Z<1)=0.8413447\)
\(P(Z>1)=1-0.8413447=0.1586553\)
\(P(Z<-1)=0.1586553\)
\(P(Z>-1)=1-0.1586553=0.8413447\)
\(P(-1<Z<1)=0.8413447-0.1586553=0.6826895\)
\(z\)-score for top 1%: \(P(Z>z_0)=0.01 \Rightarrow z_0=2.327\)
\(z\)-score for \(Q1\): \(P(Z<z_0)=0.25 \Rightarrow z_0=-0.675\)
\(z\)-score for \(Q3\): \(P(Z<z_0)=0.75 \Rightarrow z_0=0.675\)
Empirical Rule derivation
\(P(-1<Z<1)=0.8413447-0.1586553=0.6826895\)
\(P(-2<Z<2)=0.9772499-0.0227501=0.9544997\)
\(P(-3<Z<3)=0.9986501-0.0013499=0.9973002\)
Suppose a pizza company has a rule that your pizza must be 16” or you will receive it for free. Assume pizza sizes follow a normal distribution. At a restaurant, the mean size for pizza is 16.3” with standard deviation of 0.2”. Find the following:
Shorthand notation: \(X\sim N(16.3, 0.2)\)
There is no solution for a normal Spongebob…he will be crazy forever…forever…forever…