The Normal Distribution

Normal Distributions

In this section students will:

Graph a normal curve
Identify the properties of a normal distribution
Apply the empirical rule to solve real-world applications
Given mean and standard deviation, convert raw data to z-scores
Given mean and standard deviation, convert z-scores to raw data
Compute probabilities of the normal distribution to solve real-world applications

Properties of Continuous Probability Distributions

The graph of a continuous probability distribution is a curve. The probability of interest is represented by the area under the curve.

The curve is called the probability density function (pdf). Function notation is used (\(f(x)\)) to represent the curve. \(f(x)\) is the function that corresponds to the graph; we use the density function \(f(x)\) to draw the graph of a probability distribution.

Area under the curve is given by a different (related) function called the cumulative distribution function (cdf). The cumulative distribution function is used to find probabilities as areas under the curve.

Outcomes are measured, not counted
The entire area under the curve and above the \(x\)-axis is equal to 1
Probability is found for intervals of \(x\) values rather than for individual \(x\) values
\(P(a<X<b)\) is the probability that the random variable \(X\) is in the interval between the values \(a\) and \(b\)
\(P(x=c)=0\). The probability that \(x\) takes on any single individual value is zero. The area below the curve, above the \(x\)-axis, and between \(x=c\) and \(x=c\) has no width, and therefore no area (area=0). Since probabilities are areas under curves, the probability is also zero (tldr: the area under the curve at a single point does not exist and the area is equal to zero)
\(P(a<X<b)=P(a\le X\le b)\); they are the same because probability is equal to area

Introduction to Normal

The normal, a continuous distribution, is the most important of all the distributions. It is widely used and even more widely abused.

Its graph is bell-shaped. You see the bell curve in almost all disciplines. Two important points about this curve are:

The curve is symmetrical about a vertical line drawn through the mean, \(\mu\)
The area under the curve must equal one

The normal distribution has two parameters (two numerical descriptive measures), the mean \(\mu\) and the standard deviation \(\sigma\).

If \(X\) is a quantity to be measured that has a normal distribution with mean, \(\mu\), and standard deviation, \(\sigma\), we designate this by writing \(X\sim N(\mu,\sigma)\).

The cumulative distribution function is \(P(X<x)\). It is calculated either by a calculator or a computer, or it is looked up in a table.

Terms

The normal distribution has two parameters: mean (\(\mu\)) and standard deviation (\(\sigma\)).

\[\text{Shorthand notation: }X \sim N(\mu,\sigma)\]

The probability density function (pdf) for a normal distribution is:

\[f(x)=\frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{(x-\mu)^2}{2\sigma^2}}\]

To use this, you would have to use calculus but we will not have to, there is another (easier) way to calculate probabilities.

Normal distrbution with mean 10, stdev 3

Standard Normal Distribution

The standard normal distribution is a normal distribution of standardized values, called \(z\)-scores. The \(z\)-scores are measured in units of the standard deviation. It is always centered at 0 and always has a standard deviation of 1, thus

\[Z \sim N(0,1)\]

\(Z\)-score logistics

\(z\)-scores are the number of standard deviations the value is in relation to the mean. If \(z=-1\), that is one standard deviation below the mean, if \(z=1\), that is one standard deviation above the mean. Values of \(x\) that are smaller than the mean will have negative \(z\)-scores, \(x\) values that are larger than the mean will have positive \(z\)-scores. If \(x\) is equal to the mean, then it has a \(z\)-score of 0. \(Z\)-scores allow comparison of data that originate on different scales

The Empirical Rule is derived from the normal distribution:

68% of observations are within the interval \(\overline{X}\pm 1s\)
95% of observations are within the interval \(\overline{X}\pm 2s\)
99.7% of observations are within the interval \(\overline{X}\pm 3s\)

One of the examples from the Standard Normal examples will show the area between -1 and 1

Standard Normal Graph

With mean \(\mu=0\) and standard deviation \(\sigma=1\), thus

\[Z \sim N(0,1)\]

Standard normal distribution with mean 0,stdev=1

Standard normal distribution formulas and probabilities

If \(X\sim N(\mu,\sigma)\) then

\[z=\frac{X-\mu}{\sigma}\]
and \[X=z_0\sigma+\mu\]

Calculating (finding) probabilities with the table

Calculate the Z-score: Use the formula \(z=\frac{x-\mu}{\sigma}\) (where \(x\) is the value of interest, \(\mu\) is the mean, and \(\sigma\) is the standard deviation).
Locate the Z-score: Find the first two digits (example: 1.2) in the left-hand column and the second decimal digit (example: 0.04) in the top row.
Find the Intersection: The value where the row and column intersect is the probability, which represents the area to the left of that Z-score.
Interpret the Result
1. Less than (\(P(Z<z)\)): Use the table value directly
2. Greater than (\(P(Z>z\))): Subtract the table value from 1 (\(P(Z>z)=1-P(Z<z)\)).
3. Between two scores: Find the area for both, then subtract the smaller area from the larger (\(P(Z<z_{larger})-P(Z<z_{smaller})\))

To find a z-score associated with a probability (percent)

Find the left area/tail probability and find it inside the table
Use the intersection on the table to find its z-score

Standard Normal Example

With \(z\)-scores:

\(P(Z<1)\)
\(P(Z>1)\)
\(P(Z<-1)\)
\(P(Z>-1)\)
\(P(-1<Z<1)\)
\(z\)-score for top 1%: \(P(Z>z_0)=0.01\)
\(z\)-score for \(Q1\): \(P(Z<z_0)=0.25\)
\(z\)-score for \(Q3\): \(P(Z<z_0)=0.75\)

Standard Normal Solutions

With \(z\)-scores:

\(P(Z<1)=0.8413447\)

\(P(Z>1)=1-0.8413447=0.1586553\)

\(P(Z<-1)=0.1586553\)

\(P(Z>-1)=1-0.1586553=0.8413447\)

\(P(-1<Z<1)=0.8413447-0.1586553=0.6826895\)

\(z\)-score for top 1%: \(P(Z>z_0)=0.01 \Rightarrow z_0=2.327\)

\(z\)-score for \(Q1\): \(P(Z<z_0)=0.25 \Rightarrow z_0=-0.675\)

\(z\)-score for \(Q3\): \(P(Z<z_0)=0.75 \Rightarrow z_0=0.675\)

Empirical Rule derivation
\(P(-1<Z<1)=0.8413447-0.1586553=0.6826895\)

\(P(-2<Z<2)=0.9772499-0.0227501=0.9544997\)

\(P(-3<Z<3)=0.9986501-0.0013499=0.9973002\)

Normal Example I

Suppose a pizza company has a rule that your pizza must be 16” or you will receive it for free. Assume pizza sizes follow a normal distribution. At a restaurant, the mean size for pizza is 16.3” with standard deviation of 0.2”. Find the following:

What is the probability that your pizza will be free?
What is the probability your pizza will be over 16.5”?
What is the probability a pizza will be between 15.95” and 16.63”?
What is the size of pizza that represents the biggest 7% of pizzas?
What is the size of pizza that represents Q1 (25th percentile)?
What is the size of pizza that represents Q3 (75th percentile)?
What is the IQR for pizza size?

Pizza solutions

Shorthand notation: \(X\sim N(16.3, 0.2)\)

Free pizza: \(P(X<16)=P(Z<\frac{16-16.3}{0.2})=P(Z<-1.5)=0.0668072\)
Pizza bigger than 16.5”: \(P(X>16.5)=P(Z>\frac{16.5-16.3}{0.2})=P(Z>1)=1-0.8413447=0.1586553\)
Pizza between 15.95 and 16.63”: \(P(15.95<X<16.63)=P(\frac{15.95-16.3}{0.2}<Z<\frac{16.5-16.3}{0.2})=P(-1.75<Z<1)=P(Z<1)-P(Z<-1.75)=0.8413447-0.0400592=0.8012856\)
Biggest 7%: Find z for top 7% and solve for x (\(x=z\sigma+\mu\)) \(z_{top~7\%}=z_{bottom~93/%}=1.475791\) and \(x=(1.48)(0.2)+16.3=16.6"\) (or greater are in the top 7% of largest pizzas)
Q1 for pizza: Find z for 25th percentile and solve for x \(z_{25th~percentile}=-0.6744898\) and \(x=(-0.675)(0.2)+16.3=16.2"\) (or smaller are in the 25th percentile of pizzas)
Q3 for pizza: Find z for 75th percentile and solve for x \(z_{75th~percentile}=0.675\) and \(x=(0.675)(0.2)+16.3=16.4"\) (or smaller are in the 75th percentile of pizzas)
IQR for pizza: \(IQR=Q3-Q1\) ==> \(16.4-16.2=0.2\) (the middle 50% of pizza sizes have a range of 0.2”

What is normal? (bahaha)

I’m normal

Not normal

Spongeboob

There is no solution for a normal Spongebob…he will be crazy forever…forever…forever…