In this section students will:
RR output for 2-sample
methodsR outputComparisons:
\(^*\)While there are two cases for this (when variances are equal or unequal), we will only use the unequal variances (unpooled) method. If the two variances are unequal or equal, the unpooled is appropriate in either case. (In practice, a variance test is done to see if they are equal or not before deciding either pooled or unpooled; we will just learn unpooled and no variance test)
This compares the means of two distinct (separate) groups of units or subjects. The wording used is the difference of two (2) means
Degrees of freedom for (unpooled) independent means is usually calculated rather than using \(n-1\) or something similar:
\[df=\frac{\left(\frac{s^2_1}{n_1}+\frac{s^2_2}{n_2}\right)^2}{\frac{\left(s^2_1/n_1\right)^2}{n_1-1}+\frac{\left(s^2_2/n_2\right)^2}{n_2-1}}\]
We will be using the smaller of the two sample sizes (minus one) \[df=min(n_1-1,n_2-1)\]
CI for the difference of two (independent) means:
\[\overline{X}_1-\overline{X}_2 \pm t^{\star}(se)\text{ with }se=\sqrt{\frac{s^2_1}{n_1}+\frac{s^2_2}{n_2}}\]
\(t^{\star}\) is found the same way as in the one-sample methods
Interpretation: “With ___% confidence, the true difference of (independent) means of <insert context> is between (lower) and (upper) <units of measurement>.”
Hypotheses for Difference of Two Independent Means
For the difference of two (independent) means1:
\[H_0: \mu_1=\mu_2~~H_a: \mu_1\left(\begin{array} {lll} \neq \\ > \\ < \\ \end{array}\right)\mu_2\]
\[Or\]
\[H_0: \mu_1-\mu_2=0~~H_a: \mu_1-\mu_2\left(\begin{array} {lll} \neq \\ > \\ < \\ \end{array}\right)0\]
Assumptions
Formula: Test Statistic
\[t=\frac{\overline{X}_1-\overline{X}_2}{se}~~where~se=\sqrt{\frac{s^2_1}{n_1}+\frac{s^2_2}{n_2}}\]
With
\[df=min(n_1-1,n_2-1)\]
This compares the mean of the difference between two measurements of the same unit or subject. The wording used is the mean difference. This analysis is for comparing measurements on the same subject/unit; once before a treatment and once again after the treatment, to detect if there is a difference due to the treatment.
Examples are weight loss programs, Coke vs. Pepsi, compare GDP of countries at 2 different dates (time is treatment)
Formula: CI
\(d_i\): individual differences between measurements
\(\overline{X}_d=\frac{\sum{d_i}}{n}\) sample mean difference (mean of the differences)
\(s_d=\sqrt{\frac{\sum{(d_i-\overline{X}_d)^2}}{n-1}}\): sample standard deviation of the differences
\[\overline{X}_d \pm t^{\star}(se)~~where~se=\frac{s_d}{\sqrt{n}}~~and~df=n-1\]
\(t^{\star}\) is found the same as in one-sample methods
Interpretation: “With ___% confidence, the true mean difference of <insert context> is between (lower) and (upper) <units of measurement>.”
Hypotheses
For the mean difference2:
\[H_0: \mu_d=0~~H_a: \mu_d\left(\begin{array} {lll} \neq \\ > \\ < \\ \end{array}\right)0\]
Assumptions
Formula: Test Statistic
\[t=\frac{\overline{X}_d-0}{se}\text{ with }se=\frac{s_d}{\sqrt{n}}\]
This compares the proportions of two distinct (separate) groups of units or subjects. The wording used is the difference of two (2) proportions
Formula: CI
CI for the difference of two (independent) proportions:
\[\hat p_1-\hat p_2 \pm z^{\star}(se)
\text{ with }se=\sqrt{\frac{\hat p_1\hat q_1}{n_1}+\frac{\hat p_2\hat
q_2}{n_2}}\]
with \(z^{\star}\) found the same way
in one-sample methods
Interpretation: “With ___% confidence, the true difference of (independent) proportions of <insert context> is between (lower) and (upper) <units of measurement>.”
Hypotheses
For the difference of two (independent) proportions3:
\[H_0: p_1=p_2~~H_a: p_1\left(\begin{array} {lll} \neq \\ > \\ < \\ \end{array}\right)p_2\]
\[Or\]
\[H_0: p_1-p_2=0~~H_a: p_1-p_2\left(\begin{array} {lll} \neq \\ > \\ < \\ \end{array}\right)0\]
Assumptions
Formula: Test Statistic
\[z=\frac{\hat p_1-\hat p_2}{se}\text{ with }se=\sqrt{\frac{ p_1\hat q_1}{n_1}+\frac{\hat p_2\hat q_2}{n_2}}\]
The basic process is the same as for 1-sample methods. Make sure to follow the 5 steps to hypothesis testing:
Some archaeologists theorize that ancient Egyptians interbred with several different immigrant populations over thousands of years. To see if there is any indication of changes in body structure that might have resulted, in a random sample they measured 30 skulls of male Egyptians dated from 4000 BCE and 30 others dated from 200 BCE
Egypt setup
\[H_0: \mu_1=\mu_2~~H_a:
\mu_1>\mu_2\]
(or \(H_0: \mu_1-\mu_2=0~~H_a:
\mu_1-\mu_2>0\))
Assumptions:
(1) Random: yes
(2) Independence: yes because random
(3) Normality: \(n_1=n_2=30\geq30\) so
yes
Organization of information:
\(n_1=30\)
\(n_2=30\)
\(H_a:~>\) (upper tail test)
\(\alpha=0.05\) (assumed because not
specifically stated otherwise)
200BCE 4000BCE
xbari 135.633 131.367
sdi 4.038 5.129
ni 30.000 30.000
CI
\[\overline{x}_1-\overline{x}_2\pm
t^{\star}(se)\]
\(t^{\star}=t_{CL=95\%,df=29}=2.0452296\)
\[135.633-131.367\pm
(2.045)(1.192)=4.266\pm 2.438=1.83,6.7\]
With 95% confidence, the true difference in mean skull breadths of 200
BCE and 4000 BCE is between 1.83 and 6.7 mm
A tracking beacon used for enabling robots to home in on a beacon that produces an audio signal, is said to be fine-tuned if the probability of correct identification of the direction of the beacon is the same for each side (left and right) of the tracking device. In a random sample, out of 100 signals from the left, the device identifies the direction correctly 85 times. Of the 100 signals from the right, the device identifies the direction correctly 87 times.
Correct id Incorrect id Total
Left 85 15 100
Right 87 13 100
Total 172 28 200
CI
\[\hat p_1-\hat p_2\pm
z^{\star}(se)\] with \[se=\sqrt{\frac{\hat p_1\hat q_1}{n_1}+\frac{\hat
p_2\hat q_2}{n_2}}\]
\[0.87-0.85\pm (1.645)(0.0491)=0.02\pm
0.0807=-0.0607,0.1007\] The CI: \((-0.0607,0.1007)=(-6.07\%,10.07\%)\)
With 90% confidence, the true difference in proportions of the sides correctly identifying the direction of the signal is between -6.07% and 10.07%. Since the CI includes 0, we say that there is no significant difference between the two sides
A car dealer decided to compare the mean monthly sales of two salespersons, A and B. Because the strength of sales varies with season and with people’s opinions about the economy, the car dealer decided to take a random sample from Salespersons A and B to make the comparison on a monthly basis. The data given has the monthly sales (to the nearest thousand dollars) for the two salespersons.
[,1]
n 12.000
xbard 15.667
sD 10.924
CI
\[\overline{x}_d\pm t^{\star}(se)\]
with \[se=\frac{s_d}{\sqrt n}\]
\[15.667\pm (3.106)(3.153)=15.667\pm
9.793=5.874, 25.46\]
The CI: \((5.87,25.46)\)
With 99% confidence, the true mean difference in sales between Salespersons A and B is between $6,000 and $25,000.
Alternative way to interpret: With 99% confidence, the sales from A are between $6,000 and $25,000 higher than Salesperson B
REach example will be done again with R output
R OutputWhen doing one-tailed tests with software, the CIs are not the ones we want so a separate analysis is to be done to acquire proper CIs when doing one-tail tests (upper or lower)
Welch Two Sample t-test
data: breadth by era
t = 3.5797, df = 54.973, p-value = 0.000364
alternative hypothesis: true difference in means between group 200BCE and group 4000BCE is greater than 0
95 percent confidence interval:
2.27257 Inf
sample estimates:
mean in group 200BCE mean in group 4000BCE
135.6333 131.3667
Test statistic \(t=3.579\), \(df=54.973\), \(pvalue=0.000364\)
Results: \(pvalue=0.000364\leq\alpha(0.05) \therefore\) (therefore) \(H_0\) is rejected
Conclusion: since the null is rejected, that means that there is evidence that the skull breadths have significantly increased over the period from 4000 BCE to 200 BCE
Error: since \(H_0\) was rejected, a Type \(I\) error (reject null when null is true) could have been made; we think the the skull breadths have increased but they did not
CI
Welch Two Sample t-test
data: breadth by era
t = 3.5797, df = 54.973, p-value = 0.000728
alternative hypothesis: true difference in means between group 200BCE and group 4000BCE is not equal to 0
95 percent confidence interval:
1.878030 6.655303
sample estimates:
mean in group 200BCE mean in group 4000BCE
135.6333 131.3667
The CI: \((1.878030,6.655303)\approx(1.88,6.66)\)
With 95% confidence, the true difference in mean skull breadths of Egyptian males from 4000 BCE to 200 BCE is 1.88 to 6.66 mm.
Alternative way to interpret: With 95% confidence, mean skull breadths of Egyptian males have increased from 4000 BCE to 200 BCE, 200 BCE skulls are 1.88 to 6.66 mm larger than the 4000 BCE skulls, indicating that immigrating populations did interbreed with the native Egyptians.
Correct id Incorrect id Total
Left 85 15 100
Right 87 13 100
Total 172 28 200
\[H_0: \pi_1=\pi_2~~H_a: \pi_1\ne\pi_2\]
(or \(H_0: \pi_1-\pi_2=0~~H_a:
\pi_1-\pi_2\ne0\))
Assumptions:
(1) Independence: random so yes
(2) Randomization: yes
(3) Normality: \(n_1=n_2=100\geq60\)
Organization of information:
\(n_1=100\)
\(n_2=100\)
\(H_a:~\ne\) (two tail test)
\(\alpha=0.10\) (because specifically
stated as 10%)
Robot Analysis Output
side
Left Right
85 87
[1] 100 100
[1] 0.85 0.87
[,1]
se 0.04905099
zcalc 0.40773893
pvalue 0.68346535
[,1]
diff.pihat 0.02000000
zstar 1.64485363
bound 0.08068171
lower -0.06068171
upper 0.10068171
Test statistic \(z=0.41\), \(pvalue=0.6835\)
Results: \(pvalue=0.6835\nleq\alpha(0.10) \therefore\) (therefore) \(H_0\) cannot be rejected
Conclusion: since the null is not rejected, that means that there is not a significant difference of correct identifications of the direction of the beacon between the right and left sides of the tracking device. There is no difference between the sides.
Error: since \(H_0\) was not rejected, a Type \(II\) error (not rejecting null when null is false) could have been made; we think there is no difference between the left and right sides of the tracking device when there is a difference
CI
The CI: \((-0.0607,0.1007)=(-6.07\%,10.07\%)\)
With 90% confidence, the true difference in proportions of the sides correctly identifying the direction of the signal is between -6.07% and 10.07%. Since the CI includes 0, we say that there is no significant difference between the two sides
A car dealer decided to compare the mean monthly sales of two salespersons, A and B. Because the strength of sales varies with season and with people’s opinions about the economy, the car dealer decided to take a random sample from Salespersons A and B to make the comparison on a monthly basis. The data given has the monthly sales (to the nearest thousand dollars) for the two salespersons.
[,1]
xbar.d 15.66667
s.d 10.92398
n 12.00000
Organization of information:
\(n=12\) (12 months)
\(H_a:~\ne\) (two-tail test)
\(\alpha=0.01\) (specifically
stated)
CI:
\[\overline{X}_d \pm
t^{\star}(se)\]
Find \(t^{\star}\): \(df=11\), confidence level\(=99\%\) so \(t^{\star}=3.106\)
\[15.67\pm(3.106)(3.15)=15.67\pm 9.78=5.89, 25.45\]
Paired t-test
data: A and B
t = 4.9681, df = 11, p-value = 0.0004234
alternative hypothesis: true mean difference is not equal to 0
99 percent confidence interval:
5.872564 25.460770
sample estimates:
mean difference
15.66667
\[H_0: \mu_D=0~~H_a:
\mu_D\ne0\]
Assumptions:
(1) Randomization: yes
(2) Independence (of units/subjects): random met so yes
(3) Differences have approximate normal distribution (boxplots
are ok)
(4) Two measurements per unit/subject: yes
Organization of information:
\(n=12\) (12 months)
\(H_a:~\ne\) (two-tail test)
\(\alpha=0.01\) (specifically
stated)
Test statistic \(t=4.9681\), \(df=11\), \(pvalue=0.0004234\)
Results: \(pvalue=0.0004234\leq\alpha(0.01) \therefore\) (therefore) \(H_0\) is rejected
Conclusion: since the null is rejected, that means the true mean difference in sales between A and B is significantly different.
Error: since \(H_0\) was rejected, a Type I error (reject null when null is true) could have been made; we think the mean difference in sales significantly differs between salespersons when it does not
The CI: \((5.87,25.46)\)
With 99% confidence, the true mean difference in sales between Salespersons A and B is between $6,000 and $25,000.
Alternative way to interpret: With 99% confidence, the sales from A are between $6,000 and $25,000 higher than Salesperson B