Types of Estimators

In this section students will:

  1. Learn about estimators
  2. Learn about point estimation and interval estimation
  3. Learn the four properties of point estimators
  4. Understand if a point estimate is biased or not
  5. Learn how to calculate relative efficiency (RE) of estimators
  6. Understand what the RE calculation means
  7. Learn how to compare estimators

Estimation and Estimators

Statistical Inference: using information obtained from a proper sample to make an educated judgment about a population

An estimator is a rule that dictates how to calculate an estimate based on the information in the sample and that is generally expressed as a formula; is based on the measurements contained in the sample. For example, the sample mean \[\bar{x}=\frac{\sum x_i}{n}\] is an estimator of the population mean \(\mu\) and explains exactly how the actual numerical value of the estimate can be obtained once the sample values \(x_1,x_2,\ldots,x_n\) are known. The sample mean can be used to enclose the true value of \(\mu\)

Description Sample Statistic Population Parameter
Mean \[\overline{X}\] \[\mu\]
Proportion \[\hat p\] \[p\]
Total \[\hat{\tau}\] \[\tau\]
Variance \[s^2\] \[\sigma^2\]
Standard deviation \[s\] \[\sigma\]

Types of Estimators

The preceding modules set the stage for one of the main objectives of this course: an understanding of statistical inference and how it can be applied to the solution of practical problems.

Three main types of inference

  1. Point estimation
  2. Interval estimation
  3. Statistical tests

Point Estimation and Interval Estimation

Let \(\theta=\) parameter and \(\hat{\theta}=\) statistic; estimator of the parameter \(\theta\)

A point estimator of a population parameter is a rule that determines how to calculate a single number based on sample data. The resulting number is called a point estimate. The information from a sample is used to estimate the target population parameter. \(\hat{\theta}\rightarrow \theta\)

An interval estimator of a population parameter is a rule that determines how to calculate two numbers based on the sample data, forming an interval with which the parameter is expected to lie. This pair of numbers is called an interval estimate or confidence interval. \(\hat{\theta}_L <\theta < \hat{\theta}_U\)

4 Properties of Estimators

  1. Unbiasedness
  2. Efficiency
  3. Consistency
  4. Sufficiency

Unbiasedness

An estimator of a parameter is said to be unbiased if the mean of its distribution is equal to the true value of the parameter. Otherwise, the estimator is said to be biased

Let \(\hat{\theta}\) be a point estimate of a parameter \(\theta\). Then \(\hat{\theta}\) is an unbiased estimator \(iff\) (if and only if) the expected value of \(\hat{\theta}\) should be equal to \(\theta\). That is \[E(\hat{\theta})=\theta\]

The bias of the estimate is the difference between the parameter and its statistic, that is \[B(\hat{\theta},\theta)=E(\hat{\theta})-\theta\]

Bias

Is \(\bar{x}\) Unbiased?

\(E(\overline{X})=\mu\) (also \(E(X)=\mu\))

\[E(\overline{X})=E\left(\frac{X_1+X_2+\cdots+X_n}{n}\right)\]
\[=\frac 1n E(X_1+X_2+\cdots+X_n)\]
\[=\frac 1n (\mu+\mu+\cdots+\mu)\]
\[=\frac 1n (n\mu)\]
\[=\mu\]
Therefore, the sample mean is an unbiased estimator for the population mean

Is \(\hat p\) Unbiased?

The sample proportion is either denoted by \(\hat p\)

\(E(\hat p)=np\), \(p=\frac Xn\), and binomial \(EX=np\)

\[E(\hat p)=p\]
\[=E\left(\frac{X}{n}\right)\]
\[=\frac 1n E(X)\]
\[=\frac 1n (np)\] \[=p\]
Therefore, the sample proportion is an unbiased estimator for the population proportion

\(E(s^2)\) and \(E(s)\)

The sample variance is also unbiased, is the standard deviation?

One thing to note is that for any random variable \(X\), the variance of \(X\) is \(V(X)=E(X^2)-(E(X))^2\) and thus \(E(X^2)=V(X)+(E(X))^2\), which is the same as \(E(X^2)=\sigma^2+\mu^2\)

\[E(s^2)=\sigma^2\]
\[=\frac{1}{n-1}\left(\sum E(X^2_i)-\frac 1n \left[E\left(\sum X_i\right)\right]^2\right)\]
\[=\frac{1}{n-1}\left(\sum (\sigma^2+\mu^2)-\frac 1n \left(V(\sum X_i)+\left[E\left(\sum X_i\right)\right]^2\right)\right)\]
\[=\frac{1}{n-1}\left(n\sigma^2+n\mu^2-\frac 1n n\sigma^2-\frac 1n (n\mu)^2\right)\]
\[=\frac{1}{n-1}(n\sigma^2-\sigma^2)=\sigma^2\] Standard deviation \(s\). One thing to note is that we also know that \(s=\sqrt{s^2}\) but the expected value of the square root is not the square root of the expected value (\(\sqrt{E(X)}>E(\sqrt{X})\)) so that \[\sqrt{E(s^2)}\ge E\left(\sqrt{s^2}\right)\]
And thus, the sample standard deviation is slightly biased (which is why we divide by \(n-1\) when calculating sample variance)

Efficiency

An efficient estimator is one that has the smallest possible variance among all unbiased estimators. That is, the variation (variance) of the sampling distribution should be as small as possible. It ensures that, with a high probability, an individual estimate will fall close to the true value of the parameter.

Efficiency

Relative Efficiency

To find the efficiency of estimators, we can compare their variances, called a relative efficiency. Given two unbiased estimators, \(\hat{\theta}_1\) and \(\hat{\theta}_2\) of a parameter \(\theta\), the relative efficiency (\(RE\)) of \(\hat{\theta}_1\) to \(\hat{\theta}_2\) is defined to be:

\[RE(\hat{\theta}_1,\hat{\theta}_2)=\frac{V(\hat{\theta}_2)}{V(\hat{\theta}_1)}\]

Example: \(E(\overline{X})=\mu\), \(V(\overline{X})=\frac{\sigma^2}{n}\), \(V(Median)=V(M)=1.57\left(\frac{\sigma^2}{n}\right)\)

Is the sample mean \(\overline{X}\) more efficient than the median (\(M\))?

\[RE(M,\overline{X})=\frac{V(\overline{X})}{V(M)}=\frac{\frac{\sigma^2}{n}}{1.57\left(\frac{\sigma^2}{n}\right)}=\frac{1}{1.57}=0.637\]
This shows that the median is only 63.7% as efficient as the sample mean. Therefore \(\overline{X}\) is unbiased and more efficient than \(M\)

Consistency

Both consistency and sufficiency are calculated by calculus limits (which is not required for the course but the concepts are within the scope of this course).

An estimator is a consistent estimator that has the property that as the number of data points used increases, the resulting estimate \(\hat{\theta}\) will converge in probability to \(\theta\)

Definition: An estimator \(\hat{\theta}\) is a consistent estimator of \(\theta\) if \(\hat{\theta}\xrightarrow{_p} \theta\), i.e., if \(\hat{\theta}\) converges in probability to \(\theta\).
\[\lim_{n\to\infty}P(|\hat{\theta}-\theta|)<\epsilon)=0\]
Theorem: An unbiased estimator \(\hat{\theta}\) for \(\theta\) is consistent if \[\lim_{n\to\infty}V(\hat{\theta})=0\] (Proof: omitted)

Example: Let \(X_1,X_2,\ldots,X_n\) be a random sample of size \(n\) from a population with mean \(\mu\) and variance \(\sigma^2<\infty\). Show that \[\overline{X}=\frac 1n \sum X_i\] is a consistent estimator of \(\mu\)
Solution: \[E(\overline{X})=\mu \Rightarrow \overline{X}\] is unbiased and \[V(\overline{X})=\frac{\sigma^2}{n}\rightarrow 0~as~n\rightarrow \infty\]

The short of it is that as the sample size \(n\) increases, the bias (difference between estimate and parameter) will become closer to 0 (and is thus negligible)

Sufficiency

A statistic is sufficient with respect to a statistical model and its associated unknown parameter if “no other statistic that can be calculated from the same sample provides any additional information as to the value of the parameter”.

An estimator of a parameter \(\theta\) which gives as much information about \(\theta\) as is possible from the sample at hand is called a sufficient estimator. Sufficient estimators exist when one can reduce the dimensionality of the observed data without loss of information

\[P(X=x)=P(X=x,\theta)\] means that the probability of even \(x\) is equal to the probability of event \(x\) given \(\theta\). Meaning that the probability of \(x\) does not depend on \(\theta\)

Properties summarized