Sample Mean
- An estimator of the population mean is represented as
- In most cases, our best estimate is our sample mean
- Additionally, is an unbiased estimator of the population mean
- The sample mean is the unbiased estimator of the population mean
- The sample mean is just what you'd expect:
Sample Variance
- An estimator of the population variance is represented as
- In most cases, we'd expect our best estimate to be our sample variance
- The sample variance is defined as the following:
- However, the sample variance is not a perfect estimate of the population variance
- Specifically, it's a biased estimator of the population variance
- Therefore, it’s usally too small
- The population variance is best estimated as the following:
- This is the reason why we use the notation , instead of
- The story here, heuristically, is that we tend to lose variation under sampling
- So, measures of variation in the sample need to be corrected upwards, so this is the right correction to use
- A sophisticated story claims that this distinction is really important in estimation, and what we really should divide through by, is not the number of data-points but the number of degrees of freedom
- And, to get the variance, we need to estimate the mean, thereby losing one degree of freedom
- Essentially, while we should use as our estimate of the population variance , if the difference between that and s² is big enough to matter, you probably should think about getting more data points
References
Previous
Next