Chapter 7 Sampling Distribution of Means

Earlier we described the behavior of sampling distributions for a sample proportion, in order to understand how we can use a sample proportion to make inferences about a population proportion. In this lecture, we will describe the behavior of sampling distributions for a sample mean, in order to understand how we can use a sample mean to make inferences about a population mean. Sampling distributions for sample means behave similarly to those for sample proportions, but there are some important differences.Suppose we want to know the average body mass index (BMI) of US adults whose age is between 30 and 60 years old. Suppose that BMI measurements could somehow be collected from all US adults between those age ranges. This plot shows results that might be obtained from such a census*. A plot that shows the measurement results for an entire population is known as a population distribution. The BMI measurements in this census fall between 14 and 57 kg/m2. Half of the adults have BMI values less than the median BMI of 25.13 kg/m2, and the mean BMI was 25.58 kg/m2. The shape of the population distribution of BMI values in US adults appears to be more or less Normal but skewed to the right (or positively skewed), with a long right tail of very high BMI values.

References: *These BMI measurements were obtained from 5209 people as part of the Framingham Heart Study (http://www.framinghamheartstudy.org). This population may not be exactly like the entire US population in all respects.It is not usually possible or feasible to obtain measurements on the entire population of interest. Rather, we usually obtain a sample of observations.

Suppose we conduct a study to investigate the average BMI of US adults, aged 30-60: we obtain a single sample, a simple random (and therefore representative) sample of n=100 adults from the population shown on the previous slide. A dot plot for the 100 BMI values in our sample is shown here, with one dot for each person in the sample. A plot that shows all of the measurement results from a sample is called a sample distribution. The mean BMI in our sample is 25.47 kg/m2, which is slightly lower than the true population mean of 25.58 kg/m2.

What would the sample means look like if we repeated the sampling process a few more times?

Suppose we obtain three more random samples of 100 adults from the same population. The sample distributions of BMI values for these three samples are shown here. Each of the sample distributions has a somewhat different shape, a somewhat different mean, and a somewhat different standard deviation, as we would expect due to sampling variability.

However, collecting only three samples does not give us a very complete understanding of the behavior of the sample means in repeated sampling. Let’s see what happens if we repeat the sampling process many, many more times. If we repeat the sampling process 1,000 times, collecting samples of size n=100 each time, and we calculate the mean BMI from each sample, a dot plot of all 1,000 sample means would look like this. Recall that true sampling distributions consist of the sample statistics from all possible samples. This distribution of only 1,000 sample means is an approximation of the true sampling distribution for samples of size n=100. We are using this approximate sampling distribution to help us better understand the concept of sampling distributions.

What shape does this sampling distribution have? The shape appears very close to Normal (bell-shaped). Surprisingly, even though the underlying population of BMI values in US adults is right-skewed, the sampling distribution of sample means from repeated samples of size n=100 appears to be very close to symmetric.

Where is the center of this sampling distribution? The mean of this sampling distribution (25.59 kg/m2) is very close to the true population mean of 25.58 kg/m2. This discrepancy is because we only took 1,000 samples and not an infinite number of samples. If we were able to take an infinite number of samples, the mean of the sampling distribution would exactly equal the true population mean.

What about this sampling distribution’s spread? The sample means range from about 24 to 27 kg/m2. The standard deviation (or standard error) of this sampling distribution is 0.42 kg/m2. What would happen if we repeated the sampling process, but this time we used a much larger sample size of n=500 instead? A dot plot of all 1,000 sample means would look like this. How has increasing the sample size from n=100 on the previous slide to n=500 on this slide affected the sampling distribution?

The shape of this sampling distribution has not changed a lot. It is still very close to Normal: bell-shaped, unimodal and symmetric.

The center of this sampling distribution, given by the mean of 25.58 kg/m2 , which equals the true population mean of 25.58 kg/m2.

The biggest difference is in the sampling distribution’s spread. Before, with a sample size of n=100, the sample means ranged from about 24 to 27 and the standard error of the sampling distribution was 0.42 kg/m2. Now, with a sample size of n=500, the sample means range from about 25 to 26 and the SE is 0.19 kg/m2. With a larger sample size, the sampling distribution is narrower. With a larger sample, the sample means from the various samples are all much more closely clustered around the true population mean.

Note that the number of samples is identical in both cases; it is only the sample size that differs. Increasing the number of samples that we take does not narrow the sampling distribution, although it does give us a clearer picture of its shape. It is only increasing the size of each sample taken which will narrow the sampling distribution.What would happen if the sample size were very small? Suppose we repeated the sampling process, but this time we used a very small sample size of n=20 instead? A dot plot of all 1,000 sample means would look like this. How has decreasing the sample size from n=100 on a previous slide to n=20 on this slide affected the sampling distribution?

The shape of this sampling distribution has changed a bit. Unlike in previous slides, it has a slight right skew, but it’s fairly close to Normal: bell-shaped, unimodal and symmetric.

The center of this sampling distribution, given by the mean of 25.61 kg/m2, is again very close to the true population mean of 25.58 kg/m2.

The biggest difference is in the sampling distribution’s spread (notice the limits on the x-axis changed from 24 to 27.5 in the two previous slides to 22.5 to 29.5 on this slide). Before, with a sample size of n=100, the sample means ranged from about 24 to 27 and the standard error of the sampling distribution was 0.42 kg/m2. Now, with a sample size of n=20, the sample means range from about 22.5 to 29.5 and the SE is 0.95 kg/m2. With a SMALLER sample size, the sampling distribution is WIDER. With a SMALLER sample, the sample means from the various samples are MORE WIDELY SPREAD AROUND the true population mean. What general behaviors have we noticed as we examined the sampling distributions both for sample proportions and for sample means?

The first behavior we observe pertains to shape. As the sample size (n) increases, the sampling distribution both for sample proportions and for sample means looks more bell-shaped, unimodal and symmetric? that is, more Normal in the statistical sense.

The second pattern we observe is that the center of the sampling distribution is always located very close to the true population parameter, even for relatively small samples.

Lastly, we notice that as the sample size increases, the width or spread of the sampling distribution decreases. That is, as the sample size, n, increases, the sample statistics tend to be closer to the true population parameter value, thus making the variability of the sample statistics smaller. What we have observed in these sampling distributions is actually a fundamental theorem in statistics, the Central Limit Theorem (frequently abbreviated as CLT).

The Central Limit Theorem tells us two things. First, the Central Limit Theorem states that if we choose a large enough sample size, n, then the sampling distribution of the sample means will be approximately Normal (unimodal, symmetric and bell-shaped) EVEN IF the underlying population is not Normal at all (for example, multimodal or severely skewed).

If the population distribution itself is Normal, then the sampling distribution of the sample means will be Normal for any sample size, even a sample size of n=1. The next logical question is ‘how large of a sample is ’large enough’ for the CLT to hold for the distribution of sample means’?

That depends on the shape of the population distribution.

If the shape of the population distribution is completely Normal (that is, if the measurement of interest is truly Normally distributed in the population), then a minimum of a sample size of 1 is needed in order for the distribution of sample means to be approximately Normal. In other words, if the population is truly Normal, then the sampling distribution will be Normal, for ANY size of sample.

If the shape of the population distribution is not Normal but is approximately symmetric, then one rule of thumb is that a minimum sample size of about 15 is needed in order for the sampling distribution of sample means to be roughly Normal.

If the shape of the population distribution is not Normal and not symmetric, but instead is skewed, then one rule of thumb is that a minimum sample size of about 30 is needed in order for the sampling distribution of sample means to be roughly Normal. Keep in mind, though, that the less Normal or the less symmetric the population distribution is, the larger the sample size that will be needed to ensure that the sampling distribution of sample means will be roughly Normal.The second thing the Central Limit Theorem tells us is that the sampling distribution of sample means will be centered at the true population mean, \(\mu\) (mu), and will have a standard error, SE (sometimes called the standard error of the mean, SEM), equal to the population standard deviation, \(\sigma\) (sigma), divided by the square root of the sample size, n. 

This applies to the true sampling distribution, which is for all possible samples of size n. As previously mentioned, our example sampling distribution contained only 1,000 samples of size n=100, not all possible samples, so it is a good but not quite perfect approximation of the true sampling distribution. Its sample mean is nearly but not quite equal to the true population mean and its sample standard error is nearly but not quite equal to the population standard deviation divided by the square root of n. In our BMI example, the mean of our approximate sampling distribution after obtaining 1,000 samples of size n=100 (25.59 kg/m2) is very close to the true population mean of 25.58 kg/m2. The true population standard deviation is 4.24 kg/m2 and the sample size is 100, so the standard error, SE, of the sampling distribution should be 0.424 kg/m2, which is quite close to the value we saw in our approximate sampling distribution plot (0.42 kg/m2).

An important thing to notice: the SE calculation shows us that as the sample size increases, the standard error decreases, and the sample means will get closer and closer to the population mean. This is consistent with what we observed earlier when we compared the sampling distributions for samples of size n=20, n=100 and n=500.

Note that we do not use the Central Limit Theorem to get a better approximation of the population parameter of interest by using lots of samples. In most cases, it isn’t practical or even possible to re-sample from the same population thousands of times. In most cases, we only ever get one sample and we need to make our inferences about the population based on that one single sample. What we DO use the Central Limit Theorem for is to tell us how close our sample statistic is likely to be to the true population parameter, i.e. how well the sample statistic estimates the true population value. The Central Limit Theorem also applies to sample proportions. In this case, the CLT states that if we choose a large enough sample size, the sampling distribution of sample proportions from random samples of size n will be approximately Normal (unimodal, symmetric, and bell-shaped) and will be centered around the true population proportion, p, with a standard error equal to the square root of p*(1-p)/n.

Totally optional, but if you are curious: The CLT only applies to means, but it happens to also work for proportions because a proportion is in fact a mean of a Bernoulli distribution of 1’s (e.g. diabetic) and 0’s (e.g. not diabetic). A Bernoulli distribution isn’t even remotely close to Normal in shape (see the earlier lecture on sampling distributions for a sample proportion), but the sampling distribution of sample proportions is nevertheless, per the CLT, approximately Normally distributed.

Again, how large is ‘large enough’?

Recall that we find the proportion for an event or category of interest. In our example from an earlier lecture, the ‘event’ we were interested in was the proportion of young adults with diagnosed diabetes. One common rule of thumb is that the sampling distribution of sample proportions will be approximately Normal if the number of events or ‘successes’ (n * p) and the number of non-events or ‘failures’ (n * (1-p)) are both at least 10. In our example, this assumption would be met if our sample contained at least 10 people with diabetes and at least 10 people without diabetes.

If p, the population proportion, is unknown, then we use the sample proportion, p-hat, to estimate it. As we did earlier for proportions, let’s review the different distribution types that have been presented thus far for means: population distributions, sample distributions, and sampling distributions.

The population distribution for our BMI example is shown in the upper left here. This describes the distribution of BMI values in the entire population of US adults. Typically, this distribution is unknown, and information about that distribution (e.g., population parameters such as the mean, mu (\(\mu\))) are often what we want to try to estimate.

However, what we most likely have access to is a single sample of data from the population. If we plot the data from our one sample, we create a sample distribution, which is a distribution of a characteristic in the sample. The sample distribution of BMI values for our one sample of size n=100 from US adults is shown in the upper right here. We can use the information from the known and observed sample to help us estimate the unknown population value/parameter of interest.

Unlike population and sample distributions, which are distributions of cases or observations, sampling distributions are distributions of statistics, where each ‘dot’ (a.k.a., sample statistic) is aggregate information from a sample of observations and many, many, many ‘dots’ (a.k.a., sample statistics) are obtained so we can understand the behavior of the ‘dots’. The approximate sampling distribution for mean BMI values for samples of size n=100 is shown in the lower center here. While we can simulate what the sampling distribution will look like (as we have in this lecture), we do not observe this distribution directly in real life. This distribution is an abstraction of what would happen if we could repeat the sampling behavior over and over again. It helps us understand the sampling variability of the statistic so that we can make an inference about the population from the sample.

While sample distributions are an important part of the data analysis process, sampling distributions are the foundation for statistical inference.We have now explored the behavior of the sampling distributions for sample proportions and for sample means. Both are approximately Normal under certain conditions, as given by the Central Limit Theorem. One key difference is that when we are considering proportions (e.g. the proportion of young adults with diabetes), there are only two parameters needed to describe the behavior of the sampling distribution: the population proportion, p, and the sample size, n. In contrast, when we are considering means (e.g., the mean BMI in US adults), there are three parameters needed to describe the behavior of the sampling distribution: the population mean, mu, the population standard deviation, sigma, and the sample size, n.