Cluster Sampling
Cluster sampling is a sampling technique used when “natural” but relatively homogeneous groupings are evident in a statistical population. In this technique, the total population is divided into these groups (or clusters) and a simple random sample of the groups is selected. Then the required information is collected from a simple random sample of the elements within each selected group.
A common motivation for cluster sampling is to reduce the total number of interviews and costs given the desired accuracy. Assuming a fixed sample size, the technique gives more accurate results when most of the variation in the population is within the groups, not between them.
Everyone may be familiar with one of the versions of cluster sampling, it’s called area sampling or geographical cluster sampling. A geographically dispersed population can be expensive to survey, so we can apply a census for thousand people in each area.
In the animation package, function sample.cluster()
is a
demonstration for cluster sampling, here is the example.
library(animation)
ani.options(nmax = 30, interval = 1)
par(mar = rep(1, 4))
sample.cluster(col = c("bisque", "white"))
Each rectangle stands for a cluster, and the simple random sampling without replacement is performed for each cluster. All points in the clusters being sampled will be drawn out.