Statistical Sampling Techniques — A Critique

Anoop B
4 min readOct 24, 2019

--

Statistical Sampling Definition: Sampling is the process of selection of limited number of elements from large group of elements (population) so that, the characteristics of the samples taken is identical to that of the population.

Sampling techniques can be broadly classified into 2 categories:

  1. Probabilistic Sampling
  2. 2. Non-probabilistic Sampling

These can be further classified into their sub-categories:

Probabilistic Sampling

Simple Random Sampling — Every element has an equal chance of getting selected to be the part sample. It is used when we don’t have any kind of prior information about the target population. For example: Random selection of 20 students from class of 50 student. Each student has equal chance of getting selected. Here probability of selection is 1/50

Stratified Sampling — This technique divides the elements of the population into small subgroups (strata) based on the similarity in such a way that the elements within the group are homogeneous and heterogeneous among the other subgroups formed. And then the elements are randomly selected from each of these strata. We need to have prior information about the population to create subgroups. For example: a situation where a research team is seeking opinions about religion amongst various age groups. Instead of collecting feedback from 326,044,985 U.S citizens, random samples of around 10000 can be selected for research. These 10000 citizens can be divided into strata according to age, i.e., groups of 18–29, 30–39, 40–49, 50–59, and 60 and above. Each stratum will have distinct members and number of members.

Systematic Sampling — Systematic sampling is a probability sampling method where the elements are chosen from a target population by selecting a random starting point and selecting other members after a fixed ‘sampling interval’. Sampling interval is calculated by dividing the entire population size by the desired sample size. For example, a researcher has a population total of 100 individuals and need 12 subjects. He first picks his starting number, 5. Then the researcher picks his interval, 8. The members of his sample will be individuals 5, 13, 21, 29, 37, 45, 53, 61, 69, 77, 85, 93. Some techniques use a modified systematic random sampling technique wherein they first identify the needed sample size. Then, they divide the total number of the population with the sample size to obtain the sampling fraction. The sampling fraction is then used as the constant difference between subjects.

Cluster Sampling — Our entire population is divided into clusters or sections and then the clusters are randomly selected. All the elements of the cluster are used for sampling. Clusters are identified using details such as age, sex, location etc. For example: An example of cluster sampling is area sampling or geographical cluster sampling. Each cluster is a geographical area. Because a geographically dispersed population can be expensive to survey, greater economy than simple random sampling can be achieved by grouping several respondents within a local area into a cluster.

Multistage Sampling — Population is divided into multiple clusters and then these clusters are further divided and grouped into various subgroups (strata) based on similarity. One or more clusters can be randomly selected from each stratum. This process continues until the cluster can’t be divided anymore. For example, country can be divided into states, cities, urban and rural and all the areas with similar characteristics can be merged together to form a strata. For example, the Gallup poll uses multistage sampling. For example, they might randomly choose a certain number of area codes then randomly sample a number of phone numbers from within each area code.

Non-Probabilistic Sampling

Convenience Sampling — This method is used when the availability of sample is rare and costly. So based on the convenience samples are selected. For example, the most basic example of where the convenience sampling method is used is when companies stop people at a mall or on a crowded street to distribute their promotional pamphlets and ask questions.

Purposive Sampling — This is based on the intention or the purpose of study. Only those elements will be selected from the population which suits the best for the purpose of our study. For example, If we want to understand the thought process of the people who are interested in pursuing master’s degree then the selection criteria would be “Are you interested for Masters in..?”

Quota Sampling — This type of sampling depends of some pre-set standard. It selects the representative sample from the population. Proportion of characteristics/ trait in sample should be same as population. Elements are selected until exact proportions of certain types of data is obtained or enough data in different categories is collected. For example, if our population has 45% females and 55% males then our sample should reflect the same percentage of males and females.

Referral/Snowball Sampling — This technique is used in the situations where the population is completely unknown and rare. Therefore, we will take the help from the first element which we select for the population and ask him to recommend other elements who will fit the description of the sample needed. So, this referral technique goes on, increasing the size of population like a snowball. For example, it’s used in situations of highly sensitive topics like HIV Aids where people will not openly discuss and participate in surveys to share information about HIV Aids. Not all the victims will respond to the questions asked so researchers can contact people they know or volunteers to get in touch with the victims and collect information. Helps in situations where we do not have the access to enough people with the characteristics we are seeking. It starts with finding people to study.

Sampling techniques: Advantages and disadvantages

Tabular representation of advantages and disadvantages of various sampling techniques

--

--