Sample and Census in a Population

Definitions

Wilson Busaka
5 min readJan 8, 2019
Photo courtesy of Surbhi S

Observation/unit — is data from an individual study subject or sampled unit.

Sampling unit — it refers to any single person, animal, plant, product or ‘thing’ being researched. i.e. an individual person in a university.

Population — is a group of observations/units in a geographical location subject to a study.

Sample — is a group of observations obtained from a population through sampling.

Sampling — is the process of obtaining units from a population.

Census — is the process of counting all the observations/units in a population.

A country wants to know how many citizens it has so that it can plan using the resources it has such as revenue in order to afford the people with amenities such as schools, roads, hospitals, and others.

It would be imprudent to go ahead and build all these without the numbers. The numbers will take you where to build more schools as the population is higher, where to hire more doctors (and which doctors) as the place faces a specific type of illnesses.

A census is expensive, what if you don’t have the resources to carry out such an operation? This is where sampling comes in. During the world war, ammunition was being manufactured at a fast rate in order to meet the demand of the war. This meant that it would have been difficult to inspect say each and every bullet which made industries resort to sampling. For every batch of assuming 100 bullets, 10 would be picked at random and tested. If say 7/10 (70%) passed the quality test, then the whole batch would be accepted, else they would reject the whole batch. This saved them a lot of time and resources. This is known as quality control.

If we wanted to find out how many Londoners had breakfast before leaving their houses, it would be time-consuming and expensive to ask everyone, but what if we could ask a few at random and based on the findings we then say, ‘90% of people living in London had breakfast before stepping out.’

Sampling techniques

They can broadly be divided into:

  1. Probability (random) — a sampling procedure that entails some form of random selection. It is mostly used by statisticians, mathematicians and professions in related careers.
  2. Non-probability(non-random) — a sampling procedure that does not use random selection. This is preferred by non-mathematicians and statisticians as it doesn’t necessarily require any formulas.

Probability

i) Simple Random Sampling

In this technique, the units are simply selected at random. i.e. take for instance having 5 marbles: 2 red, 1 green and 2 yellow in a bowl and you want to pick a marble at random: what is the probability of picking a red marble?

Red — 2

Green — 1

Yellow — 2

Total = 5

P(Red) = 2/5 = 0.4

P(green) = 1/5

P(yellow) = 2/5

When you sum up all the probabilities, they add up to 1.

ii) Systematic sampling

This one follows a specified systematic approach. i.e. if our k=3, this means that we will be selecting every 3rd element we come across. If you’re sampling people and you happen to walk in town, you should pick every fifth person you meet.

iii) Stratified sampling

Here we divide our population into strata. Each stratum contains units that are homogeneous meaning they’re grouped in the same fashion and take into consideration the number of elements contained in there so as to draw samples with equal probabilities.

iv) Multi-stage sampling

Say you want to sample university students in Africa, you sample Ghana — Ghana has various regions, you then settle on Greater Accra, then inside Greater Accra, you select the University of Ghana and then do your sampling of students using either systematic or simple random sampling technique.

v) Cluster Sampling

The population is subdivided into clusters preferably clusters that contain all the desired attributes. Then you select one and then go ahead and apply either simple random sampling or systematic sampling technique.

2. Non-Probability

i) Convenience sampling

Here the units are sampled at the researcher’s convenience.

ii) Judgment sampling

Units are selected at the researcher’s judgment. He/she decides who to select based on the appearance.

iii) Quotas

A population is subdivided into non-overlapping quotas and then either judgment or convenience sampling is applied.

iv) Snowballing sampling

Just like a snowball, it rolls from one unit to another. This applies in a field where finding these units is not easy. e.g. in the oil industry, you may need one marketing oil company to introduce you to another as it’s a closed market in nature. HIV/AIDS patients face a lot of stigmatization, so in order to access them, you’ll need to use this approach.

References

  1. http://www.sjsu.edu/faculty/gerstman/StatPrimer/measure

--

--