Environmental and Occupational Epidemiology, part 2: Study designs

Arindam Basu
Environment, Epidemiology, Climate
6 min readJul 23, 2018

--

Overview of Epidemiological Study Designs

Continuing from our previous article, where we reviewed the principles of epidemiology, the focus here is Epidemiological study designs. Before we delve into these, let us spend some time thinking as to how do we go about investigating how to find that X is an environmental hazard and threatens human health.

This presents a very quick and rough outline of some basic information and features of different types of epidemiological study designs used in environemental and occupational health studies. Essentially, in organising this information, we move from simpler studies using secondary data to more complex primary data gathering studies. Some of these will be explained in detail in the individual study sections when we build them. We are going to discuss this in the context of environmental and occupational health.

Ecological Studies. — Ecological studies in the context of environmental health essentially test the association between environmental exposure at an aggregate level and the outcomes are measured at an aggregated level as well. For example, Adrian Barnett and colleagues studied the correlation between air pollution in several Australian cities and in Auckland and Christchurch and correlated the air pollutant data that were collected over three years to the total number of hospital admissions due to cardiovascular illnesses. (see below):

Screenshot of the full text of the study on air pollution and cardiovascular disease related hospitalisations. For reading the full text of the paper, click on the image.

What makes this as an ecological study is the fact that the investigators collected data at the level of populations (that is from hospitals about the total number of cases) and from the meteorological departments about the air quality and then correlated the two measurements. A downside of this approach with respect to causal inference is that, as data were not collected on individual levels, one cannot make any inference about one person’s risk of hospitalisation on a “bad air day” if you will. Any inference that you draw from an ecological study for individual cases would be subjected to “ecological fallacy”, that is, inferring for individuals from data collected at the aggregate levels.

Case Series. — These are study designs where individual data that are collected where details of each individual cases are noted and possible exposures are tallied as well. However, as these studies do not have any valid comparison group, these studies are best suited for environmental health related surveillance. Environmental health surveillance and environmental health tracking is a systematic, ongoing, process of data collection, analysis, inerpretation, and dissemination of vital information on data on environmental exposures and health effects. These are vital for environmental health and environmental epidemiological approaches as these help to identify clusters of disease and environmental pollutants and toxins, and thus enable framing of hypotheses. Check out for instance, the environmental health surveillance in Western Australia to learn more about how these processes happen.

Cross sectional surveys. — These provide snapshots of large or small or well defined populations in a specific period of time or a specific point in time. Cross sectional surveys are usually conducted using questionnaires. These studies are done to establish or identify prevalence of a specific health outcome.

Case Control Studies. — In this study design, investigators investigate specific hypotheses about the association between specific exposure and disease outcomes. Investigators start by sampling from individuals with and without the particular disease in question. Participants who have the disease outcome of interest are labelled as “cases”, and those without disease outcomes are labelled as “controls”. The investigators then ascertain the extent of exposure for both groups and compare their likelihood of exposure. Case control studies report their effect estimates using Odds Ratios (see above). The analytical method of choice is usually logistic regression to estimate the Odds Ratio for the specific levels of exposure. Case control studies are great for rare diseases such as cancers. This strategy enables investigators to test more than one exposure for each health outcome of interest.

Retrospective and Prospective Cohort Studies. — Cohort studies are in general study designs where “cohorts” or similar groups of individuals or participants in a study, who are initially free from the health outcomes of interest, are assembled and are followed through in time to study the pattern of emergence of the health states or health outcomes. In prospective study, the time of the emergence of the health outcome is unknown; thus, the individuals who are stratified to either exposure or non-exposure status are selected in the present, for example and then they are followed through in time. In retrospective cohort studies, the cohorts are assembled using a principle known as “historical cohort”. In assembling historical cohorts, we already know the status of exposure and we also know the status of the health outcomes of the individuals within the cohorts collected at some point also in the past (in more recent past than the time of their assignment status), and then the analysis proceeds in the same manner in both types of studies. Retrospective cohort studies are well suited for occupational epidemiological studies where cohorts are assembled on the basis of whether they were exposed ot specific environmental agents within the industry, and then records are examined to study the emergence of their specific health states or health outcomes. The data are then analysed. Analysis of cohort study data are used for estimation of incidence of specific health outcomes. The technique of data analysis for cohort studies usually include proportional hazards model where over time, the emergence of different health outcomes are studied. Within cohort studies, case control studies can be nested as well. These happen when initially from the disease free persons blood samples are collected and specific biomarkers are preserved for estimating exposure status in the form of dosage at that particular time; then, when sufficient number of disease outcomes or health outcomes accrue, these retrospectively collected biomarker samples are utilised as exposure to conduct case control studies. Individuals who are disease free at that stage are assigned the status of controls; those who show signs and symptoms of the disease under study are chosen as cases. These nested case control studies within the context of cohort studies are useful for studying multiple exposures just as regular case control studies. Otherwise, cohort studies are good study designs for studying multiple outcomes as a result of single or limited number of exposures. Cohort stuides are also well suited for studying rare forms of exposure (that is exposure that are not very common or say something that occurs due to exposure to agents in specific industries). On the downside, they are time consuming and quite expensive to conduct. Otherwise, of all the study designs they are the most robust in terms of ascertaining causal inference from epidemiological perspective.

Conclusion

This was a quick introduction to the basic principles of epidemiology. We learned about the key definition of epidemiology as was given by RJ Last in his textbook on epidemiology, and then proceeded from that definition to learn different measures of disease distribution (prevalence, incidence, and standardised rates of disease), and measures of association (odds ratio, relative risks, and attributable risks). Then we learned that a notion of causality in epidemiological studies flow from establishment of valid association (that is ruling out chance, eliminate biases, and controlling for confounding variables), and how causal inference can be established using both counterfactual approaches and using condition based approaches that was discussed by Hill way back in 1965. Then, we learned about a few study designs that are used in epidemiology. While this provides a brief (very brief) snapshot of Epidemiology and some of its applications in the special case of environmental epidemiology this will provide us with a guide to do more interesting work and researches on environmental health. We shall learn statistical data analysis and use them in environmental health next.

--

--

Arindam Basu
Environment, Epidemiology, Climate

Medical Doctor and an Associate Professor of Epidemiology and Environmental Health at the University of Canterbury. Founder of TwinMe,