Featured
How to read an Environmental Health Paper, Part I: basic steps
… Building a Users’ Guide to Environmental Health literature, baby steps
So, in this edition, I’d like to write about the ten points I have found useful and you may find useful for reading a research paper on Environmental Health (we will call these as Environmental Epidemiology papers). These points are useful for reading a single paper in a practical way, and also about not only Environmental Health research, but in general health research as well. There are tools (see here and here) that help you to critically appraise scientific body of literature, and I will cover these tools in future editions of this series, but for now, my goal is to develop intuitions as to how we can focus on reading and making sense of individual papers and articles. The best way I believe is to start with an example and link to the set of questions, so in this post, this is what I have done.
The Paper that I will be “reading” and annotating is as follows:
Ferreccio C, Smith AH, Durán V, Barlaro T, Benítez H, Valdés R, et al. Case-Control Study of Arsenic in Drinking Water and Kidney Cancer in Uniquely Exposed Northern Chile. American Journal of Epidemiology. 2013 Jun 13;178(5):813–8.
Download the full text of the paper and follow along if you wish.
So, what questions are we asking and the steps we will follow
The absolute first step is to read the paper. Make notes as Soenke Ahrens advises in “How to take notes”, always read a paper with a pencil in hand and mark the text.
Download a copy of the paper and read it first. The paper itself is about a research that the authors conducted to test whether lifetime high consumption of inorganic arsenic through drinking water would lead to or cause kidney cancer. There were some reasons for the suspicion as an IARC report had concluded that while ingested inorganic Arsenic can cause bladder, lungs and skin cancers, the evidence according to them was equivocal for kidney cancer as the only pieces of evidence was from studies that did not consider “individual” level data. The fact that you need individual level data for epidemiological studies is important because then you can study “confounding” variables, and their problem with the “ecological fallacy” and individual data based research lets you test whether there were biases that might explain the results. So, in this case, the authors argued that if inorganic arsenic were to cause bladder cancer, and urinary bladder cancers are essentially transitional cell cancers, then it is possible there would be a higher risk of cancer of the renal pelvis as well.
Essentially as they say,
The ten questions
We will walk through this paper and answer the following ten questions following our reading of the paper.
Summarise the main study question,
what did the authors study and why?
Itemise the key assumptions of the authors of this study
Describe the features of the target population of the study
Describe the exposure they studied
Comment on the comparison groups in the study
Describe the outcome they studied
Describe the methods as described by the authors in the study
Comment whether the methods are appropriate for the goals of this study and what alternative approaches could be considered and why (5)
Describe the results from the study
Choose from the results one expected and one unexpected result
Q1: Summarise the main study question, what did the authors study and why?
This is the absolute first question to answer to get the big picture of the study. Usually, you will be able to answer this question from the abstract of any study. So, let’s see what the abstract states:
If you read the abstract, you will see that the authors were questioning if International Agency for Research on Cancer (IARC) concluded that while ingested arsenic can cause other cancers but not that it can cause kidney cancer because the data came not from individual data, can this be rectified with new research? The study question was that of what did inorganic arsenic cause kidney cancer? The reason in this case, was that while it is reasonable to think that arsenic can cause kidney cancer in the same way that it can cause bladder cancer and as bladder cancer or transitional cell cancer occurs in kidneys as well, what can we tell?
This brings us to the second question in the series:
Q2: Itemise the key assumptions of the authors of this study
Where do we find this information? Typically, in most journal articles, you will find this information in the introduction section. Let’s see what information do we get in our case of this article and where is this information can be found.
A couple of points worth understanding:
- They noted that IARC found evidence that arsenic causes cancers in other organs but not for kidney because of a “lack of individual level data”, so if individual level data were to show that there were associations, that might be admissible.
- Second, to obtain individual level data, they needed to obtain the data from somewhere and some people who would be unequivocally be able to provide that kind of evidence, hence Northern Chile, which is very dry and had derived their water sources from the public water supply alone and that water supply had shown different levels over different time periods was the most appropriate in this case. That way you could relate that if there was an association, it was because of their ingestion of whatever it was in the drinking water they were using back from when for which they had sufficient records available.
These two critical assumptions needed to be met in order for their research to stand. So they did go to find individual level data from a particular part of the world where it was possible to have that kind of evidence
Q3: Describing the target population of the study
The next item in assessing an environmental epidemiological study is to check for whom is this study applicable? This relates to both the people who were studied in the main study and people who are likely to be benefited from the results of this study. Often these two populations are same, but usually, in environmental epidemiological studies, an environmental exposure being ubiquitous, applies to people all over the world, even though the study population itself were confined to a specific geographical area. Let’s take a look at the population studied in the cited paper.
In this case, the authors make it clear that the data were obtained from people in Northern Chile and therefore, people from whom the data were obtained belonged to a relatively confined population of Northern Chile and in several cities, although two cities were particularly important, those of Arica, Iquique, and Antofagasta. But more importantly, as the authors make it very clear, unravelling the linkage between exposure to arsenic and risk of kidney cancer particularly those that are similar to bladder cancer or transitional cell cancers will be important for all people in the world who are exposed to inorganic arsenic. Therefore, the target population in this case refers to everyone in the world who are exposed to or likely to be exposed to high concentration of inorganic arsenic.
Q4: Describe the exposure
In any environmental epidemiological study, the environmental variables are the key. The point here being, environment is any entity that is external to the humans, and occurs in the natural environment or somehow the humans themselves are responsble for its consumption or origin. In the context of this study, the exposure in question is that of inorganic arsenic. The inorganic arsenic was found in high concentration in the nineteen fifties in the river water or source water that came from the Andes mountains, and after the 1970s, the concentration of inorganic arsenic in the drinking water was reduced. You can find it in the Methods section. As seen here:
Q5: Describe the comparison groups
This is really important for those studies where the authors or researchers study an exposure and an outcome relationship. On the other hand, in those studies, where the authors main objective is to study a phenomenon or prevalence and nothing else, or where the authors do not want to study any exposure-disease association, a question of “comparison” group as such does not arise by the nature of their research. So keep this in mind as you start evaluating these types of epidemiological studies. More on the study designs later in this series, but for now, comparison groups are relevant where the authors or the researchers aim is to unravel an “association” or “causation”. So, the first rule is to find out the purpose of the study: are the researchers studying and reporting on prevalence of an environmental exposure or an outcome? Or are the investigators unravelling an exposure and a disease relationship? If the latter, this is where you should look for comparison groups.
Returning to the research we are discussing here, we see that the researchers have set up a case control study. Case control studies are those where specifically people with a health condition (or disease) are paired up (not always paired up in the sense of being matched) with each other. So, controls form the comparison group for the cases. But if these are not explicit, then look for whom the researchers or investigators compared with. Sometimes, the comparison groups belong to all population in a country (where they analyse standardised measurements), or sometimes, there are aggregated measures of such comparisons. So where are we going to find this information in a study we want to appraise? Always look for this information in the “Methods” section of the study. The “methods” section of this study tells us the following:
How do you make sense of this? You know that in the first place they had identified those individuals with kidney cancer that they had ascertained from different sources. So these were the “cases” they were discussing and describing in this study. The alternative groups were their controls, and as they write here, “controls without cancer” is the keyphrase you should be looking up. Other types of studies will describe other situations. Such as “non-exposed” if they were to conduct a cohort study. In yet other situations, you will not likely to find people compared with people, instead you will find time points of exposure being compared with time points that are controlled time points (such as “case-crossover” studies). So in this case we see that the people who are controls are all those people who did not suffer from kidney cancers.
Q6: Describe the outcome under study
All environmental epidemiological studies will provide you with this information, regardless whether they are additionally studying exposures or not. Outcomes in our context are the health effects. The more detailed they are, the more explicit they are, the easier it is to follow the logic or what it is that is under investigation. You will find this information in the methods section of the study you will appraise.
Retuning to our case, the following section from the “Methods” section tells us about the outcomes or health conditions or health effects they have reported:
Q7: Describe the methods of the study
This is really crucial to appraise an environmental epidemiological study. Paraphrase or simplify the methods. Thinking of our own appraisal, if we were to summarise the methods, it’d be that they had identified people with and without kidney diseases; then from each individual with and without kidney disease ALIKE and such individuals were matched on sex and age (within five years), they obtained the following data: their BMI (current and 20 years past), their water intake history, where they had resided in the past, their employment history, history of smoking, their possible exposure to other known risk factors for kidney cancer. Now these are really important as you will see that some of them were related to measuring the true exposure, and others were related to the potential confounding variables under study. This is the really important bit about finding the methods of the study in the first place, particularly the one where you read about obtaining individual data points: think of not only the exposure and the outcome but also note the role of confounding variables and collect data. See how meticulous were the researchers in collecting the data.
But it goes beyond collecting the data. They also described what did they do to measure the exposure and how did they obtain data on exposure. They also described that they had conducted what is referred to as an “unconditional” logistic regression (we will cover this in a subsequent post as to what is logistic regression, what is unconditional, and how to do so using Julia but for now we only mention this), and very clearly listed the variables they used for this purpose all the variables they used for the logistic regression (sex, age groups, smoking status, mining exposure, socioeconomic status). A point to note is how carefully they collected data on the socioeconomic status with further explanation. This is the level of meticulousness that you want to look for in any paper.
Q8: Comment (or rather think) whether the methods are appropriate for the goals of this study and what alternative approaches could be considered and why
This is a “counterfactual” that you must always consider. As you do so, think of the aims of the study. What did the authors want to study and if the methods were appropriate or if the methods would help them to get to that destination. This is important, because this is the gist of “internal validity”. In other words, are there enough in the methods that answers the three things that we must look for: first of all, did the authors do enough to cover for chances or that they would be able to rule out the play of chance? What do we know about how or what did they do to plan the sample size and power of the study? second, what did they do to eliminate any possible biases in the study? Think of what possible biases could have arisen due to one or other ways and what did the authors do to remove those biases at the planning stage of the study. Third, think of how did the authors control for potential confounding variables. In particular, pay attention to what did the authors do to identify the potential confounders and what did they do about it?
Returning to the study, we do notice that the authors do not provide us as to what did they do to plan the sample size or power of this study. But we do know that kideny cancers are extremely rare and the authors did collect data about the number of people with the cancer among the residents and further, note that the authors recruited a large number of “controls” for the cases (148 cases and 872 controls, totalling 1020 participants nearly 6:1 control:case ratio), so the study was not likely to be under-powered. What about biases? As an observational study, there were some likelihood of recall bias by the participants, but the way they ascertained cases was objective, relying on official documenation and figures obtained from official statistics therefore biases were less likely to be about the exposure assessment and outcomes. Finally, the investigators did extensive control of the confounding variables; not only did they match the cases and controls on age, and sex, you can see that they employed multivariable logistic regression model with potential confounding variables. This is what you want to look for in the studies you appraise. What strategies did the investigators adopt to rule out play of chance, control for confounding variables, and minimise or eliminate biases from their study by design. When we will discuss study designs in this series, we will take a closer look at these issues, but for now, these three entities are crucial for us to critically appraise studies.
Q9: Describe the results of the study or rather, what are the results?
This is a crucial point as you want to know how well the plan worked for the study. You will find this information in the results section of the paper. In the results section of the paper, pay attention to all the tables and figures that the authors have presented. In particular, pay attention to the description of the population they studied, pay attention to the exposure variable they studied, and the exposure outcome relationships. Were all people or participants accounted for? In some cases, you will see that the authors have included a graphic display of the participants in the sense that they have provided how many participants were planned for, and how many participants did they end up analysing data, and what were the reasons they missed the participants. Second, look for descriptive statistics in the tables and in particular, pay attention to the differences in the groups being compared. Are the groups very different on some variables that were not adjusted for? Are the groups being compared sufficiently similar? Third, check what analyses were conducted and what results are reported and how have they reported the results? In particular, look for both magnitude of the effect estimate and the directions. Have the authors reported p-values alone or have they also reported the 95% confidence intervals? What are the point estimates and the corresponding 95% confidence intervals? If the authors are studying cause and effect relationships, have they reported dose response relationships? Have they reported as the dose of the exposure increases whether there are corresponding increase (or decrease) in the outcome as well?
Returning to the study we are appraising, we see that the authors have provided us the number of people they planned for and the number of people who ended up finally in their study. We also see that they have acocunted for the differences in these numbers and counts. We notice that they have provided details of the effect size (that of Odds Ratios) and they have provided us the necessary dose response relationships as you can see that for the transitional cell cancers only, compared with the lowest dose of exposure, where the OR was fixed at 1.0 (this is the baseline), we note that a distinct dose-response relationship is present:
So you can see the hallmarks of a good meticulous study on the association between arsenic and kidney cancer, particularly those of ureter and pelvis of kidney tissues are all there. You can see strength of association measures as well a point estimate with a corresponding 95% confidence interval presented. Look for this level of meticulousness in the papers that you appraise.
We finally turn to the final question for this part of the series where we note in the paper if there are surprises or what surprises and conclusions have the authors presented or if something leaps to your eyes.
Q10: Choose from the results one expected and one unexpected result and justify your choice
This aspect of the work is up to you. Read the results carefully and see for yourself if there is something that you expected and has worked out as you were expecting, or the authors were expecting and it did work out that way. Also while you are at it, look out for some results that appear surprising to you. After all, these are the elements that will drive future research. If some results are surprising, then the hunt is on for what happened or why are these surprising. You will either find this information in the discussion section or you can rely on your own knowledge and understanding of the substantive matter to arrive at these. This is where your understanding of the assumptions that the authors make and that you make about the exposure disease association will start making sense.
Returning to our study or the one we are appraising, we see that when the authors set out ot study the association between kideny cancer and arsenic intake, they worked on the assumption that firstly, the IARC, although found an association between arsenic and cancers in other organs, they were ambivalent about the role of arsenic in causing kidney cancers because there was “no evidence” that they could find from “individual” studies, but they had evidence from aggregated studies, so they were expecting that if arsenic was a culprit in kidney cancer, there will be positive results in individual studies as well. The results do suggest that this indeed was the case. Was there anything that was unexpected? The authors do not report so, but you can argue that perhaps the fact that no other kidney cancer (particularly renal cell cancer) was not found to be statistically significant was an “interesting” finding, and might be surprising.
Final Words and next steps ….
This was an introduction to the art of critical appraisal of environmental epidemiology literature. We barely scratched the surface of how to do an appraisal. Future episodes will cover the nitty gritty of reproducing results and appraisal of a “body of evidence” and “external validity” and “study designs”. Stay tuned.