A Bayesian approach to chest pain

Vasilis Konstantakos
Analytics Vidhya
Published in
11 min readAug 11, 2021

1. Introduction

Chest pain is a common chief complaint with a wide range of conditions that can cause it, starting from illnesses with favorable prognoses to life-threatening conditions. However, translating the patient’s experience of pain in the chest to a specific pathology continues to haunt the workload of clinicians. The consequences of an incorrect diagnosis are, at times, severe; misdiagnosing a heart attack as minor musculoskeletal pain could mean death. Thus, the challenge is to timely identify an acutely dangerous cause, but on the other hand, avoid unnecessary testing and referrals.

Unfortunately, doctors often rely on pattern recognition to arrive at the appropriate diagnosis. This, however, is not always reliable as many cases can initially present in an uncommon way. Clinicians that have been confronted with an unexpected diagnosis of myocardial infarction in a seemingly innocuous presentation of chest pain can confirm this. A solution to such diagnostic problems is to adopt a more systematic Bayesian approach, incorporating multiple factors before arriving at the final decision.

1.1 Bayes’ Theorem

Bayes’ theorem is a formula that describes the probability of an event, based on prior knowledge of conditions that might be related to the event. It follows simply from the axioms of conditional probability but can be used to powerfully reason about a wide range of problems involving belief updates such as medical diagnosis. It is stated mathematically as follows:

Bayes’ Theorem.

where A, B are events, P(A), P(B) ≠ 0 are the probabilities of observing A and B, while P(A|B) and P(B|A) are the conditional probabilities of A given B and B given A, respectively.
While this is an equation that applies to any probability distribution over events A and B, it has a particularly nice interpretation in the case where A represents a hypothesis H (or multiple hypotheses Hᵢ) and B represents some observed data D. In this case, the formula - for one or n hypotheses, respectively - can be written as:

Alternative Form of Bayes’ Theorem.

where P(Hᵢ|D) is the posterior probability of hypothesis i, given the data, P(D|Hᵢ) is the likelihood of observing the data under hypothesis i, and P(Hᵢ) is the prior probability for hypothesis i. The denominator includes the likelihood of observing the data partitioned for every hypothesis (i.e., 2 or n, respectively) and functions as a normalizing constant.

1.2 Problem formulation

The latter form of Bayes’ theorem is especially useful for testing multiple hypotheses simultaneously. We apply this to study the challenges that occur during the differential diagnosis of chest pain. Specifically, we will try to answer the following scenario: “A male patient presents with chest pain as the chief complaint. What is the probability that he has a myocardial infarction? In general, how likely is it to have a cardiovascular, respiratory, etc., disease? Furthermore, how do these diagnoses change for a female patient? What are the corresponding probabilities in that case?” This report will try to tackle these questions in a brief but comprehensible manner and explain the basic concepts underlying this approach.

2 Methods

2.1 Prior distributions

To begin with, we create 30 discrete hypotheses for a patient presenting with chest pain. We assume that these hypotheses are mutually exclusive (i.e., if one is true, the others must be false) and exhaustive (i.e., all hypotheses, taken together, describe all possible outcomes). For example, in our schema, a patient with chest pain will have one (and only one) diagnosis from those 30. We then construct a prior distribution, considering each hypothesis as equally likely. Thus, each diagnosis will have an equal (1/30) probability and the resulting distribution will be non-informative. To better represent the existing knowledge, we also create 2 informative prior distributions, each one reflecting a different population. In particular, we construct a prior distribution for the patients presenting with chest pain in General Practice (GP) and one for the patients presenting in the Emergency Department (ED). Previous studies have shown that these populations are significantly different and, as such, they should be studied separately. To accomplish this, we create 1000 virtual visits - in GP and ED - based on our intuition and prior knowledge. Each visit has a final diagnosis, which is then used to calculate a corresponding probability. For instance, we observed 50 myocardial infarctions among 1000 visits in GP, resulting in a probability of 0.05. Due to space limitations, we group the 30 diagnoses into 6 categories based on the biological system to which they belong (e.g., cardiovascular, respiratory, etc.). A summary of these distributions is shown in the Supplementary Excel file. The differences between the three prior distributions are clearly demonstrated in Fig. 1. We observe that cardiovascular diseases are the most prevalent among the visits in the ED while musculoskeletal causes are quite common in GP. Furthermore, psychiatric or other miscellaneous causes are not that common, in contrast to what a non-informative distribution would illustrate. Thus, it is crucial to create and select a suitable prior distribution for the problem we are studying as it will directly influence our results.

Figure 1: Prior distributions for patients presenting with chest pain.

2.2 Likelihood calculation

After constructing the appropriate prior distributions, we need to calculate the likelihood of observing the data under each of our hypotheses, as we can see from Eq. 2. This is not always an easy task, especially for numerous discrete hypotheses, or even worse for continuous hypotheses. In our case, we use the data from 22304 visits to separate them into a unique diagnosis and identify how many of those were male patients. We can then compute the likelihood of being male under each possible hypothesis [P(Male|Hi)]; dividing the number of male patients by the number of total visits for each diagnosis gives the desired result. In addition, we group the diagnoses into 6 categories, as we did before, for a clearer illustration. An added benefit of that calculation is that we can also derive the likelihood of being female under each possible scenario, as those events are complementary (i.e., a patient can be either male or female) (Fig. 2). We use this fact as a final comparison in our analysis in the following section. First, we focus our attention on the initial problem of a male patient presenting with chest pain. All the resulting likelihood calculations can be seen in the accompanying Excel file. We observe that the sum of all likelihoods isn’t equal to 1. Is this outcome expected? In addition, could we use the likelihood of each hypothesis divided by the sum of all likelihoods as the new likelihood? These are questions that commonly arise when dealing with such problems and should be considered. We try to answer them after completing our analysis in the following section.

Figure 2: Visits presenting with chest pain per diagnostic category.

2.3 Posterior Distributions

We now proceed to calculate the posterior distributions based on the computed likelihoods and the prior distributions we have defined. Due to space limitations, we will present the results grouped into 6 representative categories for demonstration purposes. The complete results for each diagnosis and category can be seen in the corresponding Excel file.

3 Results

In this section, we present the results of our analysis. To begin with, we illustrate the constructed posterior distributions after following the process we described. In particular, Fig. 3 shows the posterior distribution that arises from the non-informative prior we used. In this case, the data show that most male patients with chest pain had a cardiovascular, musculoskeletal, or respiratory disease as their likelihoods were higher. After observing the data, the posterior probabilities for these conditions were increased. Therefore, we observe that the posterior mainly follows the trend of the likelihood; thus, it is data-driven. However, even though there is a strong bias towards cardiovascular and musculoskeletal diseases for male patients the resulting probabilities do not have a substantial difference from the initial ones.

Figure 3: Posterior distribution with a non-informative prior.

On the other hand, Fig. 4, 5 illustrate a different posterior distribution for each setting. Specifically, the prior probability for GP was dominated by musculoskeletal diseases. Likewise, the data show that most male patients have cardiovascular, respiratory, or musculoskeletal diseases (i.e., their likelihoods are higher). After observing the data, the posterior probabilities for those three were increased (more for the latter one). It follows that male patients presenting with chest pain in GP have a vastly increased probability to suffer from a musculoskeletal disease. In contrast, gastrointestinal diseases are pretty common in GP but affect female patients more frequently; thus, the probability of a male patient presenting with chest pain due to such conditions is generally low. The distribution for the patients in the emergency department (ED) also differs significantly. In particular, chest pain presentations in that healthcare setting are dominated by cardiovascular diseases (Fig. 5). For this reason, male patients - who are already at risk for cardiovascular conditions (i.e., high likelihood) - will very likely face such a diagnosis. In contrast, the increased likelihood of male patients having a musculoskeletal disease only slightly changes its posterior probability. In this case, the shape of the prior has a strong influence and drives the posterior distribution in a major manner.

Figure 4: Posterior distribution with an informative prior (GP).
Figure 5: Posterior distribution with an informative prior (ED).

Figure 6 further visualizes the differences between the mentioned posterior distributions. We confirm the importance of both the gender (male patient) and the healthcare setting (GP, ED) when evaluating the posterior distribution. Cardiovascular and musculoskeletal diseases in ED and GP, respectively, should be first considered when a male patient presents with chest pain. On the other hand, psychiatric and other miscellaneous causes of chest pain are fairly uncommon for male patients. Finally, we provide a direct comparison of the posterior distributions between both genders (Fig. 7). We chose to only compare the ones that involve a non-informative prior to better capture the concept of the likelihood shaping the posterior. We notice that female patients suffer more often from gastrointestinal, psychiatric, or other diseases when presenting with chest pain. We also confirm that the likelihood primarily drives each posterior distribution. In fact, by comparing Fig. 2 and Fig. 7, it is clear that this is the case, as they demonstrate a quite similar trend.

Figure 6: Posterior distributions for male patients presenting with chest pain.
Figure 7: Posterior distributions for male/female patients with a non-informative prior.

4 Discussion

To summarize our results, a non-informative prior gives equal probabilities for each hypothesis and allows the posterior to be shaped only by the likelihoods. We demonstrate this in two ways in our analysis. Regarding the two informative prior distributions, we also notice interesting results. In the first case (GP), we observe that male patients have increased probabilities for cardiovascular and musculoskeletal diseases. This is in agreement with both the increased prior and the increased likelihood for male patients. Thus, both the prior and the likelihood contribute to the outcome. On the other hand, the presentation in the emergency department (ED) provides a unique picture. Specifically, while the likelihood for the three mentioned diseases remains high, the resulting probabilities are significantly different only for the cardiovascular ones. The other two are just slightly increased. Therefore, the prior drives the posterior in a major manner in this healthcare setting. Furthermore, in Section 2.2 we asked the following questions:

  • We observe that the sum of all likelihoods isn’t equal to 1. Is this outcome expected?
  • Could we use the likelihood of each hypothesis divided by the sum of all likelihoods as the new likelihood? Does this ratio define something meaningful?

In fact, this outcome is expected. As we see in the relevant sheet of the Excel file (Fig. 8) the sum of all possible likelihoods (e.g., 2.95) is not equal to 1 but the likelihoods for each condition separately add up to 1. These two observations actually answer our question. To be specific, when we are summing the likelihoods of each diagnosis, we are summing probabilities from different probability spaces. Thus, the result doesn’t need to be 1; it could be less or greater than 1. Formally, the likelihood P(D|Hᵢ) is not a distribution in Hᵢ, but only in D. In our case, where the data is the gender of the patient — male or not male (i.e., female) — summing across this axis gives the expected result of 1.

Figure 8: Likelihood examples taken from the Excel file.

Regarding the second question, there are two facts we need to note. First, if we divide the likelihood of each hypothesis by the sum of all likelihoods and use the resulting likelihood as the new one, the results will be the same. The only difference would be that the likelihoods will add up to 1 as we performed a normalization procedure. We can formally prove this as follows: Let’s denote as Lᵢ = P(D|Hᵢ), Pᵢ = P(Hᵢ) the likelihood and prior probability under each hypothesis Hᵢ, respectively. In addition, we define S as the sum of all likelihoods and L’ᵢ = Lᵢ/S as the new likelihood. Then the new posterior probability for each hypothesis Hᵢ is equal to:

This answers the first part of the second question. Finally, the ratios L’ᵢ = Lᵢ/S also define something meaningful. In particular, they represent the posterior probability for each hypothesis when all the hypotheses have the same prior (i.e., when we use a non-informative prior). Indeed, if all the prior probabilities for each hypothesis were equal to some constant c (Pᵢ = c) then the posterior for a hypothesis Hᵢ would be:

which is the new likelihood that we have defined. Actually, this further illustrates the fact that when we are using a non-informative prior distribution, the likelihood drives the posterior. If we are only making use of the data, this is precisely what we get.

5 Conclusion

In summary, this case study presents an overview of how Bayes’ theorem works and how it can be incorporated into medical diagnosis. Specifically, Bayes’ theorem provides an extended, systematic framework to approach complex problems which involve belief updates. Medicine is a domain that involves such problems and demands the integration of multiple factors before reaching the final decision. Therefore, understanding the concepts that underlie Bayesian inference, its benefits, and challenges will surely pave the way to its more accurate and widespread application.
Pattern recognition can perform well when patients fit the pattern. When they don’t, a more formal, systematic approach that is based on probability theory is needed. This is what Bayes’ theorem provides.

6 Supplementary Material

The PDF report and the accompanying Excel file are available on Github.

--

--

Vasilis Konstantakos
Analytics Vidhya

Medical Doctor | Data Scientist. Interested in Genomics and AI/ML Healthcare applications.