Uncertain Knowledge and Reasoning

Heena Rijhwani
The Startup
Published in
5 min readJan 24, 2021

In real life, it is not always possible to determine the state of the environment as it might not be clear. Due to partially observable or non-deterministic environments, agents may need to handle uncertainty and deal with:

Uncertain data: Data that is missing, unreliable, inconsistent or noisy

Uncertain knowledge: When the available knowledge has multiple causes leading to multiple effects or incomplete knowledge of causality in the domain

Uncertain knowledge representation: The representations which provides a restricted model of the real system, or has limited expressiveness

Inference: In case of incomplete or default reasoning methods, conclusions drawn might not be completely accurate. Let’s understand this better with the help of an example.

IF primary infection is bacteriacea

AND site of infection is sterile

AND entry point is gastrointestinal tract

THEN organism is bacteriod (0.7).

In such uncertain situations, the agent does not guarantee a solution but acts on its own assumptions and probabilities and gives some degree of belief that it will reach the required solution.

For example, In case of Medical diagnosis consider the rule Toothache = Cavity. This is not complete as not all patients having toothache have cavities. So we can write a more generalized rule Toothache = Cavity V Gum problems V Abscess… To make this rule complete, we will have to list all the possible causes of toothache. But this is not feasible due to:

Laziness- It will require a lot of effort to list the complete set of antecedents and consequents to make the rules complete.

Theoretical ignorance- Medical science does not have complete theory for the domain

Practical ignorance- It might not be practical that all tests have been or can be conducted for the patients.

Such uncertain situations can be dealt with using

  • Probability theory
  • Truth Maintenance systems
  • Fuzzy logic.

Probability

Probability is the degree of likeliness that an event will occur. It provides a certain degree of belief in case of uncertain situations. It is defined over a set of events U and assigns value P(e) i.e. probability of occurrence of event e in the range [0,1]. Here each sentence is labeled with a real number in the range of 0 to 1, 0 means the sentence is false and 1 means it is true.

Conditional Probability or Posterior Probability is the probability of event A given that B has already occurred.

P(A|B) = (P(B|A) * P(A)) / P(B)

For example, P(It will rain tomorrow| It is raining today) represents conditional probability of it raining tomorrow as it is raining today.

P(A|B) + P(NOT(A)|B) = 1

Joint probability is the probability of 2 independent events happening simultaneously like rolling two dice or tossing two coins together. For example, Probability of getting 2 on one dice and 6 on the other is equal to 1/36. Joint probability has a wide use in various fields such as physics, astronomy, and comes into play when there are two independent events. The full joint probability distribution specifies the probability of each complete assignment of values to random variables.

Bayes Theorem

It is based on the principle that every pair of features being classified is independent of each other. It calculates probability P(A|B) where A is class of possible outcomes and B is given instance which has to be classified.

P(A|B) = P(B|A) * P(A) / P(B)

P(A|B) = Probability that A is happening, given that B has occurred (posterior probability)

P(A) = prior probability of class

P(B) = prior probability of predictor

P(B|A) = likelihood

Consider the following data. Depending on the weather (sunny, rainy or overcast), the children will play(Y) or not play(N).

Here, the total number of observations = 14

Probability that children will play given that weather is sunny :

P( Yes| Sunny) = P(Sunny | Yes) * P(Yes) / P(Sunny)

= 0.33 * 0.64 / 0.36

= 0.59

Bayesian Belief Networks

It is a probabilistic graphical model for representing uncertain domain and to reason under uncertainty. It consists of nodes representing variables, arcs representing direct connections between them, called causal correlations. It represents conditional dependencies between random variables through a Directed Acyclic Graph (DAG). A belief network consist of:

1. A DAG with nodes labeled with variable names,

2. Domain for each random variable,

3. Set of conditional probability tables for each variable given its parents, including prior probability for nodes with no parents.

Let’s have a look at the steps followed.

1. Identify nodes which are the random variables and the possible values they can have from the probability domain. The nodes can be boolean (True/ False), have ordered values or integral values.

2. Structure- It is used to represent causal relationships between the variables. Two nodes are connected if one affects or causes the other and the arc points towards the effect. For instance, if it is windy or cloudy, it rains. There is a direct link from Windy/Cloudy to Rains. Similarly, from Rains to Wet grass and Leave, i.e., if it rains, grass will be wet and leave is taken from work.

3. Probability- Quantifying relationship between nodes. Conditional Probability:

P(A^B) = P(A|B) * P(B)

P(A|B) = P(B|A) * P(A)

P(B|A) = P(A|B) * P(B) / P(A)

Joint probability:

4. Markov property- Bayesian Belied Networks require assumption of Markov property, i.e., all direct dependencies are shown by using arcs. Here there is no direct connection between it being Cloudy and Taking a leave. But there is one via Rains. Belief Networks which have Markov property are also called independence maps.

Inference in Belief Networks

Bayesian Networks provide various types of representations of probability distribution over their variables. They can be conditioned over any subset of their variables, using any direction of reasoning.

For example, one can perform diagnostic reasoning, i.e. when it Rains, one can update his belief about the grass being wet or if leave is taken from work. In this case reasoning occurs in the opposite direction to the network arcs. Or one can perform predictive reasoning, i.e., reasoning from new information about causes to new beliefs about effects, following direction of the arcs. For example, if the grass is already wet, then the user knows that it has rained and it might have been cloudy or windy. Another form of reasoning involves reasoning about mutual causes of a common effect. This is called inter causal reasoning. There are two possible causes of an effect, represented in the form of a ‘V’. For example, the common effect ‘Rains’ can be caused by two reasons ‘Windy’ and ‘Cloudy.’ Initially, the two causes are independent of each other but if it rains, it will increase the probability of both the causes. Assume that we know it was windy. This information explains the reasons for the rainfall and lowers probability that it was cloudy.

--

--

Heena Rijhwani
The Startup

Final Year Information Technology engineer with a focus in Data Science, Machine Learning, Deep Learning and Natural Language Processing.