Logic Study Note: propositional & FOL

The study note of Stanford CS221 course and some thoughts

Published in

Towards NeSy

9 min readSep 2, 2022

Introduction

The point of logic, from an AI perspective, is to be able to represent and reason with knowledge in the world. Given the progress of large neural network models in the recent decade, we are able to teach models to generate pictures based on vague instructions/write articles with a given prompt/or recognize an object, but these achievements, though groundbreaking, are not trustworthy. We need now more than ever to know an explainable and deterministic answer (YES/NO/NOT SURE) to massively deploy AI safely for the better productivity of the human race.

This is where logic kicks in and completes the puzzle of AI applications (in my opinion, and some influential researchers consider that logic has a fundamental impact on future AI advances). This blog borrows heavily from PPT used for Stanford CS221 course instructed by Percy Liang, check out the public video lectures here.

Logic as philosophical logic has a long history. Overall, only a small fraction of the discussion and study in that perspective has inspired logic in AI on a very high level (instead of detailed implementation), like, as described in Kant and AI, repurposing Kant’s a priori psychology as the architectural blue print for a machine learning system. Other than that, since researchers in the AI community emphasize more the theoretical analysis of algorithms and application-driven theories, we can focus merely on modeling knowledge using logical formulas and inference rules.

The discussion of this post is restricted to propositional logic and first-order logic only.

Ingredients of a Logic (for propositional logic)

Logic has its own language. As opposed to natural language (informal), it’s formal and thus reduces ambiguity.

Syntax

A set of valid formulas to define what are the grammatical expressions in this language.

Propositional symbols, like (A, B, C, …), are the smallest unit used in the logic paradigm. You can also use a word as a symbol such as RAIN to represent whether it is raining currently. Together with the logical connectives, they are by definition the only legal symbols. You can define any symbols for your own logic, such as 荆 (my surname) to replace ^ as the conjunction. Without semantics, symbols cannot be interpreted in natural language and thus means nothing.

Semantics

A set of models for each formula. Note that the “model” here refers to assignments or configurations (values) of the formulas instead of an inference structure with weights like in a neural network. We can say that semantics is about what the legal, based on syntax, expressions mean.

If the values of propositional symbols can be either 0 — meaning false or 1 — meaning true, there’re 4 possible models for the formula: RAIN → WET. One of the models for this could be: 1→1, meaning that if it rains, then it’s wet. Together with 0 → 0, these two models demonstrate a causal relationship between RAIN and WET.

The interpretation function tells you the semantics without having to say it in natural language, which is super ambiguous. From Percy’s slide

You can understand the interpretation function in this way: use a model w as the lookup table and replace the propositional symbols with the truth values to answer this question— is the formula correct?

And we can include all the models (configurations) in a set and the set is called a World for the formula. Instead of having to enumerate and write down all the correct assignments of symbol values in a formula, we can just write down the formula itself to compactly represent all those possibilities. (doesn’t it already sound intelligent?)

If we include more formulas in a set, we put restrictions on the configurations that satisfy all formulas and narrow down the world. We call a set of formulas Knowledge Base (KB). From Percy’s slide

Looking at two worlds (the satisfied configurations of two Knowledge Bases KBs), we can derive three relationships: Entailment/Contradiction/Contingency. If adding a formula to a KB doesn’t provide new information, we say that KB entails the formula f: KB ⊨ f. This is also saying that the set of models that satisfy f is the superset of those of KB. For contradiction, if the worlds represented by KB and formula f have no intersection, then we say that KB ⊨ ¬f.

Satisfiability: given two KBs, it’s the same asking — is KB1 ∪ KB2 satisfiable? — and — does KB1 (entail/contradict/contingent) KB2?

Inference Rules

Pure syntactic manipulations on formulas. Given formula1, what can we derive from it?

Let’s take a breath and grasp what we got so far. We have defined the syntax (propositional symbols and logical connectives) to formally write down a rigorous statement. And, instead of low-level configurations, we use the formula to compactly represent world knowledge. As the ancient Chinese philosopher, LaoZi said,

The Tao produced One; One produced Two; Two produced Three; Three produced All things. All things leave behind them the Obscurity (out of which they have come), and go forward to embrace the Brightness (into which they have emerged), while they are harmonised by the Breath of Vacancy.
— TAO TE CHING translated by JAMES LEGGE

From Tao we can derive all things in the world, we can also derive more formulas, using Inference Rules, starting from a very compact Knowledge Base (though a KB could never be as powerful and abstract as Tao). Since we are merely deriving what’s hidden behind the current KB, the before and after KBs are equal.

A very famous inference rule is Modus Ponens: “P implies Q. P is true. Therefore Q must also be true.” We can use propositional logic syntax to write down this rule in the symbolic version: P →Q, P⊢Q. (⊢ reads as “proves”)Backed by this rule, the two KBs define the same world. These two KBs are: {P, P →Q} and {P, P →Q, Q}. Use Rain and Wet as an example again: since we know that when it rains, the ground gets wet; “The ground is wet” comes without saying if we see rain falling.

Desiderata for Inference Rules

The key thing for inference is that: “entails” does NOT equal “proves” in most cases. If entailment equals derivation, like in Percy Liang’s example, we will have the truth, the whole truth, and nothing but the truth. In this case, we say that this inference rule is sound and complete.

Soundness: nothing but the truth; all derived formulas can be entailed. From Percy’s slide

Completeness: whole truth; inference rules derive all entailed formulas. From Percy’s slide

Quick takeaway: Modus Ponens is sound but incomplete as shown in the following example. The reason is that when we enumerate all possible worlds and check if the WET symbol can be true given rain is true, we can find correct configurations (thus entailment stands). But since by definition Modus Ponens rule can not reason over the disjunction connective ∨, syntactically WET can not be derived.

If we restrict the usage of ∨ in syntax, then the rule is complete. From Percy’s slide

In this example, the syntax is restricted (can not use the disjunction connective in any formula), then the entailment equals derivation. Note that the Modus Ponens here is slightly generalized. From Percy’s slide

Review some of the concepts before moving on

Let’s review the previous concept a little bit. Logic consists of three indgredients: syntax, semantics, and inference rules.

Syntax defines a set of valid formulas, including what symbols can be used in the formulas.

Semantics specify a set of models for each formula, which contains all the assignments/ configurations of possible worlds. For each atom in a formula, the semantics give it a set of possible values (usuallly binary, refering to True of False), and M(f) is a set where only the worlds that make the formula True are included.

Inference rules allow you to derive new formulas given a Knowledge Base (KB), which is a set of formulas. We can iteratively apply inference rules to KB until no new formulas can be derived. One of the inference rule mentioned above is modus ponens.

Entailment definition: given formula f, if M(f) is the super set of M(KB), we say that KB entails f. Derivation definition: if iteratively applying inference rules to KB gives us f, then KB derives f.

Soundness: KB can entail all the derived formulas; Completeness: KB can derive all the entailed formulas. Note that, to make the inference rule Modus Ponens to be complete(it is already sound) on formulas allowed in propositional logic, we can restrict formulas to using only horn clauses. If we don’t want to restrict the expressive power of propositional logic, we can apply a new inference rule: resolution. (but inference time could be different and a concern. Resolution costs exponential time while modus ponens costs linear time.)

Rewrite Modus Ponens using the disjunction connective

Resolution Inference Rule

And let’s see how do we check if the inference rule is sound. Remeber that, if we can use the KB to entail all derived formulas, we say the rule is sound.

We also need to generalize resolution inference rule to all propositional formulas instead of only clauses. To do that, we introduce the conjunctive normal form (CNF), which is a conjunction of clauses. We can convert any formula f in propositional logic into CNF with 6 conversion rules as shown below.

Resolution Algorithm

We will use the relation between entailment and contradiction, AKA “proof by contradiction”. Insert f negation into KB, convert all formulas into CNF, repeatedly apply resolution rule, and then see if we can derive a formula that contradicts the KB.

First-Order Logic (FOL)

So what’s wrong with propositional logic? The major issue is that propositional logic is still ambiguous and less expressive compared to what we need it to be.

Recall that an atomic proposition is a statement that must be True or False. There are two atomic propositions: AliceKnowsArithmetic and AllStudentsKnowArithmetic. If we parse the propositional sentence using our natural language knowledge, we know that the former atomic proposition entails the latter for that the latter one is actually a conjunction of many atoms like: (Alice/Bob/Calvin/Derick/…)KnowsArithmetic.

But we know that this entailment is falsely derived. Symbols have no meaning to each other unless we enumerate all students and write down an implication like: (Alice/Bob/Calvin/Derick/…)KnowsArithmetic →AllStudentsKnowArithmetic.

In FOL, we can capture the internal structure of a natural sentence and write down a more structured logical statement with the help of Objects/Predicates/Quantifiers/Variables.

For example, in FOL, AllStudentsKnowArithmetic can be written as,