Verifying Binarized Neural Networks By Angluin-Style Learning

Use knowledge compilation for tractable reasoning

Published in

Towards NeSy

6 min readMar 1, 2023

This blog introduces the paper: Verifying Binarized Neural Networks By Angluin-Style Learning. Feel free to check it out directly here.

Introduction

The one sentence summary for this paper is: use an Angluin-style learning algorithm to compile a binarized neural network into a tractable representation, and use it for detecting adversarial attack.

What can be achieved with this method?

You can reason about counterexamples. With the proposed method, you can first efficiently count the counterexamples, compute their probability, enumerate a subset of them, and then identify their common characteristics in a particular region of an image.
With the tractable language OBDD, you can answer Prime-Implicant query in polytime. Thus you can get a subset of inputs that, if fixed, will guarantee that the neural network output will stick even if we vary the unfixed inputs.

Preliminary

The technique to transfer one language into another equivalent but more tractable language is called Knowledge Compilation.

Knowledge compilation & OBDD

There are three aspects of any knowledge compilation approach:

The succinctness of the target language into which the propositional theory is compiled;
The class of queries that can be answered in polytime based on the compiled representation;
The class of transformations that can be applied to the representation in polytime.

Through these three aspects, we could understand in some sense that knowledge compilation is about “translate” your problem in a different way so that it can be handled efficiently. For the source language, we call it representation language, while for the target language, we call it target compilation language.

NNF language class: a class of language that is richer and nested, based on representing propositional sentences using directed acyclic graphs. The formal definition is as followed,

There’re several properties that an instance of NNF could have shown in the below table,

BDD language is a subset of NNF where the root of each sentence in BDD is a decision node. BDD language also corresponds to the binary decision diagram (BDDs), which is known in the formal verification literature. Check out the example below,

The left pic is a sentence in BDD language. You can tell that it follows the description in the previous property table. The pic on the right side is the exact same sentence shown in the formal verification manner, which is more compact (visually).

An FBDD is usually defined as a BDD that satisfies the read-once property: on each path from the root to a leaf, a variable can appear at most once. Imposing the read-once property on a BDD is equivalent to imposing the decomposability property on its corresponding BDD sentence.

Finally, let’s get to the definition of OBDD as shown in the following pic.

An example of ProbLog inference

Taking the famous burglary&earthquake problem as an example, we can see how and why we compile a program into a form of target compilation language (for efficient query) in the paper DeepProbLog.

The left pic is a grounded program with instance “mary”. The right side is the SDD for the query “calls(mary)”. On the right side, the add-node performs multiplication while the or-node performs addition.

The SDD is just another language, and each OBDD is an SDD. I quote from the abstract of paper from Prof. Adnan Darwiche:

We identify a new representation of propositional knowledge bases, the Sentential Decision Diagram (SDD), which is interesting for a number of reasons. First, it is canonical in the presence of additional properties that resemble reduction rules of OBDDs. Second, SDDs can be combined using any Boolean operator in polytime. Third, CNFs with n variables and treewidth w have canonical SDDs of size O(n*2^w), which is tighter than the bound on OBDDs based on pathwidth. Finally, every OBDD is an SDD. Hence, working with the latter does not preclude the former.

Conjuctive Normal Form(CNF)

CNF is a language of purely propositional logic and the predominant input language of modern SAT-solvers. It’s a conjunction of one or more clauses, where a cluase is a disjunction of literals.

CNF can also be seen as a product of sums or an AND of ORs

Every formula can be written into CNF as shown below,

Binarized Neural Network

The most common BNN is described in paper by Yoshua Bengio. To summarize it, they use a binarization function to transfer weight and/or activation from regular precision number into either +1 or -1.

The binarization function can be either deterministic or stochastic as shown below,

x^b is the binarized variable and x is the original variable.

This function is claimed to be better without further explanation in the paper, but was not adopted for that it requires the hardware to generate random number.

On top of this, they use straight-through estimator to address the zero-gradient issue brought by the deterministic binarization function. It’s simply clipping the original gradients to -1 and +1.

BNN in general looks like this: internal blocks and the output block. LIN stands for the regular linear transformation, BN refers to batch normalization, and BIN is the binarization function we mentioned above. The output block consists of LIN and an argmax to select the highest probability class for a classification task.

Angluin-Style Learning

The Angluin-style algorithms leverage a pair of oracles that answer two types of queries with YES or NO:

Membership query: tell if string t is a member of the unknown set or not;
Conjecture (equivalence query): tell if a regular set S is equal to the unknown language, if this is not the case, the answer is a string t (counterexample).

It’s used to learn the Definite Finite Automaton. A subset of OBDD called complete OBDD is also a DFA, thus we can learn OBDD using queries as well.

Main Algorithm

BNN to CNF

Given a BNN, we transfer it into CNF like shown below.

From internal block to cardinality constraint and then to CNF.

OBDD to CNF

Given a OBDD, we transfer it into CNF like shown below,

For each node in OBDD: R = (C0 ∧ ¬X) ∨ (C1 ∧ X)

2. Use this set of clauses to represent: ¬R ∨ C0 ∨ X; ¬R ∨ C1 ∨ ¬X; ¬R ∨ C0 ∨ C1; R ∨ ¬C0 ∨ X; R ∨ ¬C1 ∨ ¬X

Everything Come Together

In the below picture, line 4 is a constructed in-equivalence evaluation in CNF. This CNF sentence can be the input of a SAT-solver and then serves as the oracle used in the Angluin-Style learning algorithm.

The end result of the learning algorithm is an OBDD that computes exactly like the BNN. With OBDD, we can carry out efficiently Prime-Implicant query that returns a subset of inputs that, if fixed, will guarantee that the neural network output will stick even if we vary the unfixed inputs.

For example, in a binary classification task where we want to tell the difference from 0 and 8, we can detect the counterexamples of the learned classifier using Prime-Implicant query.

If the black pixels and white pixels stay the same, we can for sure get the prediction of 0 on the left picture and 8 on the right picture.

Reference

A Knowledge Compilation Map https://arxiv.org/pdf/1106.1819.pdf
SDD: A New Canonical Representation of Propositional Knowledge Bases https://www.ijcai.org/Proceedings/11/Papers/143.pdf
DeepProbLog: Neural Probabilistic Logic Programming https://arxiv.org/pdf/1805.10872.pdf
Verifying Binarized Neural Networks By ANgluin-Style Learning https://cs.stanford.edu/~andyshih/assets/pdf/SDCsat19.pdf
Binarized Neural Networks: Training Neural Networks with Weights and Activations Constrained to +1 or −1 https://arxiv.org/pdf/1602.02830.pdf
Towards an Optimal CNF Encoding of Boolean Cardinality Constraints https://www.carstensinz.de/papers/CP-2005.pdf
CNF wikipedia https://en.wikipedia.org/wiki/Conjunctive_normal_formMichael J.
Kearns, Umesh Vazirani — An Introduction to Computational Learning Theory-The MIT Press (1994)