Soundness and Completeness

Given a technique/system T that analyzes a program/input P and says either P has property X or P does not have property X,

  • T is sound: if T says P has property X, then P does indeed have property X. 
    In other words, T can be trusted when it says P has property X. T cannot be trusted when T says P does not have property X.
    Saying it another way, T is truthful when it says P has property X. T may be untruthful when it says P does not have property X.
  • T is complete: if P has property X, then T will say P has property X. 
    In other words, T cannot be trusted when it says P has property X. T can be trusted when T says P does not have property X.
    Saying it another way, T may be untruthful when it says P has property X. T is truthful when it says P does not have property X.

Let’s consider two examples to understand these definitions.

Example: Security Analysis Tool

Consider a security analysis tool T that tells if the given program P has leaks user information (vulnerability X). So, given a program P, T says either P has vulnerability X or P does not have vulnerability X. We will use T’s verdict to advice folks to use or not use P.

  1. If T was sound and said P has vulnerability X, then users should not use P cos’ T can be trusted with this verdict, and so, P is vulnerable with X.
  2. If T was sound and said P does not have vulnerability X, then users should further examine P before choosing to use it cos’ T cannot be trusted with this verdict, and so, P may be vulnerable with X.
  3. If T was complete and said P has vulnerability X, then users should further examine P before choosing to use it cos’ T cannot be trusted with this verdict, and so, P may not be vulnerable with X.
  4. If T was complete and said P does not have vulnerability X, then users should use P cos’ T can be trusted with this verdict, and so, P is not vulnerable with X.

Example: May-Alias Analysis

Consider an alias analysis tool T that tells which pair of variables in a program P may be aliases (property X) at some point during P’s execution.

Unlike in this example, T can be perceived as giving a set of verdicts — one for every pair of variables in P. However, we consider each verdict separately, then this example is not different than the previous example.

In this example, the verdict is a bit fuzzy: T will say variables v1 and v2 may be aliases (property X) or T will say variables v1 and v2 will/must not be aliases (not property X). While this fuzziness does not affect how we deal with soundness and completeness of T, it affects the the utility of the verdict when combined with the soundness and completeness of T.

  1. T was sound and it said a variable pair V may be aliases (property X). Since we can trust a sound T when it says property X, variables in pair V may be aliases.
  2. T was sound and it said a variable pair V’ will/must not be aliases (not property X). Since we cannot trust a sound T when it says not property X, it is possible that variables in V’ may be aliases.
  3. T was complete and it said a variable pair V may be aliases (property X). Since we cannot trust a complete T when it says property X, it is possible that variables in V’ will not be aliases.
  4. T was complete and it said a variable pair V’ will/must not be aliases (not property X). Since we can trust a complete T when it says not property X, variables in V’ will not be aliases.

Observe that 1 and 2 together say nothing more than every variable pair in the program may be aliases. So, we gain no new information and, in this case, the soundness of T proves to be useless.

In contrast, 3 and 4 together helps us partition the set of variable pairs into two sets:

  1. The set of variable pairs in which variables will not be aliases.
  2. The set of variable pairs in which variables may be aliases.

So, we gain new information and, in this case, soundness of T proves to be useful.

Depending on the analysis and any fuzziness inherent in the verdict, the above observations will change. To see such changes, reason about the must variant of the second example — an alias analysis tool T that tells which pair of variables in a program P will/must be aliases (property X) at some point during P’s execution. You could also try reasoning about must-not variant and may-not variant of the second example.

I hope this will help you not misuse soundness and completeness ;)