What is type safety?

Thiago Silva
11 min readDec 9, 2016

--

Often, we hear that writing softwares using “statically typed languages” makes them type safe. Some even claim that “type safety means programs are free of type errors”.

Even though there are rigorous definitions for a concept of type-error and what it means for a program to be type safe, these definitions seem incapable of capturing well the universe we intuitively understand to belong to these ideas.

In what follows, I discuss how this subject has been treated by two notorious researchers and reflect on limitations of the definitions offered. Though not a big requirement, some background on lambda calculus and semantics should help with better evaluating (pun intended) this post. In any case, the linked PDF to Cardelli below should be a good introduction to the subject.

Luca Cardelli and Type Safety

“The fundamental purpose of a type system is to prevent the occurrence of execution errors during the running of a program. This informal statement motivates the study of type systems, but requires clarification. Its accuracy depends, first of all, on the rather subtle issue of what constitutes an execution error”.

This is the opening of Luca Cardelli’s Type Systems. We will see the subtlety that he refers to surrounding what an error is indeed permeates this subject. In this work, Cardelli classifies “execution errors” as trapped and untrapped. The former are errors that “cause computation to stop immediately”, and the latter are errors that “go unnoticed (for a while) and later cause arbitrary behavior”.

Later, he defines a program is safe if it does not cause untrapped errors. He also defines a subset of all possible execution errors as forbidden errors: these are all untrapped errors plus a subset of the trapped ones. Then, a well-behaved program is one where no forbidden errors occur (thus, a well-behaved program is also safe). Finally, he says a program is strongly checked if all legal programs are well-behaved.

Now, a strongly checked program has the following properties:

  • No untrapped errors occur (safety)
  • No trapped errors belonging to forbidden errors occur.
  • Other trapped errors may occur

Finally, he explores the role of typing:

“Typed languages can enforce good behavior (including safety) by performing static (i.e., compile time) checks to prevent unsafe and ill behaved programs from ever running. These languages are statically checked; the checking process is called typechecking, and the algorithm that performs this checking is called the typechecker. A program that passes the typechecker is said to be well typed; otherwise, it is ill typed, which may mean that it is actually ill-behaved or simply that it could not be guaranteed to be well-behaved.”

As for so-called “untyped languages”, he points out that:

“They do not have types or, equivalently, have a single universal type that contains all values. […] Untyped languages can enforce good behavior (including safety) in a different way, by performing sufficiently detailed run time checks to rule out all forbidden errors. The checking process in these languages is called dynamic checking […] Most untyped languages are, by necessity, completely safe”.

Finally, he differentiates between “strongly checked” and “weakly checked” languages:

“In reality, certain statically checked languages do not ensure safety. That is, their set of forbidden errors does not include all untrapped errors. These languages can be euphemistically called weakly checked (or weakly typed, in the literature) meaning that some unsafe operations are detected statically and some are not detected.

Discussion

Cardelli’s definitions seem to capture most of the intuition about safety and and help classifying and comparing languages accordingly. From his point of view, we can differentiate languages like Python from C as untyped/dynamic checked and typed/statically checked, respectively. We can also differentiate statically checked languages like C from ML as weakly checked and strongly checked, respectively.

In the “untyped languages” group, he notes we can see them equivalently as “unityped” and, since the “universal type” type checks on all operations, these languages are also well-typed. In other words, in theory, there are no forbidden errors (i.e. type errors) on programs written in these languages. Which is correct but, at the same time, it raises questions. For example, how should we regard TypeError exceptions in Python?

Also, while the loose definitions of trapped and untrapped errors seem to be useful in allowing us to reason about different type systems, they are still quite informal. In particular, the notion of untrapped error. It depends, for instance, on what it means for a program to manifest “arbitrary behavior”. Does producing a wrong result fits the bill? If so, then even a well-typed program with no dangerous operations such as casts can be considered unsafe if the wrong data are used as input for procedures, which could be said to result in “arbitrary behavior” (consider an extreme example where the “wrong data” is a custom program just created by the program itself to be evaluated afterwards). Another example are concurrent programs that type check but have race conditions that could be said to manifest “arbitrary behavior”.

We could offer an alternate definition for untrapped errors that is more precise and able to represent better what seems the main intended meaning of “arbitrary behavior”: that is, the ability of C programs to unnoticeably corrupt it’s own runtime and/or suffer segmentation faults (ignoring the aforementioned concurrency question and other computing phenomena that hasn’t debuted as a type in a type theory yet).

But even with such a definition, we are left with the trapped errors to characterize what a type error is, which is contingent on the language. That means that, in the end, the theory still leave us with type-safe programs that raise type errors!

Benjamin C. Pierce and Type Safety

In his Types and Programming Languages (the primary textbook on the subject), Benjamin Pierce writes:

The most basic property of [a type system] is safety (also called soundness): well-typed terms do not “go wrong”: […] it means reaching a “stuck state” that is not designated as a final value but where the evaluation rules do not tell us what to do next.

In the context of lambda calculus, he offers a formalization of “stuck state” as the following:

A closed term is stuck if it is in normal form but not a value.

with the following definitions in mind:

closed term: terms containing no free variables.
values
: all possible terms that may be the final result of evaluation.
normal form: A term t is in normal form if no evaluation rule applies to it.

Then, he offers the definition of well-typed term through two theorems:

Progress: A well-typed term is not stuck.
Preservation: If a well-typed term takes a step of evaluation, then the resulting term is also well typed.

Therefore, a type-error is the occurrence of a term that fails these properties. Hence, a stuck term is also called an ill-typed term.

Discussion

To exercise these definitions, consider a simple boolean language (BOOL) below:

BOOL grammar.
BOOL evaluation rules.

In BOOL’s grammar (as BNF), we have defined what a term is (i.e. what constitutes a program; the “entry point” of parsing) and what a value is (as stated above, these are the terms that may be the final result of evaluation). With this language, we can write programs such as true, false, and(true, false), and(false, and(true, true)), etc. We evaluate programs using the evaluation rules STEP-T, STEP-F and STEP-A, where t stands for a variable ranging over terms. The following are examples of evaluations of programs (reductions are in normal order, i.e., from left to right):

Program 1.
Program 2.
Program 3.

We could extend our language BOOL with types using the following rules:

BOOL’s typing rules.

As for soundness, it should be clear that all values are well-typed as well as all terms . So, errors in syntactically correct programs are not possible. Now, consider the following (partial) extension to the language where integers are allowed. Let this language be called BOOL-I:

BOOL-I grammar.
BOOL-I: Typing rules for integers.

With this extension, although all previous programs are still valid plus programs such as -1 and 99, we can write programs that are stuck, like: and(2, false). This program is ill-typed according to Pierce’s definition, and rightly so.

Interestingly, there’s another way to model type-errors. Further on, Pierce suggests changing the language to reify the “stuck state” as a term, such as an error term, to indicate this “stuck state” has been reached. With that approach, we can rewrite BOOL-I language adding an error term and adding an evaluation rule to “bubble the error up” during evaluation. Let’s call this language BOOL-Ie.

BOOL-Ie grammar.
BOOL-Ie error rules.

Under BOOL-Ie, and(2, false) evaluates to error. In this way, we represent that “stuck state” (the inability of executing) as data back into the program, allowing them to proceed with evaluation.

However, it seems that by doing this we invited that old puzzle again.

if evaluation errors become values, are they still errors? If so, then in what sense?

Now, we can say both languages, BOOL-I and BOOL-Ie, are equivalent as long as we regard error a “special” term indicative of type-error. Currently this is the case, since error is not a value. If we regard error as a value, then and(2, false) is well-typed. Let’s consider these two alternatives in more detail:

“error” as a type-error

In the language BOOL-Ie, because error is not a value but it is in normal form (i.e. there are no evaluation steps to perform on it) then, by definition, it is an ill-typed term. It is simply acting as a placeholder denoting the occurrence of a type-error, a device used in our semantic apparatus. That is, error is a type-error from the “semantics perspective” and not any value worth of consideration from the “user’s perspective”. Then, we might reasonably say the function’s type is, still:

“error” as an ordinary value

The alternative is to regard error as a value just like any other. That is, a value that belongs to a type (e.g. Error), a value that can be computed with to yield other kinds of values. We do this by moving error in the grammar to the value production and defining it’s type. The result is that error is not a type-error according to our definitions. Thus, the program and(2, false) is well-typed and it evaluates to an ordinary value that happens to be error (quite the same as returning -1, NULL , IOException()… in popular programming languages). Consequently, and can be be evaluated with booleans, integers and error values and return booleans or error. So, and’s type would be something like¹:

…or perhaps as a polymorphic function, where T, U and V are type variables:

…or still, we could introduce sub-typing and say it’s a function “from anything to anything”: it’s “untyped” or “unityped”:

Sub-typing rules.

In any case, it’s important to observe that it would only take a small change in the language to make it possible for a program t to evaluate to a normal term x ≠ error even if t contains terms that, at some point, evaluate to error (in other words, a program could “handle” an error and proceed normally — thus, error could act as an exception value).

Consequently, it would be a contradiction to regard the occurrence of a certain typed value error as a type-error. First, because a type-error is defined as the “stuck state” in the evaluation steps, and if error is a value that can be operated within the program, even to a point it becomes “unstuck”, then it can’t be a type-error. Second, if error is a value like any other belonging to a type, then any function of type f: Error returning a value error of type Error is clearly well-typed. Then, what kind of error is an error, if not a type error?

Conclusion

In exploring well accepted² notions of type safety, we seem to encounter some puzzling corners. Even when we try to use rigorous definitions, these seem limited in their power to capture our expectations, as if they were too conservative.

In summary, we’ve seen that:

  • “Untyped languages” may exhibit “type errors”. And yet, they are made in such a way that their programs are inherently type-safe.
  • “Statically checked” programs are guaranteed not to manifest arbitrary behavior. And yet, arguably they still may behave arbitrarily.
  • A precise definition of type safety depends on a precise definition of type-error.
  • When type errors are introduced into the language as values, even though a certain type-error concept can be rigorously defined, it fails to satisfactorily capture phenomena we regard as “type errors”. Thus, the meaning of “type error” becomes diffused.
  • The popular notion that “type safety guarantees programs are free of type errors” can be misleading and may be put into question since programs with type errors can be rewritten to result in an value that represents that very error occurrence. Thus, one could say the error is still there and still happens, it just became itself computable and type safe.

¹A few extra typing rules might be necessary for completeness, but that is insubstantial to the discussion.

²While these two texts seem fairly popular (Pierce’s book in particular is the text book on “PLT-oriented” type theory), I did not conducted a much broader search. With the exception of Bob Harper’s Practical Foundations for Programming Languages, which seems to have a similar treatment as Pierce’s, these are the best treatments of the subject I know of.

--

--

Thiago Silva

I used to do programming language research. I still do, when nobody is looking…