Why is Free Pascal better than PHP?

A nonsensical ramble on type systems

Jakub Tapuć
The Startup
8 min readMar 20, 2020

--

Introduction

With modern codebases growing at an insanely fast pace we have generally come to terms with the fact that certain traits in our beloved programming languages determine the quality of the end product in one way or another. Be it support for seamless modular programming, static type system or concurrency and parallelism with robust ways of maintaining state in distributed systems.

Now and then

Yet in some ways (with numerous exceptions of course) we still use the same programming languages as we did twenty or so years ago. Most of them have acquired some new interesting features usually inherited from functional languages derived from the lambda calculus — lambda expressions (so called arrow functions), immutability, referential transparency, list comprehensions; oftentimes structures from category theory have been incorporated as well — algebraic data types (product types also called tuples and sum types often called discriminated unions), functors, applicatives (applicative functors) and monads to name a few. Another thing that cannot be neglected is a common attempt at standardizing languages and their tooling — formal specifications (even PHP got an a posteriori one), linters, tools for auto-formatting the code and building pleasant-looking documentation.

A spoon of tar in a barrel of honey

However, we have to be honest here. There are languages still in use today that somehow overslept this phase of accelerated growth. And when they woke up it was way too late to catch up. Let’s talk PHP.

In today’s world where strong static typing is becoming more and more crucial to creating robust, performant and secure applications, it’s probably well-grounded to say that PHP’s type-juggling might not be the best fit. Recent addition of gradual typing is not sufficient considering it still lacks generics. Also, there exist many peculiarities within the type system itself — when talking about generics it’s needless to say that the inability to express object boxing and unboxing is a bit surprising. Now, let’s go back in time a bit, to year 2000 when PHP 3 was still in the game and its successor, PHP 4 was released.

The legacy — type juggling

The old PHP 3 manual explicitly mentions type juggling as a language feature. The not so good thing about type juggling is the mental juggling you need to perform in your head when your script goes astray without throwing any exceptions. Even more interesting is the fact that it’s still there in PHP 7.4. This is the easiest way to bypass PHP’s gradual type system since operators are baked into the language and their semantics are exempt from the type system itself. Now let’s see if we can recreate some of the examples from before almost 20 years ago.

// 1
$foo = "0"; // $foo is string (ASCII 48)
$foo++; // $foo is the string "1" (ASCII 49)
// 2
$foo += 1; // $foo is now an integer (2)
$foo = $foo + 1.3; // $foo is now a double (3.3)
// 3
$foo = 5 + "10 Little Piggies"; // $foo is integer (15)
$foo = 5 + "10 Small Pigs"; // $foo is integer (15)
  1. This partially holds true. You can still use the post-increment operator on strings but the variable will be cast into an integer implicitly.
  2. Implicit integer-float juggling is still pervasive in the language.
  3. Nowadays it’s invalid to use a malformed integer literal (which can be a string) with arithmetic operators.

But all of this is so easy to write and natural for us humans to understand. It has a low cognitive load compared to, say, OCaml where arithmetic operators are in fact regular infix functions which aren’t polymorphic over their arguments.

(*
These are code samples from an Ocaml REPL session.
The string 'utop #' is the prompt.
*)
utop # (+);;
- : int -> int -> int = <fun>
utop # 1 + 1;;
- : int = 2
utop # 1.0 + 1.0;;
Error: This expression has type float but an expression was expected of type
int
utop # (+.);;
- : float -> float -> float = <fun>
utop # 1.0 +. 1.0;;
- : float = 2

Of course, type promotions like widening or narrowing numeric values are prevalent in many strongly typed programming languages but do not escape the language’s type system as easily as in PHP. Most of the time they need to be signaled explicitly - thus narrowing a long integer to short integer requires an explicit cast. Even then, you cannot just implicitly cast a string to an integer by using an arithmetic operator.

Also, there is a very clear contrast between strong vs weak typing and static vs dynamic type resolution. It turns out you can get a very cohesive and reasonable strong-dynamic type system based on predictable coercion — Ruby is a good example of such language.

Talking about strongly typed languages though— ever heard of Object Pascal?

P̶u̶n̶k̶ Pascal not dead

https://upload.wikimedia.org/wikipedia/commons/d/df/Turbo_Pascal_7.0_Scrren.png

The Pascal family of languages is quite large. Some of its descendants are still actively maintained. Object Pascal for instance was an implementation originally created by Apple. There exists a couple of Object Pascal-flavored languages. One such example is Delphi. It has its mature open-source counterpart — Free Pascal. It’s a very modern implementation supporting modular programming, generics, delegates, function & operator overloading and much more. It has a great tooling support and a package manager. Free Pascal implements the ISO 7185:1990 official Pascal standard.

Pascal has been a statically and strongly typed language from its inception dating back to 1970s. Its author Niklaus Wirth designed a couple of programming languages but not all of them were statically typed (vide Euler). This is in contrast to PHP which was designed as a glue language for making static webpages less static.

What follows is a brief, nonsensical yet entertaining attempt at comparing the type systems of languages that probably should never be compared in any way. In particular, we’re going to take a look why the lack of generics is such a pain in PHP.

What exactly are generics

Generics is a feature that lets you write code that is parametrized by additional types. The most obvious example of a useful parametrized structure is a collection.

Splish-splash

PHP has its dedicated collection-like datatype for storing values — arrays. Once something is put inside an array it can be retrieved by its index. There are times though when you’d like to use one of the SPL collections such as SplFixedArray which may prove to be more space efficient than an ordinary array. As with arrays you can put there values of any type.

Bark, bark!
PHP Error:  Call to undefined method Dishwasher::bark()

I have never seen a barking dishwasher but I am no expert in this. We would like to find a way of inserting elements into an array collection in a dog-aware fashion. One way would be to try something like the following.

PHP Fatal error:  Declaration of DogFixedArray::offsetSet(int $index, Dog $dog) must be compatible with SplFixedArray::offsetSet($index, $newval)

It won’t work. PHP simply forbids that. If a parameter in the base class is not typed then it virtually has no type at all — it’s called “mixed” in PHP. You can’t specify it in child classes because this would mean changing the method signature which is justly prohibited. It’s a bit of a baffling concept since there is no “mixed” type hint so sometimes obvious variance rules can’t be realized. There exists an RFC for that though.

One thing that one could resort to would be to use the instanceof operator inside offsetSet. It should be noted however that this operator obtains runtime information, so after all you would need to sacrifice one thing for another — space efficiency versus runtime overhead.

Generic collections in Free Pascal

Free Pascal logo

It’s time to see how Pascal treats generic collections and the obvious implications this treatment has on type safety in the language.

Free Pascal comes with a rich open-source standard library called RTL (Runtime Library) which contains a unit called FGL¹ (Free Pascal Generic Lists). The following is an example that uses this unit.

Generics in Free Pascal

Points 1–3 are explained in the code.

4. The “specialize” keyword is used to create a new instance of a parametrized type. From then on TDogList refers to a collection of objects of type TDog.
5. Methods are defined outside of class declarations.
6. The “magic” variable Result works a bit like the return keyword found in many mainstream languages.
7. Create is the name of an actual constructor.
9. It turns out you can even have a foreach loop in Free Pascal.
10. Free Pascal doesn’t have a garbage collector so objects need to be freed when done working with.

It’s not necessary to free the list’s items because by default it holds ownership of its data and frees it when it’s destroyed. It can be controlled by passing False as the first and only parameter to the TFPGObjectList constructor.

Finally, the eighth point proves that it’s impossible to add an object of class TDishwasher to a list of TDogs.

Verbosity

Declaring new instances² of generic types in Free Pascal is mandatory³ which is a bit verbose compared to, say C#:

var words = new List<string>();

In C# type instantiation is ad hoc while in Pascal you need to take care of creating type synonyms beforehand, in the “type” section of a unit file.

Mandatory specialization

Under the hood

Internally, Free Pascal’s take at generics is somewhat different to that present in C++or Java. When a unit is compiled a PPU file is generated. It contains meta-information about the unit. Most importantly for us it maintains a token buffer used by the compiler to create specialized implementations based on the generic type definitions. The workflow looks as follows.

1. For each generic type:
generic TGeneric<T> = ...
Create a token buffer inside the PPU file with meta information.
2. For each occurrence of:
TSpecialized = specialize TGeneric<*>
Parse the specialization and create the conrete type by substituting T for * where * is a simple type

At the conceptual level it’s not much different from other languages like Java but the implementation differs significantly. And because the specialization happens at compile time, Free Pascal’s generics are not subject to type erasure as it happens to be in Java.

Summary

This article sums up why the lack of generics in PHP is painful and why its lax type system sometimes just doesn’t help with writing good code. Free Pascal’s take on generics was presented and how it simplifies many of the problems that exist even in modern versions of PHP. Type juggling was juxtaposed with coercion in dynamic languages like Ruby. Finally, a full working example in Free Pascal was demonstrated to show how common typing issues can be alleviated with the use of generics.

[1] From what I know, FGL is not the most comprehensive generic collection library for Free Pascal but there exist more suitable alternatives.
[2] By “instance of a generic type” I mean another type that is no longer polymorphic. Strictly speaking, it doesn’t contain any free type parameters.
[3] In fact, Pascal supports two main modes {$MODE OBJFPC} and {$MODE DELPHI}. In the Delphi compatibility mode you can skip the ahead-of-time type declarations. Also, the “specialize” keyword must not be used at all.

--

--

Jakub Tapuć
The Startup

Hi! I’m a developer based in Kraków, Poland. I’m passionate about programming, programming languages and their pragmatics.