Using Swift for Differentiable Programming

Published in

Motius.de

9 min readOct 28, 2020

I didn’t actually study Computer Science. I studied Aerospace Engineering in the Mechanical Engineering School of my university. In this field, a large amount of my life was consumed by differential equations. I learned about 12 million different methods to solve specific classes of differential equations, none of which I ever encountered in the real world, where they are mostly solved numerically instead of analytically.
Although studying all these methods for exams caused a lot of sleepless nights and I never really got good at it, I always loved differential equations.

A differential equation can describe a complex system in a single line of simple math. The solution derived from this equation can predict the exact state of it at any time, past and future. Just one line, but it contains all the information you need.

If you compare this to programming, you might get depressed. Programming means that you have to spell every little thing out precisely to make a stupid computer understand.

If you define a function

𝑧(𝑥,𝑦) = 𝑥+𝑦

none of the normal languages can figure out that

𝑥(𝑧,𝑦) = 𝑧-𝑦

A kid can do that. Programming languages can’t [¹].

Similarly, they also can’t differentiate or solve differential equations. There are computer algebra systems that can do things like that, sometimes even in the form of libraries for well-known programming languages, but they cannot do it on the language itself.

Consider this example of a function defined in Python:

def z(x, y):
    return x + y

Python cannot figure out the function for 𝑥 or 𝑦 on its own. What it can do by using additional libraries, looks a lot more like this:

solve(“z = x + y”, x)

… which would return a string “𝑧-𝑦”. And while this is a correct answer, I can not use this as a function in Python. It is a string, containing the information I need, but not actually the functionality of computing a value for me if I give it numbers to use for 𝑧 and 𝑦 variables.

So when I heard about Differentiable Programming, I got excited: it promises to be able to take a function, defined in the language itself, derive it and thus create a usable function.

That means: given

def f(t):
    return t² + 1

I could do:

f’(t) = derive(f)

and I would get a real function, so that I could then do:

f’(1)

and I would get the correct answer -2, as 𝑓’(t) would be now defined as 2t.

Differentiable Programming however wasn’t invented for me and my naive dreams about programming languages being as useful as third grade math. Instead, it is used to optimize Neural Networks in Machine Learning projects, a topic I am wholly unprepared to talk about.

However, the promise of Differentiable Programming is indeed closely related to my simple examples above: one of the key motivations to perform derivation in mathematics, is to find the minimum or maximum of a function. If your function describes for example fuel consumption, the minimum would give you the speed at which you can drive your car most efficiently (which is a bad example because that speed is obviously zero, but let’s ignore that for a second or two). In a program that function could describe the energy consumption or the run-time of the program. Minimizing that would mean you optimize your program! That sounded exciting to me and so I decided to play around with Differentiable Programming during the latest Motius Discovery round.

A short introduction to Swift

Swift is the current pioneer in Differentiable Programming.

It was developed around 2010 at Apple as a modern successor to Objective-C.
Google started using Swift in the “Google Brain”-project, after the language and compiler were released as Open Source Software by Apple. Google added the Differentiable Programming capabilities to Swift and integrated it with their machine learning libraries.

It is probably not a coincidence that this story also tracks the employment history of Chris Lattner, who is one of the designers of Swift and original authors of LLVM, the compiler toolbox that Swift is built with.

Automated Reference Counting

At this point I got a little distracted. Swift uses ARC for memory management, and I’m always interested in programming language techniques like that. ARC is a pretty nice scheme to manage dynamic memory in a program. When you create an object, you also have a reference to it. A reference is what you need to actually access that object: a variable with a name or an entry in a list. Every time the compiler finds a reference, the counter goes up. Every time a reference is deleted, the counter goes down. The reference can get deleted if the variable is niled, or if it used to point to a different object. Once the counter reaches zero, the compiler adds a call to the destructor: it deletes the object, so that the memory it was stored in is free again.

The idea behind this is simple: without a reference, there is no way for you to use the object, which means you won’t need it anymore and thus it can be deleted.

Let’s look at a quick example.

First, we define a simple class and overwrite the deinit function, so we can see when it is called. The deinit function is what deletes an object.


class Person {    let name: String    init(name: String) {
        self.name = name
        print(“\(name) is being initialized\n”)
    }    deinit {
        print(“\(name) is being deinitialized\n”)
    }
}

Next we define three variables that can either be nil (nothing) or hold references to a Person object.

var ref1: Person?
var ref2: Person?
var ref3: Person?

Now we create an instance of Person and assign it to ref1.

ref1 = Person(name: “Captain Cool”)

And we point ref2 and ref3 to the same instance as ref1.

ref2 = ref1
ref3 = ref1

There is now three references to the Captain Cool instance. Next, we will start to delete all those.

ref1 = nil
ref2 = nil
ref3 = nil

After the last reference is deleted, we will see this output:

>>> Captain Cool is being deinitialized

The ARC automatically cleans up for us and we don’t have to manually free this memory. Nice!

It is however my duty as an engineer to immediately try to destroy any system that seems to be working nicely.

So let’s go again, but this time, our Person gets a Brain. The Person will have a reference to its Brain and vice versa.

class Person {    let name: String
    var brain: Brain?    init(name: String) {
        self.name = name
        print(“\(name) is being initialized\n”)
    }    deinit {
        print(“\(name) is being deinitialized\n”)
    }
}class Brain {    let name: String
    var owner: Person?    init(name: String) {
        self.name = name
        print(“\(name) is being used\n”)
    }    deinit {
        print(“\(name) is being used no more\n”)
    }
}

Now we will create a Person and a Brain and assign them to each other.

var person: Person? = Person(name: “Captain Cool”)
var brain: Brain? = Brain(name: “Captain Cools Brain”)
person?.brain = brain
brain?.owner = person

Now we delete the reference to the Person and then we delete the reference to the Brain

person = nil
brain = nil

… and there will be no deinit message this time. The Captain Cool instance still has a reference to the Captain Cools Brain instance and vice versa, so the reference count did not reach zero. However, both those references are stored in instances that have no references, and thus are unreachable. We have leaked memory!

To be fair, this is not a Swift-specific issue. In Rust, for instance, you can leak memory in just the same way. In languages that use garbage collection, it is way harder to leak memory accidentally, but you can still make it hard for the garbage collector to do its job.

This kind of cyclic dependency is an edge case that the ARC can not solve on its own, but the programmer can give it a hint to help out: the weak keyword. References marked that way don’t increase the reference count, and so the ARC can ignore them.

Swift: high potential, great on Apple, not so great on Linux

Swift is a great language for programming within the Apple ecosystem. It is very ergonomic, it is fast and it is reasonably safe. But as soon as you leave the Apple ecosystem, things get a little less nice.

Most Swift developers use OSX and target OSX and iOS, so many libraries are written with iOS and OSX in mind and are not first class citizens on Linux. The tooling in the Apple world lives in Xcode, outside of the Apple world, there is little effort put into it.

Swift does run on Linux, and also Windows since a few weeks ago, and Apple is definitely not holding back Swift on these platforms, but the reality of the Apple ecosystem means that the official and community support is focused on … the Apple ecosystem. Swift also competes in mind-share with other languages, and especially in the Open Source world, Rust is the sweetheart of the masses at the moment.

It gets even more complicated when you actually want to use the Differentiable Programming features of Swift: these are not yet part of the mainline Swift, so you have to use Google’s fork of the language. Google only provides pre-built packages for Ubuntu 18.04 (which is an LTS, but still … old). The other option is to use colab — Google’s hosted Jupyter Notebook online IDE.

Oh, and about that Differentiable Programming

You might have noticed that I didn’t talk too much about Differentiable Programming anymore. Well, here is what I found:

I have no idea what to do with it.

I mentioned differential equations above but a Differentiable Programming language still cannot solve them. The output of the differentiation is almost a normal function, but not quite — Swift cannot differentiate it again. This means you are stuck with the first derivative.

I also tried to rebuild something I learned in my control theory classes: a PID controller. These consist of a proportional part (where there is a proportional relationship between an input and a calculated output), an integrated part and a differentiated part. If you look at a system over time, you will realize that the proportional part handles the current situation, the integrated part makes sure you don’t forget about the history and the differentiated part looks into the future. A controller like this helps you to make sure the output will match the input, without overshooting the goal.

For my test, I simplified the PID controller to a PD controller, so I don’t have to deal with integration, which again Swift does not provide a tool for (although it would be simple to do so, as the integration is just the sum of the past values). This test actually worked quite nicely and created some great graphs.

Differentiable programming PD controller

The blue line is the desired position, the red line is the measured position and the orange line is the difference between desired and measured position. The green line is the computed control input.

You can check out the code here.

However, in reality, it’s pointless. Swift can, obviously, only differentiate functions. A PID controller however does not use functions as its input. It uses sensor data and user input: a sensor that measures the actual state of the output you want to control, and a user who inputs what they want that output to be. These are discrete values, not functions. It works great for simulation though, since I just modeled the measured position as a function.

Conclusion

Despite my lack of a proper use case and the limitations, I have to admit, using Differentiable Programming felt a little bit like magic. Getting the actual derived function of a function, even if this function contains structures like if/else clauses, is impressive and it feels very natural to use it. The fact that you cannot get a second derivative is a pretty big downer, but there is both an explanation for why this is hard and also ideas on how to get it working.

From here on out, I think we just need to find the right problem to solve with Differentiable Programming, and I do believe that in the coming years we will see some very smart people do exactly that. And we will be very impressed by it.

[¹]: except of course when they can.