Swift for TensorFlow

7 min readApr 3, 2018

In 2015 Swift on the server was an experiment that many members of the Swift community wanted to see realized. We saw a lot of opportunities for where it could be used, but I personally had a pessimistic view to whether those opportunities would be realized. Apple wasn’t focused on the server, and moving from macOS to Linux was a gigantic effort.

From 2016 IBM Kitura, Perfect, and Vapor gained popularity with Swift on the server, and they developed a large number of community-supported packages to push this forward in the Server Swift community, including templating, routing, and data binding libraries. There began a migration of sorts from Node.js projects to Swift, and I was fully onboard with that, even though it was still early, and Swift support for Linux was limited to Ubuntu.

Alternative Languages on TensorFlow

At the time, I, along with a number of other Swift developers, were also utilizing deep learning tools regularly, and many of us were not satisfied with the almost ubiquitous dependency on Python to build and execute models. Python was good at the job, but it wasn’t always a good choice for every developer. Go and Rust developers were also facing the same frustrations. At the time, if you were using TensorFlow as prolifically, you may have noticed a few issues being raised to support alternative languages, (I even recall seeing the Google language, Dart at some point on StackOverflow).

The Go effort went as far as gRPC support to interact with TensorFlow, but this was limited, and further implicit support for TensorFlow was seemingly impossible due to Go’s limitations as a language. Apparently being expressive is a useful language trait in machine learning.

I recall one of the Google developers involved in the experiments around last quarter of 2016 commenting that he’d abandoned Go for TensorFlow, and had moved to experimenting with Rust as he found it much more productive. Personally, as much as I enjoyed Rust, a machine learning language tends to allow rapid development of models, where-as Rust is a slow language to develop with, and verbose. This benefits Rust in a lot of areas, and the results keep me using Rust wherever I can justify it. For machine learning, I wanted type safety with rapid development syntax. Swift was still my favorite language for throwing together ideas and evolving that into production code.

Swift’s Natural Fit

On launch, Swift initially reminded me of JavaScript, then – as I actually began to use it – it became more akin to Java. In all the good ways, such as type safety, object-oriented syntax, with functional expressions and higher-level functions. I began to see it as a logical next language to replace Java on the server. In 2015, this was still early and I was still painfully moving some highly scalable Java services over to Go so, although I experimented with Swift on the server, I was keen to avoid diverting effort away onto, yet another, server language.

I was already darting between Rust and Swift communities, undecided on which language to focus on long-term, but keen to support both. Later, I joined the Swift Server work group, using early ideas, and feeding back into the discussions based on the experiences I’d had with the Java Community Process in previous years. Rust has a great community though, regardless.

As far as application development, I used Swift almost daily, and replacing my Node.js and Java applications with Swift on the server was an ambitious personal goal, so I was keen to do anything I could to make it a justifiable server language. With TensorFlow it would have replaced my reluctant use of Python. For infrastructure, I’d moved most of my efforts from Python over to Go, which was a natural step in that space.

The Ocular Experiment

So at the beginning of 2016 I created Ocular, under the Apache License 2.0, with the expressed aim to produce a “Swift API for TensorFlow. Designed to run on both server and mobile systems...”

I started with gRPC, as most other developers did, and then I tried to add C-bindings to access the full features of TensorFlow. It was slow and painful to develop Ocular, but I was determined to use it in production. After a lot of effort, I made enough progress to begin using Ocular in small experiments, on the server. Almost a year passed, and in October 2016 I joined the community effort to make this work at scale.

I’d made little progress, but it was usable. Maintenance was difficult, and there was issue after issue. Keeping track of TensorFlow in GitHub was a challenge, but necessary to keep Ocular working.

Swift Community Embraces TensorFlow

It was at this point I met Richard Wei, on GitHub. He was working through the same challenges. As far as I knew, at the time, he was working on LLVM – as Chris Lattner had before moving to Apple to create Swift – and so I gained some confidence that Richard would be on the right track. I immediately offered to support his work, taking whatever I had and throwing it into the community effort. Everybody seemed to be working through similar challenges, in separate project streams.

Unfortunately my time was limited, as was Richard’s, so efforts continued, but progress was still slow. Ocular started to fall behind, after I spent a lot of effort moving from Swift 2 to Swift 3. I needed to focus on production models in Python, and opportunities for experiments were few, as I launched more deep learning models onto my networks. I never managed to find that cohesion with Richard or any of the other disparate efforts within the community.

The Perfect Swift Server community launched their own TensorFlow library, but it too was experimental / limited for my use, and wasn’t really a step forward. A month after I migrated my code, Richard responded to my frustrations with gRPC limitations, and difficulties with C-bindings with a vague message that indicated he had another, more useful, approach.

At this point, I remained involved in Ocular, but I’d shifted my focus to what Richard was doing elsewhere, keen to see what progress he’d make. I essentially began to give up, even though my stubbornness wouldn’t allow me to let go.

As time moved on, so did my efforts to extend my production networks. I moved some models to Facebook’s Caffe2 (still with Python) for better distribution and mobile support, and the rest I kept on TensorFlow in Python. My Swift models were dropped as it became a burden to develop them any further.

Swift became my primary language for the server, with Go as my infrastructure language. Rust became my squeeze-it-in-where-I-can language, always proving successful, but often proving frustrating.

DLVM Arrives… Almost

Eventually Richard announced another project called DLVM. DLVM was based on LLVM, with the aim to provide compiler infrastructure for deep learning systems. It was ideal for what I wanted to do, and I messaged Richard to offer my support when it was ready for experimentation and further development.

DLVM started as a research project at the University of Illinois, and syntactically it looked beautiful to me. I frustratingly waited to see NNKit, the package for building neural networks with Swift from the DLVM project, arrive into the public sphere. It never did.

TensorFlow Dev Summit

On March 30, 2018, Richard took to the stage at TensorFlow Dev Summit after both him and Chris Lattner teased their talk on Twitter. I didn’t expect much, and I began to doubt whether Swift would be mentioned, with TensorFlow being a Google project, and Google being notoriously steadfast in its support for specific languages. What they both announced was more than I could hope for.

In the months preceding TensorFlow Dev Summit Chris had been pushing for dynamic language support in Swift, specifically for Python, and these weren’t language changes I was personally keen on. However, I could see use cases and stayed away from those discussions. Had I known what the goal was, I would have wholeheartedly supported it.

Swift for TensorFlow is an amazing step forward for both TensorFlow and Swift. It provides optimizations that Python can’t provide on TensorFlow, and enables Swift to be used in a space I think it can truly provide value.

Over 2 years after the original issue was raised, and over a year since Richard joined the community efforts to add TensorFlow support to Swift, Richard announced he was finally closing the issue.

Of course, I’ve already joined the community group for Swift for TensorFlow, and if you’re interested in deep learning and Swift, I recommend you do the same. I’m looking forward to diving into some rabbit holes and experiments this April, and for the next few months as the community grabs Swift for TensorFlow and begins to make it a first-class option for developing high-performance deep learning solutions.

We’ll be using it internally for experiments with a roadmap in-place to migrating our neural networks from pure Python models from June until October, if Swift for TensorFlow achieves what we expect it to. This may also result in us dropping Caffe2 altogether during the same period.