FROM THE ARCHIVES OF PRAGPUB MAGAZINE, JANUARY 2019

From The Pragmatic Bookshelf, Hot off the Presses: Systems Programming in the Twenty-First Century

by Michael Swaine

PragPub
The Pragmatic Programmers

--

📚 Connect with us. Want to hear what’s new at The Pragmatic Bookshelf? Sign up for our newsletter. You’ll be the first to know about author speaking engagements, books in beta, new books in print, and promo codes that give you discounts of up to 40 percent.

The Pragmatic Bookshelf is the publishing imprint of The Pragmatic Programmers, LLC. PragPub is associated with the company, which Andy Hunt and Dave Thomas founded in the early part of this century with a simple goal:

To improve the lives of developers. We create timely, practical books, audio books, and videos on classic and cutting-edge topics to help you learn and practice your craft.

Here’s what’s up with the Prags authors. Let’s start with an excerpt from an upcoming Pragmatic Bookshelf book.

Systems Programming in the Twenty-First Century

Photo by Kimberly Farmer on Unsplash

Why learn systems programming in the twenty-first century? It’s a fair question. When I learned C at the turn of the century, low-level languages like C and C++ were already falling out of favor and being rapidly supplanted by high-level languages such as Ruby and Java. In the intervening years that trend has only accelerated, with functional programming languages such as Clojure, Elixir, Elm, Haskell, and Scala becoming more prominent, and C receding even further from day-to-day relevance.

And yet, C remains at the heart of modern computing: it’s in our operating systems, our network stack, our language implementations, our virtual machines, and our web browsers. When performance is critical and resources are constrained, we still fall back on the techniques of low-level programming. However, in this book, I’ll show you that you don’t have to choose between the ergonomics of modern languages and the performance of systems programming. With Scala Native, you get to have both.

So what is systems programming?

As a working definition, systems programming is the art of writing code while remaining aware of the properties and limitations of the machine that will run it. For all the complexity of modern software, computers are still surprisingly simple devices at the machine-code level.

We can identify five fundamental data structures for working on bare metal:

• Primitive data types — integers, floating-point numbers, and raw bytes, that can be directly represented in machine code

• Low-level byte strings — a way of representing textual data of variable length directly in a computer’s memory

• Structs — a compound data type that arranges named fields in memory in a fixed way

• Array layout — data of the same type that are arranged in a grid-like fashion, one after another

• Pointers — a numeric representation of the location in memory of some other piece of data

These five concepts are profoundly interrelated. In this book, I’ll introduce them gradually through a series of real-world examples. As you attain more proficiency, the deep connections between these concepts will let you write simple, powerful programs that vastly outperform what you can achieve with regular JVM Scala.

Even if you rarely write low-level code, the knowledge and insight you attain from learning systems programming pays dividends: essential everyday tasks, like tuning and debugging systems, interpreting complex error messages, and predicting the performance characteristics of complex systems, become much easier and more accurate when you have a solid knowledge of the fundamental principles by which computers operate.

That said, this is not an eat-your-vegetables guide to systems programming. I am excited to write a systems programming book now, because of the new possibilities created by recent developments in cloud computing and distributed systems technology.

The enterprise IT world is prone to buzzwords, but the rapid adoption of Linux container technology in recent years has been genuinely transformative. Having access to a simple format for packaging and deploying applications, containers, and Docker in particular, has radically altered the day-to-day workflow of working developers.

Best of all, the broad adoption of container technology has also eliminated one of the chief pain points of traditional systems programming: portability. Getting a typical C codebase to compile for the first time on a new development machine could often take days, and handling incompatibilities between different UNIX variants such as Mac OS X, Linux, and Solaris littered code with opaque macros and cryptic bugs.

In contrast, Docker containers provide a reliable, Linux-flavored execution environment for any programming language that can run on any recent Windows, Mac OS, or Linux development machine. By giving us access to reproducible builds and uniform deployments, containers truly put the “modern” in “modern systems programming.”

The other critical change that distinguishes new-style systems programming from what you’d find in a C textbook is the overwhelming emphasis on network programming in a modern cloud environment. Whereas classic systems programming books focus on file input and output (I/O), many programs written for cloud deployment will communicate over one of a few network protocols, and might never write to a file at all. That’s why this book puts practical network programming front and center. You’ll learn how TCP sockets work and how to write an HTTP client and server from scratch. By the end of the book, you’ll have designed and built a powerful, lightweight framework for RESTful microservices.

But that’s enough hype from me. Before we dive into the foundations of systems programming, let’s roll up our sleeves, write some code, and take a look at Scala Native in action.

Scala Native is an ahead-of-time machine code compiler for the Scala programming language. It can take a Scala program, with traits, objects, garbage collection, and other advanced features of Scala, and translate it down to the same kind of executable machine code that a C compiler would output.

But that’s not all. On top of Scala’s support for object-oriented and functional programming, Scala Native adds powerful capabilities for working much closer to “bare metal.” In particular, it provides access to OS-level I/O and networking APIs, system-level shared libraries, and C-style memory management. With these techniques, we can often replace C code in performance-critical applications. And Scala’s capacity for clean abstraction means that we can make low-level programs more elegant and readable than ever before.

At its best, Scala Native can simultaneously exhibit both modern programming techniques and a close affinity for the underlying hardware. This expressive clarity also makes Scala Native a great way to learn systems programming for the first time.

To start, let’s set up a Scala Native project. We’ll do so much as we would set up a simple Scala project: by creating a new folder with three files. The first file is a build.sbt file that describes our project:

InputAndOutput/hello/build.sbt 
name := “hello”
enablePlugins(ScalaNativePlugin)
scalaVersion := “2.11.8”
scalacOptions ++= Seq(“-feature”)
nativeMode := “debug”
nativeGC := “immix”

The second is a hello.scala file that contains our code:

InputAndOutput/hello/hello.scala
bject main { // inline explanatory note on object/main structure
def main(args:Array[String])
{ // inline explanatory on parameterized types println(“hello, world!”) }
}

And the third is a project/plugins.sbt file that imports the actual Scala Native plugin:

InputAndOutput/hello/project/plugins.sbt 
addSbtPlugin(“org.scala-native” % “sbt-scala-native” % “0.3.6”)

Much like a regular Scala program, when you enter the command, sbt run, the Scala build tool (sbt) builds the project for you. After it is fully compiled, you should see the expected output of the program, like this:

/project-build> sbt run
[warn] Executing in batch mode.
[warn] For better performance, hit [ENTER] to switch to
[warn] interactive mode, or consider launching sbt without
[warn] any commands, or explicitly passing 'shell'
[info] Loading project definition from /root/project-build/project
[info] Set current project to sn-mem-hacks
[info] (build file:/root/project-build/)
[info] Compiling 1 Scala source to
[info] /root/project-build/target/scala-2.11/classes
[info] 'compiler-interface'
[info] not yet compiled for Scala 2.11.8. Compiling...
[info] Compilation completed in 13.051 s
[info] Linking (2352 ms)
[info] Discovered 1267 classes and 9344 methods
[info] Optimizing (5002 ms)
[info] Generating intermediate code (1015 ms)
[info] Produced 39 files
[info] Compiling to native code (2272 ms)
[info] Linking native code (153 ms) hello, world
[success] Total time: 28 s, completed Mar 13, 2018 5:11:03 PM

This is exactly what we would expect from a regular Scala program: the code we used is identical to a “Hello, world” in standard Scala, and the build configuration has only added a single plugin to support Scala Native. This is good news. Scala Native is 100% Scala — it is not a variant or a new version. It is simply a plugin that gives the language some additional capabilities.

So far, you’ve seen that Scala Native can be easy to use. In many cases, it works as a drop-in replacement for mainstream JVM Scala. But what can it do for us that JVM Scala can’t?

Smaller Footprint

To see some of Scala Native’s more exceptional functionality, run sbt nativeLink. You should see output like this:

/project-build> sbt nativeLink
[info] Loading project definition from
[info] /Users/rwhaling/Documents/dev/sn-word-..
[info] Set current project to
[info] sn-word-sorter (in build file:/Users/rwhaling/D..
[info] Compiling 1 Scala source to
[info] /Users/rwhaling/Documents/dev/sn-word-sort..
[info] Discovered 1279 classes and 9445 methods
[info] Optimizing (4418 ms)
[info] Generating intermediate code (961 ms)
[info] Produced 37 files
[info] Compiling to native code (2072 ms)
[info] Linking native code (308 ms)
[success] Total time: 17 s, completed Jan 21, 2018 4:25:58 PM

If you look in your build directory at target/scala-2.11, you’ll see a 4.2MB executable file called hello-minimal-out. This file is a native binary — it consists of immediately executable CPU instructions plus headers, symbol tables, and other metadata to allow your operating system to load it and run it.

You should also see a 4.2MB file at target/scala-2.11/hello-minimal-out. That’s our program! You can run it on its own just by typing

./target/scala-2.11/hello-minimal- out. You can copy it, move it around, and in many cases copy it to another computer intact. This file contains executable machine code: binary CPU instructions that your OS can load into memory and run without a virtual machine or interpreter.

In contrast, if you were to package up a standard JVM Scala “Hello, world” project, the output is a 5.5MB file called hello-minimal-assembly-0.1-SNAPSHOT.jar. Unlike our native binary, a .jar file cannot be directly executed — instead, it must be executed by a Java Virtual Machine (JVM). Combined, the size of a JVM and an application JAR is often close to 100MB for a small app, and can rapidly increase for larger projects with complex dependencies.

Faster Startup

There’s another, even more important difference. If we time the executions of both version of our program, we’ll see this:

~/hello-minimal/> time java -jar 
./target/scala-2.11/hello-minimal-assembly.jar
hello, world!
real 0m0.350s
user 0m0.368s
sys 0m0.038s
~/hello-minimal/> time ./target/scala-2.11/hello-minimal-out
hello, world!
real 0m0.024s
user 0m0.021s
sys 0m0.003s

This is already an exciting result! Scala Native runs our “Hello, world” program in about 20 milliseconds, while our JVM program takes almost 20 times longer, close to half of a second, to print a string out to the console. Here we’re seeing the impact of the JVM — a Java Virtual Machine is itself a large, complex program that takes time to set up and shut down, and we have to go through that process every time our tiny Scala program runs. In contrast, our native binary is a file containing machine code. Our OS can just load it into memory, point the CPU at the main method, and let it run.

Before we go any further, though, it’s worth taking a step back and asking,

“When does performance matter?”

For a command-line tool that a developer runs a few times an hour, a difference in startup time is a nice quality-of-life improvement. But when you’re dealing with big data, high-throughput networking, or heavy-duty I/O, efficiency is critical; improving performance or reducing resource usage can save serious amounts of money and make it possible to tackle new, harder problem domains. That level of performance isn’t always necessary; there are plenty of problems that are easily solved by higher-level programming languages. But throughout this book, we’re going to keep our focus on areas where this kind of performance can make a difference. As a result, we’re going to rapidly move from “Hello, World” to seriously big data.

Let’s dive in by exploring the foundations of systems programming, starting with input and output.

Systems Programming in the Twenty-First Century by Richard Whaling will be available as a beta book at The Pragmatic Bookshelf shortly.

And here’s what other Pragmatic Bookshelf authors are up to:

2019–01–01

Learn to build a daily writing habit so you can write articles, blog posts, whatever you need to enhance your business and reputation.

Johanna Rothman (author of Behind Closed Doors, Manage It!, Hiring Geeks That Fit, Manage Your Job Search, Predicting the Unpredictable, Manage Your Project Portfolio, Second Edition, Agile and Lean Program Management, and Create Your Successful Agile Project)

Non-Fiction Writing Workshop to Enhance Your Business (online workshop).

2019–01–02

Take your writing to the next level. Learn how to have several projects in progress, finish them and publish them.

Johanna Rothman

Secrets of Successful Non-Fiction Writers (Workshop 2).

2019–01–08

12 skills every rookie programmer should have, but often don’t.

Andy Lester (author of Land the Tech Job You Love)

Codemash 2019, Sandusky, Ohio.

2019–02–04

Diffuse your way out of a paper bag: I explain Monte Carlo models by demonstrating Brownian motion, showing various approaches to diffusing one’s way out of a paper bag. I will show animations in C++ using SFML, using std::random a fair bit.

Frances Buontempo

C++ On Sea, Folkestone, Kent, UK from 4th-6th February 2019.

2019–02–21

Code Your way out of a paper bag: A quick introduction to genetic algorithms

Frances Buontempo

nor(DEV):con 2019, Norfolk, UK.

2019–04–24

Is your project or organization’s approach to agile stuck? If so, join us for a simulation-based approach to learning what might work for you.

Johanna Rothman

Influential Agile Leader, Boston.

2019–05–11

This very special Scrum Patterns course will deepen your Scrum knowledge, and give you new insight on how to introduce Scrum piecemeal into an organization’s deepest foundations. And it will be a chance to meet the authors of the forthcoming book.

James O. Coplien

Scrum Patterns Course.

But to really be in the know about all things Pragmatic, you need to subscribe to the newsletter. It’ll keep you in the loop, it’s a fun read, and it’s free. All you need to do is create an account on pragprog.com (email address and password is all it takes) and select the checkbox to receive the weekly newsletter.

About the Author

Michael Swaine served as editor of PragPub Magazine and was Editor-in-chief of the legendary Dr. Dobb’s Journal. He is co-author of the seminal computer history book, Fire in the Valley, and an editor at Pragmatic Bookshelf.

--

--

PragPub
The Pragmatic Programmers

The Pragmatic Programmers bring you archives from PragPub, a magazine on web and mobile development (by editor Michael Swaine, of Dr. Dobb’s Journal fame).