Understanding the Zig Programming Language

Have problems wrapping your head around the Zig programming language? Here is a guide.

Erik Engheim
Nov 9, 2020 · 17 min read
Image for post
Image for post

So you are curious about Zig but you don’t know exactly where to start. Maybe you read the introduction at the Zig web site, and maybe you even began reading ziglearn.org, but still feel a little lost.

One first stumbling block for people is the idea that Zig is a C-like language:

But many people looking at the code above will say this doesn’t look very C-like at all. What is up with the pub, and fn keywords e.g.? And what about var in front of variables? That looks like JavaScript!

This means we really have to define what specifically we mean when we say that Zig is a C-like language. A lot of the syntax in Zig is indeed different from C, but what we really means is how the language works. The best way to give a sense of this is to begin by talking about what Zig doesn’t do.

Language Features Zig Doesn’t Have

Unlike most modern high level languages Zig does not have any of these features:

  • Classes and inheritance. There is no class keyword or syntax to express inheritance.
  • Interfaces or protocols. No syntax to define something akin to a Java or Go interface, or a Swift protocol.
  • Runtime polymorphism.
  • Exceptions. Zig uses error codes, but in a more clever way than Go.
  • Constructors and destructors, meaning there is no RAII.
  • Function overloading. You cannot write a function with the same name but with different arguments multiple times, like in e.g. C++.
  • Operator overloading. Operators such as +, -, / etc can never change meaning.
  • Garbage collection. You must manually allocate and deallocate memory in Zig.
  • Closures or lambdas. There is not way to define a function inline which captures some external state.
  • Generics.
  • String type.

This is a pretty long list of fairly normal features found in languages such as Java, C#, C++, Swift, Rust, Go and many others.

What does Zig do then?

Uncommon Features Zig Does Have

The list of things Zig doesn’t do coincides very well with what the C language also doesn’t do. What about C features uncommon in other languages? Which of these does Zig have?

  • Pointers and pointer arithmetic. Most languages today forbid this and often have dumbed down or heavily restricted versions such as Java references.
  • Manual memory allocation and deallocation.
  • Get address of arbitrary data structure.

These two lists may give you a better idea of why Zig can be called a C like language. The feature set is very similar. In fact let us do some things which is common in other languages but common in C:

This create a variable of type i32, which is a 32-bit integer (int) in Zig and we initialize it to 4. Next we create a pointer ptr to items of type i32. We initialize the pointer to point to x. Later we dereference the pointer to assign 15 to x.

While the syntax is somewhat different semantically speaking any C-programmer would be familiar with these kinds of operations.

Why change the C-syntax? This create a number of advantages and consistencies when you want your language to support type-inference. Remember C was made before type inference was widely used.

The benefit of using a keyword such as fn for function and var for variable in front of every identifier is that it makes creating tools and working with the language a lot easier. E.g. if I want to lookup the definition for the readAll function in Zig, I can just search through the standard library for fn readAll. That text string cannot be mistaken for a call to that function or a variable named readAll.

But enough digressions. If Zig basically has all the same features as C, then what is the point? Why not stick with C? Thus far I have undersold Zig, because it does in fact come with some unique new features and equally important it removes certain features from C.

What Zig adds and removes from C

C has a very weak type system, and this is an area Zig drastically changes. Type checking is much stricter in Zig and it even bans null pointers. Or rather you can have null but you have to specifically allow it.

Optional Values

Here is an example from the linked list implementation in Zig:

When you pop the first node in a linked list, there may not be anything there, so we must allow null to be returned. That is why the return type is marked as ?*Node. The ? prefix indicates that the pointer is allowed to be null.

We can contrast this with prepend which adds a not to the front:

In this case you see the type of new_node is *Node and that is because it makes no sense to be adding null pointers to a linked list.

If you are familiar with other languages that don’t have null pointers such as Rust, Swift or Kotlin, then you should be familiar with ways of dealing with optional values. It is the same in Zig. You cannot do anything with a pointer that is potentially null in Zig. You have to unwrap it before Zig lets you do anything with it. One way is using orelse

If list.first contains a null then return null will be executed in this case. You can read more about optional types here. But to give you a quick sense of how it typically works in Zig. A number of control flow statements in Zig has this form to deal with optionals:

The |value| part unwraps an optional within the conditional. So e.g. this gives the opportunity to easily use while with an iterator. The code block is only repeated each time it is possible to unwrap and optional. Thus as soon as the next() function inside while-loop returns a null the iteration stops.

You can see some examples from a simple assembler, Zacktron33 I am writing in Zig. Ignore all the weird syntax I have not introduced yet. Just focus on the while loop and the second if statement:

What this code does is reading one line at a time from a file represented by the reader object. If there is no line, it will return null. As long as there is a line it will be captured and stored in |tmp_line|, and the loop will be executed.

Another use of this is the following line:

Here I am looking for the : character inside the line I read. If it isn't there then indexOf will return null. If it is there I will get the index of the position, which will get captured in |i|.

Memory Management with Allocators and Defer

While Zig has manual memory allocation and deallocation it makes it much easier to to deal with than in C. First of all every Zig function which may allocate memory takes an allocator as argument, unless it belong to an object which was initialized with an allocator.

Still bottom line is that no function in Zig is supposed to simply decide itself how to allocate memory straight from the heap. This gives you strong control over memory allocation. It means that if you e.g. write micro-controller code and there is no operating system, then you can simply turn a chunk of memory into a big buffer, point an allocator to this buffer and pass this allocator to every standard library call you make. That way, the standard library has no dependency on the existence of a heap.

The second innovation is “stolen” from Go, because it was such a great idea. This is the defer and errdefer statements. These allow one or more statements to be executed upon exit from the current scope. defer is called regardless. errdefer is only called if you return an error code, meaning there was an error. Here is an example:

In my Zacktron33 assembler I first read all the labels in the program and store with them their address, before performing the actual assembly. I do this with the readSymTable function. It looks for labels in file and puts them into a dictionary along with their address. This is returned as labels.

Since Zig functions are not allowed to allocate anything by convention without using an allocator provided to it, we know that the dictionary returned must have had entries allocated using allocator.

Thus the defer is put right after to make sure that when we exit the function we are in, the dictionary entries will be freed from memory. The keys are strings so I have to loop over them and release them, before releasing the dictionary itself.

In this case it may be worth arguing that keys should have been copied and the labels.deinit() should have taken care of releasing them. However that may also have gone against the Zig philosophy of being very transparent about what is being done. Personally I am on the fence here. I have not programmed enough Zig to internalize the Zig philosophy yet.

Smart Error Codes

Zig takes a leaf from Go’s book and handles errors using error return codes. But there is a twist and I think Zig’s solution is a lot better.

Go base its solution for error code on having multiple return values, while Zig utilize the fact that it has tagged unions error codes returned cannot be ignored.

This works similar to optional values. E.g. ?i32 means an 32-bit integer or null, but for errors we write !i32, which means a 32-bit integer or an error code.

We can be more specific and specify which error codes we expect should be returned.

The error keyword works very similar to an enum in Zig. The differences is that error codes form sets which can be mixed and merged, and they can be inferred, so you don't need to specify which error set a function returns. This is a bigger topic, so I don't want to go too much into detail on how this works. You can read more details on ziglearn.org.

Instead I want to clarify the advantages this system brings. Returned error codes cannot be ignored in Zig. To do so would produce a compilation error. However dealing with the is easy.

Let us have a look at a function I wrote for splitting an integer into its decimal number digits. Basically if you give it 345 it will return an array [5, 4, 3] with the digits from least significant to most significant.

This function returns either an error code or an array of unsigned 32-bit integers (ArrayList(u32)) containing our digits. As you saw with the if, while and for statement we have a similar construct with catch to grab an error code.

Not catching a potential error returned is a compilation error which force you to handle them. Because catching and returning errors is so common, Zig has a shortcut for it. These two statements are the same:

This is why you see try littered around Zig code. Ever place there is a try you know that function potentially returned an error, which it is returning to be handled at the calling site.

Here is an example of how that looks at the calling site:

Notice in this case we don’t actually handle the error, we just return it from main which we indicate as returning nothing or an error code !void.

By putting try in front of decimals we just return the error code from main in case there was one.

Smart Cleanup with Defer and Errdefer

The previous example also shows how defer helps us with cleanup. In case the decimals function fail and return an error, then digits.deinit() will be called anyway because defer makes it get called when we exit enclosing scope, which we do when exiting main.

However this happens regardless of whether there was an error or not. To differentiate we have the errdefer. Look at the first part of the decimals function:

We don’t write defer digits.deinit(), because otherwise the array would be released before we could return it to the called. We don't want that. However errdefer causes the deinit() to only be called in the cases where an error code is returned. In this case we cannot use the digits array anyway since it is incomplete and never returned. In this case we want to release the memory it uses.

Thus the combination of error codes which cannot be ignored, defer and errdefer, dealing with error situations and doing cleanup properly has been dramatically improved relative to C programming.

Namespaces using Struct

I major challenge in writing modular C code is that C has no kind of namespace system. The solution has typically been to use prefixes. E.g. all the functions in the SDL library has names such as SDL_PollEvent, SDL_UpdateWindowSurface and SDL_BlitSurface, while in the GTK library they have names such as gtk_application_new and gtk_button_new_with_label. In other words they use prefixes.

In C++ this is solved with the namespace keyword which lets you enclose various code inside a namespace. Before namespaces people would often simply use classes as namespaces, by declaring static functions inside classes.

Zig basically follows a variant of the latter approach. The result gives Zig something that looks like classes, but which really isn’t. Here is an example of defining a 2D vector in the plane:

In terms of memory layout and usage, this will be almost identical to a C struct. The function add does not literally exist inside the struct at runtime. It is used as a namespace, so I could add to vectors u and v like this:

However since the compiler knows the type of u at compile time it can insert this "namespace" automatically. This is done with the following shorthand:

There is no dynamic dispatch going on here. There is no lookup in a vtable at runtime for an add method for the type of the u variable. No, this is just a convenient alternative syntax for calling what is a plain function.

In Zig you can nest structs to create nested namespaces. This is used in the Zig standard library extensively.

You can see the Allocator type is defined inside a struct named mem which provided a namespace for it.

You will also notice the odd looking practice of assigning types to constants. The Allocator constant is assigned the value of std.mem.Allocator. This is a key feature of Zig. Struct don't really have names. They are anonymous.

You simply assign the struct to one of more constants to use them later by name. This helps explain the apparent odd syntax for importing libraries:

@import returns a struct containing all the functions and types defined in the standard library and assigns this struct to the constant named std. However in theory we could have assigned it to any other name.

Replacing C Macros with Compilation Time Code

This brings us to the other key innovation of Zig over plain C: code which can run at compilation time, rather than at runtime. This removes the need for C style preprocessor macros which cause a lot of problems for C programmers.

In Zig code can run at compilation time. Code that runs at compilation time can deal with objects which normally only exist at compilation time in a statically typed language such as types as first class objects.

Thus since a struct is a type, you can deal with it like any other object at compilation time in Zig. We can use this to emulate generics:

With the comptime keyword we tell Zig that the variable the follows must be known at compilation time. This is required for types, since they are unknown at runtime. But you could specify this for any type. You could also have required e.g. that a string was known at compilation time. But since strings also exist at runtime, that would not strictly be necessary.

In this case Vec2D is actually not a type, but rather a function which takes a type as an argument and returns a type. Here you can see the advantages of anonymous structs which can be passed around just like any other object. We create a struct where the fields dx and dy are of some placeholder type T and return this type.

To explain how this works, let us look at this line:

What exactly happens here? Before compilation is completed we run compile time functions such as Vec2D which returns a struct, which means we end up compiling something that looks like this:

Meaning we define a struct and then instantiate it with the given values for each field. comptime has far reaching consequences and I cannot cover all of them here in a story just meant as an introduction.

But let me explain some of the things it can do. In C, the preprocessor is used to evaluate different code to compile for different platforms using things like #ifdef. There is no need for that in Zig. If an if-statement makes a decision based on a value known at compilation time, then the compiler knows what code paths will never be taken, and that code is not compiled at all. Thus it is easy to include platform specific code.

Another great example is how stdout.print is implemented in Zig. It works a lot like printf in that you can provide a formatting string. However unlike C, the formatting string is required to be known at compilation time. Here is the implementation of print in the standard library:

Let us skip what the Self type is for now. What you can see here is that the second argument, format which is the formatting string is specified as being comptime.

format: []const u8 is Zig's way of saying char format[]. Instead of char for bytes we write u8, which is short for unsigned 8-bit integer.

The Zig code that looks at the format string will run at compilation time leaving only the code that cannot be determined at compilation time.

The args will typically be provided as a tuple, which in Zig you can write like:

This is treated much like an array, except everything about it is known at compilation time such as the type at every index and the length of the tuple.

You can define an struct object through type inference in much the same manner:

We don’t necessarily know the value of every element in the tuple at runtime. E.g. we don’t necessarily know the value of foobar but we will know the type.

This allows Zig to produce code that will handle this specific tuple being printed out. It will at compilation time make sure that the types and number of elements specified in the formatting string matches the number of elements in the tuple.

The tuple is specified as anytype because Zig cannot know the exact type of the tuple. Depending on the length and the type of each element the tuple will be of a different type each time.

However this is not some dynamic dispatch. It isn’t the same as say var in JavaScript. The compiler will upon compilation figure exactly what type args is. In the implementation of std.fmt.format which print calls there is compile time code which checks the type of args to make sure it is of a type it can deal with. Here is a cutout of that section:

You can see that at compilation time we @TypeOf to get the type of the arguments and then we use @typeInfo to get struct which contains information about this type.

Object-Oriented Programming in Zig

While Zig has some fancy features it is important to not forget that Zig is really just an advanced version of C. As highlighted in the introduction, there are no classes, interfaces or inheritance in Zig.

If you want that you have to build such a system yourself from scratch. That is essentially how e.g. the GTK library works. It is an object-oriented GUI library written in C.

Zig is the same. You roll your own. In the standard library you can find many different approaches to this.

We can see one example of this with the Random generators. The base interface Random is basically defined like this:

What you see here is that Random has a field fillFn which is a function pointer. Then all the functions in this interface to our random generators use this function pointer. E.g. you can see this in how bytes is implemented. int is again built on top of this function. However there are several other functions in Random` which I am not showing which uses it.

What you can think of as a subclass of Random provides a concrete implementation of fillFn as shown here:

It is the init function which essentially does the "inheritance" by assigning its own fill function to the fillFn function pointer field.

What we could call the base class is then stored in the random field of SequentialPrng. Here is an example of using this random generator:

The point of this is much the same as with interfaces in general. You could write a function which take a Random pointer as argument, and you function does not have to concern itself with how random numbers are generated.

We need to comment on a couple of things. E.g. since struct are anonymously named we need a way to refer back to them. That is what @This() is used for. All functions with the @ suffix are compiler intrinsics. They tie into the Zig compiler. Typically they are functions which can only run at compile-time. We know at compilation time what the type of struct is.

By convention Zig programmers often write:

This the init function SequentialPrng can say it returns an object of type Self rather than SequentialPrng. It means the same thing.

Final Thoughts

The fairly sophisticated type system in Zig can give the impression that Zig is a far more high level language than it actually is. Thus you end up hunting for things like classes, interfaces, traits or stuff that simply does not exist in Zig when learning it.

What helps when programming Zig to keep in mind that features that imply hidden runtime behavior or control flow will generally not exist in Zig. Zig has sophisticated behavior at compilation time, but very simple behavior at runtime.

The Startup

Medium's largest active publication, followed by +752K people. Follow to join our community.

Erik Engheim

Written by

Geek dad, living in Oslo, Norway with passion for UX, Julia programming, science, teaching, reading and writing.

The Startup

Medium's largest active publication, followed by +752K people. Follow to join our community.

Erik Engheim

Written by

Geek dad, living in Oslo, Norway with passion for UX, Julia programming, science, teaching, reading and writing.

The Startup

Medium's largest active publication, followed by +752K people. Follow to join our community.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store