Understanding the Zig Programming Language

Have problems wrapping your head around the Zig programming language? Here is a guide.

Erik Engheim
Nov 9, 2020 · 17 min read
Image for post
Image for post

So you are curious about Zig but you don’t know exactly where to start. Maybe you read the introduction at the Zig web site, and maybe you even began reading ziglearn.org, but still feel a little lost.

One first stumbling block for people is the idea that Zig is a C-like language:

pub fn readAll(self: Self, buffer: []u8) Error!usize {
var index: usize = 0;
while (index != buffer.len) {
const amt = try self.read(buffer[index..]);
if (amt == 0) return index;
index += amt;
}
return index;
}

But many people looking at the code above will say this doesn’t look very C-like at all. What is up with the , and keywords e.g.? And what about in front of variables? That looks like JavaScript!

This means we really have to define what specifically we mean when we say that Zig is a C-like language. A lot of the syntax in Zig is indeed different from C, but what we really means is how the language works. The best way to give a sense of this is to begin by talking about what Zig doesn’t do.

Language Features Zig Doesn’t Have

Unlike most modern high level languages Zig does not have any of these features:

  • Classes and inheritance. There is no keyword or syntax to express inheritance.
  • Interfaces or protocols. No syntax to define something akin to a Java or Go interface, or a Swift protocol.
  • Runtime polymorphism.
  • Exceptions. Zig uses error codes, but in a more clever way than Go.
  • Constructors and destructors, meaning there is no RAII.
  • Function overloading. You cannot write a function with the same name but with different arguments multiple times, like in e.g. C++.
  • Operator overloading. Operators such as +, -, / etc can never change meaning.
  • Garbage collection. You must manually allocate and deallocate memory in Zig.
  • Closures or lambdas. There is not way to define a function inline which captures some external state.
  • Generics.
  • String type.

This is a pretty long list of fairly normal features found in languages such as Java, C#, C++, Swift, Rust, Go and many others.

What does Zig do then?

Uncommon Features Zig Does Have

The list of things Zig doesn’t do coincides very well with what the C language also doesn’t do. What about C features uncommon in other languages? Which of these does Zig have?

  • Pointers and pointer arithmetic. Most languages today forbid this and often have dumbed down or heavily restricted versions such as Java references.
  • Manual memory allocation and deallocation.
  • Get address of arbitrary data structure.

These two lists may give you a better idea of why Zig can be called a C like language. The feature set is very similar. In fact let us do some things which is common in other languages but common in C:

var x: i32 = 4;
var ptr: *i32 = &x;
ptr.* = 15;

This create a variable of type , which is a 32-bit integer () in Zig and we initialize it to 4. Next we create a pointer to items of type . We initialize the pointer to point to . Later we dereference the pointer to assign 15 to .

While the syntax is somewhat different semantically speaking any C-programmer would be familiar with these kinds of operations.

Why change the C-syntax? This create a number of advantages and consistencies when you want your language to support type-inference. Remember C was made before type inference was widely used.

The benefit of using a keyword such as for function and for variable in front of every identifier is that it makes creating tools and working with the language a lot easier. E.g. if I want to lookup the definition for the function in Zig, I can just search through the standard library for . That text string cannot be mistaken for a call to that function or a variable named .

But enough digressions. If Zig basically has all the same features as C, then what is the point? Why not stick with C? Thus far I have undersold Zig, because it does in fact come with some unique new features and equally important it removes certain features from C.

What Zig adds and removes from C

C has a very weak type system, and this is an area Zig drastically changes. Type checking is much stricter in Zig and it even bans null pointers. Or rather you can have but you have to specifically allow it.

Optional Values

Here is an example from the linked list implementation in Zig:

pub fn popFirst(list: *Self) ?*Node {
const first = list.first orelse return null;
list.first = first.next;
return first;
}

When you pop the first node in a linked list, there may not be anything there, so we must allow to be returned. That is why the return type is marked as . The prefix indicates that the pointer is allowed to be null.

We can contrast this with which adds a not to the front:

pub fn prepend(list: *Self, new_node: *Node) void {
new_node.next = list.first;
list.first = new_node;
}

In this case you see the type of is and that is because it makes no sense to be adding null pointers to a linked list.

If you are familiar with other languages that don’t have null pointers such as Rust, Swift or Kotlin, then you should be familiar with ways of dealing with optional values. It is the same in Zig. You cannot do anything with a pointer that is potentially null in Zig. You have to unwrap it before Zig lets you do anything with it. One way is using

const first = list.first orelse return null;

If contains a then will be executed in this case. You can read more about optional types here. But to give you a quick sense of how it typically works in Zig. A number of control flow statements in Zig has this form to deal with optionals:

if (optional) |value| {}
while (optional) |value| {}
for (optional_elements) |value| {}

The part unwraps an optional within the conditional. So e.g. this gives the opportunity to easily use with an iterator. The code block is only repeated each time it is possible to unwrap and optional. Thus as soon as the function inside while-loop returns a the iteration stops.

You can see some examples from a simple assembler, Zacktron33 I am writing in Zig. Ignore all the weird syntax I have not introduced yet. Just focus on the loop and the second statement:

var buffer: [500]u8 = undefined;
while (try reader.readUntilDelimiterOrEof(buffer[0..], '\n')) |tmp_line| {
const line = mem.trim(u8, tmp_line, " \t");
const n = line.len;

if (n == 0) continue;

if (mem.indexOf(u8, line, ":")) |i| {
const label = try mem.dupe(allocator, u8, line[0..i]);
try labels.put(label, address);

// is there anything beyond the label?
if (n == i + 1) continue;
}
address += 1;
}

What this code does is reading one line at a time from a file represented by the object. If there is no line, it will return . As long as there is a line it will be captured and stored in , and the loop will be executed.

Another use of this is the following line:

if (mem.indexOf(u8, line, ":")) |i| {

Here I am looking for the character inside the I read. If it isn't there then will return . If it is there I will get the index of the position, which will get captured in .

Memory Management with Allocators and Defer

While Zig has manual memory allocation and deallocation it makes it much easier to to deal with than in C. First of all every Zig function which may allocate memory takes an allocator as argument, unless it belong to an object which was initialized with an allocator.

Still bottom line is that no function in Zig is supposed to simply decide itself how to allocate memory straight from the heap. This gives you strong control over memory allocation. It means that if you e.g. write micro-controller code and there is no operating system, then you can simply turn a chunk of memory into a big buffer, point an allocator to this buffer and pass this allocator to every standard library call you make. That way, the standard library has no dependency on the existence of a heap.

The second innovation is “stolen” from Go, because it was such a great idea. This is the and statements. These allow one or more statements to be executed upon exit from the current scope. is called regardless. is only called if you return an error code, meaning there was an error. Here is an example:

var labels = try readSymTable(allocator, file);
defer {
var iter = labels.iterator();
while (iter.next()) |entry| allocator.free(entry.key);
labels.deinit();
}

In my Zacktron33 assembler I first read all the labels in the program and store with them their address, before performing the actual assembly. I do this with the function. It looks for labels in and puts them into a dictionary along with their address. This is returned as .

Since Zig functions are not allowed to allocate anything by convention without using an allocator provided to it, we know that the dictionary returned must have had entries allocated using .

Thus the is put right after to make sure that when we exit the function we are in, the dictionary entries will be freed from memory. The keys are strings so I have to loop over them and release them, before releasing the dictionary itself.

In this case it may be worth arguing that keys should have been copied and the should have taken care of releasing them. However that may also have gone against the Zig philosophy of being very transparent about what is being done. Personally I am on the fence here. I have not programmed enough Zig to internalize the Zig philosophy yet.

Smart Error Codes

Zig takes a leaf from Go’s book and handles errors using error return codes. But there is a twist and I think Zig’s solution is a lot better.

Go base its solution for error code on having multiple return values, while Zig utilize the fact that it has tagged unions error codes returned cannot be ignored.

This works similar to optional values. E.g. means an 32-bit integer or null, but for errors we write , which means a 32-bit integer or an error code.

We can be more specific and specify which error codes we expect should be returned.

const FileOpenError = error{
AccessDenied,
OutOfMemory,
FileNotFound,
};

fn open(filename: []u8) FileOpenError!*File {
// code
}

The keyword works very similar to an in Zig. The differences is that error codes form sets which can be mixed and merged, and they can be inferred, so you don't need to specify which error set a function returns. This is a bigger topic, so I don't want to go too much into detail on how this works. You can read more details on ziglearn.org.

Instead I want to clarify the advantages this system brings. Returned error codes cannot be ignored in Zig. To do so would produce a compilation error. However dealing with the is easy.

Let us have a look at a function I wrote for splitting an integer into its decimal number digits. Basically if you give it it will return an array with the digits from least significant to most significant.

fn decimals(alloc: *Allocator, n: u32) !ArrayList(u32) {
var x = n;
var digits = ArrayList(u32).init(alloc);
errdefer digits.deinit();

while (x >= 10) {
digits.append(x % 10) catch |err| return err;
x = x / 10;
}
digits.append(x) catch |err| return err;
return digits;
}

This function returns either an error code or an array of unsigned 32-bit integers () containing our digits. As you saw with the , and statement we have a similar construct with to grab an error code.

Not catching a potential error returned is a compilation error which force you to handle them. Because catching and returning errors is so common, Zig has a shortcut for it. These two statements are the same:

digits.append(x) catch |err| return err;
try digits.append(x);

This is why you see littered around Zig code. Ever place there is a you know that function potentially returned an error, which it is returning to be handled at the calling site.

Here is an example of how that looks at the calling site:

pub fn main() !void {
// ... code for creating allocator ...

const digits = try decimals(allocator, 4123);
defer digits.deinit();

for (digits.items) |digit| {
try print("{},", .{digit});
}
try print("\n", .{});
}

Notice in this case we don’t actually handle the error, we just return it from which we indicate as returning nothing or an error code .

By putting in front of we just return the error code from in case there was one.

Smart Cleanup with Defer and Errdefer

The previous example also shows how helps us with cleanup. In case the function fail and return an error, then will be called anyway because makes it get called when we exit enclosing scope, which we do when exiting .

However this happens regardless of whether there was an error or not. To differentiate we have the . Look at the first part of the function:

var x = n;
var digits = ArrayList(u32).init(alloc);
errdefer digits.deinit();

We don’t write , because otherwise the array would be released before we could return it to the called. We don't want that. However causes the to only be called in the cases where an error code is returned. In this case we cannot use the array anyway since it is incomplete and never returned. In this case we want to release the memory it uses.

Thus the combination of error codes which cannot be ignored, and , dealing with error situations and doing cleanup properly has been dramatically improved relative to C programming.

Namespaces using Struct

I major challenge in writing modular C code is that C has no kind of namespace system. The solution has typically been to use prefixes. E.g. all the functions in the SDL library has names such as , and , while in the GTK library they have names such as and . In other words they use prefixes.

In C++ this is solved with the keyword which lets you enclose various code inside a namespace. Before namespaces people would often simply use classes as namespaces, by declaring static functions inside classes.

Zig basically follows a variant of the latter approach. The result gives Zig something that looks like classes, but which really isn’t. Here is an example of defining a 2D vector in the plane:

const Vec2D = struct {
dx: f32,
dy: f32,

fn add(u: Vec2D, v: Vec2D) Vec2D {
return vec2D(u.dx + v.dx, u.dy + v.dy);
}
}

In terms of memory layout and usage, this will be almost identical to a C struct. The function does not literally exist inside the struct at runtime. It is used as a namespace, so I could add to vectors and like this:

var w: Vec2D = Vec2D.add(u, v);

However since the compiler knows the type of at compile time it can insert this "namespace" automatically. This is done with the following shorthand:

var w: Vec2D = u.add(v);

There is no dynamic dispatch going on here. There is no lookup in a vtable at runtime for an method for the type of the variable. No, this is just a convenient alternative syntax for calling what is a plain function.

In Zig you can nest structs to create nested namespaces. This is used in the Zig standard library extensively.

const std = @import("std");

const Allocator = std.mem.Allocator;
const Dict = std.StringHashMap;
const Array = std.std.ArrayList;

You can see the type is defined inside a struct named which provided a namespace for it.

You will also notice the odd looking practice of assigning types to constants. The constant is assigned the value of . This is a key feature of Zig. Struct don't really have names. They are anonymous.

You simply assign the struct to one of more constants to use them later by name. This helps explain the apparent odd syntax for importing libraries:

const std = @import("std");

returns a struct containing all the functions and types defined in the standard library and assigns this struct to the constant named . However in theory we could have assigned it to any other name.

Replacing C Macros with Compilation Time Code

This brings us to the other key innovation of Zig over plain C: code which can run at compilation time, rather than at runtime. This removes the need for C style preprocessor macros which cause a lot of problems for C programmers.

In Zig code can run at compilation time. Code that runs at compilation time can deal with objects which normally only exist at compilation time in a statically typed language such as types as first class objects.

Thus since a struct is a type, you can deal with it like any other object at compilation time in Zig. We can use this to emulate generics:

fn Vec2D(comptime T: type) type {
return struct {
dx: T,
dy: T,

fn add(u: Vec2D(T), v: Vec2D(T)) Vec2D(T) {
return Vec2D(T){
.dx = u.dx + v.dx,
.dy = u.dy + v.dy,
};
}
};
}

pub fn main() !void {
const u = Vec2D(f32){ .dx = 3, .dy = 4 };
const v = Vec2D(f32){ .dx = 2, .dy = 1 };
const w = u.add(v);

try stdout.print("u.add(v) == {d}\n", .{w});
}

With the keyword we tell Zig that the variable the follows must be known at compilation time. This is required for types, since they are unknown at runtime. But you could specify this for any type. You could also have required e.g. that a string was known at compilation time. But since strings also exist at runtime, that would not strictly be necessary.

In this case is actually not a type, but rather a function which takes a type as an argument and returns a type. Here you can see the advantages of anonymous structs which can be passed around just like any other object. We create a struct where the fields and are of some placeholder type and return this type.

To explain how this works, let us look at this line:

Vec2D(f32){ .dx = 3, .dy = 4 };

What exactly happens here? Before compilation is completed we run compile time functions such as which returns a struct, which means we end up compiling something that looks like this:

struct {dx: f32, dy: f3} { .dx = 3, .dy = 4 };

Meaning we define a struct and then instantiate it with the given values for each field. has far reaching consequences and I cannot cover all of them here in a story just meant as an introduction.

But let me explain some of the things it can do. In C, the preprocessor is used to evaluate different code to compile for different platforms using things like . There is no need for that in Zig. If an if-statement makes a decision based on a value known at compilation time, then the compiler knows what code paths will never be taken, and that code is not compiled at all. Thus it is easy to include platform specific code.

Another great example is how is implemented in Zig. It works a lot like in that you can provide a formatting string. However unlike C, the formatting string is required to be known at compilation time. Here is the implementation of in the standard library:

pub fn print(self: Self, comptime format: []const u8, args: anytype) Error!void {
return std.fmt.format(self, format, args);
}

Let us skip what the type is for now. What you can see here is that the second argument, which is the formatting string is specified as being .

is Zig's way of saying . Instead of for bytes we write , which is short for unsigned 8-bit integer.

The Zig code that looks at the format string will run at compilation time leaving only the code that cannot be determined at compilation time.

The args will typically be provided as a tuple, which in Zig you can write like:

.{4, foobar, "hello"}

This is treated much like an array, except everything about it is known at compilation time such as the type at every index and the length of the tuple.

You can define an struct object through type inference in much the same manner:

const stuff = .{
.foo = "hi",
.bar = 4,
};

We don’t necessarily know the value of every element in the tuple at runtime. E.g. we don’t necessarily know the value of but we will know the type.

This allows Zig to produce code that will handle this specific tuple being printed out. It will at compilation time make sure that the types and number of elements specified in the formatting string matches the number of elements in the tuple.

The tuple is specified as because Zig cannot know the exact type of the tuple. Depending on the length and the type of each element the tuple will be of a different type each time.

However this is not some dynamic dispatch. It isn’t the same as say in JavaScript. The compiler will upon compilation figure exactly what type is. In the implementation of which calls there is compile time code which checks the type of to make sure it is of a type it can deal with. Here is a cutout of that section:

pub fn format(
writer: anytype,
comptime fmt: []const u8,
args: anytype,
) !void {
const ArgSetType = u32;
if (@typeInfo(@TypeOf(args)) != .Struct) {
@compileError("Expected tuple or struct argument, found " ++ @typeName(@TypeOf(args)));
}
if (args.len > @typeInfo(ArgSetType).Int.bits) {
@compileError("32 arguments max are supported per format call");
}
...
}

You can see that at compilation time we to get the type of the arguments and then we use to get struct which contains information about this type.

Object-Oriented Programming in Zig

While Zig has some fancy features it is important to not forget that Zig is really just an advanced version of C. As highlighted in the introduction, there are no classes, interfaces or inheritance in Zig.

If you want that you have to build such a system yourself from scratch. That is essentially how e.g. the GTK library works. It is an object-oriented GUI library written in C.

Zig is the same. You roll your own. In the standard library you can find many different approaches to this.

We can see one example of this with the Random generators. The base interface is basically defined like this:

pub const Random = struct {
fillFn: fn (r: *Random, buf: []u8) void,

/// Read random bytes into the specified buffer until full.
pub fn bytes(r: *Random, buf: []u8) void {
r.fillFn(r, buf);
}

pub fn int(r: *Random, comptime T: type) T {
...
var rand_bytes: [@sizeOf(ByteAlignedT)]u8 = undefined;
r.bytes(rand_bytes[0..]);
...
}
...
}

What you see here is that has a field which is a function pointer. Then all the functions in this interface to our random generators use this function pointer. E.g. you can see this in how is implemented. is again built on top of this function. However there are several other functions in Random` which I am not showing which uses it.

What you can think of as a subclass of provides a concrete implementation of as shown here:

const SequentialPrng = struct {
const Self = @This();
random: Random,
next_value: u8,

pub fn init() Self {
return Self{
.random = Random{ .fillFn = fill },
.next_value = 0,
};
}

fn fill(r: *Random, buf: []u8) void {
const self = @fieldParentPtr(Self, "random", r);
for (buf) |*b| {
b.* = self.next_value;
}
self.next_value +%= 1;
}
};

It is the function which essentially does the "inheritance" by assigning its own function to the function pointer field.

What we could call the base class is then stored in the field of . Here is an example of using this random generator:

var r = DefaultPrng.init(seed);

const s = r.random.int(u64);

The point of this is much the same as with interfaces in general. You could write a function which take a pointer as argument, and you function does not have to concern itself with how random numbers are generated.

fn doRandomStuff(rnd: *Random) void {
const s = rnd.int(u64);
// code
}

We need to comment on a couple of things. E.g. since struct are anonymously named we need a way to refer back to them. That is what @This() is used for. All functions with the suffix are compiler intrinsics. They tie into the Zig compiler. Typically they are functions which can only run at compile-time. We know at compilation time what the type of struct is.

By convention Zig programmers often write:

const Self = @This();

This the function can say it returns an object of type rather than . It means the same thing.

Final Thoughts

The fairly sophisticated type system in Zig can give the impression that Zig is a far more high level language than it actually is. Thus you end up hunting for things like classes, interfaces, traits or stuff that simply does not exist in Zig when learning it.

What helps when programming Zig to keep in mind that features that imply hidden runtime behavior or control flow will generally not exist in Zig. Zig has sophisticated behavior at compilation time, but very simple behavior at runtime.

The Startup

Medium's largest active publication, followed by +754K people. Follow to join our community.

Erik Engheim

Written by

Geek dad, living in Oslo, Norway with passion for UX, Julia programming, science, teaching, reading and writing.

The Startup

Medium's largest active publication, followed by +754K people. Follow to join our community.

Erik Engheim

Written by

Geek dad, living in Oslo, Norway with passion for UX, Julia programming, science, teaching, reading and writing.

The Startup

Medium's largest active publication, followed by +754K people. Follow to join our community.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store