The glium library

The Rust programming language guarantees that your program will never have any undefined behavior as long as you use safe code. This works as long as you use the standard library, but when it comes to C APIs this safety guarantee lays on the shoulders of library writers. Anyone who exposes a safe interface over an unsafe API must be extra-cautious that nothing bad happens.

In addition to safety, the Rust standard library tries to enforce good practices as much as possible, such as avoiding hidden costs, explicitly handling every possible corner case with Results and Options, or creating simple-to-use abstractions.

Unfortunately the OpenGL library, which is used in 3D applications and games, has all the characteristics of what we would consider as a bad library today: it uses a global state, it is tedious to use, and since OpenGL 4, memory safety is no longer enforced. The purpose of the glium library is to solve these problems by wrapping around OpenGL. You still have to manage buffers, textures and programs manually, but in a much friendlier environment.

Let’s take a detailed look at what glium does!

Initialization and context management

Initialization

One of the biggest challenges when writing raw OpenGL is initializating the context. The OpenGL API supposes that a context has previously been made “current” and doesn’t mention anything about initialization. To initialize the context you have to use another API instead, like WGL, GLX or EGL.

The glium library follows this same principle by providing an unsafe Backend trait that links glium to the OpenGL implementation provider. By default glium depends on glutin, a pure Rust OpenGL context creation library, and indirectly implements the Backend trait on glutin’s Window. This makes it trivial to initialize a window and OpenGL:

let display = glutin::WindowBuilder::new().build_glium().unwrap();

But keep in mind that it is possible to implement the trait for whatever type you like. For example if you want to use glium to manage an OpenGL context created with wxWidgets, it is possible. Glutin is not very robust for the moment and can cause a lot of problems, so it is totally legitimate to use another library. There is for example a glium_sdl2 crate that allows you to easily use glium with the SDL.

Compatibility

During the creation phase, glium will parse the version and list of OpenGL extensions provided by the backend.

One of the reasons why many people choose to use OpenGL is that it works almost everywhere. OpenGL itself works on Windows, Linux and OS/X, but there’s also OpenGL ES (for Embedded System) that works on Android and iOS. And then there’s WebGL, that is slightly derived from OpenGL ES, and that works inside the browser.

In order to provide a convenient API, glium chose to only support versions of OpenGL that supported buffer objects, shaders and framebuffer objects, which includes OpenGL 3 and OpenGL ES 2/WebGL.

The glium teapot example running on a Raspberry Pi

This doesn’t mean, however, that older versions of OpenGL aren’t supported. OpenGL works alongside with an extensions system ; instead of implementing the core specifications, drivers can instead choose to implement extensions that provide the same set of functionalities. For example glium is known to work on a netbook that only officially supports OpenGL 1.5 thanks to the GL_ARB_vertex_shader, GL_ARB_fragment_shader and GL_EXT_framebuffer_object extensions (note that not all unit tests are passing though).

In summary, glium should work almost everywhere. Known exceptions are OpenGL ES 1 (on very early mobile devices) and the default software implementation that Windows provides if you don’t install any graphic driver. Other that than, I have yet to find a machine that doesn’t support glium.

Context management

The OpenGL API was created at times where everything was single-threaded, and one of its design decisions is that OpenGL contexts have to be binded to a thread before they can be used. This leads to the fact that even today most OpenGL applications are single threaded.

One the major characteristics of glium is that it makes context management safe. It copes with OpenGL’s design in two ways:

  • None of the structs that represent OpenGL objects implement the Send trait, so they can never leave the current thread.
  • Whenever you call a function, glium calls wglGetCurrentContext/glXGetCurrentContext or equivalent to make sure that the context is still the current one. If it’s not the case, it calls MakeCurrent.

Thanks to this check, you can use multiple OpenGL contexts or multiple OpenGL libraries simultaneously without getting bugs or crashes (provided that the other library is safe as well). The overhead of this check is low, but if you want you can disable it with an unsafe function call at context creation.

As shown here, around 0.08% of the CPU usage comes from the calls to GetCurrentContext.

Buffers

The main buffer handling struct is Buffer. In order to enforce safety and correctness, buffers must have a fixed size and have a template parameter indicating their content. You don’t just manipulate a Buffer, but a Buffer<[u8]> or a Buffer<Foo> for example. Common operations such as reading or writing are very easy to do with methods such as read, map or write.

You can use buffers like you would use buffers in OpenGL, but for maximum performances you are encouraged to manually handle the access to your buffers thanks to a recent OpenGL extension named ARB_buffer_storage. This extension is available almost everywhere (even on very old hardware) and allows one to get a direct access to the buffer’s content in RAM or VRAM, but in exchange it is your responsibility to handle synchronizations and ensure that you don’t write to it while the GPU is using it at the same time.

The “untextured objects” example of AZDO: drawing 64x64x64 individual moving objects. The glium version runs at approximately the same speed as the original pure OpenGL example.

But don’t worry: glium automatically does this for you. To create a buffer with persistent mapping, all you have to do is use the persistent() constructor instead of just new(). Every single command executed on the GPU that uses a segment of your buffer will then create a sync fence that will be used by glium to track accesses to the buffer.

On my machine, some profiling showed that uploading to a persistent-mapped buffer led to a FPS count 50% to 100% higher than with a non-persistent-mapped buffer when streaming the data to be drawn (as with a particles system for example). And thanks to glium, all you have to do is use a different function call to initialize the buffer.

Uniform buffers and SSBOs

One of the characteristics of buffers is that you can read their content (and even modify them) from inside your shaders.

Doing so is very often error-prone because of the various alignment requirements of OpenGL that are different than in C or Rust. To avoid all possible errors, glium queries the OpenGL backend for the offsets of each element and makes sure that the data layout of your buffer perfectly matches what OpenGL expects. This is normally too cumbersome to do with raw OpenGL, but glium does it.

You can check out the gpgpu example to see this in action.

Textures and framebuffer objects

Handling textures is one of the trickiest part of OpenGL.

Glium has opted for strong texture typing with 63 different texture types, each one being a combination of a data type (floating-points, signed, unsigned, compressed, srgb, compressed srgb, depth, depth-stencil, stencil) and a dimension (1d, 2d, 3d, 1d array, 2d array, cubemap, cubemap array). This makes it possible to have very precise operations: writing to a texture takes a differentdata type depending on the texture type, reading a 2D texture is different than reading a 3D texture, compressed textures can’t use automatic mipmaps generation and can’t be attached to a framebuffer object, etc.

Reading the content of a texture. This can’t be easier.

Just like buffers must have a fixed size, textures and their mipmaps must also have fixed dimensions. Textures are always complete in all circumstances and glTexStorage is always used if it is available. These restrictions are here to remove the possibility of textures being in a “wrong” state and greatly reduces the number of corner cases.

sRGB

However there is still something that you can get wrong: sRGB. For historical reasons, the data format of screens and pictures are actually not in linear RGB but in the sRGB format. This is reflected in glium by a difference between sRGB textures and non-sRGB textures.

If you output linear RGB to your screen, it will appear darker than expected. This is a big problem if you do mathematical operations on your texture colors in your shaders, or if you use blending.

On the left: glium’s hello triangle. On the right: an OpenGL hello triangle without correct sRGB handling enabled.

Glium makes it mandatory to correctly handle RGB and sRGB. By default it will suppose that your fragment shader is returning colors in the RGB format and will ask OpenGL to do the conversion to sRGB by enabling GL_FRAMEBUFFER_SRGB. This is handled per-program and can be disabled with an option when creating a program. However you strongly encouraged to tackle the problem by creating sRGB textures instead of regular textures.

Render to texture

One of the most useful feature of OpenGL is render-to-texture, which consists in drawing to a texture instead of drawing to the window. This is where framebuffer objects come into play.

OpenGL framebuffer objects are handled internally by glium and aren’t directly exposed to the user. When you create a SimpleFramebuffer object for example, the only thing that glium does is check if the attachments are valid without calling any OpenGL function. It is only when you draw with that framebuffer that the actual framebuffer object is created (or reused if it already exists).

The reasons behind this choice is that:

  • Glium framebuffers hold a borrow of their attachments, so it would be too annoying to keep them alive between frames. Instead you can just recreate the same framebuffer at each frame without suffering from a performance issue.
  • Some operations such as switching between windowed and fullscreen mode requires a context rebuild by creating a new context that shares lists with the old context. In this situation, all framebuffer objects, vertex array objects, program pipelines and transform feedback objects become invalid. All these objects are handled internally by glium in order to avoid issues related to this.

Drawing to a texture is as simple as possible. Instead of calling frame.draw(…) you just call texture.as_surface().draw(…).

All glium tests use render to texture to avoid the pixel ownership test.

Note however that textures and framebuffer objects are probably the less polished aspects of glium. A lot of methods and verifications are missing, and they aren’t as robust as they should be.

Drawing, uniforms, and the state machine

The state machine

One of the major problems of OpenGL today is that it is a giant state machine. In other words, its functions have a different behavior depending on the functions you called previously.

For example, the glClear function can be used to fill a surface with a color. In its default state, it will clear the default framebuffer (in other words, the window). But if you call glBindFramebuffer to set the current framebuffer beforehand, then glClear will operate on this framebuffer instead of the default framebuffer. Similarly if you call glScissor beforehand then only a portion of the surface will be cleared, if you call glBeginConditionalRender then the clear will only happen on a certain condition, if you call glColorMask then only some color components will be cleared, and if you call glEnable(GL_RASTERIZER_DISCARD) then nothing will happen at all. And that’s just a simple case.

When OpenGL was first conceived there weren’t a lot of states and it was easy to handle. But over time the complexity of the API has increased, and this design has now become very problematic. If you want to make sure that glClear works in a precise way, you have to call glBindFramebuffer, glScissor, glEndConditionalRender, glColorMask and glDisable before every single call in order to set a specific state. This has two problems: it is easy to forget some function calls here and there, and calling OpenGL functions is very slow. Every time you change the current state the driver has to revalidate the whole state, and it can really cripple performances to call that many functions every time.

Glium solves this by providing an API where the user has to pass all parameters to every single function calls. Each function has a very precise behavior that only depends on the value of its parameters. Glium automatically tracks the state of the OpenGL context, and will perform only the required state changes to reduce the number of function calls to its minimum.

Example of the list of OpenGL function calls during an entire frame. There is no garbage.

Drawing

One example of this stateless API comes when drawing. Drawing is usually a very complex process, involving lots of steps that are very confusing and error-prone even for experts. Instead, glium provides one universal function:

draw(vertex_source, index_source, program, uniforms, parameters)
-> Result<(), DrawError>

Calling this function will:

  • Reuse or create a new vertex array object for the given vertex source. If vertex array objects are not supported by the OpenGL implementation (which is the case of OpenGL ES 2), then it will call glVertexAttribPointer & glEnableVertexAttribArray.
  • Check whether the vertex source matches the attributes expected by the program, returning an error if some are missing.
  • Call glMemoryFence if necessary for the various buffers used by the command.
  • Synchronize the parameters with the OpenGL state machine, returning an error if some of them are not supported. It also does some logic-related tests, such as returning an error if you enable depth testing without having a depth buffer available (something that is normally not an error according to the OpenGL specs).
  • Bind the program and its uniform values. Uniform values are cached so that they aren’t updated if it’s not necessary. For textures, glium will try to use all texture slots (one different texture per slot), and will only change the texture bindings if not enough slots are available.
  • Create or reuse the framebuffer object.
  • Check the layout of uniform buffers and SSBOs, returning an error if they don’t match.
  • Create sync fences for persistent mapped buffers that are used by this call.
  • Call one of the eight possible glDraw* functions. If you draw with slices of vertex buffers, glium will try to use the BaseVertex variants but will fall back to creating/reusing a VAO if they are not supported.

This function tries to detect as many problems as possible, instead of relying on OpenGL’s default behaviors. For example if you don’t bind a vertex attribute, the OpenGL specifications say that its value will be undefined. Glium tries to avoid this kind of defensive programming and will return an error instead. In my experience this function has already caught a large amount of bugs that would be very annoying to find otherwise. However some checks are not made because they would be far too expensive, like checking for values that are out of range in an index buffer.

Even though this list looks massive, it is important to remember that the biggest overhead is calling OpenGL functions. For example caching the uniform values eats up a lot of CPU, but not caching the values would eat up even more CPU because of the additional OpenGL functions being called. Each OpenGL function call takes a lot of time, not just because of the call itself but also because it can force additional operations in upcoming calls. Any optimization that can remove unnecessary calls will almost always be beneficial.

It is however important to note that glium can only avoid function calls if they are actually not required. As a user, having some knowledge of OpenGL is still helpful in order to perform the operations in a specific order and minimize the number of OpenGL state changes. Glium remains a low-level library and is not a game engine. Grouping your draw calls by program, by parameters or by textures for example can improve performances a lot.

Safety

Prior to OpenGL 4 and OpenGL ES 3.1, everything was memory safe. All operations had the same effect as if they were executed immediately. Since OpenGL 4 and OpenGL ES 3.1, this is no longer totally the case and you in some situations you have to manually handle synchronization, flushing the cache, and passing around raw pointers that can crash the program if misused.

Glium automatically handles everything automatically and makes all operations memory safe by automatically calling glMemoryFence and glFenceSync/glWaitSync.

OpenGL errors

But glium takes some liberties with the definition of “safe” and is in fact stricter. In addition to memory safety, glium also enforces the lack of OpenGL errors.

When you call an OpenGL function, the OpenGL API lets you check whether it triggered an error by calling glGetError(). This looks similar to the errno mechanism of the standard C library, except that most of the OpenGL functions don’t have any return type or mechanism to indicate that an error was in fact triggered. In order to know whether an operation succeeded, you have no other choice but to call glGetError.

The problem is that glGetError() is slow. On my machine each call to this function takes on average around 0.5µs, and checking for errors after every single function call (which is something that you shouldn’t do by the way) increases the time it takes to draw a frame by as much as 10 to 15%.

To solve this problem, glium tries to make it impossible to have an OpenGL error in the first place. Some errors are easy to prevent, like wrong enumeration values or identifiers to non-existing objects. Some others are prevented through compile-time checks, like preventing a buffer from being used while it is mapped by making mapping a buffer require a &mut Buffer. And some others are prevented through runtime checks.

In order to enforce this, when you compile in debug mode, glium uses the ARB_debug_output extension to detect any potential error. If an error occurs, a panic is triggered and the user is encouraged to report the issue. This may seem invasive, but it is the easiest way to find bugs in glium.

GL_KHR_no_error

Performing runtime checks to avoid OpenGL errors may seem inefficient, after all the driver already performs the exact same checks. The exact cost of using glium compared to raw OpenGL functions remains to be calculated, but I’m confident that it is low. Even if you find the cost too high, glium has an ace up its sleeve: the GL_KHR_no_error and EGL_KHR_create_context_no_error extensions.

These extensions have recently been approved and aren’t supported anywhere yet (as far as I know), but they have a big potential. They allow you to pass a flag during context creation, asking OpenGL not to check for any error and treat errors as undefined behaviors in exchange for a performance gain. Since glium avoids all OpenGL errors in the first place, you can safely use this flag and get that performance gain.

The status of glium

Glium currently has 1,968 commits, 28,686 lines of code in the src directory, 6,964 lines of code in the tests directory, and a build script of around 1.5kloc. For some comparison: hyper has around 13kloc, cargo around 15kloc, image around 12kloc, serde around 16kloc, rustc around 360kloc, and servo around 204kloc.

At compile-time, the build script generates two files of 13kloc and 20kloc, resulting in a total of around 62k lines of code in a single crate. This is probably one of the biggest single-crate code that exists in Rust right now, and the unfortunate consequence is that glium has encountered some issues such as the compiler reaching the memory limit of travis, triggering a legitimate stack overflow, or taking one hour to compile for example. If you use glium, you should consider using the nightly versions of rustc, as the compile speed improvements make a big difference.

When it comes to OpenGL support, an issue sums up everything that remains to be implemented in terms of features.

The design of glium has been mostly figured out and shouldn’t change too much. However there are still a lot of incorrect or missing things, especially a lot of missing error detections when it comes to textures. Contributions are of course welcome!

A single golf clap? Or a long standing ovation?

By clapping more or less, you can signal to us which stories really stand out.