A case for binary packages

Dany Bittel
4 min readOct 24, 2024

--

Programming languages are almost always saved in a text format. This has undoubtedly many benefits. So any new or old language can fit right into a huge eco system (IDEs, Git, ..). It is thus easier to pick up by developers and adapt it.

But, saving into binary packages, has also many and some unexpected benefits.

Our programming system miqula uses two different file formats to store and use anything created in it.

The first is the package (.mql), this file is close to a cpp / h/ js / css file, just a binary format. It contains all the code and saves it in a tree, the way it is presented and edited by the user. This is not the AST. It may contain leaf nodes which actually store expressions as strings. It can contain arbitrary data such as fonts, images, icons, graphics, lookup tables as well. All the data is loaded once at the beginning, so this is not for loading during execution. The advantage to using a binary format here, is mostly access speed and consistency. Most of the data could also be easy represented in ascii format. A package is usually around 1MB in size (compressed).

The second is the module (.module), this file is more like a dll / lib / object file / cache.. It contains all the functions in binary format (of all supported platforms) and intermediate format (for inlining). It also exports any kits (types) to be used. It does not contain source code, only the relevant bits needed to use the code (similar to a header file). This file is around 10MB in size (compressed).

a package imported the standard module

During loading of a package in the IDE, all source code is compiled (for the current platform). To be run and updated during editing.

Every package can then be published to a module file, which subsequent packages can use. During publishing additional platforms are compiled.

Everything that the module needs to function is contained in this single file. If a module itself has dependencies, everything necessary for the module to function are imported before publishing! That means if you import a module, it will work, every time, no need to chase it’s dependencies. It follows, that two imported modules, may use different versions of a module in their dependency. That is no issue, the modules themself are binary anyway and kits (types) from that conflicting module are not available in the current package. If you want something from a module, you need to import it directly.

So if you import a shed, you don’t get to use it’s screws. If you decide to import a bicycle that uses a newer version of screws, there is no conflict. If you do want to use the screws, you import the screws.

What if a function in a module returns a type that is not part of the module? In the above example, the function returns a screw. The most obvious thing happens: you have different screws with the same name. This causes no error, per se. If you don’t have the screws module, you cannot access the type regularly, only through reflection. If the screws module is imported, you can access the type that has the same shape.

We push instead of pull dependencies. If a change happens, all modules downstream must be re-published to get the new effects. This should be the time where tests run again to see if everything still behaves as expected.

If you only have the module file, there is no way to dive into it’s code or copy from it. Usually you will want to download the package file and the modules it imports. You can then publish it and import into your own projects.

Generics must have a closed list of parameter types. We can’t use a generic type from a module with a type from another module! I’d say this is the biggest drawback. There are ways we can mitigate it. For starters, many container types are built directly into miqula, which is a major source where you’d want generics with unknown types. In the future we might add a template functionality, which works more like “paste this predefined code”. This comes with the issue that all pasted code must be present in the current package. To get back to the shed example. If the shed were a template, one could create a shed for a car, but now you need to have the screws imported as well.

At this point one might prefer to just go use text files. But a live coding environment kind of necessitates this approach. Especially if the IDE is partly written in itself. Otherwise you’d need to compile everything on startup or use a managed language. All (?) interactive programming systems are usually based on a managed language.

A surprising benefit: no build time to create the final executable. It basically takes as long as it takes to write the file.

--

--

No responses yet