With the release of Ubuntu 16.04 ZFS became officially supported by Canonical. However, this raised issues over licensing — see this article and the links it contains. Here are my thoughts on the issue as a software engineer and Linux user. Unfortunately, I do not have much legal expertise, so my discussion will lack legal precision, but I will do my best to address the legal issues highlighted by other articles.
I first wrote this because the idea that using an API constitutes a derivative work does not seem logical to me. Since then, I’ve read more about the GPL and other people’s analysis of the ZFS situation. Unfortunately, the argument I make in this first section would void the GPL, which is not something I actually want to happen. However, I still believe my argument that interfaces are different from derivation is correct in a philosophical sense, just not a legal one as copyright is used in software today. My conclusion after hours of thinking about this issue is that software licenses need to be explicit about public API usage and linking. Unfortunately, Linux is GPLv2, GPLv3 is only marginally better, and LGPLv3 is too permissive for Linux’s philosophy.
A Desktop Computer Manufacturer
Let’s start with a physical analogy — your computer itself. If I buy an assembled desktop computer, the manufacturer will choose the motherboard, graphics card, power supply, etc. The motherboard is designed to allow various modules to attach to it to enhance functionality and provide services, and the user is able to change these modules.
Now, the desktop assembler might choose a fancy patented graphics card, but plugging that card into the motherboard doesn’t suddenly create a patent conflict. The two components are designed to work together through an interface. If after buying the computer, I decide I have some objection to the graphics card patent, I can just remove it and replace it with another one.
Now, if the desktop assembler decides to create a new Skinny Desktop product and they manufacture it by modifying or taking components from the graphics card and attaching it to the motherboard in a permanent and non-standard way, this changes things. It may violate the graphics card company’s patent. The desktop assembler has created a new derivative work that required modification of both the motherboard and graphics card and they probably need licenses from these vendors. And the user can no longer swap the graphics card component or easily create their own version of this product without specialized knowledge.
ZFS and Linux
Now obviously this is analogous in the ZFS situation with Linux (the motherboard), ZFS (the graphics card), and the computer assembler (Ubuntu and other distros). A different kind of intellectual property is involved (copyright instead of patents), but does software really make the implications so different?
From the user perspective, it seems similar. ZFS and Linux communicate through a well-defined interface; ZFS is a separate module, which can easily be attached or detached by the user. After I install Ubuntu, if I have an issue with ZFS I can easily remove it without affecting Linux; I can install other file-system modules, even proprietary ones.
So what’s the issue? According to Software Freedom Conservancy:
Once license incompatibility is established, the remaining question is solely whether or not combining ZFS with Linux creates a combined and/or derivative work under copyright law (which then would, in turn, trigger the GPLv2 obligations on the ZFS code).
And that article links to the GNU opinion that “Linking a GPL covered work statically or dynamically with other modules is making a combined work based on the GPL covered work.”
I’m not convinced.
When a component is distributed as a module its very design is to be separate from other entities. Both Linux and ZFS implicitly agree that they are separate from each other by conforming to the kernel module standard, but work together through a specified interface. To say that assembling them side-by-side creates a derivation invoking copyright just seems incorrect to me. Furthermore, the user is aware that they are separate. ZFS is a package that can be added or removed by the user with a single simple action.
Biological Ecosystems Analogy
An ecosystem is an assemblage of various organisms. As creatures evolve, they derive their new form from the genetic code of their ancestors. One would never describe the ocean ecosystem as a shark+tuna+dolphin derivation. You would say that the ecosystem is made up of various organisms assembled together.
Now, there are bacteria that exchange DNA. So when you look at one species you would say that it was derived in part from other species. And if bacteria were smart (or arguably stupid) they would sue the hell out of each other.
At first this may seem silly, but I just want to point out that this idea of code and derivation does have a basis in the natural world and is not just a human creation. There is a fundamental difference between an assembly of entities that each maintain their own code independently, but interact with each other, and entities that actively exchange code with each other.
It’s unfortunate that the GPLv2 does not differentiate between linking/using and copyright derivations. In my opinion (as argued in this post) these are separate concepts from a philosophical standpoint. The idea that using an API is a derivation under copyright law seems flawed to me. But if it’s not, then the GPL has no power as it is written, and it’s not my belief or wish that the GPL should be void.
My hunch is that this is all because copyright is the only legal framework for the GPL to build on. So maybe if there was some kind of definition for interfaces in the legal system, then licenses such as the GPL could be written in a more understandable form where copyright pertains to source code and the binary, but use of the work through public interfaces is described separately.
Clearly there is confusion in the legal system and among software developers on the issue of APIs as the Java/Oracle lawsuit also demonstrates. This is probably a topic for another post, but in my opinion if an API is ever publicly released without restrictions on use, it should be considered public domain. The copyright owner can re-license the implementation, but I don’t think they should be allowed to own the copyright on an API and its documentation once release to the public.
So maybe, if the idea of a “public interface” was well-defined in the legal system, these copyright and licensing issues would have more clarity.
I also recommend this indepth discussion of the GPL and ZFS.
The Linux Kernel, CDDL and Related Issues
Eben Moglen & Mishi Choudhary February 26, 2016
GPLv2 is just a confusing license, which I suppose is why GPLv3 was created.
The GNU GPL FAQ has an answer related to the API issue:
What is the difference between an “aggregate” and other kinds of “modified versions”?
Where’s the line between two separate programs, and one program with two parts? This is a legal question, which ultimately judges will decide. We believe that a proper criterion depends both on the mechanism of communication (exec, pipes, rpc, function calls within a shared address space, etc.) and the semantics of the communication (what kinds of information are interchanged).
If the modules are included in the same executable file, they are definitely combined in one program. If modules are designed to run linked together in a shared address space, that almost surely means combining them into one program.
By contrast, pipes, sockets and command-line arguments are communication mechanisms normally used between two separate programs. So when they are used for communication, the modules normally are separate programs. But if the semantics of the communication are intimate enough, exchanging complex internal data structures, that too could be a basis to consider the two parts as combined into a larger program.
This answer is not very precise though; it just says it’s up to the judges, which seems like too much uncertainty. Why use a vague license when you could specify exactly how you want API usage and application linking to be restricted?
GPLv3 does not use the word “derive(d)” once, and instead says:
A compilation of a covered work with other separate and independent works, which are not by their nature extensions of the covered work, and which are not combined with it such as to form a larger program, in or on a volume of a storage or distribution medium, is called an “aggregate” if the compilation and its resulting copyright are not used to limit the access or legal rights of the compilation’s users beyond what the individual works permit. Inclusion of a covered work in an aggregate does not cause this License to apply to the other parts of the aggregate.
If your software has public APIs, the GPLv3 just doesn’t seem appropriate; it’s nearly as vague as GPLv2. Only the LGPL talks of libraries and APIs in a sensible way. I don’t think the “lets leave it up to judges” attitude of the GNU is very smart, why not create a strong copyleft license that is explicit about API usage and program linking? Ideally, I think Linux should use a license that is somewhere between GPLv3 and LGPLv3. A license that describes how the public APIs are to be used and explicitly states the restrictions on programs that dynamically link to the kernel (must be open-source) rather than letting the legal system try to resolve this issue in court.
So maybe we need a GPLv4? Other people have already advocated for it:
GPLv4 — Starting the Conversation
Published by Christopher Price on January 28, 2014