In the past 20 years, Red Hat Enterprise Linux (RHEL) has become the de facto Linux distribution for servers in the enterprise, capturing more than 60% of the paid distribution market. Their position as the market leader makes them the trendsetter for how enterprise software is developed for Linux. Unfortunately, their position has hurt software developers pushing the edge with container and networking technology.
As RHEL is the market leader in the enterprise, and a multi-billion dollar institution with a growing product set, developers must treat RHEL as the lowest common denominator while building software. By choosing RHEL, one creates a vacuum in the public market for software that’s able to solve problems in the way that Facebook and Netflix do.
Much like other OS vendors, Red Hat delivers a new version of RHEL every 2–3 years. In Red Hat’s own words, they are a collection of many different parts, all running upon the Linux Kernel. When a new major release comes out, Red Hat typically chooses to base it on a kernel that’s released about a year before their release. For the lifetime of the release they do not upgrade the kernel, yet they apply patch-on-to-patch in every minor version to address concerns around security, drivers, and features that matter to Red Hat and their partners. The kernel is the only component that a userspace programmer cannot independently advance. Herein lies the problem.
Upstream is Good Enough
The Linux upstream cares immensely about backwards compatibility and not breaking user space. Linus Torvalds has ranted many times about the #1 rule of kernel development: WE DO NOT BREAK USERSPACE! Given this, ABI-compatibility is not a good reason not to upgrade. There are thousands of kernel developers that ensure this every day.
If a change results in user programs breaking, it's a bug in the kernel. We never EVER blame the user programs. How hard can this be to understand?
-Linus Torvalds, 2012, LKML
Performance is not a feature, it’s a basic requirement
Not only is the kernel not upgraded, the toolchain (compiler) is not upgraded either. The reasoning behind this is yet another call to stability, but over the past decade, GCC has made incredible leaps & bounds in optimizing code for performance. It is unknown how much performance is lost across the entire fleet of Red Hat Linux installations, and how much extra power is used for little reason.
As the computing ecosystem changes, different components of the kernel are stressed out, leading to performance optimizations, and internal architecture changes. Features such as IPv6, and tunneling were often seen as components for completeness, and not necessarily critical parts of the system until they began to become critical to datacenter computing.
Further back, when SSDs first came out, the lack of TRIM support made using them a non-starter for many. Only when they became ubiquitous did the kernel team put work into TRIM. It was some years before their work became available to RHEL users.
In June 2013, when 3.10, the Kernel used in Red Hat’s newest operating system, RHEL 7.2, was released, Docker was only a few months old. Containers were perhaps only being considered by those on the bleeding edge of technology, or by institutions with hyper scale footprints, like Google, and IBM.
Since the popularization of Linux containers by Docker, a variety of kernel features have been added to ease the deployment of multi-tenant workloads enabled by containers, only some of which have been partially back ported by Red Hat. These features could be used to build higher performing, more secure, and capable containers by the likes of features like eBPF and advanced namespaces.
eBPF was added to the Linux kernel more than 18 months ago. It may seem familiar to some people who have heard of BPF, an instruction set used for network filtering, and seccomp. It was extended to form eBPF, an instruction set and runtime meant to enable developers to access aspects of the kernel in a safe, JIT’d environment. eBPF can be used to do performance tracing, traffic shaping, and network filtering. One can only imagine what having access to such a toolkit would enable.
eBPF can also be used to do performance troubleshooting. In recent versions of the kernel, kprobes and perf have gained tight integration with eBPF. This allows nearly zero overhead tracing of aspects of the Linux kernel that were previously only available to operating systems such as Solaris via dtrace. Brendan Gregg, a Netflix SRE, has constantly evangelized the capabilities of ePBF on his blog, and made it clear how effective eBPF has made organizations like Netflix at running systems at scale. We can’t ignore this technology that’s simply a game changer.
One of the more interesting features in this cycle is the ability to attach eBPF programs (user-defined, sandboxed bytecode executed by the kernel) to kprobes. This allows user-defined instrumentation on a live kernel image that can never crash, hang or interfere with the kernel negatively.
- Ingo Molnár, 2015, LKML
Linux for Adults
Red Hat’s market share and perspective has slowed down feature development and capabilities in other spaces, such as networking, security, and databases. We cannot allow this to continue. They’ve abused their position as the market leader to hurt their competition. They continue treating us like children that cannot understand the software running on our computers.
Over the past few years, many other distributions have matured. On one hand, we’ve gained brand new distros such as NixOS, while we we still have Fedora and Debian as reliable old favourites of the market. Lastly, there exists the likes of Canonical’s Ubuntu, and CoreOS with enterprise support.
Be an adult. Take responsibility for your OS. Choose an OS that will let the rest of the market advance.