A Linux Developer’s Flight Through Azure Skies
In 2016, the Adobe and Microsoft CEOs announced to the world that the two companies were entering into a strategic partnership and that Adobe would make Microsoft Azure “its preferred cloud computing platform.”
I’ve been a Linux developer for the past 15 years. I work for Adobe as part of the Adobe Experience Cloud. We have used Linux and open source software (OSS) almost exclusively to build our SaaS offerings. Hearing that we would use a cloud run by the maker of Windows gave me reason to pause. Could our Linux ethos work on Azure? Would I have to install a Windows VM just to do my job?
Two years later, here’s a brief compilation of some advice I would give a Linux developer who is considering using Microsoft Azure.
Microsoft doesn’t hate Linux (anymore)
The first thing to realize, and it has taken many of us *NIX diehards time to accept, is that Microsoft is no longer the yesteryear public-enemy-number-one of Linux and open source software. They are changing their tune. They are, amazingly, champions of open source. They are even enthusiastic about Linux.
Microsoft might be embracing open source for philosophical reasons, but it is more likely that it makes business sense. Cloud computing is mostly a Linux world. If you want to dominate in the that market, you need to court Linux developers. This strategy appears to be working for Microsoft: They now have more Linux workloads in Azure than Windows.
There’s a lot more evidence of them going all-in on open source. Here are a just a few supporting tidbits:
- Microsoft is the largest OSS contributor on GitHub — and this was true well before they acquired GitHub.
- They have released core technologies like .NET Core, all the Azure SDKs, and the venerable VS Code IDE as open source, and these all run on Linux as first-class citizens.
- Perhaps the most significant signal is they have “open sourced” all of their patents by joining the Open Invention Network, essentially ending any potential Cold War scenarios with projects like Android, OpenStack, and yes, Linux.
Once we realize that Microsoft wants Linux developers to succeed, we can all stop worrying about falling into some trap. We can instead focus on the technical merits of Azure and get our jobs done.
After Adobe announced its intention to use Microsoft’s cloud, I started poking around the Azure documentation. To my great relief, Azure had Linux VMs you could immediately provision. They supported all of my favorite distributions: CentOS, Ubuntu, and Debian. I would not have to switch to .NET in order to utilize Azure.
The Linux VMs in Azure are similar to other public cloud offerings. There are many options for compute, memory, and disk space. You can SSH into the machines, open ports to the outside world, and the host appears to be, well, a Linux host. Performance is cloud-nominal: CPU and RAM have little-to-no overhead from being virtualized, but the “noisy neighbor” problem sometimes causes drops in performance.
One thing we’ve learned is that you should always use Managed Disks for your VMs. Released almost two years ago, Managed Disks are now the default for new hosts. They eliminate needing to keep track of where your VMs disks are stored. Either select Standard or Premium performance and Azure does the rest.
Premium Storage disks are backed by SSD and take advantage of some aggressive caching. When using a VM with a non-RAID Premium disk, we were surprised when some of our benchmarks revealed better throughput than the attached ephemeral SSD. Learning more about Premium Storage is definitely worth it.
Portal is not bad, but Azure CLI is better
The uninitiated will start by exploring the cloud’s capabilities by logging into Azure’s portal: portal.azure.com. This is how I cut my teeth and found it a great way to understand the breadth of options available. The UI is clean, consistent, and has a helpful omni-search function to find and create resources. Integration with SSO and LDAP makes logging in very convenient. I still use the portal most days.
Most Linux developers cannot abide a GUI for long. The command line is where we spend our time. Are we doomed to use PowerShell with Azure? Fortunately, Microsoft provides a top-notch command-line interface (CLI) that is not PowerShell. It is called, very reasonably, Azure CLI.
Microsoft has done their homework with Azure CLI. It is a frontend for the expansive Azure HTTP APIs, and it covers all aspects of interacting with Azure: subscriptions, networking, VMs, disks, and even managed services. Installing the CLI was a snap, as they had binaries for all major platforms and even a Dockerized version which avoids any dependency issues.
The CLI is written in Python and, like so many things related to Azure, is hosted on GitHub with the permissive MIT open source license. It is updated frequently, and most tutorials include the CLI in their instructions. It functions like any well-behaved Linux tool. We use it constantly — both in scripts and in ad-hoc operations.
Learn about ARM templates
With the Azure CLI providing an imperative mechanism for managing your cloud resources, Microsoft also has ARM templates, a JSON pseudo-language used to define Azure objects. This enables a declarative “infrastructure-as-code” approach to Azure management. These are worth learning about.
Although JSON is not my favorite way of defining resources, it is better than XML. ARM templating is valuable when trying to maintain, and easily recreate, complex scenarios. You can even export an ARM template for existing Azure resources from the portal. (This helps to learn how Azure represents infrastructure in ARM — but you can’t necessarily use the downloaded template unmodified to clone resources.) Once created, the templates can be tracked in your favorite source control tool.
Azure Storage is worth a deep dive
Although Azure’s Storage sounds similar to Amazon offerings, it encapsulates S3, EBS, an HDFS filesystem, a NoSQL database, and queuing system all in one. About every other Azure service depends on Azure Storage, including your VM disks, so I would not shy away from depending on it as well.
Because Azure Storage is quite sophisticated, it is worth learning as much as you can about this technology. Its many uses may surprise you.
If you are porting services into Azure, consider investment to refactor and embrace Azure’s Blob Storage rather than attached disks. Microsoft has invested a lot to make Blob Storage scalable, performant, and full-featured — not to mention affordable.
Azure Table storage is also a capable alternative to the more complex Cosmos DB. Switching to Cosmos DB later on is straightforward, as it is interface compatible with Table storage.
Security and Network Security Groups are a must
Another area that requires careful education is security in Azure. Because services will be hosted in a public cloud they can easily be exposed to the entire Internet.
Most of the default settings that Azure recommends will secure your VM or service using authentication. With VMs you can choose from SSH keys or passwords. On Storage Account`s you can choose from keys, SAS tokens, etc. It is important to understand the pros and cons to each of these.
I’d recommend, however, that you go one step further and always use a well-configured Network Security Group (NSG). These are basically firewall objects. NSGs define ingress and egress ports, along with allowed IP ranges.
I have an NSG definition written in an ARM template that I use again and again when spinning up resources. It restricts sensitive ports to my corporate IP range to prevent non-Adobe actors from attacking.
More advanced users should also look into Virtual Network Service Endpoints which keep traffic between Azure services off the public Internet completely.
Azure managed services
Azure has two flavors of managed services: fully managed (e.g., Cosmos DB) and partially managed (e.g., HDInsight).
The fully-managed services are what they sound like. You provision them, use them, and never have to worry about disks filling up, hardware failures, or how scaling is managed. Azure takes care of that for you.
The partially-managed services allow you to spin up OSS stacks in minutes, but after setup you are responsible for on-going maintenance. Azure provides a solid default configuration, and even alerts and monitoring, but if you aren’t careful, you can fill up a disk and take the service down. At least until you fix the problem.
Partially-managed services might not sound that great, but we have found them to be extremely helpful in rapid prototyping. Need a Kafka cluster? HBase? Spark? Using HDInsight, you can have any of these and can focus on building your software. Once your prototype graduates, you can decide to stick with HDInsight or transition to a self-managed stack. We’ve taken both approaches at Adobe. In one case we used HDInsight Kafka for rapid development and kept it into production with great results. In other cases, our operations team took ownership and we deprovisioned HDInsight.
Cosmos DB, a fully-managed database solution, is another service we’ve used extensively across Adobe. It has the same provision-and-use instantly benefits as the partially-managed services. Scaling it or setting up global distribution is a snap. Like all cloud services, however, one should analyze costs to determine if it is the best long-term solution. Taking a naïve approach to “cost engineering” can result in unexpectedly large bills.
Another “service” I use constantly is Azure’s Container Registries. In under one minute you can get a high-performance Docker registry spun up and made available to store container images. Well worth a try if you are drinking the container Kool-Aid.
Azure Kubernetes Service
Speaking of containers, Azure provides an excellent partially-managed service called Azure Kubernetes Service (AKS). It is a great way to rapidly setup Kubernetes and learn more about it. We use it for development, testing, and even production applications. The tutorials and docs are good. Definitely check it out.
Kubernetes was built with the cloud in mind, and I feel like using AKS (or a similar cloud offering) is the best way to experience Kubernetes. Otherwise, you will spend valuable development time maintaining complicated software. Minikube or Microkube work for small-scale development, but there’s no replacement for a full-sized deployment.
If your applications are networking-bound, you’ll want to know about Accelerated Networking.
Initially we found less-than-stellar performance for network-heavy applications running in Azure. Fortunately, Microsoft was already working on a solution and has introduced Accelerated Networking VMs. This class of VM uses special OS drivers (available for many Linux distros) to bypass a virtual NIC and communicate directly with the physical NICs for much improved performance.
Application Gateway load balancing works for small- to medium-scale systems
When deploying web-based applications, a load balancer is a must. A quick search in Azure will result in the Application Gateway load balancer. It provides a lot of convenient features, including TLS termination. Experience Cloud uses it in production for a few of our mission-critical components.
We have found, however, that Application Gateway works best for small-to-medium use cases, where you don’t have thousands of transactions per second. If Application Gateway is not meeting your needs for high-scale loads, then using a layer 4 “Azure Load Balancer” coupled with another solution for TLS termination is a good way to go. Something like nginx or haproxy can work well in many scenarios.
Azure is evolving incredibly fast
One thing we learned early on is that Azure is evolving incredibly fast. There have been multiple instances where the documentation and even the portal UI would be improved from one week to the next. Features are added regularly. Microsoft understands how competitive the landscape is right now and is in the race to be a top player.
There have been instances where an Azure service didn’t meet our needs. After engaging Microsoft about these issues, we were often told they were already working on solutions. In some cases, we only had to wait weeks before our requirements were met.
From our interactions with Microsoft I have gathered Azure’s responsiveness is not just because we are Adobe, a big and important partner, but rather the product and engineering teams at Microsoft are plugged-in to the expectations of Linux developers. Many of them are Linux users.
Familiar, but different, and sometimes better
As I’ve dug deeper into Azure, I recognize a “Microsoft pattern” I’ve seen before. When C# was first released, I remember thinking how much it looked and felt like Java. As I learned more about the .NET platform, I realized that Microsoft had not only created a Java competitor, they had learned from Sun’s experience, and built an entire ecosystem of languages that could interoperate and provide more value for their developers. C# and .NET felt familiar, but were different, and in many ways were better because Microsoft learned from those that went before.
Azure seems the same way to me. At first glance I thought it was like the other major cloud platforms, but with use, the familiarity gives in to differences that feel like welcome refinements.
Azure isn’t perfect, but it definitely isn’t horrible either. It is more than this Linux developer thought it would be, and I’ve been able to stay true to my open source workflow while using it. It very well may be the preferred cloud of Linux developers in the future. That sound crazy? Microsoft being the biggest OSS contributor in the world also sounds kind of crazy. If Microsoft can learn to love Linux, anything is possible.