Linode Cluster Toolkit — Part 2

Karthik Shiraly
Linode Cube
Published in
6 min readJul 27, 2017

--

In Part 1, I had covered some of the basic features of the Linode Cluster Toolkit (LCT) and LinodeTool. If those names and terms like “Cluster Plans” are new to you, you may want to read Part 1 first .

In this article, I’ll introduce some of their more advanced features, including predefined cluster plans, firewall management, advanced disk image management, and DNS management.

Predefined Cluster Plans

While LCT and LinodeTool make it easy create clusters from cluster plans, preparing a cluster plan itself often requires domain knowledge and prior experience with the software stacks it’s deploying.

LCT and LinodeTool come packaged with a number of reusable, predefined cluster plans. Creating a cluster based on one of these cluster plans is simple:

$ linodetool list packageplans
$ linodetool cluster create hdfs1 'package:hdfs-ha'

If you have a cluster plan to share, you are welcome to contribute them to the GitHub repo.

There is also support for using cluster plans behind URLs, which enables teams in a company to maintain a repository of cluster plans and share them with internal and external clients.

$ linodetool cluster create yourappcluster http://YOUR-SERVER/APP-CLUSTER.yaml

Firewall Management

Cluster operations often involve delicate coordination of software-specific configuration files, daemons and firewalls. Network interface bindings and ports can be customized for many software stacks through their configuration files. They are then distributed to all nodes of a cluster and read by the daemons executing on those nodes. The complication is that when settings change, the firewalls on those nodes, too, have to be reconfigured to match the settings.

For example, consider ZooKeeper. For those not familiar with ZooKeeper, think of it as a software that acts like a two-way radio for software to reliably broadcast announcements to one another. Many big data clusters — including Storm, HDFS, and Spark — rely on ZooKeeper for storing information and distributing information both internally and externally.

Now imagine one such big data cluster has to be expanded by adding more nodes — usually done to handle higher load or improve availability. Apart from the provisioning and configuration, from a firewall point of view, such an expansion typically involves:

  • Knowing the software daemons that should be started on those nodes and knowing the network interfaces and ports on which they should communicate.
  • Configuring firewalls on all existing nodes to be able to communicate with these new nodes — and vice versa — but only among themselves, and not allow access to any unknown node.
  • But it’s not just the nodes of the expanded cluster that require reconfiguring. Dependent clusters too — ZooKeeper in this example — should have their firewalls reconfigured to be able to communicate with the new nodes.

This deployment pattern is so common that LCT cluster plans provide a number of related features to support it:

  • Firewall configuration commands can be included in response to cluster events. These are simple declarative commands that describe which source and destination ports and IP addresses to configure upon event hooks like “start cluster.”
  • Event Hooks that are executed on cluster commands, like “start cluster,” provide you opportunities to call your own shell or Ansible or other configuration scripts. If you prefer to do your own firewall management using your own scripts, you can use inventory queries to shortlist nodes and pass them to your scripts to configure firewalls.
  • Cluster dependencies can be described so that one cluster’s events trigger event hooks of clusters that depend on it.

Disk Image Management

A disk image is like a template for a disk. A disk image is prepared by installing all necessary software on one Linode’s disk and then cloning that disk multiple times to get perfect replica disks which can be attached to other Linodes.

But why are disk images useful when configuring clusters? When configuring nodes individually in a large cluster, each configuration process takes some time depending on the complexity of the configuration steps. It’s possible that newer versions of OS packages or third party packages are released between configuration of earlier nodes and those of later nodes, which can result in different — and possibly incompatible — versions of the same software on different nodes of a cluster.

Disk images can prevent this problem because every cloned disk is a perfect replica of an image.

Other benefits of disk images are bandwidth savings — because software downloads need not be repeated on every node — and shorter cluster bring-up time. Cloning just a Linode disk image is a magnitude faster than cloning an entire Linode and also affords higher API rate limits.

But Linode disk images also have some limitations:

  • The total storage available for all images is limited to around 10-12 GiB per account. In practical terms, this means only about 2–4 images can be stored using Linode API. It’s currently impossible to store large software stacks (such as Android AOSP code repositories) as disk images on Linode.
  • Linode API does not support use of disk images stored externally.
  • Linode API does not support replicating a disk image onto a block store or creating a disk image from a block store.

LCT provides the following enhancements to overcome these limitations:

  • Supports creation and storing of disk image files on instance disks and block stores. This makes it possible to maintain your own long-term, unlimited capacity, image repositories.
  • Support creation of disks and block stores from Linode images and from external disk images.

DNS Features

Host names and name resolution, if configured incorrectly, can create failures in many software stacks which run on clusters. It’s quite critical that host names and name resolution — either through /etc/hosts or through DNS — are configured correctly. Since a cluster can contain dozens or even hundreds of nodes, maintaining these manually or using scripts for every software stack can become cumbersome.

LCT provides multiple name-resolution features to solve these problems:

  • /etc/hosts management
    LCT supports automatic creation and distribution of /etc/hosts files with entries for nodes selected by the cluster plan. Addition or removal of nodes trigger recreation and redistribution of /etc/hosts, both to same cluster and to dependent clusters. Separate entries for public and private IP addresses are supported.
  • Private secure DNS server
    Instead of /etc/hosts entries, some deployments may prefer a DNS server. LCT supports provisioning and secure configuration of private primary and secondary name servers to resolve names. Addition or removal of nodes trigger reconfiguration of the DNS mappings, both in same cluster and in dependent clusters. Separate entries for public and private IP addresses are supported.
  • Public secure DNS server
    Some deployments may require a public DNS server because node names are displayed to or used by external clients. LCT supports creation and secure configuration for such public DNS servers.
  • Split-horizon DNS server
    Depending on use cases, some stacks — GlusterFS is one example I’ve run into — may require a somewhat special DNS configuration called Split-horizon DNS. The concept here is that the same name resolves to two different IP addresses depending on whether it’s queried from inside the cluster or from outside. For internal queries, a name should typically resolve to a server’s private IP address, and for external queries, to its public IP address. LCT supports this kind of split-horizon DNS server.

Additional Features & Roadmap

These are some of the more advanced features of LCT. The project’s feature list has a complete list of all the available and planned features, and documentation for them.

Cross-datacenter private clouds are a feature I’m keen on implementing next, once I’ve researched available options. If you have worked in this area, please share your experiences and ideas in the comments below.

Another priority will be adding more predefined cluster plans as I explore more software stacks. You are welcome to contribute your own predefined cluster plans to the project repo if you think they’ll be useful to users of LCT and LinodeTool.

Code, Installation and Feedback

Detailed code and installation instructions are available in the toolkit’s GitHub repo: https://github.com/pathbreak/linode-cluster-toolkit. You are welcome to contribute or report bugs or open feature requests.

Credits

Thanks to Dave Roesch and Keith Craig for providing Linode infrastructure and suggestions that made this article possible.

About me: I’m a software consultant and architect specializing in big data, data science and machine learning, with 14 years of experience. I run Pathbreak Consulting, which provides consulting services in these areas for startups and other businesses. I blog here and I’m on GitHub. You can contact me via my website or LinkedIn.

Please feel free to share below any comments or insights about your experience using Linode, Linode API, Linode Cluster Toolkit, LinodeTool or their features. If you found this blog useful, consider sharing it through social media.

--

--

Karthik Shiraly
Linode Cube

Tech lover. Data Science | Big Data | Machine Learning. Pathbreak Consulting. Always on the path less traveled.