OpsMop: Building the Next Generation of Configuration Management Tooling

So, short history lesson — I created Ansible, starting in Feburary of 2012, on the couch and floor of my living room in Morrisville. That makes it about about 7 years old by this spring, I think.

When I started it, I wanted to have another open source project — being too long removed from the fun times of developing Cobbler (which I stopped working on in the mid 2000's). I was thankful for having a small bit of following from that project, and that quickly brought up steam to try out a crazy idea about configuring systems over SSH protocol. Thankfully, friends that I really respected thought it was going in a possibly interesting direction, and from their advice and enthusiasm, I kept going and lots of other people joined on. In just a month or two, that took off amazingly well, and we had a system that could configure basic applications almost fully on at least CentOS and Debian systems, but also quickly supporting nearly all kinds of Linux and Unix environments.

Unfortunately, I had to leave that project in early 2015. Naturally, I would miss the community tons, the largest benefits to that development model being the amazing surprises — a new module for a service you didn’t expect, a platform edition for a new operating system, or waking up to find out there’s 100 people listening to a talk about your thing being live streamed in Japan. It was being like connected to everyone. But still weird, because your code was running in all of these cool places and you wouldn’t know about it.

Three years away from writing systems management software has given me a lot of perspective. There are things that I want to see better, and there are things that haven’t changed enough.

What I’m talking about here — OpsMop is just one of two recent releases I’ve made.

I have also recently released Vespene, which is a new take on both continuous integration/deployment and build automation consoles. I think it’s probably the best work I’ve ever done. Interest around it is slowly growing, and I could use some help getting the word out, honestly. But I’m committed to the idea of bringing all sorts of smart folks together, deciding how to build the tools we want to use, and seeing where it goes. We need build systems we all can work on — and Vespene is Python, about 5k lines of code, very straightforward, and contains a lot of powerful features without hunting and testing endless sets of plugins. Vespene is a pretty good minimal CI/CD system — the codebase is about three months old, and thanks to Django, feels like it’s about three years worth of project. With a few more people looking at it, sharing ideas, possibly code or testing, it’s going to be pretty awesome I hope.

Still though, automation systems have been a large part of my life, and I can’t stop thinking about them. If I had started a config management program today, with all of the hindsight of starting what became the world’s most popular config management layer — how would I do it again? What would I do differently? Do people even what CM systems these days?

Even if they don’t, I wanted to apply all that knowledge into something and try something new. What’s the very very best I could build? How do we make an even better system?

Enter OpsMop. It’s new. There’s not much there yet. At the time of writing, OpsMop is just about 4 days old. It is already at a state that I’d say ansible was at after about 2–3 weeks, but I can’t be sure. When I started OpsMop, I decided to build an experiment, but as I got further in, I figured out this is the config system that I want to use, and I’m fully committed to making this the next big thing. It does not require Vespene, but they may grow to work well together.

It’s not production ready now. There are numerous bugs. What we have today is mostly a sketch about a language (that completely works) and about how to build modules that work with that language (this also works!). There are bugs but … they will be fixed super quickly, and I hope those of you that know me know that.

What’s new about OpsMop? Several things.

  • OpsMop fully embraces Python 3, with a powerful declarative Python DSL. There is no YAML, and while you can Jinja2 template strings, you can also completely use the system without writing any Jinja2 if you like.
  • There’s very strong object-orientation, where everything is subclassable. You can provide your own implementations for system types, or write your own types. You can subclass literally every piece of the API.
  • There’s a solid type and provider model — this is something similar to Puppet and something Ansible should have always had, but as an artifact of developing Ansible in spare time after my day job, it was thrown out to enable slightly more rapid plugin development at the time. I always wanted one.
  • Very solid type checking — all arguments, including presence and absence of critical files, can be checked prior to execution time
  • A slim, very well organized codebase — while admittedly, parts of Vespene are very new and due for some refactoring, a strong focus has been around making the type/provider architecture very well designed. Actions are planned, then the planned actions are recorded. If the actions aren’t completed, that is an error. The dry run mode isn’t an afterthought — it is the entire thing the program was designed around. I want to avoid Python becoming assembler — and while there are some places I really need to overhaul internals (CLI/callbacks in particular), it’s going in a nice direction at the moment.
  • A focus on performance, though this is mostly kept in mind for when we build the remote modes in the near future. One nice performance hack is facts are calculated only when they are used, so there is no reason to turn them off. Also, when we do remote features, everything will run end to end as fast as possible for every node, so there won’t be a lot of round trip waiting. I’m still open to lots of feedback to what that might look like, but have some ideas.
  • An exceedingly high focus on code quality, readability, and maintainability. (I don’t think the internals are there yet to my standards, but they will be VERY soon — again, the project is days old).

In the next two weeks, I’m going to be taking some areas that are essentially charcoal sketches of plumbing and refactoring them, and polishing the existing type/provider classes to a near ideal state. At that point, the repo will be open for pull requests and I’ll turn on the issue tracker.

It’s very important to get the models right before we take on a lot of pull requests — things become difficult to change later. I can’t say that enough. Modules WILL be cargo-culted, the quality of the base class API for when we start taking pull requests sets the standard for everything.

We’re going to be quite open to pull requests, but also code review is going to be a bit thorough because we will be very focused around organization and reuse. I hope people actually enjoy this, and talking about the architecture that way -where as before, community in Ansible was largely about the modules, and the modules and core didn’t quite interact as much. That worked there, this will be a bit different.

Ideally I want to see only about 50 modules in the distribution max, though for something like “package” there could be a wide number of implementations. We also keep things simple by being pragmatic. In many cases, modules can exist by using shell calls and pipes and grep internally, rather than having a sea of python code. If you want to install 10 yum packages in one call, let’s keep it simple, just shell out to yum too. Return codes are things in Unix, and this works well.

In getting the first days sketch out, unit tests don’t exist yet, but they will, and they will be (eventually) incredibly extensive. Due to the nature of the providers, it should be easy to subclass everything and mock out nearly everything, in ways that can prove the system to be extremely robust. I want this project to become a showcase for testing and API design. It’s going to be rather computer sciencey.

I also broke one of my old rules — there are a few Python Metaclasses. We are using them to allow lazy evaluation of conditionals at runtime within the DSL. My brain nearly exploded writing that, but it was kept pretty minimal and most people will not need to understand that part of the internals.

So, yes, MANY things to organize in the coming weeks, until then, I’m VERY interested in everyone’s feedback on the language itself.

I’ve set up a Discourse forum here.

The key things I’m looking for feedback on right now is the language design and the type/provider layout. Right now the best type/provider to look at in the code is service. Meanwhile, I’m working to polish up the internals and make a really good, very stable, easily subclassable Python API.

What do you think? What do you want to be using as a configuration management system these days?

As people move more and more towards immutable systems, I thought configuration management would become less common — but ultimately, what we see the most is a build language that defines the guts of our computer systems, and there are also VERY important jobs of deploying clouds themselves, maintaining databases and message stores, and so on. Configuration tools aren’t going anywhere, and they help us not worry about knowing the way 57 different command line tools work. It’s becoming a little less common, but it’s still vital. Maybe it’s not the proverbial hammer every construction worker has, maybe it is a angle grinder. But the world freaking needs angle grinders. I think it’s more of a miter saw though. Ok, enough with the tool analogies…

So there you have it, hopefully taking some nice ideas from all the major players in this space, mixing them up, and doing something completely new, all powered by Python 3, full-on object oriented development, no scripts (full stack traceback and debugging everywhere!), and also no YAML this time. That might not appeal to the exact same folks — I think that’s ok too. This is definitely a different flavor.

While there are several large players in this market, I’m doing Vespene more for fun — but I suspect, if you believe in where this *COULD* go, get involved, and help share feedback, we can take this to some pretty impressive heights. I’m committed to maintaining this long term regardless, but I need you though! If you like Python a lot, and also have an interest in seeing a really easy module development environment, this might be the project for you. Stop by the forum and let me know what you think. If you don’t like forums, or want to share more details, you can also send me a DM on twitter at @laserllama.

You can also follow @opsmop and @vespene_io on twitter if you like.

I hope you find this interesting. I am very very excited about being able to work with many of you again, as well as meeting lots of new folks, and am curious to see what we might build together!

GITHUB: https://github.com/vespene-io/opsmop