Cleaning out Rusty old sockets: implementing a netcat command line using clap in Rust

Noa
Dwelo Research and Development
4 min readAug 25, 2020
Photo by hannah joshua on Unsplash

Socket command lines are a very interesting bit of tech. They essentially allow you to easily run commands against your program while it’s running, rather than just when you first start it with the options you pass to the CLI.

Speaking of which, CLI’s are very easy to use, aren’t they?

You can just type a fairly short identifier, the action you want to take, and a few options or modifiers to that action, and your shell will start a process that does what you want. But, if the process is already running, especially if it’s running on another machine, you’re out of luck with a traditional command line. To keep the same level of convenience, you might want to look into a socket command line.

A socket command line is just a “shell” implemented over a network socket, usually just a raw tcp stream. As a software engineering intern this summer, I worked on Boss, Dwelo’s software that runs on our smart apartment hubs and sends commands to/receives data from smart devices. For a while, it’s had a socket commander that you could netcat to while it was running locally in mock mode; however, it wasn't all that intentionally designed. It was just a bunch of regexes that matched on each line one after another, with a fair few drawbacks:

  • Ill-defined syntax: sometimes arguments were delimited by spaces, sometimes by commas, sometimes by = signs
  • Not very performant: it had to restart parsing from scratch for each regex that failed
  • Kinda just ugly:

I wanted to rewrite it in order to make it more robust and easy to extend, so I decided to use nom to write a parser for a more well defined version of the command language.

I had never looked into nom that extensively (other than reading Amos Wenger’s blog posts about parsing ELF executables), but it was a pretty good experience. I initially avoided the macro parser system at first, but I found myself using it more as time went on; it is a pretty elegant format for defining parsers. However, once I got to the point where I was adding an escaping system (because what if a wifi ssid has a space in it?), I had to stop and ask myself: aren’t I reinventing the wheel a little? I’m just making a worse version of the posix shell syntax, and isn’t that already much more standardized? Then, I remembered the shell-words crate. It parses a command string according to the rules set out in the posix specification, which everyone should be mostly familiar with — 'foo' for raw-ish strings, "bar\"baz\"qux"for escapable strings, escapable spaces so that foo\ bar\ baz is one word — again, very familiar, comfortable syntax.

So I thought, “wow, that will be much nicer”. Then I can just match on the words of the command in order to determine subcommands and arguments. But isn’t that just reinventing the wheel again? Isn’t parsing subcommands and arguments already done by a crate in the ecosystem? One that’s most likely already added as a dependency in our project?

clap

Turns out, clap is perfect for this use case. Although most usually just use it with its App::get_matches() method, which uses std::env::args() and just exits the process when an error occurs, it also has a very nice App::get_matches_from_safe(iter) method which takes any iterable of string arguments and just returns a result. This is much nicer than the regex:

  • The clap::Error type implements Display with all the coloring and help text that you'd normally expect.
  • Performant: most rust cli tools use clap, and (anecdotally) I’ve always found them to be very fast to start up.
  • Auto-generated help strings: with the regex implementation, if we failed to match on any of the commands, we just had a README.txt file that we include_str!'d, which was just a list of the supported commands. clap automatically gives us a printout for --help, with an up-to-date list of the commands/subcommands and any extra info we choose to give them.
  • We can use structopt, which allows us to simply define the structure of the subcommands and just match on them to map them to the actual commands for the executor.
  • structopt also implicitly uses FromStr to parse any arguments, so any arguments that I did want to validate with nom's parsing I could just do with a newtype and FromStr.

Here’s a simplified (and sanitized) version of what I ended up with:

It’s perfectly feasible to avoid the command->command translation. It just happens that our program already had a command model + api that we needed to use to execute the commands.

In the code that actually handles the tcp stream, we put a BufReader around the TcpStream and iterate through the .lines(), parsing and executing each command. If the ParsedCommand is an Exit, we can just break out of the loop and drop the tcp stream to close the connection. You can also use a runtime like Tokio to handle the network io, which is what we actually do, but if you're not integrating into an existing tokio app, the standard library's network sockets are perfectly fine. Assuming that code above is in a command module, here's what some really basic command line handling code would look like.

Then, we can just run the binary!

The code for this article is available to peruse here. I considered making this into a crate and publishing it, but there really isn’t much unique code to this pattern or anything that can be made generic over different CLIs, which is part of why I think it’s so cool.

Thanks for reading!

--

--