Why Laziness, Impatience and Hubris Drives Great Developers to Script

Laziness
The quality that makes you go to great effort to reduce overall energy expenditure. It makes you write labor-saving programs that other people will find useful, and document what you wrote so you don’t have to answer so many questions about it. Hence, the first great virtue of a programmer. Also hence, this book. See also impatience and hubris.
Hubris
Excessive pride, the sort of thing Zeus zaps you for. Also the quality that makes you write (and maintain) programs that other people won’t want to say bad things about. Hence, the third great virtue of a programmer. See also laziness and impatience.
Larry Wall, Programming Perl

For many years I have enjoyed writing Perl scripts more than any other type of code. It is fast, immediate and very satisfying to automate some small niggle with a quick script. Now, developers are kind of tribal so there will be tribes who applaud this (thanks Perl tribe), others who will scoff (hello Python guys) and the others who aren’t sure what I’m talking about.

Lets forget about Perl for a bit — we’ll come back to this later — and instead think about why. Why do I enjoy it? Well it’s just fun. And it is quick to write and is typically to automate some boring repetitive task. So a win-win. And because laziness is one of the characteristics of developers, it is this that drives developers to automate solutions using the fastest possible tool. And great developers will look to demonstrate their skills to their peers because of their hubris.

When someone calls themselves a “developer” or “programmer” what do they mean? Hopefully someone who writes code. Someone who develops software. And then when you hear someone call themselves a “professional developer” or even better “software engineer”, is this a level-up on a plain-old-developer? What additional skills and experience does this professional have I wonder? When hiring senior developers I look for experience with multiple major language platforms including: C/C++, Objective-C, Java/JVM languages or .NET. Web development is also a must given its prevalence. But is this enough?

It is not. Every developer should be skilled in at least one scripting language. The productivity gain from a quick script to automate a problem is high. And there are plenty of simple problems we find as software engineers. It’s more than this though. Scripting languages are essential platforms for today’s development. Your laziness and hubris will demand it.

A Skip Through Scripting

Scripting languages are more popular than ever because of the rapid development possible with dynamic languages. And as software slowly evolves they are becoming ever more mainstream for production use. Lets try and categorise some uses for scripting.

Command line applications

In the beginning scripts were used for command-line processing. This is sometime called “glue” which characterises it well: it will allow you to quickly glue things together. Glue is used for handling command-line inputs and interacting with the console. File processing and string processing for example are very popular uses for command-line scripts. Many scripts for command-line processing can also be used interactively in the shell which simplifies testing and debugging.

For example to grep out all the lines of a logfile that contain a regular expression looking for “ ERROR “,

$ tail *.log | perl -e ‘print if /\s+ERROR\s+/’

You will commonly see Unix shell scripts, awk, sed, tcl, perl, ruby, python, PowerShell (for windows) used for command-line applications. Node.js can also be used for command line scripting.

Web development

In 2016, web development means both browser-side code and web server-side code. Scripting languages have popularised both sides.

On the browser side we have Javascript, based on the ECMA standard ECMAScript. Javascript is natively supported by all the major browsers and so is the #1 scripting language for browser-side code. It has given rise to a “Cambrian Explosion” of open source javascript web frameworks that is essential for all web developers.

Not content to remain in the browser Javascript has become popularised for web server development, popularised by Node.js. This allows the rapid development of HTML form processing and serving HTTP requests. Of course before we had Javascript on the server-side we had Perl (beginning with CGI scripts) then later PHP and Ruby on Rails. Others like Lua are vying for attention in this space too. So if you’re a web developer you’re already a scripter.

DevOps

DevOps is something so commonly misunderstood that so much time is spent clarifying what it means. I’m going to assume you know what this is. And I’m sure you will have people in your multi-disciplinary teams who are using configuration management to describe your infrastructure, application deployment and configuration using code. Given that code is declarative not procedural it is not quite used in the same way as a typical script. Yes there is lots of YAML but scripting skills remain important to go beyond this configuration. These are mainly Ruby (Chef, Puppet) and Python (Ansible, Salt).

Embedded

It is common for software products that wish to be extensible to provide a high-level language for doing this. This fits perfectly with scripting languages. For example Lua is often chosen by vendors because of it is small, fast and built in C. However it is not uncommon to find Python or Ruby embedded. For example Mercurial the open source distributed source control tool is written in Python and can be extended with Python scripting. This is apparently why Facebook chose Hg over git so they could extend it for their unique uses.

Desktop app development

Who has written a desktop application using a scripting language? I have but this was many many years ago. I would guess that no-one under the age of 30 has. You rarely see desktop applications these days—in the old days one of the popular ways to do with was with Tcl/Tk which I haven’t seen for 20 years. I see thought that Microsoft has introduced the development of Universal Windows Apps for Windows 10 using Javascript but I haven’t seen any of these yet.

Choosing a Scripting Language

The point of scripting languages is rapid problem solving — often with an interactive session for immediate response. If a language supports rapid development and rapid problem solving I would classify this as a scripting language. But it is very very common for scripting languages to be dynamic languages and to be interpreted at runtime. The distinction between static languages and dynamic languages is becoming more grey every year as dynamic languages add more features (such as type checks and compile to IL).

The other characteristic of scripting languages is they must be fun. if a scripting language is not fun then why would you use it? Find another or head back to a static language. Scripting should be fast and fun.

Fun is a subjective characteristic. Having said this, fast is a subjective characteristic also: it depends on your skills. So how do you choose a scripting language?

Try one. Remember they are fast and fun so download the interpreter and try it out. Ditch it if its not. Then try another.

As implied above, some scripting languages are better suited to certain things. So figure out what you want to do: is it command-line applications or web development for example would be a good place to begin?

Do you have strong feelings for patterns or styles? This might lead you towards or away from a language. For example if you love object-orientation Ruby provides great native support for OO. If you appreciate terse, expressive language then you might love Perl. If you just want to do functional programming with closures then Lua or Clojure might be for you. If you already know Javascript then learn it some more. If you are doing mathematical calculations that require rational numbers then Perl 6 is waiting for you.

Programming languages have communities. These communities are made up of committers, users and advocates and they can be quite tribal. But they are all open source and tend to have great documentation and support. So rely upon the community to learn.

As for me I’ve spent years writing shell scripts (when I was younger). I started with Bourne Shell, then Korn Shell and then BASH. I spent a few years writing PowerShell when I was using Windows and .NET. But for the most years I have been writing Perl scripts. I empathise with the Perl community philosophy, which was born from Larry Wall’s original language design. By the far the most fun for me is Perl. Perl 5 that is.

Scripting With Perl

Last summer I wrote a Perl command-line script to parse AWS Elastic Load Balancer log files to identify slow response times given these are stored by default in S3 and we hadn’t ingested these to our ERK[1] cluster. It spins through some ELB log files and highlights the transactions that are slow — either upstream of the ELB, within the ELB or downstream of the ELB. It’s grep for performance in ELB log files.

You can see the output when you run it — with IP and URL redacted. This shows that there are two calls of 3 and 6 seconds spent processing inside the AWS ELB — which is cause for some investigation. Of course if the ELB log files have been ingested into ELK or Splunk then I wouldn’t need a script to discover this but very often you’ll need to apply some glue.

=========/Users/peter/Developer/dig-scriptogram/elb/testdata/719728721003_elasticloadbalancing_eu-west-1_publishing-external-elb-https_20150828T0805Z_00.000.000.000_2a71z70z.log=========
2015–08–28T08:01:42.778254Z - 3.181453 - https://yoururl:443/ (200)
2015–08–28T08:03:24.892627Z - 6.553243 - https://yoururl:443/stats (200)

Perl is often derided. It is an older scripting platform than Python, PHP, Ruby and has provided much inspiration to these newer scripting platforms — as well as compiled languages. Regular expressions are now standard in all languages. But twenty years ago they were popularised in Perl 5 as a derivation of UNIX wildcard expressions. Java , PHP, Ruby, Javascript and .NET all support Perl standard regular expressions for manipulating strings because of significant time saving they offer compared with splitting and looping through strings.

In spite of the immediate results gained there is often a long-tail of disappointment for the Perl script maintainer. This is because it is commonly easier to write a new script than iterate an old one. Now this isn’t new for other languages but it is especially problematic for Perl because of There Is More Than One Way To Do It, a core principal of the language design. And the terse syntax encourages one-liner writers and Perl poets but makes for harder to maintain code.

if ($class{body} =~ /(}[\/\*\s-=\w:\?@\.\;:]*\n\s*($keyword{classModifier})?\s*($keyword{class})\s+(\w+)(?:$keyword{extends})*(.*?){(.+)}?)/s)

It is rather embarrassing whenever you return to one of your own scripts six months after the joy of writing it and you find yourself struggling to understand it. The regex above matches a Java inner class but without comments it will be a struggle to remember this. You can of course make it easier for yourself with more verbosity and more comments but you then start reducing the fun factor.

Lets take a look at the grep_elb.pl script I wrote to parse the ELB log files. It is available in Github for you. It’s fairly straightforward procedural-style Perl that uses one module for parsing the command line arguments. It’s also only 78 lines long and is hopefully easy enough to follow without much explanation.

It just reads the command line options, validates them then loops through the passed in log files. For each of these it reads line-by-line for more efficient treatment of large files (rather than slurping it all into memory first) and prints out the line with the slower response times than the threshold time. There are three responses times in an ELB log file, the first for upstream of the ELB, second for internal ELB processing time and the last for downstream processing time. This is $time1, $time2, and $time3.

Scripting Performance

When I learned Perl for the first time (circa 1997 when Perl 5 was newly released) it was considered speedy. Not speedy compared with C but better performance compared with shell scripts, AWK, SED, TCL, etc. Not challenging this I still consider Perl to be speedy in 2016. But is it?

Aggregated benchmarking sites like the Alioth benchmarks show that Perl is not particularly speedy. Oh dear.

Well lets run it and see. I’m running this on my Manager MacBook[2] plugged to the mains here in the UK. Performance of the script is important because ELB logs are so verbose for a large service. It can easily be gigabytes per day.

I have two sets of log files,

  1. The smaller set is 285 Mb of log files made up of 5 files with 411,000 lines in total. This are redacted log files that are in github so you can run what I am running.
  2. The bigger set is 24 hours worth of ELB log files for a large service. It has 1589 log files that contain 5,159,601 lines and are 1.6Gb. This set isn’t published.
$ time perl grep_elb.pl -t 3 ~/Developer/dig-scriptogram/elb/testlogs/*.log
6.47 real 5.72 user 0.17 sys

So then it took 6.5 secs (on an average of 5 runs) for Perl 5 on the smaller set. Not bad. Or is it? On the bigger set of log files it took a little longer: 114 secs. That’s a fair while to wait.

$ time perl grep_elb.pl -t 3 ~/Downloads/elb-pub/*.log
114.98 real 91.20 user 3.87 sys

Get Scripting

This got me thinking. What other newer languages could I have written this in? Are they more fun? Will they run faster? Is the code more terse or verbose? Would it be easier to maintain? Hubris again.

With the launch of Perl 6 in Christmas 2015 I decided to learn some new languages. Next time I’ll explore my journey through Perl 6, Ruby, Lua and Node.js editions of the ELB grep script. I was (pleasantly) surprised by the some of the results.

But I wonder can you do better? Can you write better, faster versions of my perl script? Hubris demands it. I await a flood of improvements.

What about adding some other scripting languages? Or maybe even some compiled languages to compare the runtime performance? Clojure, Golang, Dart, Erlang, Rust anyone?


[1] Elasticsearch Rsyslog Kibana, a variant of the ELK stack.

[2] MacBook, Early 2015 (MacBook 8,1). 1.3 Ghz, 8Gb RAM 1600 Mhz DDR3, 256Gb SSD (FileVault enabled), Intel HD Graphics 5300 1536Mb. El Capitan, 10.11.3 with Perl 5.22.1.

A single golf clap? Or a long standing ovation?

By clapping more or less, you can signal to us which stories really stand out.