Lisp may not be the best language for Data Science, but what can we still learn from it?
This article is in response to Emmet Boudreau’s article ‘Should We be Using Lisp for Data-Science’.
Below, unless otherwise stated, lisp refers to Common Lisp; in general, lisp refers to the lisp family of languages, just like the C-family of languages. There are functional lisps like Clojure and Scheme, and there are general purpose lisps such as Common Lisp and Racket.
The primary hurdle to using Lisp for Data Science, I believe, is the non-infix syntax common in mathematics.
[Edit: Added note about parentheses balancing wrt paredit, and mentioned slima for atom. A redone note about formatting lisp code. Mention of PAIP. Thanks to comments at r/lisp!]
But, is there anything to learn from its syntax?
Lisp base syntax follows prefix notation and “everything is a list” convention. In this notation, the first element of the list is the operator while the remaining elements are the operands.
To some, the lack of commas in lisp can feel cleaner than their presence in non-lisps. YMMV.
Further, lisp symbols can actually be *anything*.
The lisp reader is also, by default, case insensitive; so you don’t need to bother about whether something is called “fooBar” or “foobar” or “FOOBar”. It’s usually just “foo-bar”. Hyphens also feel more natural than underscores. In daily language, we rarely use — at least I don’t — underscores; but hyphens just feel more natural while writing compound-words.
All of that is significantly subjective. Where lisp shines is when it is used in conjunction with a structured editing mode like the paredit mode.
The alternative in other languages happens to be
- Hold your mouse and place the cursor at the start of the code block.
- Drag your mouse to the end of the code block
That’s about 100–300% more effortful than if you know paredit. And if you are gonna be alive for more than a decade, paredit is definitely worth the learning! Plus, did you notice that you do not need to count parentheses? The editor keeps track of it for you! Even for non-lisps, there are equivalent modes like smartparens-strict-mode you could help yourself with.
But is this something unique to lisp? No, in fact, there is a related open issue for julia. So, if you did understand the power of paredit, do try smartparens-strict-mode or equivalent with other languages!
Idiomatic lisp code looks nothing unreadable the way the other article states. Here’s the same code, completed and formatted.
On the extensibility of lisp syntax
Lisp is characterized by its extreme extensibility. You can extend the reader. You can introduce new syntax.
I’d also consider it to be objective that the prefix lisp syntax isn’t particularly well suited to sharing mathematical code.
In almost all written mathematics, the prefix syntax in code and paper/text-book takes away the overhead of checking whether you typed the expression correctly. I, as a lisper, don’t find the former syntax to be “harder” than the latter; if anything, it states exactly what is to be done — namely, we have to add 5, x, and another expression. However, that isn’t how I find mathematics written in books!
But lisp is extensible. There is a library called cmu-infix that introduces the required mathematical syntax; so the following is perfectly valid lisp code once you use it.
And that is hardly different from the math syntax above!
However, in 2020, in my opinion, cmu-infix needs more polishing, particularly to deal with array slicing. And, it’d be great to have superscripts to mean power too, just very like usual mathematics!
Quicklisp, and Ultralisp
Where I found myself significanly disagreeing with the author was on quicklisp. Quicklisp is one of the best package management I’ve known, thanks to its guarantee that “all packages build together”. So, as long as you haven’t moded your lisp system, loading a library is as simple as (ql:quickload “library”). This is actually a dist system, akin to what one’d find with the linux package system apt. There are almost monthly updates to quicklisp, and if something does break due to the upgrade, you simply go back to a known working dist.
So, I don’t really get the author’s point about using a quicklisp compatible compiler. It’s just ubiquitous!
The monthly update cycle can, indeed, be painful for projects in active development; for which, there happens to be ultralisp, that updates every 5 min!
Of course, there are facilities like qlot to help manage dependencies, if quicklisp dists ever happen to be insufficient for you.
While macros are one thing, I don’t tend to find them of as much use as SLIME.
Your usual development cycle consists of
- Write/Edit code.
- (Optionally) Compile.
- Run and debug.
I’ve found myself waiting for 10 sec on the smallest of changes while working on a largish python system. SLIME is a god-send in that it takes this waiting period to under 1 sec, unless perhaps you are doing something in the base of your system.
Are there alternatives in other languages? Sorta yes. In python, if you fire up a REPL inside emacs, it is connected to python buffers; however, the lack of a lisp like package system limits this to single file workflows. For julia, there is julia-snail.
What about jupyter notebooks? Consider SLIME to be the next step in the jupyter-notebooks evolution :p. Okay, one place where jupyter notebooks (can) shine is when you need to display images and graphs to management folks. Though, emacs-ers do have the awesome org-mode for the same, that does allow inline latex and images, and multiple language blocks in a single file. YMMV.
See? You learnt so many things without actually learning a lisp?
Well, I’d say you are already about one-third way on your path to learning lisp. May be help yourself learning the full thing!
What about libraries?
In theory, it is entirely possible to write SBCL that is almost as performant as C. For instance, one of my past-times has been working on a library called numericals that gets at least as fast as numpy using SBCL.
While, not tested for performance by me, there are much more complete libraries like clml for machine learning.
In practice, in fact, lisp was never a go to language for machine learning! Lisp has been the language of the pre-ML AI — Norvig’s Paradigm of Artificial Intelligence is a great read for anyone interested in what AI was before the dawn of Machine Learning.
That said, the take home message is: structured editing, quicklisp-like package system, and SLIME. If lisp does interest you, take a look at some of its defacto libraries — the Getting Started sections of cl-who, cl-ppcre, fiveam, iterate and alexandria in very particular.
The bonus of working with lisp is that, if your needs are met by the ANSI standard, or perhaps some well solidified defacto libraries, then you can avoid yourself the pain of upgrading your projects due to “language upgrades”. ANSI standardization was complete in 1994; and quite a few projects from the pre-2000s can still run in ANSI-conforming implementations in 2020.
If Common Lisp does interest you, The Common Lisp Cookbook and Practical Common Lisp, coupled perhaps with Baggers’ videos are some of the several excellent resources to get started. If you ever need help with anything, the community is welcoming at r/CLSchools! Of course, there’s the more general purpose r/lisp, and the more familiar stackoverflow.
And, if you find emacs to be curve, there is SLIMA for Atom as well.
Hope you have a good day!