These new type systems, and the type systems in other modern languages often draw inspiration from Haskell. I’ve gotten pretty tired of reading about references to Haskell and having them go completely over my head, so I finally decided to bite the bullet and learn it.
What is Haskell?
Haskell is a general purpose programming language that is distinguished by two main features:
- It is a “purely functional language”. This means that any function written in Haskell if called with the same arguments will return the same values, and does not have any “side effects”. This means that it will not modify an array or map, print anything out or do any other kind of I/O.
- It has a very powerful system of static types.
I find that I tend to not learn much while going through tutorials, so I decided to solve a simple problem and learn what I needed to learn along the way:
Write a program that takes in a string containing a bunch of words, along with a maximum line length, and prints them the words, minimizing the amount of space at the end of each line.
I chose to define “minimizing the amount of space at the end of each line” as minimizing the sum of the squares of the space at the end of each line (to make the space more balanced).
Here’s a Python solution that I wrote a while back to do this:
Don’t worry too much about reading the details of the program. If you’re curious about the algorithm, basically, what this is doing is iterating over each word i, and then for each word j from the beginning of the string to i, choosing the cheapest cost j, where the words from j to i are on a line together as the last line in the output string. The cost of placing a line break before j is the amount of space left over on this last line, plus the cost of just considering the words from 0 to j.
Alright, now, to rewrite it in Haskell!
The first thing I decided to start with was putting together a program that computed the “minimal cost” of the optimal solution, rather than actually printing out the words in the paragraph, like I do in the Python program.
The familiar and easy part
The first step I took in getting this program to work was handling inputs and outputs. The interface is as follows:
# "10" here is the max line length to use, and words.txt is
# a file containing the words to print.$ ./consistent_printer 10 words.txt
The entrypoint to any Haskell program is a function called main, and the operation for pulling out the command line arguments looks like this:
main = do
[lineLengthRaw, fileName] <- getArgs
To extend this to print out those arguments would look like this:
main = do
[lineLengthRaw, fileName] <- getArgs
For those who have mostly worked with imperative languages, this probably looks very familiar — it simply parses and writes the command line args to the variables
fileName, and then prints them out using the
putStr argument. Now, if you remember what I mentioned earlier, Haskell is a language where there are not supposed to be side effects; all functions are pure. There are clearly some side effects going on here — we are pulling values from the environment, and outputting stuff in the console. The reality of any programming is that some amount of imperative processing is going to be required. Haskell provides the do keyword that allows programmers Haskell programmers to do that. I’m not going to go into details on how it works here, but if you’re interested, the StackOverflow answers here are great.
Inverting Control Flow
The bulk of the Python program I wrote is a nested for loop. Here it is again, without the comments, and without any reference to
previous_break_indexes for simplicity:
for i in xrange(len(words)):
minimum_cost = sys.maxsize
for j in xrange(i + 1):
line_length = len(" ".join(words[j:i+1]))
if not line_length > max_line_length:
extra_space = max_line_length - line_length
previous_cost = 0 if j == 0 else cost_array[j - 1]
total_cost = (previous_cost + (extra_space ** 2))
if total_cost < minimum_cost:
minimum_cost = total_cost
cost_array[i] = minimum_cost
Haskell doesn’t have for loops, so we’re going to have to do things differently here. Why it doesn’t have loops is related to the purely functional nature of the program — a for loop doesn’t have any value at the end, it just executes some code multiple times. In a world with pure functions and no side effects, executing a block of code multiple times and not returning any value is completely pointless. For more detail on why this is the case, this article is a great resource.
Therefore, Haskell has other mechanisms that it uses for looping. For instance, it has the map function, which takes a list and a function, and applies that function to each element of the list and returns the new list. In addition to map, it has a whole slew of other array operators.
What is the right option here?
Well, let’s break down what this code is doing. What are we computing on each iteration of the loop is the value of
cost_array[i]. So, instead of computing this in a loop, we could simply write a function that given i , returns the value of
cost_array[i]. This value, however, is dependent on the results of another for loop. What’s happening in that for loop though is that we are building up the value of variable
minimum_cost. Similarly to above, we can instead have a function that given an
i and a
j, returns the cost, use
map to call it for all possible
j values, and then take the minimum.
Let’s implement it:
We’ll start with the outer loop, and implement a function called
costOf. It’ll take an index, as well as the list itself and a maximum line length:
-- `costOf` refers to the minimum cost if linebreak
-- is placed after the ith word.
costOf :: [String] -> Int -> Int -> Int
costOf list max_line_length i
| i == 0 = 0
| otherwise = minimum costAtEachJ
where costAtEachJ = map (costAtJ list max_line_length i) [0..i-1]
costOf :: Int -> [String] -> Int -> Int is the type signature, and specifies that this function takes an
Int, list of strings, and another
Int, and returns yet another
| operator is what’s called “pattern matching” in Haskell and allows you to specify what the value of the function ought to be based on some conditions. Here, I am saying that if
i == 0, then the function should return
0, and that otherwise, it should return the minimum value of the list
costAtEachJ, which contains the values of the minimum cost for each possible value of
where operator allows you to define local variables — here I am defining the variable
costAtEachJ, which essentially is the value of the inner loop in the Python program (we’ll define the
costAtJ function shortly). To get ahead a little bit, the
costAtJ function will take four arguments,
j. What I’ve done here with the
map function is called currying — in Haskell, calling a function with fewer arguments that are required returns a new function that only takes the remaining arguments. Therefore, calling
costAtJ list max_line_length i returns a new function that takes a single argument
j. Therefore, when we pass this new function to
map , it applies that function to each element of the list specified by
Moving on, computing
costAtJ is a a little trickier:
-- `costAtJ` is the cost incurred by placing a newline
-- after j.
costAtJ :: [String] -> Int -> Int -> Int -> Int
costAtJ list max_line_length i j
| lineLength > max_line_length = (maxBound :: Int)
| otherwise = previous_cost + extra_space ^ 2
where lineLength = length (intercalate " " (drop j (take i list)))
previous_cost :: Int
| j == 0 = 0
| otherwise = costOf j list max_line_length
extra_space = max_line_length - lineLength
While computationally, this function is trickier, it doesn’t really use any new concepts. The only new things I introduce here are the
take functions. What
drop j (take i list) essentially does is just return the list from indices
intercalate is essentially what
join is other languages — it converts a list to a string, using the first argument as the delimiter.
So, with these two functions written, we have the building blocks to build up a whole program that computes the minimum cost for a list of words. Here’s the solution:
There are some very clear differences algorithmically from the Python program I presented. Notably, the
cost_array list is completely absent — we aren’t actually storing a list anywhere here. This is actually quite problematic — even if this program is taking up less space, every time we encounter
costAt, we have to recompute it, and we end up computing the same values over and over again.
In the next couple posts, I’ll explore strategies for solving this problem and “memoizing” the
costAt function, as well as reincorporating the
break_indexes variable and printing out the solution.
What have we learned so far?
In Haskell, the different control structures present require thinking about code differently. Instead of thinking in terms of using loops to build up values, you have to think in terms of values you are computing.
What’s interesting is that it’s possible to write code like this in other languages. In the Python program, we could have easily written a very similar
costAt function that recursively calls a
costAtJ function. Many modern languages support first-class functions and allow control structures like
map, which is necessary to write code like this.
The main difference then, is that unlike Python, Haskell requires code be written in a functional manner and does not support breaking the paradigm — there isn’t an easy way out. Writing Haskell code is great training in writing code in a functional way for this reason, you have no choice but to do so.
Also, debugging Haskell programs feels a lot more like debugging SQL queries than like debugging Python programs. For instance, in the Python program I include above, it was easy for me to drop a bunch of
Another observation is that it’s a little bit harder to reason about the runtime performance about this program written without the for loops. In the python version of this, it’s easy to tell that the program runs in O(n²) time, because it’s two nested for loops, while in this case, it requires fully reading and grokking the code to determine that as is, it runs in O(n!) time (in a follow-up, we’ll add back the
cost_array array and make it O(n²) in the Haskell version as well).
Getting started with Haskell was a lot of fun. The constraints you have to deal with force you to think about your code in terms of formulas and think in terms of “what should the value of this array be?”, rather than “how do I iteratively build up this array?”. While this particular example wasn’t great at demonstrating some of the benefits of functional programming — you could imagine that the assumption that functions are immutable makes adding certain features, such as memoization, easier. In the next post, I’ll be covering that.
I hope that this post demonstrates practically the kind of paradigm shift that programming in Haskell requires. Feel free to reach out to me on Twitter if you have any questions or comments!