Scala Short Take: Will it Fold?

A tale of procedural code made functional

Stephen Harrison
Rue Gilt Groupe Tech Blog
6 min readAug 29, 2018

--

Background

For a reason that’s not important, we wanted to generate all versions of a multiword string (we’re Rue Gilt, so it’s a retail brand name).

Words, their initials, followed by “.”, space, or nothing at all. So “Jones New York” generates that string intact, through “J New.York” to “J N Y” and everything in between. We think there are 3³ * 2³ = 216 variations.

We were in a hurry so just wanted to knock something out that worked. We started with loops and quickly got into special cases and off-by-one errors. But we soldiered on regardless. The code looked like

A few things we’re wondering

  • Is it correct?
  • Could we explain it?
  • Why the var?
  • What does the words ++ fl do?
  • Can we make it for functional?

Let’s clean it up

We think there’s a fold or reduce in there somewhere. There is. Here’s how, with the steps to derive it along the way.

First let’s make functions for things, including one that contains the for loop with a slight change to clean it up.

(The lines of code wrap around, but we’re looking for a different kind of fold!) In any case this looks like a nice encapsulation of some of the operations we’ll probably need.

safePrefix gets the initial letter of a word if it exists, or "" if it doesn’t. variations uses this result and returns word with some suffixes. expand takes a word and returns all the word variations (in w) along with all the remaining combinations. We’re going to arrange to pass the current word (in, well word, which we totally should have called head) and the remaining words in the sequence tail. The for is just nested loops, and so a bit unsatisfactory, but we’ll get to that later.

Next task is to put these new functions together. We’ll make a function that traverses the list of source words in the input, peeling them off one at a time.

We made it messier because we have to check for an empty list. We can’t avoid putting somewhere: Either as we have it above or guarding the recursive call. And while the recursive call itself is OK to look at, all the tail and flatten tomfoolery is a bit much. And that would ruin our chances at being “provably correct by inspection” a thing we value highly at Rue Gilt Groupe.

In any case, now we call the entry point and print the results

It gives us what we want, along with duplicates, which we don’t quite understand. But we could easily throw them into a Set to dedup them.

Now let’s get fold-y or reduce-y

We’ll start with Seq(""), a list of a single empty string. Then we’ll add all the word combinations for each source word as a prefix to all the previously created strings.

In other words, we keep expanding all the different possibilities so far with the variations of each word. And so on, until we’ve worked our way to the first word. Simple enough to say, but how do we implement it?

That .foldRight expresses just the right thing: start with Seq(""), and compose all the variations of each word to the right until we’ve done the last one (on the left).

So why .fold and not .reduce? Two reasons. .reduce does not work on empty lists because it requires at least one element, the head of the list. We don’t have an empty list (this time), but we want our code to be robust.

So we need .fold for that reason, which works because it takes a starting value. Note in .fold the function can produce a different type to the elements of the list, whereas .reduce variants can not: Everything must be the same type. This is another good reason to use .fold.

One other thing, there’s an ungainly .flatten in there, which is not only ugly but we feel sure can be done a better way.

But is that enough?

Not really, if we’re honest. There’s still a nested loop in there, even though the Scala syntax makes it expressive and compact.

Also, this blog post is about fold and reduce, so we need to go full beans on making that happen.

Start with some declarations, a bit cleaned up this time and more general.

safePrefix now returns Char we got with .headOption. We have to convert this to a String somewhere. Not really sure of the best place. Now we have to figure out how to write a new combine() function that we can substitute for the old one.

We can do it in a one-liner, but this looks arguably cleaner, and so more likely to be correct and easier to explain.

The _ + t is a bit mysterious until you know Scala. We could have written this as w => w + t, which you could argue is clearer. As a bonus, we removed the recursive call and made it recurrent. We don’t need the return type, but write Seq[String] for clarity. Good IDEs will show this if you hover over the function or value. In any case our driver code is still

Which when we run it delivers the following and we’re done.

Wait what? The words and variations are in reverse order. What happened? And how do we fix it?

With the help of some println debugging code, we find the .foldRight(concat) is applied over the source words in the wrong order.

We broke it. Can we fix it?

Sure we can. Simply using a .foldLeft should do the trick.

Converting foldRight to foldLeft isn’t as easy as we thought

Grrrr…. What’s going on. We look at the signature of .foldRight and it’s

meaning type we start with, B, a Seq[String], which is the second argument to op , the function combine in our case. And we return a B, Seq[String].

The function must work with the first argument of the type A, a String here. So it makes sense to us now that the words get appended in the wrong order, from the right.

Let’s see whether .foldLeft is the solution, appending source words in the correct order. Along the way, we can find the source of the error above.

The signature of .foldLeft is

B is still Seq[String], the initial value and the return type of the .foldLeft. But the order of arguments to op is different. Why the complexity? Ignoring the specific implementation of op for a moment, let’s rewrite .foldRight as

and .foldLeft as

The types of the arguments to op are reversed. An easy fix. Just change the signature of combine to

The upshot

Was it worth it? We think so. We learned how to write iterative functional code to replace explicit loops. (It took considerably less time to do the work than write about it!) It’s easier to explain and “prove correct by inspection”, something we said we shoot for.

Also we removed the var, which is always really nice. var is a bad smell in code and we try to avoid it at all costs. Mutability encourages procedural code. We want functional.

Thanks for reading. There’ll be more on Scala in a bit.

--

--