Rust as the Future of Sudoku Verifiers

Marco Amann
Digital Frontiers — Das Blog
9 min readSep 25, 2020

At Digital Frontiers we all have our favourite languages. The other day we were arguing about advantages and shortcomings of some specific ones and decided to have a closer look at them, using a Sudoku verifier as common example. If you are also interested in other programming languages, here is a collection of links to the other blog posts in our series:

Let’s build a Sudoku verifier, that reads Sudokus from a file and outputs the number of valid Sudoku solutions in it. I tried to find a balance between conciseness and performance, so please don’t expect any magic in this regard. The main focus is safety, since this is (in my opinion) the most important promise Rust makes.

Before we get into the details, consider opening up the code in your IDE: Thanks to the static type system, IntelliJ shows us the explicit types known at compile-time.

The explicit types and parameter names in light gray

Saving us from a headache: Error handling

Let’s start off easy, by reading a file. We assume the user gives us the file location as a command-line parameter. If none is given, we make use of Rusts error-handling:

let input_file_string = env::args()
.nth(1)
.ok_or(Error::from(ErrorKind::InvalidInput))?;

So how does the error handling work here? The env::args()gives us an iterator over the arguments. We try to take the second one (indices starting from 0). If no such argument exists, the .nth(1) call will return the None variant of Option. We map the None to a Result type containing either the String or an IOError in this example. The tiny ? at the end automatically unwraps the Result and returns from the function in the case of an error. This means, we explicitly treat the absence of a file argument as not-recoverable and exit.

Since this is located in our main method, we need to give it a return type.

fn main() -> std::io::Result<()>  { ...

Moreover, this also modifies the exit code of our program:

➜ ./target/release/sudoku 
Error: Kind(InvalidInput)
➜ echo $?
1

Nice.

Now to reading the file: There are some things that can go wrong reading a file, for example the file might not exist. Rust got us covered here again, we just return in that case with the ? .

let file = File::open(input_file)?;
let mut reader = BufReader::new(file);

The reader offers an iterator over all lines that we will use, so let’s do the following: Only concern lines with valid strings. Then, try to parse a Sudoku from the line and only continue with those that have valid entries (numbers from 1 to 9). We then map the Sudoku to a boolean that tells us, if it valid and we count those valid ones.

let valid_count = reader.lines()
.filter_map(|line|line.ok())
.filter_map(|line| parse_raw_line_to_sudoku(&line) )
.map(|sudoku| eval_sudoku(sudoku) )
.filter(|b|*b)
.count();

Yo might ask what could go wrong reading a line of text? Perhaps someone nasty put in a naughty string that is not valid utf8, hence the line.ok() call.

What do we need to save a Sudoku? We need to find a new home for 81 numbers. Let’s use a struct for this:

struct Sudoku{
fields: [u8;81]
}

This fields is a sized array in Rust, it will always have 81 elements. If you try to save a smaller array in there, Rust will kindly ask you to refrain from doing so by returning an error. This is strong guarantee: If your function has a parameter that expects 81 elements, you can rely on always receiving an array of that size.

In rust, it’s all about safety «|» Photo by Alexandre Lecocq on Unsplash

Verifying a Sudoku

There are more ways to verify a Sudoku than there are solutions for one but all of them need to check the following three conditions: The numbers 1 to 9 are all present (that implies without duplicates) in all of the following:

  • Each row of 9 elements
  • Each line of 9 elements
  • Each of the 9 disjoint 3x3 boxes

In the implementation I chose I check those conditions as individual ‘views’ into the potential solution and stop if one of the conditions is not true.

fn eval_sudoku(sudoku: Sudoku) -> bool{
is_valid_view(sudoku.as_columns())
&& is_valid_view(sudoku.as_rows())
&& is_valid_view(sudoku.as_boxes())
}

That code is pretty standard, so let’s move on to more interesting things: Each view is an array of 9 arrays of 9 u8variables each. How do we tell this to the compiler? By using sized arrays again:, here’s an example as a function parameter: view: [[u8;9];9] .

How do we get such a view from our initial set of 81 values? One way would be to implement a custom iterator that returns the correct values. However, I chose the blunt way of just copying the bytes around to have a straight forward implementation:

fn as_rows(&self)-> [[u8;9];9] {
let mut res = [[0u8;9];9];
self.fields.iter().enumerate().for_each(
|(e,v)| res[e/9][e%9]=*v
);
res
}

It is astonishing how ugly the rust syntax can get. Note that we could re-use memory here but there was no noticeable performance difference.

So how to verify that each value in a slice is unique? There are many ways to do so, for example by inserting them a HashSet and detect collisions. But we know more about Sudoku-Slice: It has exactly 9 elements and each element is between 1 and 9. Therefore, we can do this in a single 16 bit integer by make use of bit-wise operations.
When we shift a 1 bit byx-1 bits to the left and use a bit-wise OR operation on an accumulator, we get an accumulator that has ones in all the indices of numbers x we encountered. So assume we have an accumulator after some steps that looks like this: 0010 (we processed at least one number 2) and we encounter the number 4, we calculate 0001 << (4-1) = 1000 and then OR that with the accumulator: 0010 OR 1000 = 1010 . Since we can assure that only 9 numbers are processed by using sized arrays, we can be sure that if we get a set of 9 ones in a row, all 9 numbers were in the slice only once. To express it in Rust:

fn is_slice_unique(slice: &[u8;9]) -> bool{
slice.iter().fold(0u16,|a,e| a | 1u16<<(*e-1) as u16 )
== (1u16<<9) -1
}

Of course the compiler optimizes this to indecipherability. Did you know there exist x86 assembly instructions like punpckldq or cvttps2dq ? (Spoiler: I didn’t) Unfortunately we need to explicitly tell the compiler to use 16 bit unsigned integers u16 for the computation, since 9 numbers sadly won’t fit in 8 bit.

Handle strings

The only piece of code I haven’t shown you yet is the parsing function. Here we will see one of the not-so-nice parts of the Rust type system.

A quick walk-through of the following code: A string (line) is split at the occurrence of a , yielding an iterator. The substrings are then stripped of all whitespaces and we try to parse them to some8 bit integer (1). If parsing fails, we ignore that substring (hence the filter_map and .ok() ). Of the successfully parsed integers, we take at most 81 (2), then filter them to be within 1 and 9 and create tuples with an enumeration. Then the value (num) of these tuples is written in an array (fields) at index idx , mapped to the empty type () and these are then counted.

     let count = input.split(',')
(1) .filter_map(|split| split.trim().parse::<u8>().ok())
(2) .take(81)
(3) .filter(|e| *e<10 && *e>0)
.enumerate()
(4) .map(|(idx,num)| fields[idx] = num)
.count();

Note that fields is already created before we start parsing and that the count value has to be checked to assure that we actually copied 81 numbers and no less. This is due to the default implementations for sized arrays. The default traits are implemented up to a size of 32 but are left to the user to implement if larger arrays are desired. This greatly impacts the safety of the code (e.g. one could forget to check the actual amount of copied numbers, resulting in a bunch of zeroes at the end of the array) or the home-brewn implementation for such a basic thing could break. Of course this can be circumvented but holy sh*t is it ugly: You can’t directly impl TryFrom<&u8> for [u8;81] due to the restriction of implementing foreign traits on foreign types. Further, looking into the imlementation in std, we find this: unsafe { Ok(&*ptr) } . I am aware that one can make this safe but is it really necessary to let the user implement unsafe code for such a trivial thing as a conversion between slices and arrays? Doesn’t this defy the very point of the safety guarantees? However, up until a size of 32 this works:

let x: [u8;7]  =  numvec[..].try_into().unwrap();    // works
let x: [u8;81] = numvec[..].try_into().unwrap(); // doesn't

Enough of ranting about complexity, let’s run it!

Running the verifier

A simple cargo run --release -- sudokus.txt runs the application. Note the --release flag to ensure all optimizations are turned on. The double dashes indicate that all arguments after them shall be passed to the application and are not intended for cargo.

When opening up htop I saw, most of the cores were doing nothing:

The Sudoku verifier might not be utilizing all available resources

Let’s fix this by utilizing one of Rusts promises: Fearless Concurrency.

Gotta go fast

The following modifications decrease the runtime significantly but there is quite a bit of cheating going on: We load the whole file in memory. Of course this hurts the performance at first (there is no other processing while reading the file) but you will see in a moment how it still increases the overall performance. Reading a file with 12 million Sudokus into a string takes about 0.8 seconds for me. Of course, this requires the whole file to fit in memory but AWS offers you servers that can verify an insane amount of Sudokus.

Meet Rayon

Rayon is a crate for data parallelism. Given that the Sudokus are available in a String now, we can formulate the perfect click-bait title: “Increase your performance with this single-line trick”. Here it is: .par_lines() . That’s it. Let’s have a look at the context of it being added to the code introduced before:

let valid_count = lines
.par_lines()
.filter_map(|line| parse_raw_line_to_sudoku(line) )
.map(|sudoku| eval_sudoku(sudoku) )
.filter(|b|*b)
.count();

The rayon crate performs the following magic for us: It creates a parallel iterator that is executed on all available cores (per default). The output of htop confirms this:

The Epyc goes Brrrr…

This brings the processing time down from ca 14 seconds to 1.3 seconds. If we subtract the 0.8 seconds for reading the file (although it is unfair to compare), processing performance increased by a factor of roughly 26.

We could write a bit of custom code to allow for parallel processing while reading the file line-by line but this is out of scope.

Your key-takeaway here should be: Rayon is some magic, that when sprinkled upon an iterator of known size, automatically parallelizes your work.

Conclusion

Rust is a modern language with a focus on safety while maintaining speed and keeping developer ergonomics in mind. Given a set of high-quality libraries, writing parallel code becomes trivial. However, there are still areas that feel in need of revisiting. None-the-less it remains my favorite language and I can only recommend you having a look at it, even if you don’t plan on using it: Some of the patterns the compiler enforces on you in Rust are also applicable in other languages, even if not enforced there.

Thanks for reading! If you have any questions, suggestions or critique regarding the topic, feel free to respond or contact me. You might be interested in the other posts published in the Digital Frontiers blog, announced on our Twitter account.

--

--