Advent of Clojure — Day 3

This is day 3 of my series on the Advent of Clojure.

It’s only Day 3, and we’re already up to my favorite puzzle of the month!

— — Day 3: Spiral Memory — -
You come across an experimental new kind of memory stored on an infinite two-dimensional grid.
Each square on the grid is allocated in a spiral pattern starting at a location marked 1 and then counting up while spiraling outward. For example, the first few squares are allocated like this:
17  16  15  14  13
18 5 4 3 12
19 6 1 2 11
20 7 8 9 10
21 22 23---> ...
While this is very space-efficient (no squares are skipped), requested data must be carried back to square 1 (the location of the only access port for this memory system) by programs that can only move up, down, left, or right. They always take the shortest path: the Manhattan Distance between the location of the data and square 1.
For example:
- Data from square 1 is carried 0 steps, since it's at the access port.
- Data from square 12 is carried 3 steps, such as: down, left, left.
- Data from square 23 is carried only 2 steps: up twice.
- Data from square 1024 must be carried 31 steps.
How many steps are required to carry the data from the square identified in your puzzle input all the way to the access port?

I don’t really know why I liked this puzzle. Maybe it’s the infinite plane that appealed to me?

My puzzle input was the number 312051. This is somewhat arbitrary value, and I will refer to it simply as the value i.

Rather than just build the spiral, what patterns can we see in this structure? The first thing that stands out, is that the spiral is formed from a series of concentric squares, and of course, the number of elements in the square is a square number. Because each square is 2 wider than the square inside of it, the length of the sides of the squares increases in the sequence of odd numbers 1, 3, 5, 7…. Consequently, the sizes of the squares increases in the sequence 1, 9, 25, 49…

A consequence of this structure is that the highest number in each square will be from the sequence of squares of odd numbers, and will occur in the bottom right corner of the square. The length of the sides of the square will be the square root of this number. Finally, turning widdershins (anti clockwise) around a square of size n, each side will start 1 place in from the corner, and extend to the next corner with a length of n-1.

A square with n=7 is shown below. If we look at each side as starting just after the corner, and extending to the next corner, then each side will be 6 long (or n-1). Since the square is n=7, then the last number in the previous square was 5² = 25. That makes the top right corner 5²+6=31, the top left corner 5²+6+6=37, the bottom left corner 5²+6+6+6=42, and the bottom right corner 5²+6+6+6+6=49, which also happens to be 7².

This can be generalized for any size square, where n is even (i.e. n=2m for integral values of m):

  • The first side is on the right and is in the range: (n-2)²+1 … (n-2)²+(n-1)
  • The second side is the top: (n-2)²+(n-1)+1 … (n-2)²+2(n-1)
  • The third side is on the left: (n-2)²+2(n-1)+1 … (n-2)²+3(n-1)
  • The fourth side is on the bottom: (n-2)²+3(n-1)+1 … (n-2)²+4(n-1)

It looks complex, but hopefully the pattern is visible. For sides s numbered 0 to 4, the range is:

(n-2)²+s(n-1)+1 … (n-2)²+(s+1)(n-1)

This isn’t the answer, but it does define the parameters that we need to work in.

The first step to locating where a number i is located, is to determine which square it’s in. This means finding n for:

(n-2)² < i ≤ n²; where n=2m

This can be done by finding the square root of i, rounding up. If it rounds up to an odd number, then increment to the next number.

Clojure doesn’t come with math functions, since it can rely on its host platform to provide libraries for this. On the Java Virtual Machine (JVM) these are in java.lang.Math, which is automatically imported into every namespace. The ceil function rounds up for us, but it returns a floating point value, so we will want to convert it to an integer. At the repl we can determine the root (rt) with:

(def rt (int (Math/ceil (Math/sqrt i))))

And the size of the square (sq instead of n) using:

(def sq (if (odd? rt) rt (inc rt)))

This gives rt as 559, and since it’s an odd number sq is also 559. So the square for this value of i is 559 wide.

To find the number that is the final number before this square starts counting, we need to look for (sq-2)². We can call this number the base of the square:

(def base (* (- sq 2) (- sq 2)))

This gives a base of 310249, which means that i is 1802 places around the square. We don’t actually need to know this, but since 1802/558 is over 3, we know that this value of i will be along the bottom of the square.

The distance along any one side (measured from the starting corner) will then be found by subtracting the base from the provided number i, and taking the mod with respect to the length of the side sq-1:

(def side (mod (- i base) (dec sq)))

This gives the distance of i along the bottom edge as 128.

Finally, we’re looking for the Manhattan distance. That means, the distance from the middle to the side, plus the difference from the middle of the side to the number i. The distance from the middle to the side is just half of the sq value, which we can call hsq. Because sq is always odd (due to the 1 position always being a single square in the middle), the value for hsq always has an extra half, which we can truncate. The distance between the middle of the side and the length along that side is the absolute value of hsq-side.

(+ (Math/abs (- hsq side)) hsq)

This provides a Manhattan distance of 430 for my value of i.

Note that neither of these values requires knowing which side was in use, which is why we don’t need to figure that out.

The final function for day3 is:

With solution in hand, we reveal part 2:

As a stress test on the system, the programs here clear the grid and then store the value 1 in square 1. Then, in the same allocation order as shown above, they store the sum of the values in all adjacent squares, including diagonals.
So, the first few squares’ values are chosen as follows:
Square 1 starts with the value 1.
Square 2 has only one adjacent filled square (with value 1), so it also stores 1.
Square 3 has both of the above squares as neighbors and stores the sum of their values, 2.
Square 4 has all three of the aforementioned squares as neighbors and stores the sum of their values, 4.
Square 5 only has the first and fourth squares as neighbors, so it gets the value 5.
Once a square is written, its value does not change. Therefore, the first few squares would receive the following values:
147  142  133  122   59
304 5 4 2 57
330 10 1 1 54
351 11 23 25 26
362 747 806---> ...
What is the first value written that is larger than your puzzle input?

Unlike the last time around, this looks like it will need to actually simulate the memory.

The plane is supposed to be infinite, but of course, we can stop calculating as soon as the required value has been reached. We can also see that the numbers seem to be climbing quickly, so we shouldn’t need to go too far before we find our result. At this point, I figure that since I’m solving a puzzle and not engineering a product, then it’s acceptable to just use a large array, and pretend that’s my “infinite” memory. If I made a mistake and run out of space, then I’ll just make the field larger.

As an initial guess, I think a field that stretches out 50 in each direction should be sufficient. That will make for a maximum square of 101 to a side. We will then choose the central position in this field to be the [0,0] coordinate. So we will treat [50,50] as [0,0], and the position next to it [50,51] will be [0,1].


Computer graphics and graphs often puts the horizontal displacement first, followed by the vertical displacement. i.e. [x,y]. However, linear algebra (the fancy name for using vectors and matrices) always puts the vertical displacement first, and measures from the top downwards. I’m going to stick to this convention when addressing the field of memory, because it also helps while visualizing the “vector of row vectors” described below.


Clojure does support Java’s 2D arrays, but these are mutable memory. If you recall, back when I started this I said that I wanted to stick to functional techniques, so that doesn’t appeal. The core.matrix library would work, but it seems very heavy handed when I don’t need any matrix operations. The simplest thing to use here is the standard vector of vectors approach.

First of all, a vector of twice the half width is needed, this will hold all of the vectors that form the rows of the field. To do this, we can create an empty row vector by repeating 0 the required number of times, and turn it into a vector:

(vec (repeat (* 2 half-size) 0))

This row can then be repeated the required number of times, and putting these into a vector.

(vec (repeat (* 2 half-size) (vec (repeat (* 2 half-size) 0))))

Now that we have our field, we need to initialize the central value to 1. We can choose the half-width position as the [0,0] coordinate. This can be updated using the assoc-in function.


Nested Structures

The assoc-in, get-in and update-in functions each take nested objects that are “associative” and allows you to access an element based on a path through the object. The most common approach is to use nested maps. This means that a map of maps like this:

{:a {:name “a” :value 1}
 :b {:name “b” :value 2}
 :c {:name “c” :value 3}}

Can be accessed by defining a path through it. For instance, the :value field of the :b map could be changed to 5 using:

(assoc-in nested-map [:b :value] 5)

These are standard operations for nested structures like this. What is of interest for us is that vectors are also associative structures, which associate a numerical key (the offset into the vector) with a value. This means that the path we provide would be the coordinates of the value we are interested in. So to update the 8th element (offset 7) in the second row (offset 1) to the number 15, we can use:

(assoc-in field [1 7] 15)

As a general matter of interest, because JSON data is a structure of nested maps and vectors, these functions are ideal for working with JSON, with paths being described be alternating between keys and numbers depending on the data type.

Updating a Buffer

We’re finally at the point where I realized that there may be some interest from people still learning functional programming. Yay.

One of the first mental hurdles that people moving to functional programming have to deal with is the idea of immutable data. How is a buffer useful when you can’t ever change it? What about the memory in this puzzle? Every cell has to be updated, one at a time, until the solution is found. How can that be done when the buffer doesn’t change?

Enter the world of Functional Data Structures. I’m not going to describe how these work here, as their implementation can be complex and nuanced. But I can explain what they do.

Functional data structures have the same interfaces as standard data structures, so they still look like arrays, trees, hash maps, and so on. The difference is that any changes do not affect the structure at all. Instead, an entirely new structure is created, with all of the old data in it, along with the requested change.

A simple way to replicate this behavior in an imperative language is to make a complete copy of the original data, and then modify it according to the requested operation. Many people assume that something like this must be going on, and so they are often concerned that these structures are slow to manipulate, and use lots of extra memory. The actual implementation has the new data structure share as much of the original structure as possible, and just include the minor change. This sharing is possible because neither structure can change. The operations are quick, and very memory efficient.

The main difference in using these structures is that instead of modifying something existing and expecting the structure to be updated, a reference to the new structure is taken, and the old one is discarded.


Now that we can initialize our field, we could wrap this into a function:

Once we have that, we want to iterate through values of i, setting the array accordingly. This means that for each value of i, we need the appropriate coordinates. While it’s possible to keep track of values of x and y as they traverse around the spiral, this is a lot of work and not particularly extensible. A more general approach is to create a function that can take an arbitrary value of i, and return it’s 2D coordinates. There will be some redundancy here, but each step is basic numerical calculations, so they ought to be fast enough to not matter.

Given the work above, we are already most of the way to getting these coordinates. We have the square that the value of i appears in, the distance around the square, and how far along a particular side the value appears. Using this, we can subtract it from half the side (hsq), to get how far it displaces from the center position in the side (disp). And while we didn’t keep it earlier, we also worked out which side of the square the value is on, which we can now call side-nr.

Using these figures, we can calculate coordinates, centered on [0,0]. When the value is on one of the sides, then the horizontal displacement is always the hsq value (positive or negative), and the vertical is disp. When it’s on the top or bottom, then the vertical displacement is either positive of negative hsq, and the horizontal is based on disp.

Remember that the vertical coordinate comes first. Also, there is no need to calculate that 1 is the origin, so it can be returned immediately.

With all of this machinery to create the coordinates, the redundancy in day3 is too much for me. So I’ll change it to use the new function:

Second Star

At this point, we want to iteratively add a new value into the spiral memory, and figure out each value by adding all the neighboring values. While reduce can be used to update a structure over and over like this, I find that this kind of operation is a more natural fit for a loop/recur.

The idea is to start with the initialized field and a value for i of 2 (since the number 1 position is already initialized). Then, calculate the location of i in the field using the coords function, and add up all the values around it. Finally, update the field with this value, and recur back with the updated field and an incremented i.

Line 2 initializes i and field, and marks the start of the loop. Line 3 then gets the [x,y] coordinates for the current value of i, and lines 4 and 5 modify these to create locations within the field buffer. This just means adding in the values for the origin position, which is the “radius” of a spiral in the field, which we can call fieldr.

Once this location is found, we need to find all the locations around it. This can be done using a nestedfor loop. These loops create a sequence of values based on the varying vars. By varying 2 vars between -1 and 1, we can accumulate a complete series of 9 elements, which reflect all 8 of the neighboring coordinates for a location, plus the location itself (when the pair equals [0,0]):

(for [p [-1 0 1] q [-1 0 1]] [p q])

This tells p to vary across those 3 values, and for each value of p, q is made to vary across its 3 values. Each combination is then put together as a pair. The result is:

([-1 -1] [-1 0] [-1 1] [0 -1] [0 0] [0 1] [1 -1] [1 0] [1 1])

Since [0 0] is not needed, we can filter it out of the list with:when

(for [p [-1 0 1] q [-1 0 1] :when (not (= 0 p q))] [p q])

([-1 -1] [-1 0] [-1 1] [0 -1] [0 1] [1 -1] [1 0] [1 1])

We can see this for structure on line 6. These offsets are then added to the current location, and used to access the field. The result is a sequence of all the neighboring values for the coordinate. Adding this sequence is done with apply +.

Line 8 checks if the value is the one that is being searched for, and if it is, then it can be returned. Otherwise, the field is updated to include this value, and that updated field is given to the next iteration of the loop, along with an incremented i.

Fortunately, fieldr was large enough, and the result was 312453.

Time for Day 4… right?

Redux

This solution bothered me for some time. I had the correct answer, so why worry about it? Well, the approach of using an arbitrary size for the memory and hoping it would work really bothered me.

Thinking about it, there was no need for the simulation to use spiral memory. It was merely filling in the values for different values of i, and this could be done with a linear array, with i being the position in the array.

The problem with this approach was that finding the neighboring positions for the location of i would need to be converted into positions in the linear array. This is the inversion of the coords function. That function takes a value for i, and converts it into [x,y] in the field, but now we need to take an arbitrary [x,y] and convert it into a value for i. We can call this function offset:

Since the “square” operation is appearing in a couple of places now, it makes sense to name it, and we can use it here.

Line 4 starts the offset function by finding which of the absolute values of x or y is larger. This will indicate the square that the coordinates are in. By doubling this number and removing one (line 5), we know the length of the side of the square just inside the one we are in. Line 6 squares this to give the base number for the square we are in.

Line 7 provides the size of each “side” of the square (the width of the square, minus the corner cell), and line 8 gets half of that side, since coordinates are measured from the halfway point in a side.

Lines 10 through 12 identify which side of the square the coordinates indicate, based on whether x or y was the larger value, and the sign of that coordinate. Given that, the address of the halfway point in a side can be calculated, and added on to the other coordinate to find out how far along that side it is.

Using this, we can update the day3* function:

The linear array here is called mem and it is being represented as a vector, since these can be efficiently addressed by offset. Also the conj operator always appends to the end of a vector, which is what we need to do for each new value of i.

The start of the loop on line 2 is similar to last time. Because i starts at 1 instead of 0, the 0 position will be ignored, but we can just initialize it with a 1, like the first position is. (This doesn’t really matter, as it is never looked at).

Line 3 gets the coordinates like last time, and line 4 has a similar loop to add/subtract values from the [x,y] coordinately to find the neighbors. This time, line 5 converts the coordinates back into a linear offset labelled f.

Some of the “neighboring” locations are outside of the initialized spiral, but these can be filtered out, since they provide values of i larger than the one being looked at. These are dropped with the :when filter on line 6. Finally, line 7 retrieves those values that are needed, and these all get added together with the apply + on line 4.

Line 8 provides the exit for the loop when the required value is found, or else line 10 sends execution back to the top of the loop, updating i, and appending the calculated value to the end of the vector.

There is more code, it doesn’t really look any cleaner, and it doesn’t give an answer that is any more right, but this just does it better. 😄

The final answer is still 312453. You didn’t expect it to change, right?

On to Day 4.

Like what you read? Give Paula Gearon a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.