Beyond the Bootcamp: Memory!
Welcome to the second module of Beyond the Bootcamp!
In today’s edition, we’ll go through a high level overview of what computer memory is and why it’s important to understand how computers utilize memory so we can write better programs.
What is computer memory?
Let’s actually start off with a several definitions of memory as they’re fairly similar. Dictionary.com defines memory as:
The faculty by which the mind stores and remembers information
Psychologists also talk about different types of memory. Two categories of memory that psychologists often refer to are short-term & long-term memory. Short-term memory is defined as memory with limited capacity with limited duration while long-term memory is associated with storage of information over an extended period.
If I asked you what you were doing right before reading this article or what you had for breakfast this morning, that’d be pretty easy to answer right?
But instead if I asked you what did you have for dinner last Wednesday or what countries you’ve been to in the last 5 years, now you have to think!
It turns out that this categorization of memory is very similar to how computers operate!
Computers function the same way!
Computers function the same way! Below is an illustration of the hierarchy of memory:
At the top of the pyramid, we have the CPU. At the bottom of the pyramid, we have the hard drive. Computers start at the top of the pyramid looking for data. Whenever the computer is executing code, it first looks to see if it’s on the processor register in the CPU. A register is similar to a variable. They can store 32 or 64 bits depending on what system architecture you’re running. If you ever hear the terms x64 and x86 system architectures, these terms are referring to the number of bits that can be stored on the hardware. 32 bits can be stored on x64 hardware and 64 bits can be stored on x86 hardware. Since the CPU is small in storage size, the lookups to variables are extremely fast.
If the requested piece of information is not found in the CPU, the next step is that the information will be tried to lookup in the CPU cache. The CPU cache has three levels, L1, L2, and L3.
What are caches?
Just like how we can remember what we had for breakfast or what day of the week is, a cache serves as a way to store recently accessed pieces of data for fast access.
To get a better sense of how caches work, let’s go through an example:
Above is an example of how a cache work in this hierarchy. If the data is not available in the cache, the computer goes into physical memory which is referred to as RAM.
Let’s go over an example demonstrating how a cache hit would work:
In this example, we request for data block 3. We see that it’s in our cache, so we don’t need to go down to main memory to fetch the value.
Let’s go over the case where we run into a cache miss:
In this case, we request for block 11 and we don’t see it in our cache. We then traverse to main memory to find where block 11 is. Since we requested for block 11, there’s a good chance that block 11 will be requested again in the near future. This is referred to as temporal locality.
Another type of locality computers like to take advantage of is spatial locality. Spatial locality states that items with nearby memory addresses are more likely to be accessed.
What are memory addresses?
In a computer, a memory address is a reference to a location on the hardware and used by the software. They’re expressed as a sequence of digits typically in hexadecimal (remember base 16?). Whenever you create an object, it gets allocated a memory address. We’ll revisit this concept when we take a closer look at how software views memory.
As a simple example, let’s take a look at how arrays are implemented in memory. Let’s assume that we are looking at an array of 6 integers (remember, they’re 4 bytes)!
For the sake of this example, we can assume the address space starts at 0x00. In reality, memory addresses start at higher addresses (ex: 0x800f092), but the concept is the same. Since we know that an int is 4 bytes, we can see that the 6 element integer array takes up 24 bytes of memory. Notice that the integers are close together in proximity in memory. This is by design so that hardware can take advantage of spatial locality!
As a programmer, when you reference an index in an array, under the hood, the hardware will pull in the rest of the array into the cache because there’s a good chance you will be accessing the rest of the elements. Here’s a concrete example of a code snippet where this would happen:
On the first iteration of this for loop, the reference of arr brings in the rest of the arr array into the cache (or if the array is too long, just parts of it). On subsequent iterations, we then have the elements ready to use in the L1, L2, and L3 caches.
What happens if we don’t find elements in the cache?
If the elements aren’t found in the cache, we need to go to main memory (which can be expensive!) to fetch a given array element. We’ll then repeat the above process of pulling elements into the cache until the array summation is complete.
Next week, we’ll go through an example where we don’t take advantage of spatial locality and seeing how it directly impacts performance with timed benchmarks.
Until next time!