Why Every Bit Is Not Equal — a Primer in Computer Memory
A few weeks ago, I was working on an Active Record database in Ruby and I came across a strange error —
out of range for ActiveRecord::Type::Integer with limit 4 bytes
I realized I had no clue how many characters were in a 4 byte integer, and that led me on a quest to understand computer memory. In this blog, I will introduce you to the architecture of memory, beginning with basics and ending with advanced concepts. I also included a bonus section with some tricks for rubyists.
Fun fact: 1 byte = 8 bits
The Field Effect
The field effect revolutionized memory in computers. It drives much of modern technology, including solar panels, amplifiers, and computers. The field effect means that you can increase the conductivity of special materials called semiconductors by applying an electrical charge to them. You can use this effect to create a switch that turns on when you supply positive charge and off when you remove the charge.
Examples of the field effect:
In the graphics below, the blue area represents the semiconductor. When there is no energy source connected directly to the semiconductor, the switch is off. When you connect a positive charge to the semiconductor, the switch turns on.
This switch is off — electricity is blocked because there is no positive voltage connected to the blue area
This switch is on — electricity flows because we applied a positive voltage to the blue area
These switches are called “transistor” switches (aka field effect transistor) because when the semiconductor is electrically charged, you can ‘transfer’ current across material that normally ‘resists’ current. The animation above is an example of a Bipolar Junction Transistor. Today, transistor switches can be built at a microscopic level through a technology called MOSFET. Modern memory chips have millions of transistor switches, each representing one bit of information. A computer’s central processor and I/O communicate with these switches to write and read memory. A computer can interpret the on and off state of switches into a readable fashion through the binary language of 0s and 1s.
The Memory Hierarchy
Here’s the full memory hierarchy. Computer memory is organized based on how quickly your Central Processing Unit (CPU) can access it.
Volatile Memory — SRAM and DRAM
Volatile memory is kind of like the ghosts summoned from Voldemort’s wand in Harry Potter And The Goblet of Fire. You can see the ghosts and interact with them, but they are not permanent. Volatile memory requires power and will be lost when power is turned off. There are two main types of volatile memory — Static Random Access Memory (SRAM) and Dynamic Random Access Memory (DRAM).
Static RAM (SRAM), is closest to your CPU and the quickest to read, write, and edit. It is also the fastest type of memory. SRAM is very expensive because each bit is constructed using multiple field effect transistors.That’s why your computer does not have very much of it. If you are interested in how SRAM switches work, read about the bistable multivibrator.
The fastest type of SRAM is stored in your CPU registers. CPU registers are at the top of the memory hierarchy and easiest for the CPU to access. There are two main types of registers — data and address. Data registers store numeric values, which your CPU’s Arithmetic Logic Unit (ALU) operates on to execute math and logic. Address registers hold values pointing to addresses in memory. Program counter registers and index registers can be used to iterate through data or addresses so that an entire program can be retrieved and executed by the CPU.
Outside of CPU registers, most SRAM is used for cache memory. Cache memory stores small amounts of data so future requests for that data can be served faster. As a programmer, you will probably come into contact with software caches often. These are the caches stored on webpages and database calls. In Ruby on Rails, for example, Active Record stores object instance information in a cache so you don’t have to access the database every time you call a method. When you update a user’s information, Active Record directly contacts the database and resets the cache.
Caches are also vital to your computer’s hardware speed. For example, your CPU has memory caches that speed up access to important information like its memory management programs.
DRAM is Dynamic Random Access Memory. It is kind of like a dam. It holds memory stably like the dam holds water, but over time it can develop leaks and needs to be maintained.
DRAM switches consist of a capacitor and a field effect transistor. The capacitor holds its charge but slowly leaks its charge when it is read. This means that DRAM needs to be rewritten often in a process known as Memory Refresh. This rewriting cycle happens every 64ms in new 256M chips.
Virtual and Physical Memory
DRAM is organized into physical memory and virtual memory to provide security, free up applications, and increase available space. Physical memory has a physical address that can be accessed directly by the CPU. But how can we run programs that need more memory than is available in RAM? Virtual memory solves this issue, providing the illusion of a very large main memory. Virtual memory is physically distributed in many locations, but mapped in an address table by your Memory Management Unit (MMU) so that it appears as a collection of contiguous addresses. Through the process of pagination or segmentation, your operating system’s kernel can seamlessly swap large chunks of virtual memory into physical memory, and vice versa. More on Memory Management Unit here.
Virtual memory also prevents programs from crashing each other by calling on the same memory. Two programs can use memory from the same physical address by storing it in different virtual addresses. In this way, each running program can behave as if it is the only one running.
Why is DRAM slower than SRAM?
The organization of DRAM into virtual and physical memory is one of the major reasons why DRAM is slower than SRAM. Reading one bit of virtual DRAM requires many steps. First a virtual address is loaded into a counter register, which alone is a multi-step process. Then the virtual address is sent to the Memory Management Unit (MMU). The MMU translates the virtual address to a physical address, then sends it to the memory controller. The memory controller figures out what bank the DRAM is in and asks the DRAM to find the specific array where the data is held. DRAM repeatedly searches for the data until it narrows the location down to a single array of cells. The data is loaded, sent to the memory controller, and finally sent to the CPU where it it is loaded into registers and executed.
WHEW! That’s a lot of steps!
The cheap design of DRAM’s switch circuits also contributes to its slow speed. To fit billions of switches on a single chip requires small transistors that carry a small charge. In order to read such a small charge, the charge must be amplified, which takes time. Lastly, distance plays a role. SRAM is physically closer to the CPU than DRAM, which makes reading SRAM much quicker. Your CPU has methods for dealing with the slowness of DRAM. Here’s a great resource to learn more about this.
Non- Volatile Memory Is Like A Light Switch
Non-volatile memory is like a light switch. You need energy to turn a switch on and off. But once you turn a switch on or off, it does not require power to store its state. You can give it dirty looks and growl at it like Marge, but it will maintain its state. Non-volatile memory is mostly used for storage (secondary memory). This includes the hard disk drives (HDD), solid state drives (SSD), and cloud storage, but it also includes an important type of memory called read only memory (ROM).
Most memory stored on your computer is slow memory. This includes your flash memory, hard disk drive and cloud memory. The fastest in this category, flash memory, can be randomly accessed for reading and writing, but is slow for editing or erasing and can only be edited in blocks. The random access read and write capability is why solid state drives (SSD) are more expensive than hard disk drives (HDD), but SSD is still not used for your computer’s random access memory (RAM).
Read Only Memory (ROM)
ROM is almost like the skeleton memory of your computer. It stores the vital information that allows your computer to operate. In older computers, ROM stored entire operating systems, but now it usually stores firmware, which is the group of programs that provide low level control of computer hardware.
How does your computer read and write memory?
Organization of physical DRAM array:
Memory chips are physically organized much like a database table. There are rows, columns, and cells. Each physical bit has an address, similar to the way each instance of a Ruby object has its own id.
Each type of memory has its own particular method for reading and writing. Here I will discuss the physical organization of Dynamic Random Access Memory (DRAM), the most plentiful memory in RAM.
In DRAM, sense-amps read each bit and amplify it’s on/off state to a level readable by CPU. After reading a bit, the sense-amp rewrites the bit by recharging its capacitor. This read-write loop is called memory refresh. There is one sense amp per column of memory cells, which means there are thousands of sense amps on a modern memory chip.
The memory buffer provides a place to temporarily store memory while it is being sent through the I/O. This solves a speed bottleneck issue in the memory architecture. Oftentimes memory is read faster than it can be processed by the I/O.
The memory decoders help select which place on the chip should be read when the CPU asks for a specific address. Memory decoders allow the I/O to access small chunks of memory independently even if they are all connected to the same bus.
We’ve already talked about memory registers, (core arithmetic connected directly to your CPU), but we haven’t talked about what the registers do. The main jobs of the CPU and its registers are to process and execute memory.
The CPU runs on a clock. Most CPUs today can process a maximum of 64 bits in every cycle of their clock. The size of the CPU is commonly referred to as a word of memory. But the real speed of a CPU comes from its clock speed, which is measured in clock cycles per second, or hertz. Modern processor performance is measured in Ghz. 1 Ghz is equivalent to 1 billion clock cycles per second. That’s quite a lot of processing power!
The reason your computer cannot process memory that quickly is because the speed of the memory chip is much slower than the speed of the CPU. It takes time for the chip circuitry to read and write each bit and transfer data to the CPU.
BONUS: How to return the memory equivalent of objects in ruby
Ruby includes awesome methods to let you interact with the memory of an object. Here are a couple:
### RETURN THE NUMBER OF BYTES IN THE STRING "1" ####### 1 byte = 8 bits #####"1".bytesize => 1 => 8 bits ### RETURN THE NUMBER OF BITS IN THE NUMBER 1 ###1.bit_length => 1 => 1 bit
Why do integers use less memory than a number in a string?
Fixnum characters in ruby have their own binary code separate from the binary of a number character. You can see the binary equivalents of numbers and strings using the
unpack("C*") method on a string and the
to_s(2) method on an integer. Ruby uses 8 bits per string character and different numbers of bits for Fixnums depending on their value.
### RETURN THE BINARY FOR A STRING ### "1".unpack("C*") => ["00110001"] => 8 bits ### RETURN THE BINARY FOR A NUMBER ### 1.to_s(2) => "1" => 1 bit
You can see the difference between the binary of 0, the smallest Fixnum, and 9, the largest.
### RETURN THE BINARY FOR 9 and 0 ###9.to_s(2) => "1001" 0.to_s(2) => "0"
Convert an entire string into binary using
unpack("B*") and back to a string using
pack("c"). Read this to understand why byte encoding is important!
Print the location of an object in memory as a string
a = 1.class => Fixnum (a.object_id << 1).to_s(16) => "7f7faa0c4848"
Thank you for reading! If you’ve made it this far and you’d like to geek out even more, watch this awesome documentary from the 60’s explaining the architecture of computers: