Caching and Memory Hierarchy

5 min readJun 14, 2020

Memory Hierarchy

The memory hierarchy is to increase the efficiency of the memory organization in order to reduce access time. It was developed based on a program behavior known as the reference location.

The picture below clearly shows the different levels of memory hierarchy:

Memory Hierarchy — Diagram

There are 2 main types of Memory Hierarchy:

Internal Memory or Primary Memory
There are Main Memory, Cache Memory, CPU registers. These are directly accessible by the processor.
External Memory or Secondary Memory
There are several types of external memory that exist, such as magnetic disk, optical disk, magnetic tape, and peripheral storage devices. These devices are accessible by the processor through the I/O module.

characteristics of Memory Hierarchy Design following the above figure:

Performance:
Previously, the computer system has a large difference in access time. Therefore, the performance of the system decreases because of was designed without a Memory Hierarchy design and the speed of the gap is increasing between the CPU registers and the Main Memory. So, enhancement was required. This enhancement was made in the form of Memory Hierarchy Design due to increased system performance. One of the most important ways to increase system performance is to reduce the memory hierarchy

Capacity:
As going down the Hierarchy, the capacity increases.

Access Time:
It is the time interval between the read/write request and the availability of the data. As going down the Hierarchy, the access time increases.

Cost per bit:
As we going down in the Hierarchy, the cost per bit decreases i.e. External Memory is cheaper than Internal Memory.

Locality of Reference

Temporal Locality
Temporal locality means the current data or command that is being fetched may be needed soon. Therefore, we should keep that data or command in cache memory so we can avoid searching the same data in the main memory again.

2. Spatial Locality
If a particular storage location is referenced at a particular time, then it is likely that nearby memory locations will be referenced in the near future. it is worthwhile to prepare faster access for subsequent reference.

The advantage of Locality

Memory Hierarchy

- Store everything on the disk

- Copy the frequently used data and neighbor data from disk to main memory

-Copy the frequently used data and neighbors from main memory to cache

Cache

Cache memory is a high-speed memory. It acts as a temporary storage area that holds frequently requested data for the processor can retrieve data easily.

The name of the hardware that is used in cache is SRAM (Static RAM) that faster than using DRAM (Dynamic RAM) in Main Memory.

In order, cache has less storage space than the main memory. It also small and expensive but has higher performance.

The Levels of Cache Memory,

-L1 cache is usually embedded in the processor chip as CPU cache. It is an extremely fast and small size

- L2 cache might be embedded in many places such as CPU, separate chip, or co-processor. It also has a high-speed alternative system bus for connecting to CPU. So, it doesn’t get slowed by traffic on the main system bus and usually has more capacity than L1.

-L3 (Level 3) cache is the largest cache memory unit, and also the slowest one. Modern CPUs have dedicated space on the CPU die for the L3 cache, and it takes up a large chunk of the space

Cache Line

The cache is partitioned into lines known as blocks. Each line has 4–64 bytes in it. During data transfer, all lines are read or written.

Each line has a tag that indicates the address in M from which the line has been copied.

Cache hit: Data found in cache. Results in data transfer at maximum speed.

Cache miss : Data not found in cache. Processor loads data from M and copies into cache. This results in extra delay, called miss penalty.

Cache mapping is a method in which the main memory data is imported into the cache and referenced by the CPU. The mapping method used directly affects the efficiency of the entire computer system.

Direct Mapping

Each addressed location in main memory can maps into one location in the cache memory because of main memory is much larger than cache.

There are 3 fields.

Fully associative mapping

This is the most complex cache mapping, but this method is more flexible than direct mapping. It is used to store content and addresses of the memory words. A block of main memory can be mapped to any freely available cache line. when the cache is full, the replacement algorithm is needed to replace a block.

Set associative mapping

Set associative cache mapping combines the best of direct and associative cache mapping techniques. In this case, the cache consists of a number of sets, each of which consists of a number of lines.