Introduction of Persistent Memory (PMem)

MemArk
Geek Culture
Published in
10 min readAug 17, 2021

*This acticle is published by MemArk Tech Community (https://memark.io/en/). All rights reserved.

You may have heard of some confusing technical terms in different channels, such as Non-Volatile Memory, Persistent Memory, Optane Persistent Memory, PMem, DCPMM, AEP, etc… what are these terms? What’s the connection? What’s the use? This article will help you answer these questions from the perspective of science popularization. I hope that after reading this article, whether you are a programmer or a practitioner in related industries, you can understand what persistent memory and what role it plays.

What is persistent memory?

First, let’s answer the confusing naming. Don’t worry about the evolutionary relationship between these names for the time being. In short, we can think that most of them refer to the similar thing. More strictly, Non-Volatile Memory is a technology, other terms refer to the commercial products released by Intel. At present, the Intel’s official name is “Intel Optane Persistent Memory “, which is also known as PMem for short.

So, let’s get started… In one sentence, Persistent Memory is a new type of memory module that is capable of data persistence. We can look at the image below. It is not very different from any ordinary memory modules, and that it is inserted into the memory slot of the server.

To put it simply, you can go directly to an online website to purchase and plug it into the memory slot on your server, and the persistent memory is ready to be used. Of course, persistent memory has certain requirements for hardware, especially CPUs. Persistent memory is Intel’s exclusive weapon and will not be left to competitors for support. The specific CPU support models can be viewed on: https://www.intel.com/content/www/us/en/support/articles/000055996/memory-and-storage/data-center-persistent-memory.html Of course, if you want to give full play to the greatest advantage of persistent memory, you should pay attention to the hardware configuration and operating modes, which will be explained later.

Then, we study the position of persistence in the whole computer architecture. Students who have studied computer architecture must be very familiar with the architecture of the computer storage pyramid. If we put persistent memory into this pyramid, where will it be? Let’s take a closer look. In the storage pyramid shown in the figure below, we can see that persistent memory is between external storage (HDD or SSD) and DRAM, which is in the middle of the two in terms of capacity, performance, and price. In addition, in terms of function, it is completely a hybrid of DRAM and external storage (so why is PMem marked as half sea water and generally flame in the figure). In other words, it can be used either as memory or as a persistent external storage. Of course, it can also take both into account, depending on how you use persistent memory.

If you still don’t quite understand what I’m describing, the most direct way is to tell you the three most important features of persistent memory: Large, Fast and Persistence:

1. Large: At present, the maximum capacity of a single persistent memory module can reach 512 GB, while the maximum capacity of a single DRAM module is 64 GB. In other words, a single server can easily reach the memory capacity of TB with persistent memory. On the other hand, in terms of unit price, persistent memory is about half of ordinary memory.

2. Fast: Since it is also called a memory, it must not be slow. Compared with an ordinary SSDs, persistent memory has a latency performance advantage of 1–2 orders of magnitude and has a greater advantage than hard disk. Of course, compared with DRAM, there will be a certain performance gap. However, in actual use, because the performance bottleneck is not necessarily in memory, there is generally no significant gap (generally, the performance degradation is less than double).

3. Persistence: Persistent memory has the same characteristics as hard disk. After power failure and restart, the data in memory still exists. This feature can be said to be an evolution of memory. We all know that the data in memory will no longer exist after power failure or program exits. This feature enables persistent memory to be used as a high-speed persistent device and can also meet the needs of rapid recovery in some scenarios of memory applications.

The following table summarizes the typical configurations on a single server in the data center and the corresponding approximate performance figures for reference.

General performance of common memory and persistence devices on a single server

What are the main advantages of persistent memory?

Having said the characteristics of persistent memory, you can imagine that it has unique advantages in some application scenarios. Specifically, there are several ways to play in the actual landing. Starting from several specific scenarios, the following dialectically gives the advantages of using persistent memory and the possible problems.

Scenario 1: Large memory and low-cost solutions

If your application’s memory consumption is the key and the resource bottleneck of the whole system, using persistent memory will be the best solution to reduce your cost. Your system generally has special requirements for large memory in two cases:

1. In consideration of memory performance, you must use memory-based solutions instead of disk-based solutions, such as Redis or MemSQL.

2. Although your application can accept the performance loss caused by disk based, but obviously, if the memory is expanded, your application can run faster and save time, such as the application based on Spark

In this scenario, you can consider using persistent memory to provide a large memory and low-cost solution.

Advantages: The unit price of persistent memory is about half that of ordinary memory, and it can easily reach the memory size of 1.5 TB or even 3 TB on a single machine. Therefore, for example, your target requires a total memory capacity of 20 TB, and only 10 machines may be required for persistent memory, but 40 or more may be required for DRAM only based clusters. Considering the cost of machine investment and operation, the advantages of low-cost solution brought by persistent memory are obvious.

Possible problems: Of course, the introduction of persistent memory may lead to a certain performance decline compared with memory. The decline may be caused by the persistent memory itself, or by the reduction of the number of machines and other hardware resources (such as CPU cores or network bandwidth). Therefore, in the actual project implementation, as a decision-maker, we must conduct careful evaluation to quantify the interests brought by persistent memory.

Scenario 2: Application of high-performance persistence requirements

Persistent memory is a mixture of memory and external storage. Its high-speed persistence is a ideal solution in some scenarios where disk IO is a performance bottleneck. Although SSD can also alleviate the disk IO performance bottleneck to some extent, PMem is undoubtedly revolutionary, it is a persistent device that can improve throughput and latency by two orders of magnitude. Here are some scenarios where disk IO is used as a performance bottleneck:

1. Message queue: The famous open-source message queue system Kafka will eventually get stuck on the hard disk IO due to its data persistence mechanism. The current solution is to have more machines to expand the throughput of the entire Kafka cluster.

2. Search system: Like Kafka, the popular open-source search system Elasticsearch also stores some data structures on disk. The ultimate impact on overall latency and throughput will be the performance of disk IO.

3. Databases or KV storage engines: Such as MySQL or RocksDB, both have data persistency mechanism on external storage.

4. Distributed file systems: In artificial intelligence scenarios, there are often many small files. For example, in Ceph’s file system, the management of a large number of small files on the metadata server may has performance problems due to the existence of a large number of random reads and writes.

Advantages: Obviously, for scenarios with high-speed persistent read-write requirements, the introduction of persistent memory directly improves the performance by an order of magnitude. In terms of throughput, due to the increase of single machine throughput, the total number and scale of machines can be greatly reduced. In terms of latency, it provides another dimension advantage. For specific performance comparison, please refer to the performance comparison table given at the end of the previous section.

Possible problems: PMem, as a pure persistence device, may be a double-edged sword. The main problem is that its capacity is still small compared with traditional hard disks, and the unit cost is also high. Therefore, in some scenarios, if there are high requirements for capacity in addition to performance, the use of PMem will improve the performance, but it may also increase the cost. Of course, the capacity issue can be addressed via software optimization such as tiered storage algorithms.

Scenario 3: Application of in-memory data persistence

In this scenario, PMem is essentially used as a memory. In the previous scenario, the persistence mechanism already exists in the original software design (for example, the file system needs to be stored on the hard disk), and then we can move the persistence mechanism to PMem directly. It may not involve the modification of complex data structures, because its original design already has persistence requirement. However, in the in-memory data persistence scenario, the persistence is not considered in the original design. Therefore, you need to redesign the persistent data structures and algorithms for the in-memory data structure. This kind of applications has the highest requirements for development and can give full play to the characteristics of PMem.

This scenario is often based on pure in-memory applications, but the most common requirement for having data persistence is fast data recovery. This demand generally comes from online service systems (such as Redis database, or parameter server and feature engineering database in artificial intelligence scenario). Once the online service node is offline, the service quality will be affected. Because the system is based on the in-memory data structure, the data recovery after offline often takes hours to import the data and rebuild the data structure in memory. With persistent memory, such services can not only reduce costs through large memory, but also increase rapid recovery function to ensure online service quality.

Advantages: As mentioned above, the advantages of persistent memory can be brought into full play in this mode. Firstly, large memory brings about a decrease in hardware cost. Secondly, through persistence, it gives the original memory application new persistence characteristics, which can support rapid data recovery and ensure online service quality.

Possible problems: The only problem with this application may be that it brings more development workload. General in-memory data structures are not persistency-aware. Programmers are generally required to redesign the persistent data structures and logic through PMDK to achieve the expected memory data persistence.

What are persistent memory operating modes

It is precisely because persistent memory has the characteristics of traditional memory and the persistence of external memory that creates its special dual-mode use. Note that the two modes cannot be mixed, and switching before the mode has a certain cost, so it is impossible to switch dynamically when the program is running.

Memory mode

As the name suggests, persistent memory is used directly as memory without using its Non-Volatile characteristics. This is the fastest and cheapest way to expand memory capacity, which is completely transparent to programs. Specifically, the operating system will directly see the capacity of persistent memory, and the original DRAM will be hidden (in fact, as cache of persistent memory, its cache policy is directly controlled by the CPU). The program does not need to change any code and can directly use the large memory advantage of persistent memory to run applications with large memory consumption.

Memory mode is easy to use, but it also brings some problems:

1. Possible performance problems. Since DRAM is regarded as cache of PMem and is automatically managed by the CPU, but the performance of PMem is worse than DRAM, it may cause performance degradation in some cache unfriendly cases.

2. The persistence attribute cannot be used. Memory mode loses the feature of in-memory data persistence and cannot be used as a scenario requiring data persistence.

App direct mode (AD mode)

AD mode completely exposes the memory hierarchy to the application. Programmers need to control the storage of data in DRAM or persistent memory, and decide whether to persist memory data. Therefore, its advantage is to overcome two problems of memory mode:

1. The storage hierarchy is visible to programmers, so applications can optimize storage performance according to their own characteristics, such as hierarchical storage of hot and cold data and the use of cache-conscious algorithms.

2. Data persistence is available in AD mode. Programmers can choose whether to persist the data in the persistent memory, to make use of the high-speed persistence ability or bring rapid recovery to the program

However, the problem brought by AD mode is the increase of R&D cost. Due to the introduction of persistent programming model, the original memory-based program may need to be re-architected to give full play to its advantages in the memory architecture of multi-level storage.

How to use and develop persistent memory-based applications?

First, you must understand that if memory mode, a low-cost way to expand memory capacity, can meet your business needs, you don’t need development costs at all. However, if it cannot meet your needs and there are the following problems, you may have to think about how to involve development efforts:

  • The performance degradation of memory mode is unacceptable.
  • You want to use persistent memory to replace (or partially replace) traditional external storage devices, and use its high-speed persistence feature
  • It is hoped that the in-memory data can be persisted to provide fast recovery function after offline

The development of persistent memory will be a very big topic, and the idea will be completely different according to the purpose you want to achieve. We won’t expand it here. Reading the official blogs and tech docs from pmem.io is a good starting point.

Join our community

If you are interested in developing PMem-related applications, or adopting PMem for your business, please join our community for further discussion

MemArk: https://memark.io/en/

Slack: https://join.slack.com/t/memarkworkspace/shared_invite/zt-o1wa5wqt-euKxFgyrUUrQCqJ4rE0oPw

Contact us: contact@memark.io

We also have the following open-source projects based on PMem:

  • Pafka: https://github.com/4paradigm/pafka
    A high performance version of Kafka based on persistent memory architecture optimization
  • PmemStore: https://github.com/4paradigm/pmemstore
    A PMem based storage engine that is specially optimized for artificial intelligence workload to process time window related queries with high performance.
    This is the code repository of our VLDB 2021 paper [pdf]:
    Cheng Chen, Jun Yang, Mian Lu, Taize Wang, Zhao Zheng, Yuqiang Chen, Wenyuan Dai, Bingsheng He, Weng-Fai Wong, Guoan Wu, Yuping Zhao, and Andy Rudof. Optimizing In-memory Database Engine for AI-powered On-line Decision Augmentation Using Persistent Memory. VLDB 2021.

--

--

MemArk
Geek Culture

memark.io — Leveraging Modern Storage Architecture for System Enhancement