Exploring etcd’s Efficiency- A Performance Evaluation in ppc64le versus amd64 Environments

Kishen V
6 min readJun 13, 2023

--

Introduction

Etcd serves as a key-value datastore and is a go-to solution for managing data in distributed systems environments. It holds a significant role as the brain of a Kubernetes cluster, responsible for storing and managing the state of the entire system. This encompasses crucial data such as pods, namespaces, services, configurations, and more. The API server relies on etcd’s capabilities, such as Data Storage, Consistency, Concurrency Control, Cluster Coordination, and the WATCH API to detect any changes made to the stored keys.

Understanding Performance Factors:

The performance of etcd is primarily determined by two important factors: latency and throughput. Achieving minimum latency and high throughput is essential for low turnaround time for requests. To measure the performance offered by etcd, the repository provides a benchmark utility. This tool allows users to evaluate and analyze the performance metrics efficiently. Different platforms and architectures may have their own advantages and disadvantages, making it essential to assess their impact on etcd’s performance and determining the most suitable workload for the respective architecture.

Benchmarking with the etcd repository:

Before we delve into interpreting and analyzing the benchmark results, let’s take a moment to understand the benchmark utility bundled with the etcd repository. This tool plays a vital role in gathering performance metrics for etcd. It enables users to measure latency, throughput, and other relevant parameters, providing valuable insights into the system’s efficiency.

Benchmark tool:

The benchmark tool is designed to measure the performance and benchmark the etcd cluster and helps assess the capabilities and performance of an etcd cluster under different workloads and conditions. The tool can be used to simulate various scenarios and test the system’s response, allowing users to evaluate the performance and scalability of their etcd deployment.

By running the benchmark tool, one can generate a synthetic workload on the etcd cluster and measure the system’s performance metrics such as throughput, latency, and other relevant statistics. This information can be crucial for understanding how the etcd cluster will perform under different scenarios, identifying potential bottlenecks or performance issues, or the kind of workload that is most suited for a particular computer architecture, in this case.

The benchmark tool provides various options and parameters such as varying sizes of keys and values, number of clients and connections, etc. to customize the workload, allowing you to simulate different types of operations, concurrency levels, and data sizes. This flexibility enables you to tailor the benchmark to your specific use case and evaluate the performance characteristics that are most relevant to the workload under consideration

The following subcommands are supported by the benchmark tool

mvcc Benchmark mvcc

put Benchmark put

range Benchmark range

stm Benchmark STM

txn-mixed Benchmark a mixed load of txn-put & txn-range.

txn-put Benchmark txn-put

watch Benchmark watch

watch-get Benchmark watch with get

watch-latency Benchmark watch latency

of which, `put` command is what had been used for the tests, with various key and value sizes, a range of values for clients and connections.

Comparison of throughput in amd64 and ppc64le environments:

In this post, we will focus on comparing the performance statistics between two platforms: amd64 and ppc64le. Both platforms have their distinct characteristics and implications when it comes to utilizing etcd. To shed light on their differences, we will evaluate the key and value sizes that need to be stored in the datastore. By examining these variables, we can gain insights into how the two platforms perform and identify any variances in their capabilities. The tests were done on a single node etcd cluster, with the node spun-up locally as a process, with CentOS Stream 8 as the operating system.

The following are the specifications of the environment used to compare the throughputs on ppc64le and an amd64 environment. Although, the ppc64le node is clocked 100MHz faster than the ppc64le node, the rest of the factors such as number of CPUs and Processing units are powered to an extent to have a fair comparison.

The following is the node compute configuration for the ppc64le environment:
1PU(8vCPU) — 8GB memory –

Architecture: ppc64le

Byte Order: Little Endian

CPU(s): 8

On-line CPU(s) list: 0–7

Thread(s) per core: 8

Core(s) per socket: 1

Socket(s): 1

NUMA node(s): 1

Model: 2.2 (pvr 004e 0202)

Model name: POWER9 (architected), altivec supported

Hypervisor vendor: pHyp

Virtualization type: para

L1d cache: 32K

L1i cache: 32K

NUMA node1 CPU(s): 0–7

Physical sockets: 2

Physical chips: 1

Physical cores/chip: 10

Amd64 environment- 4vCPU 8GB

[root@kishen-etcd-test-vm ~]# cat /proc/cpuinfo

processor : 0

vendor_id : GenuineIntel

cpu family : 6

model : 85

model name : Intel Xeon Processor (Cascadelake)

stepping : 6

microcode : 0x1

cpu MHz : 2394.290

cache size : 16384 KB

physical id : 0

siblings : 2

core id : 0

cpu cores : 1

apicid : 0

initial apicid : 0

fpu : yes

fpu_exception : yes

cpuid level : 13

wp : yes

Benchmark variations covered:

The benchmark tests had been conducted using etcd’s benchmarking tool:

Finding 1: With larger “value” size for each key, the throughput in the case of ppc64le is significantly higher than x86 based runs, for equally powered VMs for large keys.

The size of the value was set through 256, 512, 1024, 2048, 4096 and 8912 for each run, for both platforms. Beyond the value-size of 1024, etcd’s performance in the x86 based runs degrades sharply beyond value size 1024 bytes.

Command:

benchmark — target-leader — conns=100 — clients=1000 put — key-size=8 — sequential-keys — total=500000 — val-size=X

Finding 2: The following graph represents the relationship between throughput and the number of clients used to write the data to etcd for a constant size set for keys and values, that are to be written to the etcd. The performance degrades sharply in the amd64 environment beyond cases where the number of clients to write to etcd exceeds 2000.

Command:

benchmark — target-leader — conns=100 — clients= put — key-size=8 — sequential-keys — total=500000 — val-size=256

Finding 3: With varying number of clients with a set number of connections had marignal performance gain in the ppc64le environment.

Command:

benchmark — target-leader — conns=X — clients=1000 put — key-size=8 — sequential-keys — total=500000 — val-size=256

Finding 4: In a containerised environment, ppc64le based etcd containers have better throughput across varying sizes of keys. The nodes were setup as single node kubernetes clusters with 1C and 16GB of memory and 4vCPU and 16GB of memory on PowerVS(ppc64le) and VPC(amd64) environments respectively.

ppc64le has the upper hand in providing higher throughput in an equally comparable amd64 environment. The version of ETCD used here is 3.5.9.

Command:

benchmark — target-leader — cacert /etc/kubernetes/pki/etcd/ca.crt — cert /etc/kubernetes/pki/etcd/server.crt — key /etc/kubernetes/pki/etcd/server.key — conns=100 — clients=1000 put — key-size=8 — sequential-keys — total=1000000 — val-size=X

Findings and observations:

Etcd clusters hosted on ppc64le architecture have an edge over x86 based clusters in cases of large key-value sizes, large number of clients to write data and marginally better performance in case of high number of client connections.

References:

Relationship between fio and etcd performance — https://www.ibm.com/cloud/blog/using-fio-to-tell-whether-your-storage-is-fast-enough-for-etcd

Etcd Benchmark tool : https://etcd.io/docs/v3.4/op-guide/performance/

Blog on Alibaba’s route to etcd optimization: https://www.alibabacloud.com/blog/performance-optimization-of-etcd-in-web-scale-data-scenario_594750

Summary of etcd benchmarks: https://indico.cern.ch/event/560399/contributions/2262460/attachments/1318051/1975404/slides.pdf

--

--

Kishen V
Kishen V

Written by Kishen V

0 Followers

Cloud Engineer - IBM India Systems Development Labs.

No responses yet