Effective Workload Benchmarking with Open Source Tools

Ezequiel Lanza
Intel Tech
Published in
6 min readAug 22, 2023

Use a prebuilt Docker* container to run benchmarking tools for hardware metrics.

Photo by Paul Zoetemeijer on Unsplash

Author: Brian McGinn

With rapidly changing technologies, maintaining a reliable approach to monitoring workload performance is essential. However, performance benchmarking tools are often tightly coupled to the technologies or not offered at all. This often requires creating custom benchmarking tools to extract key metrics.

Rather than creating a set of custom tools, we opted to use open source tools and place them into Dockerfiles. This provides an operating system (OS)-agnostic Docker image to deploy benchmark tools. With this approach, the replication of deployments remains uniform across hardware platforms.

In this post, we’ll guide you through six tools we’ve adopted, providing you with code snippets to provide performance results tailored to specific use cases. We took this approach, for example, to generate performance results for our open source reference implementation and have made it available to all, allowing you to produce your own performance data on Intel hardware.

How to use benchmark metrics tools?

Each tool described below follows a particular flow. We start with a description of the tool and its capabilities, followed by the command used to invoke it and conclude with a command line output of the results. Let’s take a deeper look at the six tools that were used in this open source benchmarking implementation.

Sysstat

sar sysstat reports CPU utilization on the overall system. The recommended running operating system for our Docker image is Ubuntu* 22.04, as this base package is built upon it. To refine the output, multiple settings can be used. However, for the purpose of our example, we only need to focus on the interval setting for the metrics pull, currently set at one second.

sar 1

Free

Like sar sysstat, Free is also a base package in Ubuntu* 22.04 and generates the free and used memory in the system. We use a one-second interval to gather constant metrics from the system. Other parameters are available to gain additional insight into the system’s memory usage.

free -s 1

iotop

Iotop, available in Ubuntu* 22.04, pulls I/O metrics for active processes in the system. We use -p to output the monitored PIDs (process/threads to monitor), -o to only monitor processes using I/O, and -b to enable a non-interactive mode enabling the gathering of metrics in the background. Of the metrics logged, we focus on the disk read and disk write in MB/s for the entire system.

iotop -o -P -b

IGT GPU Tools

Intel IGT GPU tools are a set of Intel open source tools that can be used to gather Intel GPU performance metrics. We use intel_gpu_top command to uncover GPU utilization for Intel integrated graphics (iGPU) and the Intel® Arc™ GPU.

intel_gpu_top -d pci:card=<your card # ex. 0> -J

Intel® XPU Manager

The Intel® XPU Manager is another tool to gather GPU performance metrics. Unlike IGT, the Intel® XPU Manager can support the Intel® Data Center GPU series. To gather metrics, we use the xpumcli dump command to log the raw data for a specified GPU device. The -m/ — metrics command is used to narrow down the amount of data produced by the log.

xpumcli dump --rawdata --start -d $device -m 0,5,22,24,25 -j

Intel® Performance Counter Monitor (Intel® PCM)

Intel® PCM is an API provided by Intel to monitor performance and energy usage on Intel CPUs. It specifically uses pcm-power to monitor the system’s energy usage.

pcm 1 -silent -nc -nsys 0,5,22,24,25 -j

Pipeline Specific

We also support additional custom metrics that output from our computer vision pipelines. The average frames per second (FPS) of the computer-vision pipeline is a common metric to determine the performance consistency of a workload. Here we also track inference results such as object count, text detected count, barcode detected count, and inference process time. These metrics can validate the accuracy and consistency of the inferences.

Now that you’re familiar with tools used to collect metrics, let’s examine how these tools are integrated or combined to generate meaningful metrics. To begin, you’ll need to complete the prerequisites. After that, you can either run a specific number of pipelines or stress test the system with the stream density flag.

Using the — pipelines parameter you can define the number of pipelines to use for the benchmark. This is the best method for testing a consistent setup between different hardware setups.
Here’s an example script:

sudo ./benchmark.sh --pipelines 1 --logdir benchmark-test/data --init_duration 30 --duration 120 --platform core --inputsrc rtsp://127.0.0.1:8554/camera_0 
# Using the --stream_density parameter you can define a minimum FPS threshold you want for the running pipelines. The script will then create pipelines until the average target FPS falls below the defined parameter. This script is intended to find the performance threshold for your system. An example of the script:
sudo ./benchmark.sh --stream_density 14.95 --logdir benchmark-test/data --init_duration 30 --duration 120 --platform core --inputsrc rtsp://127.0.0.1:8554/camera_0

Get a Results Summary

After running the benchmark script, the data can be consolidated into a single report with averages. To do this, run: make consolidate ROOT_DIRECTORY=<output dir>

Here’s a sample summary report.

Metric,data
0,Total Text count,0
1,Total Barcode count,2
2,Camera_0 FPS,15.0
3,CPU Utilization %,16.548
4,Memory Utilization %,21.162
5,Disk Read MB/s,0.0
6,Disk Write MB/s,0.025
7,S0 Memory Bandwidth Usage MB/s,1872.632
8,S0 Power Draw W,27.502
9,GPU_0 GPU Utilization %,17.282

Now let’s look at a breakdown of the metrics and the tools used to collect them:

Resources

We can enable a benchmark method for any workload on Intel hardware by choosing open source tools. Dockerfiles have been provided to assist with deployment on a range of hardware configurations. Developers can use these tools with Docker containers to quickly deploy and benchmark workloads across Intel’s hardware portfolio.

Here’s a list of all the resources used:

sar/sysstat

Free

Iotop

integrated gpu and arc

Intel XPU manager

Intel® Performance Counter Monitor

Automated Self-Checkout

About the Author

Brian McGinn is a Software Architect and Technical Lead at Intel in the Health, Education, and Consumer Technologies group. He’s developed software with Intel for the past 13 years, most recently working with open source computer vision and artificial intelligence solutions in the retail space. He holds a Bachelor’s degree in Computer Engineering and a Master’s degree in Computer Science.

https://www.linkedin.com/in/brian-mcginn-009b8234/

brian.mcginn@intel.com

--

--

Ezequiel Lanza
Intel Tech

AI open source evangelist at Intel . Passionate about helping people discover the exciting world of artificial intelligence.