x86–64 Levels

Heather Lapointe
4 min readJan 2, 2023

--

The target instruction set for Intel and AMD has been long confusing for application developers trying to distribute their systems to a variety of customer systems on different hardware levels. This can be especially infuriating if you accidentally use a vfmadd132pd instruction (because maybe you set gcc -march=native) and your target CPU does not recognize this instruction. If you haven’t taken a look at your GCC documentation for -march, you’ll realize there’s several pages of options just for x86–64 (the most common are lines of Intel CPUs such as haswell, or AMD like bdver4).

Thankfully, in 2020, several large vendors (Intel, AMD, RedHat, SUSE) collaborated on a shared microarchitecture level specification that provide a concise set of features. Check out the RedHat blog for some of the work that went into this.

The source of truth lives in the psABI repository:

Microarchitecture levels in from the psABI repository

x86–64-v2

This is the first non-baseline microarchitecture level. This level notably brings SSE4.2 and POPCNT to the table.

Most Intel processors from 2009 (Nehalem and later) and AMD Processors from 2013 (Jaguar) have this feature set.

My poor Xeon X5450 (Penryn/Harpertown ~2007) did not match this feature set after having a good run of 13 years.

This is a fairly good set to target if you are looking for maximum compatibility with consumers. RHEL 9 was based on this level.

As of December 2022, Steam Hardware Survey shows 99.0% of users have SSE 4.2 available.

x86–64-v3

This microarchitecture level brings the FMA and AVX2 features to the table.

Many server and workstation class processors from Intel (2013, Haswell) and AMD (2015, Excavator) meet this level.

My AMD Opteron 6328 also fell out around here with its lack of AVX2 instructions.

While many gaming and workstation processors meets this specification, general desktop and laptop users may not have as easy access to this level. Steam Hardware Survey (Gaming demographic of all levels) shows 89.0% with AVX2 support in December 2022.

If you are targeting enterprises, note that larger corporations do tend to rotate hardware about every 5 years so that it can be written off for tax purposes, so this feature set is usually available to them. The smaller SMBs may be here depending on how much recycling they do (but they probably are experiencing other issues).

MongoDB 5.0 requires some features from this level.

Almost all AWS hardware supports this level (starting with m4 generation).

x86–64-v4

This microarchitecture level primarily brings some of the AVX512 variants (like AVX512F).

Only specific workstation and server-class hardware has these features available. Usually, these are the higher class Intel hardware for Xeon processors (Xeon Phi, Skylake-X, Skylake-SP). AMD’s Zen 4 was just released in 2022 with support for this level.

I recently replaced my workstations to use an Intel Gold 6126 at this level.

The newest generation of AWS hardware supports this level (m5, t3, etc).

Detecting Levels in Bash

This script checks to ensure that x86–64-v3 is satisfied by checking flags in /proc/cpuinfo. (Linux-only)

# Write a message to stderr using light blue.
# $* - Message
_info() {
>&2 echo -e "\\033[0;36m$*\\033[0m"
}

# Write a message to stderr with Yellow foreground. WARNING: will be prepended with a red background.
# $* - Message
function _warning() {
>&2 echo -e "\\033[1;33m\\033[41mWARNING\\033[49m: $*\\033[0m"
}

# Write a message to stderr using Red foreground.
# It will be prefixed with ERROR: automatically.
# $* - Message
function _error() {
>&2 echo -e "\\033[1;91mERROR: $*\\033[0m"
}

# Check a single set of flags
#
# Usage:
# $1 - flags
# $2 - optional (if false, show missing flags)
# $* - array to check in flags
#
# Returns: 1 if missing any flags
function _validate_flags() {
local check rc=0 flags=" $1 " optional=$2
shift 2

for check in "$@"; do
case "$flags" in
*" $check "*)
;;
*)
if [ "$optional" = "false" ]; then
_warning "Missing $check"
fi
rc=1
;;
esac
done
return "${rc}"
}

# Validate x86_64 support.
#
# We require x86_64-v3, which is characterized by:
# SSE 4.2
# POPCNT
# AVX
# AVX2
#
# See https://gitlab.com/x86-psABIs/x86-64-ABI/-/blob/master/x86-64-ABI/low-level-sys-info.tex
function _check_linux_86_64_levels() {
local flags=$1
# x86_64-v1
if ! _validate_flags "$flags" false lm cmov cx8 fpu fxsr mmx syscall sse2; then
_error "Insufficient x86_64-v1 support"
return 2
fi
# x86_64-v2
if ! _validate_flags "$flags" false cx16 lahf_lm popcnt sse4_1 sse4_2 ssse3; then
_error "Insufficient x86_64-v2 support"
return 2
fi
# x86_64-v3
if ! _validate_flags "$flags" false avx avx2 bmi1 bmi2 f16c fma abm movbe xsave; then
_error "Insufficient x86_64-v3 support"
return 2
fi
_info "Sufficient x86_64-v3 support"
# bonus check: x86_64-v4
if _validate_flags "$flags" true avx512f avx512bw avx512cd avx512dq avx512vl; then
_info "Sufficient x86_64-v4 support"
fi
return 0
}

flags=$(grep -m1 "^flags" /proc/cpuinfo)
_check_linux_86_64_levels "$flags"

CPU Dispatching

It is totally possible to support many levels at once using CPU dispatching. This generally involves compiling for all supported levels and selecting the appropriate runtime for a given target at runtime.

Maybe I will go into more detail on this later.

--

--