Paper Review: “PACT: Parameterized clipping activation for quantized neural networks”

In this post, I’ll review PACT — a novel quantization scheme for activations during training. In the experiments, the authors show that the model can be quantized down to 4-bit without any significant accuracy degradation.

Anh Tuan
5 min readMay 15, 2022

Original paper: https://arxiv.org/pdf/1805.06085.pdf

Photo by Zawila on Unsplash

1. Quantization

1.1. What is Quantization?

Nowadays, nowadays, deep learning models achieve high results in many fields: computer vision, NLP, speech recognition,…However, most state-of-the-art models are not capable of deploying on IoT devices because of limited resources.

Definition of quantization according to Wikipedia:

Quantization, in mathematics and digital signal processing, is the process of mapping input values from a large set (often a continuous set) to output values in a (countable) smaller set, often with a finite number of elements. Rounding and truncation are typical examples of quantization processes. Quantization is involved to some degree in nearly all digital signal processing, as the process of representing a signal in digital form ordinarily involves rounding. Quantization also forms the core of essentially all lossy compression algorithms.

Quantization is a technique in which model data (model parameters and activations) are converted from a floating-point representation (float32) to a lower-precision representation (float16 or int8). This reduces the size of the model, and latency, and increases inference speed.

Fig. 1: Example of quantization (Source: Nvidia)

1.2. Challenge in Activation Quantization

Weight quantization is the same as discretizing the loss function’s hypothesis space with regard to the weight variables. Therefore, in model training, it is possible to make up for the error caused by weighting. Traditional activation functions do not have any trainable parameters. As a result, back-propagation cannot directly correct for mistakes caused by quantizing activations.

Quantization becomes more difficult when ReLU function is used. As the output of the ReLU function is unbounded, the quantization after ReLU requires a high dynamic range (more bit-precision). However, we can solve this problem by using a clipping activation which places an upper bound on the output. However, because of layer to layer and model to model differences, it is difficult to determine a globally optimal clipping value.

2. PACT: Parameterized Clipping Activation Function

2.1. Method

From the question mentioned earlier, the authors propose PACT, a new activation quantization scheme with a parameterized clipping level, α. It can be learned during training. Then, the conventional ReLU activation function in CNN can be replaced by PACT with the formula:

The truncated activation output is then linearly quantized to k bits for the dot-product computations:

For back-propagation, gradient ∂y_q/∂α can be computed using the Straight-Through Estimator (STE).

Fig. 2: Evolution of α values during training using a ResNet20 model on the CIFAR10 dataset ([1])

Starting with an initial value of 10, and using the L2-regularizer, Fig. 2 shows how the value evolves throughout full-precision training of CIFAR10- ResNet20.

2.1. Experiment

In the experiment, the authors implemented PACT in Tensorflow using Tensorpack. The authors only replace ReLU into PACT but the same hyper-parameters are used. Models in the experiment include:

  • CIFAR10-ResNet20 (CIFAR10, Krizhevsky & Hinton (2010)): a convolution (CONV) layer followed by 3 ResNet blocks (16 CONV layers with 3x3 filter) and a final fully-connected (FC) layer.
  • SVHN-SVHN (SVHN, Netzer et al. (2011)): 7 CONV layers followed by 1 FC layer.
  • IMAGENET-AlexNet (AlexNet, Krizhevsky et al. (2012)): 5 parallel-CONV layers followed by 3 FC layers. BatchNorm is used before ReLU.
  • IMAGENET-ResNet18 (ResNet18, He et al. (2016b)): a CONV layer followed by 8 ResNet blocks (16 CONV layers with 3x3 filter) and a final FC layer. “full pre-activation” ResNet structure (He et al. (2016a)) is employed.
  • IMAGENET-ResNet50 (ResNet50, He et al. (2016b)): a CONV layer followed by 16 ResNet “bottleneck” blocks (total 48 CONV layers) and a final FC layer. “full pre-activation” ResNet structure (He et al. (2016a)) is employed.
Fig. 3: Training and validation error of PACT ([1])

Fig. 3 shows training and validation error of PACT for the tested CNNs. Overall, the higher the bit-precision, the closer the training/validation errors are to the full-precision reference.. Specifically it can be seen that training using bit-precision higher than 3-bits converges almost identically to the full-precision baseline.

3. Conclusion

In this post, I reviewed a novel activation quantization scheme based on the PArameterized Clipping acTivation function (PACT). This proposes scheme replaces ReLU with an activation function with a clipping parameter, α, that is optimized via gradient descent based training.

You can read more about the article here:

If you have any questions, please comment below or contact me via linkedin or github

If you enjoyed this, please consider supporting me.

Resources:

[1] PACT: https://arxiv.org/pdf/1805.06085v2.pdf

--

--

Anh Tuan

Artificial Intelligence and Data Science Enthusiast