SmaQ: Smart quantization for DNN training by exploiting value clustering

Published in IEEE Computer Architecture Letters, 2021

Citation: Nima Shoghi, Andrei Bersatti, Moinuddin Qureshi, Hyesoon Kim, IEEE Computer Architecture Letters 20 (2), 126-129, 2021 https://ieeexplore.ieee.org/abstract/document/9525237/

This paper introduces Smart Quantization (SmaQ), a quantization scheme that exploits the observed normal distribution of various data structures used in neural networks to quantize them efficiently. By studying the characteristics of weight, gradient, feature map, gradient map, and optimizer state distributions for popular neural network architectures, the authors propose a dynamic quantization method that calculates the sampled mean and standard deviation of tensors to quantize each tensor, ultimately addressing the memory bottleneck in single-machine training of deep networks.

Access paper here