<aside> ๐ฉ๐ผโ๐ซ In this paper we focus on integer quantization for neural network inference, where networks are modified to use integer weights and activations so that integer math pipelines can be used for many operations
</aside>

Let $[ฮฒ, ฮฑ]$ be the range of representable real values chosen for quantization and b be the bit-width of the signed integer representation. Uniform quantization transforms the input value $x โ [ฮฒ, ฮฑ]$ to lie within $[โ2^{bโ1}, 2^{bโ1} โ 1]$, where inputs outside the range are clipped to the nearest bound.
Affine transform function : $f(x) = s \cdot x + z$
s : Scale factor
z : zero point โ the integer value to which the real value zero is mapped
Quantize operation
$clip(x, l, u)\left\lbrace \begin{array}{l} l, \;\; x<l\\ x, \; l\leq x \leq u\\ u, \; x > u \end{array}\right.$
$x_q = quantize(x, b, s, z) = clip(round(s \cdot x + z), โ2^{bโ1}, 2^{bโ1} โ 1)$
$\hat{x} = dequantize ( x_q,s,z ) = \frac{1}{s} (x_q - z)$