ngraph.fake_quantize

ngraph.fake_quantize(data: Union[_pyngraph.Node, int, float, numpy.ndarray], input_low: Union[_pyngraph.Node, int, float, numpy.ndarray], input_high: Union[_pyngraph.Node, int, float, numpy.ndarray], output_low: Union[_pyngraph.Node, int, float, numpy.ndarray], output_high: Union[_pyngraph.Node, int, float, numpy.ndarray], levels: int, auto_broadcast: str = 'NUMPY', name: Optional[str] = None) _pyngraph.Node

Perform an element-wise linear quantization on input data.

Parameters
  • data – The node with data tensor.

  • input_low – The node with the minimum for input values.

  • input_high – The node with the maximum for input values.

  • output_low – The node with the minimum quantized value.

  • output_high – The node with the maximum quantized value.

  • levels – The number of quantization levels. Integer value.

  • auto_broadcast – The type of broadcasting specifies rules used for auto-broadcasting of input tensors.

Returns

New node with quantized value.

Input floating point values are quantized into a discrete set of floating point values.

if x <= input_low:
    output = output_low
if x > input_high:
    output = output_high
else:
    output = fake_quantize(output)

Fake quantize uses the following logic:

\[output = \dfrac{round( \dfrac{data - input\_low}{(input\_high - input\_low)\cdot (levels-1)})} {(levels-1)\cdot (output\_high - output\_low)} + output\_low\]