Model Accuracy for INT8 and FP32 Precision

The following table shows the absolute accuracy drop that is calculated as the difference in accuracy between the FP32 representation of a model and its INT8 representation.

Intel® Core™
i9-10920X CPU
@ 3.50GHZ (VNNI)
Intel® Core™
i9-9820X CPU
@ 3.30GHz (AVX512)
Intel® Core™
i7-6700K CPU
@ 4.0GHz (AVX2)
Intel® Core™
i7-1185G7 CPU
@ 4.0GHz (TGL VNNI)
OpenVINO Benchmark
Model Name
Dataset Metric Name Absolute Accuracy Drop, %
bert-base-cased SST-2 accuracy 0.57 0.11 0.11 0.57
bert-large-uncased-whole-word-masking-squad-0001 SQUAD F1 0.76 0.59 0.68 0.76
brain-tumor-
segmentation-
0001-MXNET
BraTS Dice-index@
Mean@
Overall Tumor
0.10 0.10 0.10 0.10
brain-tumor-
segmentation-
0001-ONNX
BraTS Dice-index@
Mean@
Overall Tumor
0.11 0.12 0.12 0.11
deeplabv3-TF VOC2012 mean_iou 0.03 0.42 0.42 0.03
densenet-121-TF ImageNet accuracy@top1 0.50 0.56 0.56 0.50
efficientdet-d0-tf COCO2017 coco_precision 0.55 0.81 0.81 0.55
facenet-
20180408-
102900-TF
LFW_MTCNN pairwise_
accuracy
_subsets
0.05 0.12 0.12 0.05
faster_rcnn_
resnet50_coco-TF
COCO2017 coco_
precision
0.16 0.16 0.16 0.16
googlenet-v3-tf ImageNet accuracy@top1 0.01 0.01 0.01 0.01
googlenet-v4-tf ImageNet accuracy@top1 0.09 0.06 0.06 0.09
mask_rcnn_resnet50_
atrous_coco-tf
COCO2017 coco_orig_precision 0.02 0.10 0.10 0.02
mobilenet-
ssd-caffe
VOC2012 mAP 0.51 0.54 0.54 0.51
mobilenet-v2-1.0-
224-TF
ImageNet acc@top-1 0.35 0.79 0.79 0.35
mobilenet-v2-
PYTORCH
ImageNet acc@top-1 0.34 0.58 0.58 0.34
resnet-18-
pytorch
ImageNet acc@top-1 0.29 0.25 0.25 0.29
resnet-50-
PYTORCH
ImageNet acc@top-1 0.24 0.20 0.20 0.24
resnet-50-
TF
ImageNet acc@top-1 0.10 0.09 0.09 0.10
ssd_mobilenet_
v1_coco-tf
COCO2017 coco_precision 0.23 3.06 3.06 0.17
ssdlite_
mobilenet_
v2-TF
COCO2017 coco_precision 0.09 0.44 0.44 0.09
ssd-resnet34-
1200-onnx
COCO2017 COCO mAp 0.09 0.08 0.09 0.09
unet-camvid-
onnx-0001
CamVid mean_iou@mean 0.33 0.33 0.33 0.33
yolo-v3-tiny-tf COCO2017 COCO mAp 0.05 0.08 0.08 0.05
yolo_v4-TF COCO2017 COCO mAp 0.03 0.01 0.01 0.03

The table below illustrates the speed-up factor for the performance gain by switching from an FP32 representation of an OpenVINO™ supported model to its INT8 representation.

Intel® Core™
i7-8700T
Intel® Core™
i7-1185G7
Intel® Xeon®
W-1290P
Intel® Xeon®
Platinum
8270
OpenVINO
benchmark
model name
Dataset Throughput speed-up FP16-INT8 vs FP32
bert-base-cased SST-2 1.5 3.0 1.4 2.4
bert-large-uncased-whole-word-masking-squad-0001 SQUAD 1.7 3.2 1.7 3.3
brain-tumor-
segmentation-
0001-MXNET
BraTS 1.6 2.0 1.9 2.1
brain-tumor-
segmentation-
0001-ONNX
BraTS 2.6 3.2 3.3 3.0
deeplabv3-TF VOC2012 1.9 3.1 3.5 3.8
densenet-121-TF ImageNet 1.7 3.3 1.9 3.7
efficientdet-d0-tf COCO2017 1.6 1.9 2.5 2.3
facenet-
20180408-
102900-TF
LFW_MTCNN 2.1 3.5 2.4 3.4
faster_rcnn_
resnet50_coco-TF
COCO2017 1.9 3.7 1.9 3.3
googlenet-v3-tf ImageNet 1.9 3.7 2.0 4.0
googlenet-v4-tf ImageNet 1.9 3.7 2.0 4.2
mask_rcnn_resnet50_
atrous_coco-tf
COCO2017 1.6 3.6 1.6 2.3
mobilenet-
ssd-caffe
VOC2012 1.6 3.1 2.2 3.8
mobilenet-v2-1.0-
224-TF
ImageNet 1.5 2.4 2.1 3.3
mobilenet-v2-
PYTORCH
ImageNet 1.5 2.4 2.1 3.4
resnet-18-
pytorch
ImageNet 2.0 4.1 2.2 4.1
resnet-50-
PYTORCH
ImageNet 1.9 3.5 2.1 4.0
resnet-50-
TF
ImageNet 1.9 3.5 2.0 4.0
ssd_mobilenet_
v1_coco-tf
COCO2017 1.7 3.1 2.2 3.6
ssdlite_
mobilenet_
v2-TF
COCO2017 1.6 2.4 2.7 3.2
ssd-resnet34-
1200-onnx
COCO2017 1.7 4.0 1.7 3.2
unet-camvid-
onnx-0001
CamVid 1.6 4.6 1.6 6.2
yolo-v3-tiny-tf COCO2017 1.8 3.4 2.0 3.5
yolo_v4-TF COCO2017 2.3 3.4 2.4 3.1
INT8 vs FP32 Comparison