Arm® CPU device¶
Introducing the Arm® CPU Plugin¶
The Arm® CPU plugin is developed in order to enable deep neural networks inference on Arm® CPU, using Compute Library as a backend.
Note
Note that this is a community-level add-on to OpenVINO™. Intel® welcomes community participation in the OpenVINO™ ecosystem and technical questions on community forums as well as code contributions are welcome. However, this component has not undergone full release validation or qualification from Intel®, and no official support is offered.
The Arm® CPU plugin is not a part of the Intel® Distribution of OpenVINO™ toolkit and is not distributed in pre-built form. To use the plugin, it should be built from source code. Plugin build procedure is described on page How to build Arm® CPU plugin.
The set of supported layers is defined on Operation set specification.
Supported inference data types¶
The Arm® CPU plugin supports the following data types as inference precision of internal primitives:
Floating-point data types:
f32
f16
Quantized data types:
i8
Note
i8 support is experimental.
Hello Query Device C++ Sample can be used to print out supported data types for all detected devices.
Supported features¶
Preprocessing acceleration¶
The Arm® CPU plugin supports the following accelerated preprocessing operations:
Precision conversion:
u8 -> u16, s16, s32
u16 -> u8, u32
s16 -> u8, s32
f16 -> f32
Transposion of tensors with dims < 5
Interpolation of 4D tensors with no padding (
pads_begin
andpads_end
equal 0).
The Arm® CPU plugin supports the following preprocessing operations, however they are not accelerated:
Precision conversion that are not mentioned above
Color conversion:
NV12 to RGB
NV12 to BGR
i420 to RGB
i420 to BGR
See preprocessing API guide for more details.
Supported properties¶
The plugin supports the properties listed below.
Read-write properties¶
All parameters must be set before calling ov::Core::compile_model()
in order to take effect or passed as additional argument to ov::Core::compile_model()
Read-only properties¶
Known Layers Limitation¶
AvgPool
layer is supported via arm_compute library for 4D input tensor and via reference implementation for another cases.BatchToSpace
layer is supported 4D tensors only and constant nodes:block_shape
withN
= 1 andC
= 1,crops_begin
with zero values andcrops_end
with zero values.ConvertLike
layer is supported configuration likeConvert
.DepthToSpace
layer is supported 4D tensors only and forBLOCKS_FIRST
ofmode
attribute.Equal
does not supportbroadcast
for inputs.Gather
layer is supported constant scalar or 1D indices axes only. Layer is supported as via arm_compute library for non negative indices and via reference implementation otherwise.Less
does not supportbroadcast
for inputs.LessEqual
does not supportbroadcast
for inputs.LRN
layer is supportedaxes = {1}
oraxes = {2, 3}
only.MaxPool-1
layer is supported via arm_compute library for 4D input tensor and via reference implementation for another cases.Mod
layer is supported for f32 only.MVN
layer is supported via arm_compute library for 2D inputs andfalse
value ofnormalize_variance
andfalse
value ofacross_channels
, for another cases layer is implemented via runtime reference.Normalize
layer is supported via arm_compute library withMAX
value ofeps_mode
andaxes = {2 | 3}
, and forADD
value ofeps_mode
layer usesDecomposeNormalizeL2Add
, for another cases layer is implemented via runtime reference.NotEqual
does not supportbroadcast
for inputs.Pad
layer works withpad_mode = {REFLECT | CONSTANT | SYMMETRIC}
parameters only.Round
layer is supported via arm_compute library withRoundMode::HALF_AWAY_FROM_ZERO
value ofmode
, for another cases layer is implemented via runtime reference.SpaceToBatch
layer is supported 4D tensors only and constant nodes:shapes
,pads_begin
orpads_end
with zero paddings for batch or channels and one valuesshapes
for batch and channels.SpaceToDepth
layer is supported 4D tensors only and forBLOCKS_FIRST
ofmode
attribute.StridedSlice
layer is supported via arm_compute library for tensors with dims < 5 and zero values ofellipsis_mask
or zero values ofnew_axis_mask
andshrink_axis_mask
, for another cases layer is implemented via runtime reference.FakeQuantize
layer is supported via arm_compute library in Low Precision evaluation mode for suitable models and via runtime reference otherwise.