Preprocessing API - details

Preprocessing capabilities

Addressing particular input/output

If your model has only one input, then simple ov::preprocess::PrePostProcessor::input() will get a reference to preprocessing builder for this input (tensor, steps, model):

ppp.input() // no index/name is needed if model has one input
  .preprocess().scale(50.f);

ppp.output()   // same for output
  .postprocess().convert_element_type(ov::element::u8);
# no index/name is needed if model has one input
ppp.input().preprocess().scale(50.)

# same for output
ppp.output() \
    .postprocess().convert_element_type(Type.u8)

In general, when model has multiple inputs/outputs, each one can be addressed by tensor name

auto &input_image = ppp.input("image");
auto &output_result = ppp.output("result");
ppp.input('image')
ppp.output('result')

Or by it’s index

auto &input_1 = ppp.input(1); // Gets 2nd input in a model
auto &output_1 = ppp.output(2); // Get output with index=2 (3rd one) in a model
ppp.input(1) # Gets 2nd input in a model
ppp.output(2) # Gets output with index=2 (3rd one) in a model

C++ references:

Supported preprocessing operations

C++ references:

Mean/Scale normalization

Typical data normalization includes 2 operations for each data item: subtract mean value and divide to standard deviation. This can be done with the following code:

ppp.input("input").preprocess().mean(128).scale(127);
ppp.input('input').preprocess().mean(128).scale(127)

In Computer Vision area normalization is usually done separately for R, G, B values. To do this, layout with ‘C’ dimension shall be defined. Example:

// Suppose model's shape is {1, 3, 224, 224}
ppp.input("input").model().set_layout("NCHW"); // N=1, C=3, H=224, W=224
// Mean/Scale has 3 values which matches with C=3
ppp.input("input").preprocess()
  .mean({103.94, 116.78, 123.68}).scale({57.21, 57.45, 57.73});
# Suppose model's shape is {1, 3, 224, 224}
# N=1, C=3, H=224, W=224
ppp.input('input').model().set_layout(Layout('NCHW'))
# Mean/Scale has 3 values which matches with C=3
ppp.input('input').preprocess() \
    .mean([103.94, 116.78, 123.68]).scale([57.21, 57.45, 57.73])

C++ references:

  • ov::preprocess::PreProcessSteps::mean()

  • ov::preprocess::PreProcessSteps::scale()

Convert precision

In Computer Vision, image is represented by array of unsigned 8-but integer values (for each color), but model accepts floating point tensors

To integrate precision conversion into execution graph as a preprocessing step, just do:

// First define data type for your tensor
ppp.input("input").tensor().set_element_type(ov::element::u8);

// Then define preprocessing step
ppp.input("input").preprocess().convert_element_type(ov::element::f32);

// If conversion is needed to `model's` element type, 'f32' can be omitted
ppp.input("input").preprocess().convert_element_type();
# First define data type for your tensor
ppp.input('input').tensor().set_element_type(Type.u8)

# Then define preprocessing step
ppp.input('input').preprocess().convert_element_type(Type.f32)

# If conversion is needed to `model's` element type, 'f32' can be omitted
ppp.input('input').preprocess().convert_element_type()

C++ references:

  • ov::preprocess::InputTensorInfo::set_element_type()

  • ov::preprocess::PreProcessSteps::convert_element_type()

Convert layout (transpose)

Transposing of matrices/tensors is a typical operation in Deep Learning - you may have a BMP image 640x480 which is an array of {480, 640, 3} elements, but Deep Learning model can require input with shape {1, 3, 480, 640}

Using layout of user’s tensor and layout of original model conversion can be done implicitly

// First define layout for your tensor
ppp.input("input").tensor().set_layout("NHWC");

// Then define layout of model
ppp.input("input").model().set_layout("NCHW");

std::cout << ppp; // Will print 'implicit layout conversion step'
# First define layout for your tensor
ppp.input('input').tensor().set_layout(Layout('NHWC'))

# Then define layout of model
ppp.input('input').model().set_layout(Layout('NCHW'))

print(ppp)  # Will print 'implicit layout conversion step'

Or if you prefer manual transpose of axes without usage of layout in your code, just do:

ppp.input("input").tensor().set_shape({1, 480, 640, 3});
// Model expects shape {1, 3, 480, 640}
ppp.input("input").preprocess().convert_layout({0, 3, 1, 2});
// 0 -> 0; 3 -> 1; 1 -> 2; 2 -> 3
ppp.input('input').tensor().set_shape([1, 480, 640, 3])

# Model expects shape {1, 3, 480, 640}
ppp.input('input').preprocess()\
    .convert_layout([0, 3, 1, 2])
# 0 -> 0; 3 -> 1; 1 -> 2; 2 -> 3

It performs the same transpose, but we believe that approach using source and destination layout can be easier to read and understand

C++ references:

  • ov::preprocess::PreProcessSteps::convert_layout()

  • ov::preprocess::InputTensorInfo::set_layout()

  • ov::preprocess::InputModelInfo::set_layout()

  • ov::Layout

Resize image

Resizing of image is a typical preprocessing step for computer vision tasks. With preprocessing API this step can also be integrated into execution graph and performed on target device.

To resize the input image, it is needed to define H and W dimensions of layout

ppp.input("input").tensor().set_shape({1, 3, 960, 1280});
ppp.input("input").model().set_layout("??HW");
ppp.input("input").preprocess().resize(ov::preprocess::ResizeAlgorithm::RESIZE_LINEAR, 480, 640);
ppp.input('input').tensor().set_shape([1, 3, 960, 1280])
ppp.input('input').model().set_layout(Layout('??HW'))
ppp.input('input').preprocess()\
    .resize(ResizeAlgorithm.RESIZE_LINEAR, 480, 640)

Or in case if original model has known spatial dimensions (widht+height), target width/height can be omitted

ppp.input("input").tensor().set_shape({1, 3, 960, 1280});
ppp.input("input").model().set_layout("??HW"); // Model accepts {1, 3, 480, 640} shape
// Resize to model's dimension
ppp.input("input").preprocess().resize(ov::preprocess::ResizeAlgorithm::RESIZE_LINEAR);
ppp.input('input').tensor().set_shape([1, 3, 960, 1280])
# Model accepts {1, 3, 480, 640} shape, thus last dimensions are 'H' and 'W'
ppp.input('input').model().set_layout(Layout('??HW'))
# Resize to model's dimension
ppp.input('input').preprocess().resize(ResizeAlgorithm.RESIZE_LINEAR)

C++ references:

Color conversion

Typical use case is to reverse color channels from RGB to BGR and wise versa. To do this, specify source color format in tensor section and perform convert_color preprocessing operation. In example below, user has BGR image and needs to convert it to RGB as required for model’s input

ppp.input("input").tensor().set_color_format(ov::preprocess::ColorFormat::BGR);
ppp.input("input").preprocess().convert_color(ov::preprocess::ColorFormat::RGB);
ppp.input('input').tensor().set_color_format(ColorFormat.BGR)

ppp.input('input').preprocess().convert_color(ColorFormat.RGB)

Color conversion - NV12/I420

Preprocessing also support YUV-family source color formats, i.e. NV12 and I420. In advanced cases such YUV images can be splitted into separate planes, e.g. for NV12 images Y-component may come from one source and UV-component comes from another source. Concatenating such components in user’s application manually is not a perfect solution from performance and device utilization perspectives, so there is a way to use Preprocessing API. For such cases there is NV12_TWO_PLANES and I420_THREE_PLANES source color formats, which will split original input to 2 or 3 inputs

// This will split original `input` to 2 separate inputs: `input/y' and 'input/uv'
ppp.input("input").tensor().set_color_format(ov::preprocess::ColorFormat::NV12_TWO_PLANES);
ppp.input("input").preprocess().convert_color(ov::preprocess::ColorFormat::RGB);
std::cout << ppp;  // Dump preprocessing steps to see what will happen
# This will split original `input` to 2 separate inputs: `input/y' and 'input/uv'
ppp.input('input').tensor()\
    .set_color_format(ColorFormat.NV12_TWO_PLANES)

ppp.input('input').preprocess()\
    .convert_color(ColorFormat.RGB)
print(ppp)  # Dump preprocessing steps to see what will happen

In this example, original input is being split to input/y and input/uv inputs. You can fill input/y from one source, and input/uv from another source. Color conversion to RGB will be performed using these sources, it is more optimal as there will be no additional copies of NV12 buffers.

C++ references:

Custom operations

Preprocessing API also allows adding custom preprocessing steps into execution graph. Custom step is a function which accepts current ‘input’ node and returns new node after adding preprocessing step

Note: Custom preprocessing function shall only insert node(s) after input, it will be done during model compilation. This function will NOT be called during execution phase. This may look not trivial and require some knowledge of OpenVINO™ operations

If there is a need to insert some additional operations to execution graph right after input, like some specific crops and/or resizes - Preprocessing API can be a good choice to implement this

ppp.input("input_image").preprocess()
   .custom([](const ov::Output<ov::Node>& node) {
       // Custom nodes can be inserted as Pre-processing steps
       return std::make_shared<ov::opset8::Abs>(node);
   });
# It is possible to insert some custom operations
import openvino.runtime.opset8 as ops
from openvino.runtime import Output
from openvino.runtime.utils.decorators import custom_preprocess_function

@custom_preprocess_function
def custom_abs(output: Output):
    # Custom nodes can be inserted as Preprocessing steps
    return ops.abs(output)

ppp.input("input_image").preprocess() \
    .custom(custom_abs)

C++ references:

Postprocessing

Postprocessing steps can be added to model outputs. As for preprocessing, these steps will be also integrated into graph and executed on selected device.

Preprocessing uses flow User tensor -> Steps -> Model input

Postprocessing is wise versa: Model output -> Steps -> User tensor

Comparing to preprocessing, there is not so much operations needed to do in post-processing stage, so right now only following postprocessing operations are supported:

  • Convert layout

  • Convert element type

  • Custom operations

Usage of these operations is similar to Preprocessing. Some example is shown below:

// Model's output has 'NCHW' layout
ppp.output("result_image").model().set_layout("NCHW");

// Set target user's tensor to U8 type + 'NHWC' layout
// Precision & layout conversions will be done implicitly
ppp.output("result_image").tensor()
   .set_layout("NHWC")
   .set_element_type(ov::element::u8);

// Also it is possible to insert some custom operations
ppp.output("result_image").postprocess()
   .custom([](const ov::Output<ov::Node>& node) {
       // Custom nodes can be inserted as Post-processing steps
       return std::make_shared<ov::opset8::Abs>(node);
   });
# Model's output has 'NCHW' layout
ppp.output('result_image').model().set_layout(Layout('NCHW'))

# Set target user's tensor to U8 type + 'NHWC' layout
# Precision & layout conversions will be done implicitly
ppp.output('result_image').tensor()\
    .set_layout(Layout("NHWC"))\
    .set_element_type(Type.u8)

# Also it is possible to insert some custom operations
import openvino.runtime.opset8 as ops
from openvino.runtime import Output
from openvino.runtime.utils.decorators import custom_preprocess_function

@custom_preprocess_function
def custom_abs(output: Output):
    # Custom nodes can be inserted as Post-processing steps
    return ops.abs(output)

ppp.output("result_image").postprocess()\
    .custom(custom_abs)

C++ references: