NonMaxSuppression¶
Versioned name : NonMaxSuppression-5
Category : Sorting and maximization
Short description : NonMaxSuppression performs non maximum suppression of the boxes with predicted scores.
Detailed description : NonMaxSuppression performs non maximum suppression algorithm as described below:
Let
B = [b_0,...,b_n]
be the list of initial detection boxes,S = [s_0,...,s_N]
be the list of corresponding scores.Let
D = []
be an initial collection of resulting boxes.If
B
is empty then go to step 8.Take the box with highest score. Suppose that it is the box
b
with the scores
.Delete
b
fromB
.If the score
s
is greater or equal thanscore_threshold
then addb
toD
else go to step 8.For each input box
b_i
fromB
and the corresponding scores_i
, sets_i = s_i \* func(IOU(b_i, b))
and go to step 3.Return
D
, a collection of the corresponding scoresS
, and the number of elements inD
.
Here func(iou) = 1 if iou <= iou_threshold else 0
when soft_nms_sigma == 0
, else func(iou) = exp(-0.5 \* iou \* iou / soft_nms_sigma) if iou <= iou_threshold else 0
.
This algorithm is applied independently to each class of each batch element. The total number of output boxes for each class must not exceed max_output_boxes_per_class
.
Attributes :
box_encoding
Description : box_encoding specifies the format of boxes data encoding.
Range of values : “corner” or “center”
corner - the box data is supplied as
[y1, x1, y2, x2]
where(y1, x1)
and(y2, x2)
are the coordinates of any diagonal pair of box corners.center - the box data is supplied as
[x_center, y_center, width, height]
.
Type : string
Default value : “corner”
Required : no
sort_result_descending
Description : sort_result_descending is a flag that specifies whenever it is necessary to sort selected boxes across batches or not.
Range of values : true of false
true - sort selected boxes across batches.
false - do not sort selected boxes across batches (boxes are sorted per class).
Type : boolean
Default value : true
Required : no
output_type
Description : the output tensor type
Range of values : “i64” or “i32”
Type : string
Default value : “i64”
Required : no
Inputs :
1 :
boxes
- tensor of type T and shape[num_batches, num_boxes, 4]
with box coordinates. Required.2 :
scores
- tensor of type T and shape[num_batches, num_classes, num_boxes]
with box scores. Required.3 :
max_output_boxes_per_class
- scalar or 1D tensor with 1 element of type T_MAX_BOXES specifying maximum number of boxes to be selected per class. Optional with default value 0 meaning select no boxes.4 :
iou_threshold
- scalar or 1D tensor with 1 element of type T_THRESHOLDS specifying intersection over union threshold. Optional with default value 0 meaning keep all boxes.5 :
score_threshold
- scalar or 1D tensor with 1 element of type T_THRESHOLDS specifying minimum score to consider box for the processing. Optional with default value 0.6 :
soft_nms_sigma
- scalar or 1D tensor with 1 element of type T_THRESHOLDS specifying the sigma parameter for Soft-NMS; see Bodla et al. Optional with default value 0.
Outputs :
1 :
selected_indices
- tensor of type T_IND and shape[number of selected boxes, 3]
containing information about selected boxes as triplets[batch_index, class_index, box_index]
.2 :
selected_scores
- tensor of type T_THRESHOLDS and shape[number of selected boxes, 3]
containing information about scores for each selected box as triplets[batch_index, class_index, box_score]
.3 :
valid_outputs
- 1D tensor with 1 element of type T_IND representing the total number of selected boxes.
Plugins which do not support dynamic output tensors produce selected_indices
and selected_scores
tensors of shape [min(num_boxes, max_output_boxes_per_class) \* num_batches \* num_classes, 3]
which is an upper bound for the number of possible selected boxes. Output tensor elements following the really selected boxes are filled with value -1.
Types
T : floating-point type.
T_MAX_BOXES : integer type.
T_THRESHOLDS : floating-point type.
T_IND :
int64
orint32
.
Example
<layer ... type="NonMaxSuppression" ... >
<data box_encoding="corner" sort_result_descending="1" output_type="i64"/>
<input>
<port id="0">
<dim>3</dim>
<dim>100</dim>
<dim>4</dim>
</port>
<port id="1">
<dim>3</dim>
<dim>5</dim>
<dim>100</dim>
</port>
<port id="2"/> <!-- 10 -->
<port id="3"/>
<port id="4"/>
</input>
<output>
<port id="5" precision="I64">
<dim>150</dim> <!-- min(100, 10) \* 3 \* 5 -->
<dim>3</dim>
</port>
<port id="6" precision="FP32">
<dim>150</dim> <!-- min(100, 10) \* 3 \* 5 -->
<dim>3</dim>
</port>
<port id="7" precision="I64">
<dim>1</dim>
</port>
</output>
</layer>