MulticlassNonMaxSuppression<a name=”MulticlassNonMaxSuppression”></a>¶
Versioned name : MulticlassNonMaxSuppression-9
Category : Sorting and maximization
Short description : MulticlassNonMaxSuppression performs multi-class non-maximum suppression of the boxes with predicted scores.
Detailed description : MulticlassNonMaxSuppression is a multi-phase operation. It implements non-maximum suppression algorithm as described below:
Let
B = [b_0,...,b_n]
be the list of initial detection boxes,S = [s_0,...,s_N]
be the list of corresponding scores.Let
D = []
be an initial collection of resulting boxes. Letadaptive_threshold = iou_threshold
.If
B
is empty, go to step 9.Take the box with highest score. Suppose that it is the box
b
with the scores
.Delete
b
fromB
.If the score
s
is greater than or equal toscore_threshold
, addb
toD
, else go to step 9.If
nms_eta < 1
andadaptive_threshold > 0.5
, updateadaptive_threshold \*= nms_eta
.For each input box
b_i
fromB
and the corresponding scores_i
, sets_i = 0
wheniou(b, b_i) > adaptive_threshold
, and go to step 3.Return
D
, a collection of the corresponding scoresS
, and the number of elements inD
.
This algorithm is applied independently to each class of each batch element. The operation feeds at most nms_top_k
scoring candidate boxes to this algorithm. The total number of output boxes of each batch element must not exceed keep_top_k
. Boxes of background_class
are skipped and thus eliminated.
Attributes :
sort_result
Description : sort_result specifies the order of output elements.
Range of values :
class
,score
,none
class - sort selected boxes by class id (ascending).
score - sort selected boxes by score (descending).
none - do not guarantee the order.
Type :
string
Default value :
none
Required : no
sort_result_across_batch
Description : sort_result_across_batch is a flag that specifies whenever it is necessary to sort selected boxes across batches or not.
Range of values : true or false
true - sort selected boxes across batches.
false - do not sort selected boxes across batches (boxes are sorted per batch element).
Type : boolean
Default value : false
Required : no
output_type
Description : the tensor type of outputs
selected_indices
andvalid_outputs
.Range of values :
i64
ori32
Type :
string
Default value :
i64
Required : no
iou_threshold
Description : intersection over union threshold.
Range of values : a floating-point number
Type :
float
Default value :
0
Required : no
score_threshold
Description : minimum score to consider box for the processing.
Range of values : a floating-point number
Type :
float
Default value :
0
Required : no
nms_top_k
Description : maximum number of boxes to be selected per class.
Range of values : an integer
Type :
int
Default value :
-1
meaning to keep all boxesRequired : no
keep_top_k
Description : maximum number of boxes to be selected per batch element.
Range of values : an integer
Type :
int
Default value :
-1
meaning to keep all boxesRequired : no
background_class
Description : the background class id.
Range of values : an integer
Type :
int
Default value :
-1
meaning to keep all classes.Required : no
normalized
Description : normalized is a flag that indicates whether
boxes
are normalized or not.Range of values : true or false
true - the box coordinates are normalized.
false - the box coordinates are not normalized.
Type : boolean
Default value : True
Required : no
nms_eta
Description : eta parameter for adaptive NMS.
Range of values : a floating-point number in close range
[0, 1.0]
.Type :
float
Default value :
1.0
Required : no
Inputs :
There are 2 kinds of input formats. The first one is of two inputs. The boxes are shared by all classes.
1 :
boxes
- tensor of type T and shape[num_batches, num_boxes, 4]
with box coordinates. The box coordinates are layout as[xmin, ymin, xmax, ymax]
. Required.2 :
scores
- tensor of type T and shape[num_batches, num_classes, num_boxes]
with box scores. The tensor type should be same withboxes
. Required.
The second format is of three inputs. Each class has its own boxes that are not shared.
1 :
boxes
- tensor of type T and shape[num_classes, num_boxes, 4]
with box coordinates. The box coordinates are layout as[xmin, ymin, xmax, ymax]
. Required.2 :
scores
- tensor of type T and shape[num_classes, num_boxes]
with box scores. The tensor type should be same withboxes
. Required.3 :
roisnum
- tensor of type T_IND and shape[num_batches]
with box numbers in each image.num_batches
is the number of images. Each element in this tensor is the number of boxes for corresponding image. The sum of all elements isnum_boxes
. Required.
Outputs :
1 :
selected_outputs
- tensor of type T which should be same withboxes
and shape[number of selected boxes, 6]
containing the selected boxes with score and class as tuples[class_id, box_score, xmin, ymin, xmax, ymax]
.2 :
selected_indices
- tensor of type T_IND and shape[number of selected boxes, 1]
the selected indices in the flattenedboxes
, which are absolute values cross batches. Therefore possible valid values are in the range[0, num_batches \* num_boxes - 1]
.3 :
selected_num
- 1D tensor of type T_IND and shape[num_batches]
representing the number of selected boxes for each batch element.
When there is no box selected, selected_num
is filled with 0
. selected_outputs
is an empty tensor of shape [0, 6]
, and selected_indices
is an empty tensor of shape [0, 1]
.
Types
T : floating-point type.
T_IND :
int64
orint32
.
Example
<layer ... type="MulticlassNonMaxSuppression" ... >
<data sort_result="score" output_type="i64" sort_result_across_batch="false" iou_threshold="0.2" score_threshold="0.5" nms_top_k="-1" keep_top_k="-1" background_class="-1" normalized="false" nms_eta="0.0"/>
<input>
<port id="0">
<dim>3</dim>
<dim>100</dim>
<dim>4</dim>
</port>
<port id="1">
<dim>3</dim>
<dim>5</dim>
<dim>100</dim>
</port>
</input>
<output>
<port id="5" precision="FP32">
<dim>-1</dim> <!-- "-1" means a undefined dimension calculated during the model inference -->
<dim>6</dim>
</port>
<port id="6" precision="I64">
<dim>-1</dim>
<dim>1</dim>
</port>
<port id="7" precision="I64">
<dim>3</dim>
</port>
</output>
</layer>
Another possible example with 3 inputs could be like:
<layer ... type="MulticlassNonMaxSuppression" ... >
<data sort_result="score" output_type="i64" sort_result_across_batch="false" iou_threshold="0.2" score_threshold="0.5" nms_top_k="-1" keep_top_k="-1" background_class="-1" normalized="false" nms_eta="0.0"/>
<input>
<port id="0">
<dim>3</dim>
<dim>100</dim>
<dim>4</dim>
</port>
<port id="1">
<dim>3</dim>
<dim>100</dim>
</port>
<port id="2">
<dim>10</dim>
</port>
</input>
<output>
<port id="5" precision="FP32">
<dim>-1</dim> <!-- "-1" means a undefined dimension calculated during the model inference -->
<dim>6</dim>
</port>
<port id="6" precision="I64">
<dim>-1</dim>
<dim>1</dim>
</port>
<port id="7" precision="I64">
<dim>3</dim>
</port>
</output>
</layer>