CTCGreedyDecoderSeqLen¶
Versioned name : CTCGreedyDecoderSeqLen-6
Category : Sequence processing
Short description : CTCGreedyDecoderSeqLen performs greedy decoding of the logits provided as the first input. The sequence lengths are provided as the second input.
Detailed description :
This operation is similar to the TensorFlow CTCGreedyDecoder.
The operation CTCGreedyDecoderSeqLen implements best path decoding. Decoding is done in two steps:
Concatenate the most probable classes per time-step which yields the best path.
Remove duplicate consecutive elements if the attribute merge_repeated is true and then remove all blank elements.
Sequences in the batch can have different length. The lengths of sequences are coded in the second input integer tensor sequence_length
.
The main difference between CTCGreedyDecoder and CTCGreedyDecoderSeqLen is in the second input. CTCGreedyDecoder uses 2D input floating-point tensor with sequence masks for each sequence in the batch while CTCGreedyDecoderSeqLen uses 1D integer tensor with sequence lengths.
Attributes
merge_repeated
Description : merge_repeated is a flag for merging repeated labels during the CTC calculation. If the value is false the sequence
ABB\*B\*B
(where ‘*’ is the blank class) will look likeABBBB
. But if the value is true, the sequence will beABBB
.Range of values : true or false
Type :
boolean
Default value : true
Required : no
classes_index_type
Description : the type of output tensor with classes indices
Range of values : “i64” or “i32”
Type : string
Default value : “i32”
Required : no
sequence_length_type
Description : the type of output tensor with sequence length
Range of values : “i64” or “i32”
Type : string
Default value : “i32”
Required : no
Inputs
1 :
data
- input tensor of type T_F of shape[N, T, C]
with a batch of sequences. WhereT
is the maximum sequence length,N
is the batch size andC
is the number of classes. Required.2 :
sequence_length
- input tensor of type T_I of shape[N]
with sequence lengths. The values of sequence length must be less or equal toT
. Required.3 :
blank_index
- scalar or 1D tensor with 1 element of type T_I. Specifies the class index to use for the blank class. Regardless of the value ofmerge_repeated
attribute, if the output index for a given batch and time step corresponds to theblank_index
, no new element is emitted. Default value isC-1
. Optional.
Output
1 : Output tensor of type T_IND1 shape
[N, T]
and containing the decoded classes. All elements that do not code sequence classes are filled with -1.2 : Output tensor of type T_IND2 shape
[N]
and containing length of decoded class sequence for each batch.
Types
T_F : any supported floating-point type.
T_I :
int32
orint64
.T_IND1 :
int32
orint64
and depends onclasses_index_type
attribute.T_IND2 :
int32
orint64
and depends onsequence_length_type
attribute.
Example
<layer ... type="CTCGreedyDecoderSeqLen" version="opset6">
<data merge_repeated="true" classes_index_type="i64" sequence_length_type="i64"/>
<input>
<port id="0">
<dim>8</dim>
<dim>20</dim>
<dim>128</dim>
</port>
<port id="1">
<dim>8</dim>
</port>
<port id="2"/> <!-- blank_index = 120 -->
</input>
<output>
<port id="0" precision="I64">
<dim>8</dim>
<dim>20</dim>
</port>
<port id="1" precision="I64">
<dim>8</dim>
</port>
</output>
</layer>