CTCGreedyDecoder¶
Versioned name : CTCGreedyDecoder-1
Category : Sequence processing
Short description : CTCGreedyDecoder performs greedy decoding on the logits given in input (best path).
Detailed description : Given an input sequence \(X\) of length \(T\), CTCGreedyDecoder assumes the probability of a length \(T\) character sequence \(C\) is given by
Sequences in the batch can have different length. The lengths of sequences are coded as values 1 and 0 in the second input tensor sequence_mask
. Value sequence_mask[j, i]
specifies whether there is a sequence symbol at index i
in the sequence i
in the batch of sequences. If there is no symbol at j
-th position sequence_mask[j, i] = 0
, and sequence_mask[j, i] = 1
otherwise. Starting from j = 0
, sequence_mass[j, i]
are equal to 1 up to the particular index j = last_sequence_symbol
, which is defined independently for each sequence i
. For j > last_sequence_symbol
, values in sequence_mask[j, i]
are all zeros.
Note : Regardless of the value of ctc_merge_repeated
attribute, if the output index for a given batch and time step corresponds to the blank_index
, no new element is emitted.
Attributes
ctc_merge_repeated
Description : ctc_merge_repeated is a flag for merging repeated labels during the CTC calculation.
Range of values :
true
orfalse
Type :
boolean
Default value :
true
Required : no
Inputs
1 :
data
- input tensor with batch of sequences of type T_F and shape[T, N, C]
, whereT
is the maximum sequence length,N
is the batch size andC
is the number of classes. Required.2 :
sequence_mask
- input tensor with sequence masks for each sequence in the batch of type T_F populated with values0
and1
and shape[T, N]
. Required.
Output
1 : Output tensor of type T_F and shape
[N, T, 1, 1]
which is filled with integer elements containing final sequence class indices. A final sequence can be shorter that the sizeT
of the tensor, all elements that do not code sequence classes are filled with-1
.
Types
T_F : any supported floating-point type.
Example
<layer ... type="CTCGreedyDecoder" ...>
<data ctc_merge_repeated="true" />
<input>
<port id="0">
<dim>20</dim>
<dim>8</dim>
<dim>128</dim>
</port>
<port id="1">
<dim>20</dim>
<dim>8</dim>
</port>
</input>
<output>
<port id="0">
<dim>8</dim>
<dim>20</dim>
<dim>1</dim>
<dim>1</dim>
</port>
</output>
</layer>