LSTMCell¶
Versioned name : LSTMCell-1
Category : Sequence processing
Short description : LSTMCell operation represents a single LSTM cell. It computes the output using the formula described in the original paper Long Short-Term Memory.
Detailed description : LSTMCell computes the output Ht and ot for current time step based on the following formula:
Formula:
\* - matrix multiplication
(.) - Hadamard product (element-wise)
[,] - concatenation
f, g, h - are activation functions.
it = f(Xt\*(Wi^T) + Ht-1\*(Ri^T) + Wbi + Rbi)
ft = f(Xt\*(Wf^T) + Ht-1\*(Rf^T) + Wbf + Rbf)
ct = g(Xt\*(Wc^T) + Ht-1\*(Rc^T) + Wbc + Rbc)
Ct = ft (.) Ct-1 + it (.) ct
ot = f(Xt\*(Wo^T) + Ht-1\*(Ro^T) + Wbo + Rbo)
Ht = ot (.) h(Ct)
Attributes
hidden_size
Description : hidden_size specifies hidden state size.
Range of values : a positive integer
Type :
int
Required : yes
activations
Description : activations specifies activation functions for gates, there are three gates, so three activation functions should be specified as a value for this attributes
Range of values : any combination of relu, sigmoid, tanh
Type : a list of strings
Default value : sigmoid for f, tanh for g, tanh for h
Required : no
activations_alpha, activations_beta
Description : activations_alpha, activations_beta attributes of functions; applicability and meaning of these attributes depends on chosen activation functions
Range of values : a list of floating-point numbers
Type :
float[]
Default value : None
Required : no
clip
Description : clip specifies bound values [-C, C] for tensor clipping. Clipping is performed before activations.
Range of values : a positive floating-point number
Type :
float
Default value : infinity that means that the clipping is not applied
Required : no
Inputs
1 :
X
- 2D tensor of type T[batch_size, input_size]
, input data. Required.2 :
initial_hidden_state
- 2D tensor of type T[batch_size, hidden_size]
. Required.3 :
initial_cell_state
- 2D tensor of type T[batch_size, hidden_size]
. Required.4 :
W
- 2D tensor of type T[4 \* hidden_size, input_size]
, the weights for matrix multiplication, gate order: fico. Required.5 :
R
- 2D tensor of type T[4 \* hidden_size, hidden_size]
, the recurrence weights for matrix multiplication, gate order: fico. Required.6 :
B
1D tensor of type T[4 \* hidden_size]
, the sum of biases (weights and recurrence weights), if not specified - assumed to be 0. optional.
Outputs
1 :
Ho
- 2D tensor of type T[batch_size, hidden_size]
, the last output value of hidden state.2 :
Co
- 2D tensor of type T[batch_size, hidden_size]
, the last output value of cell state.
Types
T : any supported floating-point type.
Example
<layer ... type="LSTMCell" ...>
<data hidden_size="128"/>
<input>
<port id="0">
<dim>1</dim>
<dim>16</dim>
</port>
<port id="1">
<dim>1</dim>
<dim>128</dim>
</port>
<port id="2">
<dim>1</dim>
<dim>128</dim>
</port>
<port id="3">
<dim>512</dim>
<dim>16</dim>
</port>
<port id="4">
<dim>512</dim>
<dim>128</dim>
</port>
<port id="5">
<dim>512</dim>
</port>
</input>
<output>
<port id="6">
<dim>1</dim>
<dim>128</dim>
</port>
<port id="7">
<dim>1</dim>
<dim>128</dim>
</port>
</output>
</layer>