I was exploring a simple character RNN and decided to construct it like this. I didn’t quite understand the difference between LSTM and LSTMCell. There is a comment in the code stating that CuDNN is not used in the latter case.
But my primary problem is with the shapes. It is just a sample sequence of characters.
I manage to draw the sample as they have recommended.
(e.g) “Hell” and “ello”
I am not using tf Dataset as I want to keep this very simple.
def draw_random_sample(text): sample = random_sample(text) split_sample = tf.strings.bytes_split(sample) tf.map_fn(map_fn, tf.strings.bytes_split(split_sample)) # print(tf.stack(list[:-1]),tf.stack(list[1:])) return tf.stack(list[:-1]),tf.stack(list[1:])
I checked the output and it is indeed correct.
The error is
ValueError: Exception encountered when calling layer “sequential” (type Sequential).
Input 0 of layer "rnn" is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (None, 1)
I also have a question about batching. If I manage to fix the error and execute it I would like to use batches.
Could you take a look ?
EMBEDDING_DIM = 100 HIDDEN_DIM = 128 INPUT_DIM=len(string.printable) OUTPUT_DIM=len(string.printable) EPOCHS = 1 # Build the RNN model def build_model(): # Wrapping a LSTMCell in a RNN layer will not use CuDNN. keras.layers.Embedding(input_dim=INPUT_DIM, output_dim=EMBEDDING_DIM), lstm_layer = keras.layers.RNN( keras.layers.LSTMCell(HIDDEN_DIM), return_sequences=True ) model = keras.models.Sequential( [ lstm_layer, keras.layers.Dense(OUTPUT_DIM), ] ) return model loss = tf.losses.SparseCategoricalCrossentropy(from_logits=True) model = build_model() model.build((1,1,INPUT_DIM)) model.compile(optimizer='adam', loss=loss) print(model.summary()) history = model.fit(draw_random_sample(input), epochs=EPOCHS, verbose=2)