Possitlbe mistakes in image_captioning

Masaki_Murata · August 22, 2022, 5:04am

There are several mistakes in the comments in image_captioning tutorial.
For example, in the definition of RNN_Decoder, there is a sentence:

shape == (batch_size, max_length, hidden_size)

above the follwing code:

x = self.fc1(output)

However, it should be

shape == (batch_size, 1, units).

The reasons are as follows.
The second component of shape should be 1 as well as the one of output.shape.
Because x corresponds to a single word in a sentence, max_length shoud not appear. In addition, according to

self.fc1 = tf.keras.layers.Dense(self.units),

units is correct rather than hidden_size.