Model optimizer state isn't saved

Viindo · April 20, 2023, 12:41pm

I recently attempted training a CNN model (tf.keras.models.Model) and saved it with the model.save() function. I then attempted loading it, in order to resume training on a different set of data, however, I received the following error message, saying that the optimizer state couldn’t be loaded.

WARNING:tensorflow:Error in loading the saved optimizer state. As a result, your model is starting with a freshly initialized optimizer.

I am using the Adam optimizer and I am initializing it like this: tf.keras.optimizers.Adam(learning_rate=0.0001)

Do I need to save the optimizer state/weights in a specific manner rather than using model.save()? I can’t seem to find any information on it online, so it would be appreciated if someone can help with this issue

Laxma_Reddy_Patlolla · April 20, 2023, 6:57pm

Hi @Viindo ,

We cannot say why it is showing as a warning until you provide reproducible code.

Meanwhile, I will try to run a notebook on my system and check whether it is popping or not.

Thanks.

Viindo · April 21, 2023, 12:00pm

Used imports:

import tensorflow as tf
import pandas as pd
import numpy as np
from transformers import BertTokenizer, TFAutoModel, TFBertModel
from tensorflow.keras.callbacks import EarlyStopping

Creating and comiling the model:

max_length = 100

# Define the CNN model
inputs = tf.keras.Input(shape=(max_length,), dtype=tf.int32)
input_masks = tf.keras.Input(shape=(max_length,), dtype=tf.int32)

bert_model = TFAutoModel.from_pretrained('dbmdz/bert-base-german-uncased')
bert_model.trainable = False
bert_output = bert_model(inputs, attention_mask=input_masks,)[0]

# Apply a 1D convolution to the BERT embeddings
conv1d = tf.keras.layers.Conv1D(filters=16, kernel_size=3, padding='valid', 
                                activation='relu')(bert_output)
dropout1 = tf.keras.layers.Dropout(0.2)(conv1d, training=True)
pool1d = tf.keras.layers.MaxPooling1D(pool_size=2)(dropout1)

conv1d2 = tf.keras.layers.Conv1D(filters=32, kernel_size=3, padding='valid', activation='relu')(pool1d)
dropout2 = tf.keras.layers.Dropout(0.2)(conv1d2, training=True)
pool1d2 = tf.keras.layers.MaxPooling1D(pool_size=2)(dropout2)

conv1d3 = tf.keras.layers.Conv1D(filters=64, kernel_size=3, padding='valid', 
                                 activation='relu')(pool1d2)
dropout3 = tf.keras.layers.Dropout(0.2)(conv1d3, training=True)
pool1d3 = tf.keras.layers.GlobalMaxPooling1D()(dropout3)

# Flatten the pooled features
flatten = tf.keras.layers.Flatten()(pool1d3)

# Apply a dense layer to classify the features
dense = tf.keras.layers.Dense(units=128, activation='relu')(flatten)
dropout4 = tf.keras.layers.Dropout(0.5)(dense, training=True)
outputs = tf.keras.layers.Dense(units=5, activation='softmax')(dropout4)

model = tf.keras.models.Model(inputs=[inputs, input_masks], outputs=outputs)

# Compile the model
optimizer = tf.keras.optimizers.Adam(learning_rate=0.0001) 
model.compile(optimizer=optimizer, loss='sparse_categorical_crossentropy', metrics=['sparse_categorical_accuracy'], run_eagerly=True)

Training the model (using a generator that encodes the inputs for the BERT model):

early_stopping = EarlyStopping(patience=3, mode='min')
history = model.fit(bert_encode_generator(train_df['tweet'].values, train_df['label'].values, tokenizer, 32, 100), 
            validation_data=(bert_encode_generator(validation_df['tweet'].values, validation_df['label'].values, tokenizer, 32, 100)), callbacks=[early_stopping], epochs=100, steps_per_epoch=len(train_df) // 32, validation_steps=len(validation_df) // 32)

Saving the model:

model.save('models/model.h5', save_format='h5')

Loading the model is when the warning appears:

custom_objects = {'TFBertModel': TFBertModel}
model = tf.keras.models.load_model('models/model.h5', custom_objects=custom_objects)

Please let me know if you need any other information and thank you in advance!

Viindo · April 27, 2023, 5:11pm

@Laxma_Reddy_Patlolla any news?