How to fix Random seed in Gradient tape

Yuki_Nat · November 8, 2021, 3:18am

How to fix random seed to calculate gradient (gradient tape)?

[Environment]
OS: Windows 10
Tensorflow 2.6.0

[Issue]
I show GradCAM image like this sample code.
https://keras.io/examples/vision/grad_cam/

It uses GradientTape to get gradient, and the result is different each time in spite of same source image and logit values of inference.
So heatmap image is different in each time with this issue.

I fixed random seed like below, but I couldn’t fix random seed for gradient tape.

import os
import random
os.environ[‘TF_DETERMINISTIC_OPS’] = ‘1’
SEED = 42
random.seed(SEED)
np.random.seed(SEED)
tf.random.set_seed(SEED)
os.environ[“PYTHONHASHSEED”] = str(SEED)

Calc gradient and Get heatmap like below:

def make_gradcam_heatmap(img_array, model, last_conv_layer_name, pred_index=None):
with tf.GradientTape() as tape:
last_conv_layer_output, preds = grad_model(img_array)#this preds’ value is same in every time with fix random seed
if pred_index is None:
pred_index = tf.argmax(preds[0])
class_channel = preds[:, pred_index]
# This is the gradient of the output neuron (top predicted or chosen)
# with regard to the output feature map of the last conv layer
grads = tape.gradient(class_channel, last_conv_layer_output)

Bhack · November 8, 2021, 4:51pm

Do you have a minimized standalone gist or colab to reproduce this?

Yuki_Nat · November 9, 2021, 3:56am

Thank you for reply Bhack.
I’m sorry but I can’t upload notebook.

I realized that the original Keras example works same every time, even if I changed the model “xception” to “VGG16”. But my model is not so.
My model is transfer-trained, which based on VGG16.

I found the cause of random is difference of model.
So I would like to change my question to "How to get same result of layers every time?"

Detail:
I found [last_conv_layer_output] makes the difference:

def make_gradcam_heatmap(img_array, model, last_conv_layer_name, pred_index=None):
    ............
    last_conv_layer_output, preds = grad_model(img_array)

grad_model retruns [last_conv_layer] and [preds].
In Keras example, last_conv_layer is same value every time , but mine is random.

Therefore, I found I should modify my model.

My model is made with transfer learning. Base is VGG16, and added some layers.
I saved the model in this way:
model.save("FruitsSorter.tf")
Load it in this way:

from keras.models import load_model
model2 = load_model("FruitsSorter.tf")

Training machine and Inference machine is same.

The summary of model is:

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_2 (InputLayer)         [(None, 224, 224, 3)]     0         
_________________________________________________________________
sequential (Sequential)      (None, 224, 224, 3)       0         
_________________________________________________________________
tf.__operators__.getitem_1 ( (None, 224, 224, 3)       0         
_________________________________________________________________
tf.math.add_1 (TFOpLambda)   (None, 224, 224, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 56, 56, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 28, 28, 256)       0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 28, 28, 512)       1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 14, 14, 512)       0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0         
_________________________________________________________________
global_average_pooling2d_2 ( (None, 512)               0         
_________________________________________________________________
dropout (Dropout)            (None, 512)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 512)               262656    
_________________________________________________________________
dense (Dense)                (None, 5)                 2565      
=================================================================
Total params: 14,979,909
Trainable params: 265,221
Non-trainable params: 14,714,688

Yuki_Nat · November 9, 2021, 7:25am

Model is created like below:

    IMG_SHAPE=(224,224,3)
    class_names=["apple_braeburn", "apple_golden_delicious",  "apple_topaz",  "peach",  "pear"]
    
    data_augmentation = tf.keras.Sequential([
      tf.keras.layers.experimental.preprocessing.RandomFlip("horizontal_and_vertical"),
      tf.keras.layers.experimental.preprocessing.RandomRotation(0.2),
    ])
    
    preprocess_input = tf.keras.applications.vgg16.preprocess_input
    
    inputs = tf.keras.Input(shape=(IMG_SHAPE))#入力Tensor
    inputs=data_augmentation(inputs)
    inputs = preprocess_input(inputs)

    base_model = tf.keras.applications.VGG16(input_shape=IMG_SHAPE,
                                           include_top=False,
                                           weights='imagenet',
                                           classes = len(class_names), input_tensor=inputs)
    base_model.trainable = False

    last_layer=base_model.layers[-1]
    last_layer_outputshape=last_layer.output_shape
    dense_unit_to_set=last_layer_outputshape[-1]

    x2=base_model.output
    x2 = tf.keras.layers.GlobalAveragePooling2D()(x2)
    x2 = tf.keras.layers.Dropout(0.2)(x2)
    x2=tf.keras.layers.Dense(dense_unit_to_set,activation='relu')(x2)

    prediction_layer = tf.keras.layers.Dense(len(class_names),activation="softmax")
    x2 = prediction_layer(x2)

    model = tf.keras.Model(inputs=base_model.input, outputs=x2)
    

    base_learning_rate = 0.0001
    batch_size= BATCH_SIZE
    momentum_usage=0.99
    initial_epochs = 50

    sgd=tf.keras.optimizers.SGD(learning_rate=base_learning_rate,momentum=momentum_usage)

   model2.compile(optimizer=sgd,loss=tf.keras.losses.CategoricalCrossentropy(from_logits=True),metrics=['accuracy'])

Bhack · November 9, 2021, 10:42am

How you get last_conv_layer_output in your model?

Yuki_Nat · November 9, 2021, 12:11pm

I am using keras sample code.
I can get [last_conv_layer_output] in the make_gradcam_heatmap.

def make_gradcam_heatmap(img_array, model, last_conv_layer_name, pred_index=None):
    # First, we create a model that maps the input image to the activations
    # of the last conv layer as well as the output predictions
    grad_model = tf.keras.models.Model(
        [model.inputs], [model.get_layer(last_conv_layer_name).output, model.output]
    )

    # Then, we compute the gradient of the top predicted class for our input image
    # with respect to the activations of the last conv layer
    with tf.GradientTape() as tape:
        last_conv_layer_output, preds = grad_model(img_array)
        if pred_index is None:
            pred_index = tf.argmax(preds[0])
        class_channel = preds[:, pred_index]

    # This is the gradient of the output neuron (top predicted or chosen)
    # with regard to the output feature map of the last conv layer
    grads = tape.gradient(class_channel, last_conv_layer_output)

I’m sorry, I made mistakes previous post.
I’m using VGG16, but some codes are “DensNet”. I fixed my post.
The issue is same between vgg16 base and densenet base.

Bhack · November 9, 2021, 4:55pm

What is the last_conv_layer_name in your network?

Yuki_Nat · November 9, 2021, 9:38pm

The name is block5_conv3.

Bhack · November 9, 2021, 11:06pm

Do you have a github gist to share?

Yuki_Nat · November 10, 2021, 2:28am

Thank you Bhack!
I prepared GitHub repository

TransferLearning.ipynb = Make a model
Apply = Inference with new model

I couldn’t share Images because they are private.
It has 5 folders, which named class name. Like this:

Image
|-- Fruits
   |---apple_braeburn        
   |---apple_golden_delicious
   |---apple_topaz
   |---peach
   |---pear

All Image size is 400(W) x 300(H), format is “png”.

Yuki_Nat · November 10, 2021, 2:52am

Bhack,
I solved this problem!
The cause is I set “input_tensor” when I make a model.
I can get a deterministic heatmap not to set input_tensor!