How to find the patch of image which makes a neuron fire for a CNN?

Given a trained TensorFlow CNN model, for image classification into 5 classes , I want to find the neuron in the last convolution layer that has maximum activation (in case of same highest value for multiple neurons, any one can be takes). Then I want to trace this neuron back to the original image, and find the patch that caused this neuron to fire(draw a small box around it ). As shown in this Youtube video.

For example here is the architecture of one of my CNN model

USE_BIAS = True

arch1 = keras.Sequential([
    keras.layers.Conv2D(8, 11, strides=4, padding='valid', activation='relu', input_shape=(224, 224, 3), use_bias=USE_BIAS),
    keras.layers.MaxPooling2D(3, strides=2),

    keras.layers.Conv2D(16, 5, strides=1, padding='valid', activation='relu', use_bias=USE_BIAS),
    keras.layers.MaxPooling2D(3, strides=2),

    keras.layers.Flatten(),
    keras.layers.Dense(128, activation='relu', use_bias=USE_BIAS),
    keras.layers.Dense(5, activation='softmax', use_bias=USE_BIAS)
])

After I compile and fit this model, I do one forward pass on one image ,I want to find the neuron, say N , in the second Conv2D layer , which has the highest output and then trace back to the MaxPooling2D before this and find which neurons are connected to N (i.e N is the maximum of what patch ), then trace back that patch to first Conv2D, and then to the original image.

How can this be done?

I’ve tried searching online but all I could find online was ways to generate feature maps or find inputs that maximally activate a layer, but these don’t help me.
I am beginner in TensorFlow, so I don’t really know where to begin.

( my same question on stackoverflow)

Hi @B20291_Hrishabh_Naya, In tensorflow you can generate heat map around the pixel that causes the neurons to fire. You can achieve that using Grad-CAM class activation visualization. Please refer to this tutorial for implementation. Thank You.

Thanks for replying but I don’t want class activation maps, for my data, what I am looking for is something like this :
image
(the white box around the region of interest)

I have updated my question on stackoverflow.com which shows my current best approach to do this, but I am not sure if that is the optimal way to approach this problem