Pytorch code convertion into keras

DANIELE_ACCONCIA · February 14, 2022, 6:36pm

I’m traying to convert Pytorch code into Tensorflow. What is the equivalent of self.model_t.layer1[-1].register_forward_hook(hook_t) in Tensorflow/Keras?

    def hook_t(module, input, output):
        self.features_t.append(output)
    def hook_s(module, input, output):
        self.features_s.append(output)

    self.model_t = resnet18(pretrained=True).eval()
    for param in self.model_t.parameters():
        param.requires_grad = False

    self.model_t.layer1[-1].register_forward_hook(hook_t)
    self.model_t.layer2[-1].register_forward_hook(hook_t)
    self.model_t.layer3[-1].register_forward_hook(hook_t)

    self.model_s = resnet18(pretrained=False) # default: False
    self.model_s.layer1[-1].register_forward_hook(hook_s)
    self.model_s.layer2[-1].register_forward_hook(hook_s)
    self.model_s.layer3[-1].register_forward_hook(hook_s)

Thanks!

Bhack · February 15, 2022, 1:00am

Have you checked:

github.com/tensorflow/tensorflow

Need a way to get Intermediate Layer Inputs/Activations for tf.keras Models

opened 05:01PM - 17 Oct 19 UTC

closed 05:45AM - 05 Apr 20 UTC

n2cholas

stat:awaiting tensorflower type:feature comp:keras TF 2.0

<em>Please make sure that this is a feature request. As per our [GitHub Policy](…https://github.com/tensorflow/tensorflow/blob/master/ISSUES.md), we only address code/doc bugs, performance issues, feature requests and build/installation issues on GitHub. tag:feature_template</em> **System information** - TensorFlow version (you are using): 2.0 - Are you willing to contribute it (Yes/No): Yes, if there is a consensus on how it should be designed. **Describe the feature and the current behavior/state.** In eager mode, there is no way to access a tf.keras model's layer inputs/outputs during training (as far as I can tell, please correct me if I'm wrong). In TF 1.x (graph mode), this was not a problem, since you could use <s>[`layer.input`](https://github.com/tensorflow/tensorflow/blob/r2.0/tensorflow/python/keras/engine/base_layer.py#L1541-L1558) or [`layer.output`](https://github.com/tensorflow/tensorflow/blob/r2.0/tensorflow/python/keras/engine/base_layer.py#L1560-L1577)</s> [`layer.inbound_nodes` or `layer.outbound_nodes`](https://github.com/tensorflow/tensorflow/blob/r2.0/tensorflow/python/keras/engine/base_layer.py#L1663-L1673) to get these tensors and use those values, but this is no longer possible in eager mode. PyTorch solves this issue by allowing users to register hooks on layers, which is essentially a function that is called before/after the forward/backward pass on a layer. [Here](https://pytorch.org/docs/stable/_modules/torch/nn/modules/module.html#Module.register_forward_hook) is the code for `register_forward_hook` in PyTorch. Alternatively, the input/output properties of a layer could store a reference to the tensors used in the most recent forward pass. **Will this change the current api? How?** Yes, depending on how this is implemented. If a hooks approach is used, this public method would have to be added to `tf.keras.layers.Layer`. If the input/output property approach is used, these properties would have new behavior in eager mode. **Who will benefit with this feature?** Being able to access, record, manipulate, or otherwise use layer inputs and outputs for models during training/inference is generally very useful. A specific example is the [K-FAC](https://arxiv.org/abs/1503.05671) optimization algorithm, which uses each layer's inputs and pre-activation gradients to approximate the Fisher information matrix. The [current implementation](https://github.com/tensorflow/kfac) does not support eager. PyTorch implementations (e.g. [this one](https://github.com/alecwangcq/KFAC-Pytorch/blob/master/optimizers/kfac.py#L81-L82)) of this algorithm use hooks to do this. Another use case is visualizing intermediate activations of CNNs. [This example](https://towardsdatascience.com/visualizing-intermediate-activation-in-convolutional-neural-networks-with-keras-260b36d60d0) uses layer.outputs in TF 1.x + Keras to grab the right tensors then creating an augmented model. This process would be greatly simplified by allowing access to intermediate activations without augmenting the model. **Any Other info.** [Here](https://github.com/tensorflow/tensorflow/issues/33129) is a related issue about getting intermediate activations. **EDIT (2019-10-20):** I learned that `layer.inbound_nodes `and `layer.outbound_nodes` used to have this behavior, not `layer.input` and `layer.output`. `layer.input` and `layer.output` track the tensors that are created when [`model.build(input_shape)`](https://github.com/tensorflow/tensorflow/blob/r2.0/tensorflow/python/keras/engine/base_layer.py#L365-L379) is called. When you use the model as a callable on your own input (i.e. `predictions = model(inputs)`), new tensors are created (reusing the model's weights/architecture). In TF 1.x, `inputs` would be added to the `inbound_nodes` list and `predictions` would be added to the `outbound_nodes` list during this call. Now, since in eager mode the model is called on a new `EagerTensor` for every training step, it is not reasonable to add that many new tensors to these lists, so the property was deprecated. Since this feature used to exist, I think it's important for a reasonable replacement to exist in TF2 (such as hooks or a layer property tracking the most recent inputs/outputs).

DANIELE_ACCONCIA · February 15, 2022, 4:41am

Thanks,

Currently immediatamente trying to implement this paper https://arxiv.org/abs/2103.04257, Pytorch implementation in pretty straightforward, but i have some issue with tensorflow. I defined the model in the following way, but I don’t think it is correct, the result it’s quite different from Pytorch, also in terms of trainable parameters (~11M Pytorch vs ~3M TF)

def Define_Model(img_shape, num_channel):

#----------------------------Istanza ResNet-18 ----------------------------
ResNet18, preprocess_input = Classifiers.get('resnet18')                                                                                             
#--------------------------------------------------------------------------
    

#----------------------- Definizione Tensore Input ------------------------
input_tensor = tf.keras.Input(shape = (img_shape, img_shape, num_channel))
#--------------------------------------------------------------------------

    
#----------------- -- Definizione ResNet Teacher e Student ----------------
t_net = ResNet18(weights = 'imagenet', include_top = False, input_tensor = input_tensor, input_shape = (img_shape, img_shape, num_channel))
s_net = ResNet18(weights = None, include_top = False, input_tensor = input_tensor, input_shape = (img_shape, img_shape, num_channel))
#--------------------------------------------------------------------------


#---------------------- Redifinzione Nomi Layer Reti ----------------------
for i, layer in enumerate(t_net.layers):
    layer._name = 't_net_' + layer.name
            
for i, layer in enumerate(s_net.layers):
    layer._name = 's_net_' + layer.name
#--------------------------------------------------------------------------   


#------------------ Imposto la rete Teacher come non addestrabile ---------
for l in t_net.layers:
    l.trainable = False
#--------------------------------------------------------------------------

    
#----------------- Estrazione Layer Intermedi Teacher ---------------------
intermediate_t_layer_1 = t_net.get_layer("t_net_stage1_unit2_conv2").output        
intermediate_t_layer_2 = t_net.get_layer("t_net_stage2_unit2_conv2").output        
intermediate_t_layer_3 = t_net.get_layer("t_net_stage3_unit2_conv2").output
#--------------------------------------------------------------------------

   
#----------------- Estrazione Layer Intermedi Student ---------------------  
intermediate_s_layer_1 = s_net.get_layer("s_net_stage1_unit2_conv2").output        
intermediate_s_layer_2 = s_net.get_layer("s_net_stage2_unit2_conv2").output        
intermediate_s_layer_3 = s_net.get_layer("s_net_stage3_unit2_conv2").output
#---------------------------------------------------------------------------


#------------------------------ Output -----------------------------------
out_1 = [intermediate_t_layer_1] + [intermediate_t_layer_2] + [intermediate_t_layer_3]
out_2 = [intermediate_s_layer_1] + [intermediate_s_layer_2] + [intermediate_s_layer_3]
#--------------------------------------------------------------------------


#------------------------------ Modello -----------------------------------
model = tf.keras.Model(inputs = input_tensor, outputs = [out_1, out_2])
#--------------------------------------------------------------------------

    
#------------------------------ Compile -----------------------------------   
model.add_loss(Feature_Loss(input_tensor, out_1, out_2))     
model.compile(Adam(lr = 0.4), loss = None)
#--------------------------------------------------------------------------

return model, t_net, s_net

Daniele

Bhack · February 15, 2022, 10:53am

Have you tried to compare the two models with a model summary, a graph or any other visualization tool?

DANIELE_ACCONCIA · February 15, 2022, 11:15am

I compared the models with a summary. Total parameters of two models (TF and Pytorch) are substantially equal, it is the trainable parameters that are very different. It seems that TF model is truncked after third residual block. Model TF definition is correct? The loss is a measure of the distance between the teacher and student features. Here the Pytorch implentation GitHub - hcw-00/STPM_anomaly_detection: Unofficial pytorch implementation of Student-Teacher Feature Pyramid Matching for Unsupervised Anomaly Detection

Bhack · February 15, 2022, 12:41pm

Can you post the Netron graph of the two Networks?

DANIELE_ACCONCIA · February 15, 2022, 9:16pm

Uhmm It seems that is not possible upload images in a message

Bhack · February 15, 2022, 9:32pm

Yes as you are new in the forum need to scale Discuss gamification to enable more permissions.

Do you have a link?

DANIELE_ACCONCIA · February 15, 2022, 9:39pm

Ok, I never used Netron tool, I start sharing Keras graph and Pytorch summary

Bhack · February 15, 2022, 10:34pm

It is hard to follow the connections in a pytorch summary but in the Keras graph I don’t see the intermediate connections between the student and the teacher.

I am not sure in pytorch what kind of tool you could use to visualize the graph connection:

DANIELE_ACCONCIA · February 16, 2022, 6:51am

Ok, i’ll try to visualize pytorch graph, I don’t use usually Pytorch

Thanks for your support

DANIELE_ACCONCIA · February 17, 2022, 9:08am

Should the blocks of the two network be connected?

Bhack · February 17, 2022, 11:22am

I don’t have the Pytorch graph but quickly checking the mentioned Pytorch impl It seems not. The output features vector form teacher and student model are just used to compute the loss with a for loop

DANIELE_ACCONCIA · February 17, 2022, 2:11pm

Exactly what I also understood from the paper. The teacher net isn’t trainable and it provides only the feature as reference for the student net. It seems a quite simple model but I can not reproduce the results with TF.

Bhack · February 17, 2022, 3:52pm

I’ve not checked the paper details.

Can you try to adapt this tutorial to your specific use case?

DANIELE_ACCONCIA · February 17, 2022, 5:09pm

I view the tutorial but it is focus on logits distillation that is simpler

Bhack · February 17, 2022, 8:01pm

As you have in that example a custom train loop/step I think that you could customize your loss a you want there.

DANIELE_ACCONCIA · February 17, 2022, 10:30pm

I’ll try. I have a question: if my output is an intermediate layer, Are the network trainable parameters only those untill the intermediate layer or all network parameters?

Thanks

Bhack · February 17, 2022, 10:38pm

I think that in your case you have multiple outputs as all the intermediate outputs are accumulated in the loss.

Bhack · February 17, 2022, 10:39pm

Check also: