Norm kills performance in TD3

Hello,

i implemented TD3 and the agent works in the Pendulum-v1 environment. However if i use normalization layers inbetween the agent doesn’t work at all. What might be the reason?

Here is the code of the call function of the critic:

def call(self,state,parameters):
        
        
        
        state_ = tf.convert_to_tensor(state, dtype=tf.float32)
        parameters_ = tf.convert_to_tensor(parameters, dtype=tf.float32)
        
        
        
        

            
  
        
        state_ = tf.concat([state_,parameters_],1,name = "concatene_State_Parameters")


        
        if self.use_Skip_Layers:
            x = state_
            x_ = state_
            for i in range(len(self.network_Layers)-1):
                x = self.network_Layers[i](x)
                
                    
                x = tf.concat([x,x_],axis=1)
                
                if self.normalize:
                    x = self.norm_Layers[i](x,training=True)
                
                x_ = x
                
            q = self.network_Layers[len(self.network_Layers)-1](x)
            
            
        else:
            x = state_
            x_ = state_
            for i in range(len(self.network_Layers)-1):
                x = self.network_Layers[i](x)
                
                if self.normalize:
                    x = self.norm_Layers[i](x,training=True)
            
            q = self.network_Layers[len(self.network_Layers)-1](x)
            

        

       
                
            
        max_Action = tf.math.argmax(q,axis=1)
        max_Q_Val = tf.math.reduce_max(q,axis=1)
        
        max_Action=tf.reshape(max_Action,[max_Action.shape[0],1])
        max_Q_Val=tf.reshape(max_Q_Val,[max_Action.shape[0],1])
        return q,max_Action,max_Q_Val

The actor has the same code except of a squashing function at the end and NoisyDense layers. Also can normalization be used if priorized experience replay is implemented?