Norm kills performance in TD3


i implemented TD3 and the agent works in the Pendulum-v1 environment. However if i use normalization layers inbetween the agent doesn’t work at all. What might be the reason?

Here is the code of the call function of the critic:

def call(self,state,parameters):
        state_ = tf.convert_to_tensor(state, dtype=tf.float32)
        parameters_ = tf.convert_to_tensor(parameters, dtype=tf.float32)

        state_ = tf.concat([state_,parameters_],1,name = "concatene_State_Parameters")

        if self.use_Skip_Layers:
            x = state_
            x_ = state_
            for i in range(len(self.network_Layers)-1):
                x = self.network_Layers[i](x)
                x = tf.concat([x,x_],axis=1)
                if self.normalize:
                    x = self.norm_Layers[i](x,training=True)
                x_ = x
            q = self.network_Layers[len(self.network_Layers)-1](x)
            x = state_
            x_ = state_
            for i in range(len(self.network_Layers)-1):
                x = self.network_Layers[i](x)
                if self.normalize:
                    x = self.norm_Layers[i](x,training=True)
            q = self.network_Layers[len(self.network_Layers)-1](x)


        max_Action = tf.math.argmax(q,axis=1)
        max_Q_Val = tf.math.reduce_max(q,axis=1)
        return q,max_Action,max_Q_Val

The actor has the same code except of a squashing function at the end and NoisyDense layers. Also can normalization be used if priorized experience replay is implemented?