MaxPooling operation on temporal data - select signal with the highest amplitude

Vigneswaran · January 2, 2024, 11:52am

Is there any clean way to perform Maxpooling operation on temporal data (i.e. signal with highest amplitude will be the output).

For example,

# sample four sin signals
a = 2*tf.math.sin(tf.linspace(0, 10, 200))
b = 0.1*tf.math.sin(2*tf.linspace(0, 10, 200))
c = 3*tf.math.sin(0.5*tf.linspace(0, 10, 200))
d = 1*tf.math.sin(5*tf.linspace(0, 10, 200))
# stack the signals
data = tf.stack([a, b, c, d], -1)
# reshape to appropriate timeseries of 2D feature-maps
# (batch_size, sequence length, feature_dim1, feature_dim2, channels)
data = tf.reshape(data, [1, 200, 2, 2, 1])

data will look something like this:

download

Now, I want to perform something similar to MaxPooling2D((2,2)) operation on data to get only c (as it has the highest amplitude). Clearly, we cannot use MaxPooling3D and TimeDistributed layers directly, as they will perform pooling at each timestep. I tried my luck with alternatives using tf.math.reduce_max() and tf.nn.max_pool_with_argmax but they were not straight-forward.

Any suggestions or comments is appreciated. Thanks in advance

Kiran_Sai_Ramineni · January 4, 2024, 1:22pm

Hi @Vigneswaran, I have gone through your data values which are obtained after stacking data = tf.stack([a, b, c, d], -1)
The values look like

[[ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,  0.00000000e+00],
  [ 1.00460220e-01,  1.00333406e-02,  7.53689538e-02,     2.48620990e-01],
  [ 2.00666812e-01,  1.99654222e-02,  1.50690330e-01,     4.81629004e-01],
..............

Each value in the inner lists zero index refers to a, 1st index refers to b, 2nd index refers to c , 3rd index refers to d.

If you see the data.numpy()[1] it will be
array([0.10046022, 0.01003334, 0.07536895, 0.24862099]) which are values at 1st index values of a,b,c,d

If you try to get the max value the value we get is 0.24862099 which will be the value of d at index 1. like that we also have many cases where we have -ve values, for those -ve values we get max value as b values. so if you plot the graph with those max values the graph will not be similar to c graph.

If you want max amplitude wave values getting max will not work. One way is to find which wave has max amplitude. once you have stacked and resize the array

import tensorflow as tf

# Sample four sin signals
a = 2 * tf.math.sin(tf.linspace(0, 10, 200))
b = 0.1 * tf.math.sin(2 * tf.linspace(0, 10, 200))
c = 3 * tf.math.sin(0.5 * tf.linspace(0, 10, 200))
d = 1 * tf.math.sin(5 * tf.linspace(0, 10, 200))

# Stack the signals
signal_stack = tf.stack([a, b, c, d], -1)
resize_signal_stack = tf.reshape(signal_stack, [1, 200, 2, 2, 1])
resize_numpy_stack=resize_signal_stack.numpy()

to get a max amplitude wave.

import numpy as np
max_value_indices = np.unravel_index(np.argmax(resize_numpy_stack), signal_stack.shape)[1]
#output of max_value_indices will be 2

The max_value_indices+1 gives which wave have max amplitude

Now to get values of max amplitude wave

new_array=[]
for i in range(0,200):
  new_array.append(resize_numpy_stack[0][i][1][0][0])

if you plot this new_array it will be

plt.plot(new_array, color='green')

which will be the same as c plot.

Note: This indexing new_array.append(resize_numpy_stack[0][i][1][0][0]) depends on the max_value_indices value. suppose if i define the waves like

a = 2*tf.math.sin(tf.linspace(0, 10, 200))
b = 0.1*tf.math.sin(2*tf.linspace(0, 10, 200))
d = 3*tf.math.sin(0.5*tf.linspace(0, 10, 200))
c = 1*tf.math.sin(5*tf.linspace(0, 10, 200))

The indexing will be

new_array=[]
for i in range(0,200):
  new_array.append(resize_numpy_stack[0][i][1][1])

please refer to this gist for working code. Thank You.

Vigneswaran · January 8, 2024, 9:54am

Hi @Kiran_Sai_Ramineni, Thank you for your response. Your code works for a standalone part of the feature map for which direct pooling can be performed. The kernel should be traversed across the feature-map and channels, and index of the maximum amplitude signal should be selected in each pooling operation. Here is my implementation with extension but it is bit messy.

def temporal_max_pooler(signal_stack):

  #signal_stack (bs, T, f, f, d)
  overall_stack = tf.TensorArray(dtype=tf.float32, size=0, dynamic_size=True)
  for ch in range(signal_stack.shape[-1]):
    ch_signals = signal_stack[:,:,:,:,ch:ch+1]
    patches = tf.extract_volume_patches(ch_signals, (1, 1, 2, 2, 1), 
                                        (1, 1, 2, 2, 1), 'VALID')
    patches = tf.transpose(patches, [0, 1, 4, 2, 3])
    (s0, s1, s2, s3, s4) = patches.shape
    patches = tf.reshape(patches, [s0, s1, s2//2, s2//2, s4*s3])
    ch_stack = tf.TensorArray(dtype=tf.float32, size=0, dynamic_size=True)
    for p in range(patches.shape[-1]):
      p_signals = tf.reshape(patches[:,:,:,:,p],
                             (patches.shape[0], patches.shape[1], -1))
      max_amps = tf.math.reduce_max(p_signals, 1)
      where_is_max = tf.math.argmax(max_amps, -1)
      winners = tf.gather(p_signals, where_is_max, axis=-1, batch_dims=1)
      ch_stack.write(ch_stack.size(), winners)
    ch_stack = tf.transpose(ch_stack.stack(), [1, 2, 0])
    n_d = tf.math.sqrt(tf.cast(ch_stack.shape[-1], 'float32'))
    n_d = tf.cast(n_d, 'int32')
    ch_stack = tf.reshape(ch_stack, [ch_stack.shape[0], ch_stack.shape[1],
                                     n_d, n_d])
    overall_stack.write(overall_stack.size(), ch_stack)
  overall_stack = overall_stack.stack()
  return tf.transpose(overall_stack, [1, 2, 3, 4, 0])