This is simplified from a more complex network to ask my question:
Suppose I want a Functional model with the follow layers:
- input layer of samples, each is 30932x4
- 1d convolution of size 8
- output a single scalar value from a fully connected dense layer
In code, I write:
conv = Conv1D(filters=1, kernel_size=8, activation=‘relu’)
outputs = Dense(1)(conv(inputs))
Which gives me the output:
input_1 (InputLayer) [(None, 30932, 4)] 0
conv1d (Conv1D) (None, 30925, 1) 33
dense (Dense) (None, 30925, 1) 2
33 trainable parameters for my convolution makes sense, I have a kernel size of 8x4 inputs at each location + 1 bias.
Why do I have only 2 parameters for Dense layer? Shouldn’t this get 30926 parameters? 30925 weights, 1 for each value coming from my convolution layer, and 1 bias? And I expect an output shape of (None,1,1)
This runs very quickly but performs very poorly.