Hello,

I am not so familiar with TF so I may misunderstood something in the machinery.

Below, I just C&P the example from the doc found here

```
# Construct and fit model.
x_ = tfkl.Input(shape=(2,), dtype=tf.float32)
log_prob_ = distribution.log_prob(x_)
model = tfk.Model(x_, log_prob_)
model.compile(optimizer=tf.optimizers.Adam(),
loss=lambda _, log_prob: -log_prob)
```

I was wandering why the loss in the `model.compile`

does not use `tf.reduce_mean`

ie

```
loss=lambda _, log_prob: -tf.reduce_mean(log_prob)
```

Am I wrong? Thanks