Hi @komo, The difference in results is due to model.fit calculates the values after the forward pass and then updates the weights via back-propagation then model.evaluate calculates weights based upon the updated weights in back-propagation. while using the model.fit you can use custom callbacks,For example:

class CustomMonitoring(keras.callbacks.Callback):
def on_train_batch_end(self, batch, logs=None):
loss, acc = self.model.evaluate(train_images, train_labels,batch_size=1024)
print('For end batch {}, loss is {:7.2f} and acc is {}.'.format(batch, loss, acc))

to update weights at the end of each batch. so that you can get the same results when using model.fit and model.evaluate. Please refer to this gist for working code example. Thank You.