Average step time differs between Tensorboard and console output

I’m running a Resnet50 benchmark which generates time histories like this:

I0318 08:49:47.869190 139849287907136 keras_utils.py:122] TimeHistory: 34.28 seconds, 11.67 examples/second between steps 0 and 1
I0318 08:49:47.981948 139849287907136 keras_utils.py:122] TimeHistory: 0.11 seconds, 3574.58 examples/second between steps 1 and 2
I0318 08:49:48.088986 139849287907136 keras_utils.py:122] TimeHistory: 0.11 seconds, 3744.50 examples/second between steps 2 and 3
[...]
I0318 08:50:14.385496 139849287907136 keras_utils.py:122] TimeHistory: 0.11 seconds, 3719.92 examples/second between steps 249 and 250

Averaged over 250 steps, this yields 243 ms per step. Obviously this is dominated by the first step.

TensorBoard, on its profile performance summary page, states an average step time of 96.3 ms. Below that, there is σ = 194 ms. I don’t know what this value indicates, but I doubt that it is a standard deviation. From its value I guess that rather it is related to the average as computed above.

I wonder how this difference in the average step time can be explained. I guess that there is some additional action done in the first step, which leads to the large time excess, which might not be measured by . Do you know more about what TensorBoard is measuring in detail?