Using SAM wrapper gives wrong evaluation loss

I am using the Sharpness Aware Minimization to subclass model, as shown here:

The problem I am facing is although it appears to dramatically improve model convergence (and, hopefully but TBD to improve generalization), the evaluation loss reported on each epoch appears to be incorrect. Here is an example:

Epoch 72/100
1760/1760 [==============================] - ETA: 0s - loss: 0.2741 - accuracy: 0.9187  
Epoch 72: val_accuracy improved from 0.88879 to 0.89052, saving model to /mnt/c/eaf_llc/aa-analytics_and_bi/alliance_molds/radome_quality/image_analysis/models/2023-01-24_11_10_47_best_model.tf
WARNING:absl:Found untraced functions such as _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op while saving (showing 5 of 5). These functions will not be directly callable after loading.
INFO:tensorflow:Assets written to: /mnt/c/eaf_llc/aa-analytics_and_bi/alliance_molds/radome_quality/image_analysis/models/2023-01-24_11_10_47_best_model.tf/assets
INFO:tensorflow:Assets written to: /mnt/c/eaf_llc/aa-analytics_and_bi/alliance_molds/radome_quality/image_analysis/models/2023-01-24_11_10_47_best_model.tf/assets
1760/1760 [==============================] - 44s 25ms/step - loss: 0.2741 - accuracy: 0.9187 - val_loss: 50.3961 - val_accuracy: 0.8905
Epoch 73/100
1760/1760 [==============================] - ETA: 0s - loss: 0.2696 - accuracy: 0.9199  
Epoch 73: val_accuracy improved from 0.89052 to 0.89269, saving model to /mnt/c/eaf_llc/aa-analytics_and_bi/alliance_molds/radome_quality/image_analysis/models/2023-01-24_11_10_47_best_model.tf
WARNING:absl:Found untraced functions such as _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op while saving (showing 5 of 5). These functions will not be directly callable after loading.
INFO:tensorflow:Assets written to: /mnt/c/eaf_llc/aa-analytics_and_bi/alliance_molds/radome_quality/image_analysis/models/2023-01-24_11_10_47_best_model.tf/assets
INFO:tensorflow:Assets written to: /mnt/c/eaf_llc/aa-analytics_and_bi/alliance_molds/radome_quality/image_analysis/models/2023-01-24_11_10_47_best_model.tf/assets
1760/1760 [==============================] - 44s 25ms/step - loss: 0.2696 - accuracy: 0.9199 - val_loss: 50.4510 - val_accuracy: 0.8927
Epoch 74/100
1758/1760 [============================>.] - ETA: 0s - loss: 0.2653 - accuracy: 0.9211  
Epoch 74: val_accuracy improved from 0.89269 to 0.89468, saving model to /mnt/c/eaf_llc/aa-analytics_and_bi/alliance_molds/radome_quality/image_analysis/models/2023-01-24_11_10_47_best_model.tf
WARNING:absl:Found untraced functions such as _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op while saving (showing 5 of 5). These functions will not be directly callable after loading.
INFO:tensorflow:Assets written to: /mnt/c/eaf_llc/aa-analytics_and_bi/alliance_molds/radome_quality/image_analysis/models/2023-01-24_11_10_47_best_model.tf/assets
INFO:tensorflow:Assets written to: /mnt/c/eaf_llc/aa-analytics_and_bi/alliance_molds/radome_quality/image_analysis/models/2023-01-24_11_10_47_best_model.tf/assets
1760/1760 [==============================] - 45s 25ms/step - loss: 0.2653 - accuracy: 0.9211 - val_loss: 50.5057 - val_accuracy: 0.8947
Epoch 75/100
1760/1760 [==============================] - ETA: 0s - loss: 0.2611 - accuracy: 0.9221  
Epoch 75: val_accuracy improved from 0.89468 to 0.89627, saving model to /mnt/c/eaf_llc/aa-analytics_and_bi/alliance_molds/radome_quality/image_analysis/models/2023-01-24_11_10_47_best_model.tf
WARNING:absl:Found untraced functions such as _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op while saving (showing 5 of 5). These functions will not be directly callable after loading.
INFO:tensorflow:Assets written to: /mnt/c/eaf_llc/aa-analytics_and_bi/alliance_molds/radome_quality/image_analysis/models/2023-01-24_11_10_47_best_model.tf/assets
INFO:tensorflow:Assets written to: /mnt/c/eaf_llc/aa-analytics_and_bi/alliance_molds/radome_quality/image_analysis/models/2023-01-24_11_10_47_best_model.tf/assets
1760/1760 [==============================] - 44s 25ms/step - loss: 0.2611 - accuracy: 0.9221 - val_loss: 50.5603 - val_accuracy: 0.8963
Epoch 76/100
1758/1760 [============================>.] - ETA: 0s - loss: 0.2570 - accuracy: 0.9232  
Epoch 76: val_accuracy improved from 0.89627 to 0.89790, saving model to /mnt/c/eaf_llc/aa-analytics_and_bi/alliance_molds/radome_quality/image_analysis/models/2023-01-24_11_10_47_best_model.tf
WARNING:absl:Found untraced functions such as _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op while saving (showing 5 of 5). These functions will not be directly callable after loading.
INFO:tensorflow:Assets written to: /mnt/c/eaf_llc/aa-analytics_and_bi/alliance_molds/radome_quality/image_analysis/models/2023-01-24_11_10_47_best_model.tf/assets
INFO:tensorflow:Assets written to: /mnt/c/eaf_llc/aa-analytics_and_bi/alliance_molds/radome_quality/image_analysis/models/2023-01-24_11_10_47_best_model.tf/assets
1760/1760 [==============================] - 44s 25ms/step - loss: 0.2569 - accuracy: 0.9232 - val_loss: 50.6153 - val_accuracy: 0.8979
Epoch 77/100
1760/1760 [==============================] - ETA: 0s - loss: 0.2531 - accuracy: 0.9244  
Epoch 77: val_accuracy improved from 0.89790 to 0.89966, saving model to /mnt/c/eaf_llc/aa-analytics_and_bi/alliance_molds/radome_quality/image_analysis/models/2023-01-24_11_10_47_best_model.tf
WARNING:absl:Found untraced functions such as _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op, _jit_compiled_convolution_op while saving (showing 5 of 5). These functions will not be directly callable after loading.
INFO:tensorflow:Assets written to: /mnt/c/eaf_llc/aa-analytics_and_bi/alliance_molds/radome_quality/image_analysis/models/2023-01-24_11_10_47_best_model.tf/assets
INFO:tensorflow:Assets written to: /mnt/c/eaf_llc/aa-analytics_and_bi/alliance_molds/radome_quality/image_analysis/models/2023-01-24_11_10_47_best_model.tf/assets

You can see the train loss decreases and both accuracies are increasing, but it reports the val loss as actually increasing.

Hi @Blaine_Bateman

Could you please share some more details on the issue along with the reproducible code used to replicate the error and to understand the issue better? Thank you.