The model misclassify on categories

Did you try out variations of the model implementation you shared? Did you get improvements?
Isn’t your dataset a little small?
Can I ask why you used the sigmoid function as activation function (except output layer) instead of softmax as generally used with LeNet?