Design rules for CNN architectures?

Greeting to the TensorFlow community. I’m new to running NNs and have been amazed to their power, but i find a distinct lack of information in one area of the field.

How is the basic architecture of a CNN decided upon? Given that for at least a decade, a large number of competitions have been set for the best designs of CNNs, to yield the highest accuracies, there should by now be a clear understanding of how the basic structure of an optimal network (ie the numbers of convolutional and fully-filled layers and the numbers of neurons in each) is linked to the input information characteristics.

Many tutorials and books on the subject just copy an existing NN structure with little consideration or thought given to what the optimum structure might be to begin with. The approach is ‘this one works, so just use this’ with little information about why there are eg. one, two, six or more layers of neurons. This is not conducive to creative thoughts about building optimum structures for NNs. It suggests the authors simply do not have this information. Clearly you can iterate a basic first design into one with higher accuracies given time, but a more formal set of ‘starter’ design rules would be very helpful.

It would be really useful to the whole community of NN developers if there was a clear strategy for building particular NNs to process information, be it words, ultrasound, photon counting, human gait, or whatever, the list is almost endless, which offers the highest accuracy to begin with, which could then be refined. Would anyone have a good source of information on this first stage of a NN design? Many thanks.

Hi @NZ1, Selecting the number of layers and neurons in the CNN is an experimental thing and it often involves a lot of trial and error.

Choosing the number of layers and neurons depends upon the complexity of the data. Using more number of layers and neurons helps the model learn more complex patterns, but also increases the risk of over fitting and requires more computational resources.

To determine the optimal number of layers and neurons is to monitor the performance of the model on a validation set. If the performance on the validation set starts to decrease, it might be a sign that the model is overfitting and that you should reduce the complexity of the network.

Thank You.

Hi Kiran, many thanks that’s great advice. It confirms a route that i am taking and encouraging to hear it from others that this is the right thing to do.

Adding extra layers of convolutional networks and fully filled dense networks, then varying the number of channels and hidden neurons in each, whilst monitoring the classification accuracy and tweaking learning rates and decay weights is really time consuming. I wish there was more of a quicker way to do this.

Thank you for the advice.

It depends upon the use case and the data used for the model, there is no perfect route for a perfect result. You need to get familiarize with the use of the models or sometimes get lucky as your iteration is satisfactory at the early stages. Using the widely used activation functions and optimizers for your use case can make things better.