Users of artificial neural networks are likely to ask some questions. Some of these questions include what is the number of hidden layers to use? How many hidden neurons in each hidden layer? What is the purpose of using hidden layers/neurons? Is increasing the number of hidden layers/neurons always gives better results? We could answer such questions. But answering them might be too complex if the problem being solved is complicated. By the end of this article, one could at least get the idea of how they are answered and be able to test yourself based on simple examples.
ANN is inspired by the biological neural network. For simplicity, in computer science, it is represented as a set of layers. These layers are categorized into three classes which are input, hidden, and output.
Knowing the number of input and output layers and the number of their neurons is the easiest part. Every network has a single input layer and a single output layer. The number of neurons in the input layer equals the number of input variables in the data being processed. The number of neurons in the output layer equals the number of outputs associated with each input. But the challenge is knowing the number of hidden layers and their neurons.
Here are some guidelines to know the number of hidden layers and neurons per each hidden layer in a classification problem:
1. Express the decision boundary as a set of lines. Note that the combination of such lines must yield to the decision boundary.
2. The number of selected lines represents the number of hidden neurons in the first hidden layer.
3. To connect the lines created by the previous layer, a new hidden layer is added. Note that a new hidden layer is added each time you need to create connections among the lines in the previous hidden layer.
4. The number of hidden neurons in each new hidden layer equals the number of connections to be made.
To make things clearer, let’s apply the previous guidelines for a number of examples.
Example 1
Let’s start with a simple example of a classification problem with two classes as shown in figure below. Each sample has two inputs and one output that represents the class label.
The first question to answer is whether hidden layers are required or not. A rule to follow in order to determine whether hidden layers are required or not is as follows: In artificial neural networks, hidden layers are required if and only if the data must be separated non-linearly.
Looking at figure, it seems that the classes must be non-linearly separated. A single line will not work. As a result, we must use hidden layers in order to get the best decision boundary. In such case, we may still not use hidden layers but this will affect the classification accuracy. So, it is better to use hidden layers.
In order to add hidden layers, we need to answer these following two questions:
1. What is the required number of hidden layers?
2. What is the number of the hidden neurons across each hidden layer?
Following the previous procedure, the first step is to draw the decision boundary that splits the two classes. There is more than one possible decision boundary that splits the data correctly as shown in figure. The one we will use for further discussion is in figure.