Deep Learning 101 – Part 2
In the previous part, we took a look at some of the basic concepts of deep learning. Let us now take a look at activation functions. There are several types of activation functions, so let’s try to make sense of them all.
1) Linear Activation Function
This function basically scales an input by a factor, which implies a linear relationship shared by the outputs and inputs.
This is the formula. Output = y * x
y is a scalar value, as is instance 2, and x is the input.
This is how the graph looks if y=2.
2) Sigmoid activation function
This is in the shape of an ‘S’. It imparts non-linearity to outputs and gives binary values of either 0 or 1.
Look at the following scenario to understand this better. Suppose you purchase a European call option. The premise of this is that a premium amount is X is made to purchase an option on an underlying, for example on the stocks of an organization.
The purchaser and seller then agree on a strike price. This is the amount when the purchaser of the option can execute it.
When the rate of the underlying stock appreciates over the strike price, the purchaser winds up with a profit. If the opposite happens losses are capped and the premium is lost. This is an example of a non-linear relationship.
This binary relationship on choosing to exercise that option or not, can be calculated by the sigmoid activation function:
Output = 1 / 1+e-x
If your output is set to be either zero or 1, then utilize the sigmoid activation function.
This is the example graph:
Tanh Activation Function: Tanh is an extension to the sigmoid activation function. Therefore, Tanh can be utilized to induce non-linearity to the outputs. Outputs fall within the 1 to -1 range. Tanh shifts the outcome of the sigmoid activation function:
Rectified Linear Unit Activation Function
This is an often-used activation function. It is favored for usage within the hidden layers. It’s a simple concept. Additionally, it imparts non-linearity to the output. Interestingly, the outcome can range from zero to infinity.
If you are uncertain which activation function to utilize, then RELU is the one to go for.
Softmax activation function
This is an extension of the sigmoid activation function. It imparts non-linearity to outputs, but it is primarily used for classification examples in cases where several classes of outcomes can be computed.
Let’s look at this example to enhance our understanding. Suppose you are constructing a neural network that is set to predict the possibility of snowfall in the future. The softmax activation function can be utilized in the output layer as it can compute the odds of the event taking place in the future.
The activation functions make the inputs normal and produce values ranging from 0 to 1.
The weights combined with the bias can alter the functioning of neural networks.