Multi-Layer Perceptron
Neural networks are key pillars of the state-of-the-art algorithms since they contribute to solving complex tasks. In previous blog, simple type of neural network called single layer perceptron was discussed; however, this blog emphases on one step further than single layer perceptron, namely, Multi-layer Perceptron (MLP), which is a prominent type of neural network architecture. The purpose of this blog is to explore the topic of MLPs and understand how they work, together with it, not only to develop Python code to implement and test a simple MLP, but also to solve it mathematically for better comprehension.
Example
Suppose the following inputs, and the purpose is to classify them using MLP by means of corresponding weights and activation functions, so that if the final output y for any input is less than the provided threshold of 1.35, the input should be classified as -ve, otherwise +ve.
inputs = (0, 0), (1, 0), (0, 1), (0, 2), (2, 0), (2, 1), (1, 2), (3, 1).
Taking the information provided into account, the network design will be as follows:
At the moment since the network has been designed, the next step is to develop the necessary mathematical intuition to aid with the task at hand.
where x0 is bias term which is always 1.
Now that the network has been designed and the mathematical formulation required for the design has been developed, all the input samples can be classified using the formulated equations while keeping in mind the specified threshold of 1.35.
Code
For a classification task, the given code implements a three-layer MLP. The activation functions defined in the code are step, sigmoid, tanh, and relu. These activation functions give the network non-linearity, allowing it to learn complicated correlations between inputs and outputs. The code then uses forward propagation to calculate the MLP output for each input data point and as per output it classifies the input as +ve or -ve by comparing it with defined threshold.
#Activation functions used for this task are defined i.e. "layer1:step, sigmoid, tanh" "layer2: relu, sigmoid" "layer3: relu"
def relu(output):
return np.maximum(0,output)
def step(output):
if output > 0:
return 1
elif output==0:
return 1/2
else:
return 0
def sigmoid(output):
return 1/(1 + np.exp(-output))
def tanh(output):
return np.tanh(output)
#The bias term "1" is added as input at each layer using inster method
import numpy as np
import matplotlib.pyplot as plt
category=[]
inputs = np.array([(0, 0), (1, 0), (0, 1), (0, 2), (2, 0), (2, 1), (1, 2), (3, 1)])
weights1 = np.array([0.02, 0.03, 0.4,
0.5, 0.6, 0.1,
2, 0.4, 1])
weights2 = np.array([0.7, -0.9, 2, 1,
0.8, 1, -1.5, -0.9])
weights3 = np.array([0.01, 0.5, 0.6])
for i in range(len(inputs)):
print(" Input",inputs[i])
modified_inputs = np.insert(inputs[i], 0, 1)
#Applying activation functions at layer 1
weightedsum1 = np.dot(modified_inputs, weights1.reshape(-1, 3).T)
activation1 = np.array([step(weightedsum1[0]), sigmoid(weightedsum1[1]), tanh(weightedsum1[2])])
#Applying activation functions at layer 2
weightedsum2 = np.dot(np.insert(activation1, 0, 1), weights2.reshape(-1, 4).T)
activation2 = np.array([relu(weightedsum2[0]), sigmoid(weightedsum2[1])])
#Applying activation function at layer 3
weightedsum3 = np.dot(np.insert(activation2, 0, 1), weights3.reshape(-1, 3).T)
activation3 = np.array([relu(x) for x in weightedsum3])
if activation3 > 1.35:
print("classified as +ve","\n")
category.append("+")
else:
print("classified as -ve","\n")
category.append("-")
for i, cat in zip(inputs, category):
plt.text(i[0], i[1], cat, c='red', fontsize=20)
plt.xlim(0, 3) # Set x-axis limits to 0 and 3
plt.ylim(0, 3) # Set y-axis limits to 0 and 3
plt.xticks(range(5)) # Set x-axis tick labels from 0 to 3
plt.yticks(range(5)) # Set y-axis tick labels from 0 to 3
# Add grid lines to the plot
plt.show()
Output
Conclusion
In this blog, the concept behind multi-layer perceptron (MLP) was discussed and also created a simple MLP using Python code. The effect of activation functions in introducing nonlinearity to the network was examined. The code shows how to use forward propagation to classify data points depending on the output of the MLP.
MLPs are effective tools for performing machine learning tasks such as classification, regression, and even complicated pattern recognition. Understanding MLPs and how they are implemented can lay the groundwork for tackling more advanced neural network architectures.
Experimenting with various activation functions, network designs, and datasets will help you better grasp MLPs and their possibilities in solving real-world problems.