APPLICATION OF MULTILAYER ARTIFICIAL NEURAL NETWORK FOR PREDICTING STUDENT’S ACADEMIC PERFORMANCE IN NIGERIA SECONDARY SCHOOL
ABSTRACT
Prediction of student’s academic performance is potentially important for educational institutions to deliver high-quality education. Developing an accurate student’s performance prediction model is a challenging task. This project employs an artificial intelligence system for student academic performance prediction so as to help students, school administration, improves students' academic achievements. The proposed approach consists of two steps. First, the results of students in the previous exams are pre-processed by normalizing their results in order of improving the accuracy and efficiency of the predictive model. Second, the Artificial Neural Network(ANN) model is applied to predict the students’ expected performance in the next academic section. Various indicators that may likely influence the performance of a student were identified. Such indicators as subjects’ scores, gender, behavior, skills were then used as input variables for the ANN model. The performance of the NN model is based on three different types of training functions(Levenberg Marquardt, Gradient Descent Backpropagation with Momentum, Gradient Descent Backpropagation with Adaptive Learning Rate) is evaluated by varying the number of training epoch from 100 to 500. The experimental results showed that the Gradient descent backpropagation with momentum achieved the optimal students’ performance prediction and surpassed the other ANN training functions with a Mean Square Error (MSE) as low as 0.14763 in 400 epochs. The result was quite impressive and demonstrates the potential of the artificial neural network to the prediction of student academic performance.
Introduction
Artificial intelligence tools have been integrated into so many fields of human endeavors in which education is no exception. Artificial Intelligent tools such as Expert System, Neural Network, Fuzzy Logic, Genetic Algorithms. These tools enable the discovery of new, interesting and useful information from a given dataset in education, medicine, marketing, insurance, engineering and so on. The use of conventional techniques, database queries exhibit shortcomings and suffer from high volumes of data availability, prediction of students’ academic performance in Nigeria secondary schools is a difficult task because it requires high number of variables, the variable consist of subject scores, student behavioral parameters, and skills (punctuality, fluency, game/sports, and handling of materials). The secondary school prepares one for higher school or trade, business. Hence proper, adequate evaluation method to determine student’s performance is required. The intelligent system emulates nature’s way of processing information.
The application of an artificial neural network that imitates the human brain in problem-solving is proposed for predicting student’s academic performance. The neural network classifier is incorporated in a user-friendly software tool for predicting student academic performance in Nigeria secondary school. Multilayer Artificial neural network is a kind of ANN. ANN consists of nodes in which the failure of any nodes or more does not affect the result. ANN stores data in node, the storage of this data leads to fast processing of data and it exhibits fault-tolerant.
Literature Review of Related Works
The rapid advances in educational technologies assist the educational institutes to practice the student performance prediction model. Several techniques had been designed to predict students’ performance.
In (Shakil et al., 2017), an intelligent system to predict academic performance based on different factors during adolescence was proposed. The Artificial neural network performs the best with an accuracy of over 86%. One of the major limitations of this work is that there is no way to validate the responses from the survey. Another limitation is that the study only targets predicting the overall grade.
In (Tarik and Nian, 2016), student academic performance prediction using artificial intelligence was presented. The experiments have shown that the ANN-based model can be successfully used to predict the expected performance of the student’s in General English courses at with accuracy of 83%. From the data analysis, it was discovered that departments and tutors are very significant factors affecting academic performance. These two variables have the most significant weights among the rest of the 14 variables. The study shows that the factors gender, age, Father and mother’s degree, and GPA do not have any effect on academic performance. The limitation of the neural network model is that it undergoes a long time of training before giving the result.
In (Naser et al., 2015), ANN model for the prediction of the academic performance of Engineering faculty students was proposed. The ANN model correctly predicts the student performance with accuracy of 80%. The limitation is that ANNs train slowly and require lots of training data.
In (Ajith et al., 2013), a rule mining framework for the evaluation of the performance of the students based on Association Rules is proposed, the work learns the student dataset to obtain in-depth information on student performance .
In this work the neural network and will be adapted for better analysis and proper optimization of training parameter.
Artificial Neural Network
Artificial neural network also known as neural network is a system based on the operation of biological neural networks, in other words, is an emulation of biological neural system. Inspired by the structure of the brain, a neural network consists of a set of highly interconnected entities, called Processing Elements (PE) or units. Each unit is designed to mimic its biological counterpart, the neuron. Each accepts a weighted set of inputs and responds with an output. Neural networks address problems that are often difficult for traditional computers to solve, such as speech and pattern recognition, weather forecasts, sales forecasts, scheduling of buses, power loading forecasts, early cancer detection, etc. (Adefowoju and Osofisan, 2004)
A neural network is a more general method of regression analysis. Some of the advantages of the network over conventional regression include the following:
i. There is no need to specify a function to which the data are to be fitted.
ii. The function is an outcome of the process of creating a network.
iii. The network is able to capture almost arbitrarily nonlinear relationships.
iv. With Bayesian methods, it is possible to estimate the uncertainty of extrapolation.
There are feed-forward, back-propagation, and feedback types of networks depending on the manner of neuron connections. The first allows only neuron connections between two different layers. The second has not only feed-forward but also ‘error feedback’ connections from each of the neurons above it. The last shares the same features as the first, but with feedback connections, that permit more training or learning iterations before results can be generated. ANN learning can be either supervised or unsupervised.
Architecture of ANN
The architecture of an artificial neural network defines how its several neurons are arranged, or placed, in relation to each other. These arrangements are structured essentially by directing the synaptic connections of the neurons. ANN can be divided into three parts, named layers, which are known as:
a. Input layer: This layer is responsible for receiving information (data), signals, features, or measurements from the external environment. These inputs (samples or patterns) are usually normalized within the limit values produced by activation functions. This normalization results in better numerical precision for the mathematical operations performed by the network.
b. Hidden, intermediate, or invisible layers: These layers are composed of neurons that are responsible for extracting patterns associated with the process or system being analyzed. These layers perform most of the internal processing from a network.
c. Output layer: This layer is also composed of neurons, and thus is responsible for producing and presenting the final network outputs, which result from the processing performed by the neurons in the previous layers.

Fig1: Feedforward ANN model
Data feature description collection process
The dataset consist of 500 records of student from Polytechnic Academy, Nasarawa were sourced from exams and records and chosen as the input variable. 500 instances form the records with 4 variables; the variables used are:
a. Student scores which are obtained from 11 different subjects for each classes, scores gained by students in these subjects ranges from 0–100, these subjects include: (Mathematics, English, Biology, Chemistry, Physics, Geography, Agricultural Science, Civic education, Economics, Further Mathematics, Marketing) and it is categorized as excellent, very good, credit, pass, fail.
b. Gender which is categorized as either male or female.
c. Behavior with 8 attributes namely; (Attendance, interest of study, punctuality, reliability, neatness, politeness, honesty, self-control, sense of responsibility) which is rated as excellent, good, poor, very poor.
d. Skills with 4 attributes namely; (handwriting, fluency, games/sport, handling of materials) and it is rated as excellent, good, poor, very poor.
The dataset are pre-processed by normalizing the attributes to enhance the accuracy and the efficiency of the prediction. The average score of the student’s score was chosen as the target data. The dataset used is based on the current grading system of the school.
Performance Evaluation of the Ann Model
To examine the performance of a neural network, several criteria are used. These criteria are applied to the trained neural network to determine how well the network works. The criteria were used to compare predicted values and actual values. They are as follows:
a. Mean Squared Error (MSE): The mean squared error tells you how close a regression line is to a set of points. It does this by taking the distances from the points to the regression line (these distances are the “errors”) and squaring them. The squaring is necessary to remove any negative signs. It also gives more weight to larger differences. It’s called the mean squared error as you’re finding the average of a set of errors.
b. Root mean squared error (RMSE): This index estimates the residual between the actual value and predicted value. A model has better performance if it has a smaller RMSE. An RMSE equal to zero represents a perfect fit.

Where tk is the actual value, yk is the predicted value produced by the model, and m is the total number of observations.
In addition to the mentioned criteria, the number of iterations required by individual training algorithms to reach the certain output accuracy was also used to evaluate the performance of the training algorithms. In this project mean squared error is used as the performance measure.
Experimental Results and Discussion
Various experiments were conducted using the feedforward backpropagation as the NN model. This model was trained using different training functions: Levenberg-Marquardt (LM), Gradient descent with momentum (GDM), Gradient descent with adaptive learning rate (GDA). The work uses MSE as the performance measure MSE for the training and testing of the model. In addressing the prediction of students’ performance, students’ subject score in the previous examinations were used as the input dataset and the average score was
used as the target dataset for the NN model. The optimal NN model will be nominated based on the testing MSE values for different epoch numbers. The performance of the NN model is based on three different types of training function(LM, GDM, GDA) is evaluated by varying the number of training epoch from 100 to 500 as shown in the table 4.2. The table depicts results which reflect that NN model showed significant variation in its performance based on the type of training function in use. The testing MSE values for the three ANN training function ranged from 0.14763 to 0.99968. The Gradient descent with momentum achieved the best students’ performance prediction with MSE of 0.14763 in 400 epochs,and MSE of 0.16976 in 100 epochs followed by the Levenberg Marquardt with MSE of 0.30097 in 100 epochs respectively. The Gradient descent backpropagation with adaptive learning rate performs poorly and predicts the students’ performance with MSE of 0.92331 which is less accurate than the other training function.

Table 1: shows the results of the NN model with different epoch.
Conclusion
The accurate prediction of student academic performance is of importance for making admission decisions as well as providing better educational services. In this project, the feed-forward neural network was utilized to predict student academic performance. This model was trained using different training functions: Levenberg-Marquardt (LM), Gradient descent with momentum (GDM), Gradient descent with adaptive learning rate (GDA), and MSE as the performance measure MSE for the training and testing of the model. The optimal NN model will be nominated based on the testing MSE values for different epoch numbers. The performance of the NN model is based on three different types of training functions (LM, GDM, GDA) is evaluated by varying the number of training epochs of 100 to 500. The testing MSE values for the three ANN training functions ranged from 0.14763 to 0.99968. The Gradient descent with momentum (traingdm) achieved the best students’ performance prediction with MSE of 0.14763 in 400 epochs, and MSE of 0.16976 in 100 epochs followed by the Levenberg Marquardt(trainlm) with MSE of 0.30097 in 100 epochs respectively. The Gradient descent backpropagation with adaptive learning rate (traingda) performs poorly and predicts the students’ performance with MSE of 0.92331 which is less accurate than the other training function. The results of this study also reinforce the fact that a comparative analysis of different training algorithms is always supportive in enhancing the performance of a neural network.
References
Ajith P, Tejaswi B, Sai MSS. Rule Mining Framework for Students Performance Evaluation.
International Journal of Soft Computing and Engineering. 2013; 2(6):201–206.
Altyeb. A and Omar.B (2017). “Prediction of Student’s Academic Performance Based on
Adaptive Neuro-Fuzzy Inference”. International Journal of Computer Science and Network Security (IJCSNS),.17(1):165–16 .
Ioannis E, Konstantina.D and Panagiotis.P (2012). “Predicting student’s performance using artificial neural networks”. Department of Mathematics, University of Patras, GR 265 00, Greece. 8(1): 322–328.
Naser, S. A., Zaqout, I., Ghosh, M. A., Atallah, R., & Alajrami, E. (2015). “Predicting Student
Performance Using Artificial Neural Network”. Faculty of Engineering and Information Technology. International Journal of Hybrid Information Technology, 8(2):221–228.
Oladokun, V. and A. Adebanjo (2015). “Predicting student’s academic performance using artificial neural network”. Department of industrial engineering, university of Ibadan. Ibadan Nigeria. The Pacific Journal of Science and Technology 9(1):71–79.
Samy, S. (2012).”Predicting learners’ performance using artificial neural networks in linear programming intelligent tutoring system”. International journal of artificial intelligence and application (Ijaia).3(2):65–73.
Shakil .A, Navid .T Mahmood and Rashedur M.R. (2017). “An intelligent system to predict academic performance based on different factors during adolescence”, Journal of Information and Telecommunication, 1(2):155–175
Tarik A.R and Nian K.A (2016). “Student Academic Performance Using Artificial Intelligence”
ZANCO Journal of Pure and Applied Sciences The official scientific journal of Salahaddin University-Erbil ZJPAS 28 (2):56–69.
