In recent years, there has been evidence of a growing interest on the part of universities to know in advance the academic performance of their students and allow them to establish timely strategies to avoid desertion and failure. One of the biggest challenges to predicting student performance is presented in the course 'Programming Fundamentals' of Computer Science, Software Engineering, and Information Systems Engineering careers in Peruvian universities for high student dropout rates. The objective of this research was to explore the efficiency of Long-Short Term Memory Networks (LSTM) in the field of Educational Data Mining (EDM) to predict the academic performance of students during the seventh, eighth, twelfth, and sixteenth weeks of the academic semester, which allowed us to identify students at risk of failing the course. This research compares several predictive models, such as Deep Neural Network (DNN), Decision Tree (DT), Random Forest (RF), Logistic Regression (LR), Support Vector Classifier (SVM), and K-Nearest Neighbor (KNN). A major challenge machine learning algorithms face is a class imbalance in a dataset, resulting in over-fitting to the available data and, consequently, low accuracy. We use Generative Adversarial Networks (GAN) and Synthetic Minority Over-sampling Technique (SMOTE) to balance the data needed in our proposal. From the experimental results based on accuracy, precision, recall, and F1-Score, the superiority of our model is verified concerning a better classification, with 98.3% accuracy in week 8 using LSTM-GAN, followed by DNN-GAN with 98.1% accuracy.
Bibliographical notePublisher Copyright:
© 2013 IEEE.
- Educational data mining
- generative adversarial networks
- long-short term memory
- synthetic minority over-sampling technique