Addressing Overfitting Issues with Deep Learning Model for Video Action Recognition
Open Access
Author:
Muralidhar, Shivran
Area of Honors:
Computer Engineering
Degree:
Bachelor of Science
Document Type:
Thesis
Thesis Supervisors:
Vijaykrishnan Narayanan, Thesis Supervisor Vijaykrishnan Narayanan, Thesis Honors Advisor John Morgan Sampson, Faculty Reader
Keywords:
deep learning machine learning video recognition computer vision
Abstract:
This thesis explores and addresses the overfitting issue that exists in a novel human action recognition model, proposed by our research group. Current works in the human action recognition field have implemented a graph convolutional network or a graph neural network variant to achieve high-performance metrics. The great dependence on convolutional neural networks introduces efficiency problems regarding high computational costs and a significant latency increase. Our new baseline model is fully attention-based to tackle these efficiency issues through the reliance on transformers architecture. However, our model suffers from an overfitting issue, which restricts the processing of practical datasets. This thesis addresses the overfitting issue by implementing four different data augmentation techniques: introduction of gaussian noise, removal of localized body parts, rotation of joints, and removal of frames. Using the absolute error, loss ratio, and learning curves of training and testing sets as metrics, the results support the view that the combination of removing localized body parts and cutting-off frames yields the most effective and generalized model.