Creating a Solver for Nonlinear Differential Equations Using Unsupervised Neural Networks

Open Access
- Author:
- Frishman, Jacob
- Area of Honors:
- Mathematics
- Degree:
- Bachelor of Science
- Document Type:
- Thesis
- Thesis Supervisors:
- Leonid V Berlyand, Thesis Supervisor
Sergei Tabachnikov, Thesis Honors Advisor - Keywords:
- Differential Equations
DNNs
Deep Neural Networks
Machine Learning
ODEs
Neural Networks - Abstract:
- Differential equations are ubiquitous in science and engineering for describing the natural world and often appear as nonlinear differential equations. Unfortunately, there is no general method for solving all types of nonlinear differential equations. This work uses a machine learning process called deep neural networks (DNNs) to create a solver for the Ginzburg-Landau equation regardless of the boundary conditions or the right-hand side. This method overcomes challenges to previous methods that require recomputing the solution again for every change in the boundary conditions and right-hand side of the equation. The method develops a versatile solver capable of finding a solution using only the form of the differential equation without a predefined right-hand side or boundary conditions. Systematically varying the architecture of the network, the characteristics of the input data, the loss function optimized over, and the network's hyperparameters reveal that the method can find a general solution across a diverse range of boundary conditions and right-hand sides. The network can consistently find accurate approximations of slowly oscillating data and highly oscillating data built from many terms of the Fourier series. The model can generalize performance from training data to test data, indicating its success in creating a general inverse differential operator that solves the equation. For data with many oscillations and small magnitudes, the network suffers from the vanishing gradient problem. These challenges are addressed by implementing strategies such as batch normalization, varying initialization schemes, changing activation functions, modifying the network architecture, and altering the loss function. These changes help mitigate the problem, leading to more stable and robust solutions to the initial hyperparameters of the model. However, the vanishing gradient problem persists despite these changes. Developing a solver that works for nonlinear equations would be pivotal in developing a theory for solving differential equations, saving computational time and resources, and facilitating real-time applications of the network without retraining.