Please enable JavaScript to view the comments powered by Disqus.

A Structured Approach to Neural Networks

Introduction

A vector-based description of a Fully Connected Neural Network will be presented to provide a solid and easy to understand mathematical foundation. By using some simple Linear Algebra, complicated summation equations can be avoided, making the mathematics much cleaner and clearer. This is important because the mathematical underpinnings of a Fully Connected Neural Network form the basis for other kinds of connectionist networks such as Convolutional Neural Networks. Neural Networks have free parameters that must be solved for which is referred to as training. The kind of training that will be discussed is based on having a set of input data with associated expected results. This is known as Structured Learning. Network parameters are adjusted in an attempt to produce the expected results relative to associated inputs. To know how to make the adjustments, a function called a Loss function (a.k.a. Cost function or Objective function) is devised to measure the network output error and adjustments are made to the network parameters in an effort to minimize the error. This paper will focus on the Gradient Descent method to minimize the Loss function and a simple study of a Loss function Error Surface is conducted to give an appreciation for how the method works. To apply Gradient Descent the Error Surface gradient is needed, and to determine that value the Back-Propagation method will be used. The Back-Propagation method will be derived from the vector-based network equation.

There is a companion C++ library that mirrors closely the structure of the mathematics described in this paper. It is called the SimpleNet library as it is written to be easily readable and extensible. The paper examines the implementation accuracy of the library, and its structure is discussed, but the paper is not about the SimpleNet library. It is available to you and documentation for it is provide in an appendix if you are interested, but the paper stands on its own.

The target audience for this paper is the person interested in the underlying mathematics of Neural Networks. To fully benefit from the paper a reader should at least be familiar with basic linear algebra and differential calculus. Regarding the former, knowing how to multiply a matrix by a vector is enough. Regarding the later, if you know, or once knew what a function derivative is, that should be enough to follow along.

View Full Paper Code Download:  7-Zip  RAR