5. Adaptive Multilayer Neural Networks I

5.0 Introduction

This chapter extends the gradient descent-based delta rule of Chapter 3 to multilayer feedforward neural networks. The resulting learning rule is commonly known as error back propagation (or backprop), and it is one of the most frequently used learning rules in many applications of artificial neural networks.

The backprop learning rule is central to much current work on learning in artificial neural networks. In fact, the development of backprop is one of the main reasons for the renewed interest in artificial neural networks. Backprop provides a computationally efficient method for changing the weights in a feedforward network, with differentiable activation function units, to learn a training set of input-output examples. Backprop-trained multilayer neural nets have been applied successfully to solve some difficult and diverse problems such as pattern classification, function approximation, nonlinear system modeling, time-series prediction, and image compression and reconstruction. For these reasons, we devote most of this chapter to study backprop, its variations, and its extensions.

Backpropagation is a gradient descent search algorithm which may suffer from slow convergence to local minima. In this chapter, several methods for improving backprop's convergence speed and avoidance of local minima are presented. Whenever possible, theoretical justification is given for these methods. A version of backprop based on an enhanced criterion function with global search capability is described, which, when properly tuned, allows for relatively fast convergence to good solutions. Several significant applications of backprop trained multilayer neural networks are described. These applications include the conversion of English text into speech, mapping hand gestures to speech, recognition of hand-written zip codes, autonomous vehicle navigation, medical diagnosis, and image compression.

The last part of this chapter deals with extensions of backprop to more general neural network architectures. These include multilayer feedforward nets whose inputs are generated by a tapped delay-line circuit and fully recurrent neural networks. These adaptive networks are capable of extending the applicability of artificial neural networks to nonlinear dynamical system modeling and temporal pattern association.

Goto [5.1] [5.2] [5.3] [5.4] [5.5]

Back to the Table of Contents

Back to Main Menu