Table of Contents
Fundamentals of Artificial Neural Networks
by Mohamad H. Hassoun
(MIT Press, 1995)
- Chapter 1 Threshold Gates
- 1.0 Introduction
- 1.1 Threshold Gates
- 1.1.1 Linear Threshold Gates
1.1.2 Quadratic Threshold Gates
1.1.3 Polynomial Threshold Gates
- 1.2 Computational Capabilities of Polynomial Threshold Gates
- 1.3 General Position and the Function Counting Theorem
- 1.3.1 Weierstrass's Approximation Theorem
1.3.2 Points in General Position
1.3.3 Function Counting Theorem
1.3.4 Separability in f-Space
- 1.4 Minimal PTG Realization of Arbitrary Switching Functions
- 1.5 Ambiguity and Generalization
- 1.6 Extreme Points
- 1.7 Summary
Problems
Chapter 2 Computational Capabilities of Artificial Neural Networks
- 2.0 Introduction
- 2.1 Some Preliminary Results on Neural Network Mapping Capabilities
- 2.1.1 Network Realization of Boolean Functions
2.1.2 Bounds on the Number of Functions Realizable by a Feedforward Network of LTG's
- 2.2 Necessary Lower Bounds on the Size of LTG Networks
- 2.2.1 Two Layer Feedforward Networks
2.2.2 Three Layer Feedforward Networks
2.2.3 Generally Interconnected Networks with no Feedback
- 2.3 Approximation Capabilities of Feedforward Neural Networks for Continuous Functions
- 2.3.1 Kolmogorov's Theorem
2.3.2 Single Hidden Layer Neural Networks are Universal Approximators
2.3.3 Single Hidden Layer Neural Networks are Universal Classifiers
- 2.4 Computational Effectiveness of Neural Networks
- 2.4.1 Algorithmic Complexity
2.4.2 Computational Energy
- 2.5 Summary
Problems
Chapter 3 Learning Rules
- 3.0 Introduction
- 3.1 Supervised Learning in a Single Unit Setting
- 3.1.1 Error Correction Rules
- Perceptron Learning Rule Generalizations of the Perceptron Learning Rule
- The Perceptron Criterion Function
Mays Learning Rule
Widrow-Hoff (alpha-LMS) Learning Rule
- 3.1.2 Other Gradient Descent-Based Learning Rules
mu-LMS Learning Rule
The mu-LMS as a Stochastic Process
Correlation Learning Rule
- 3.1.3 Extension of the mu-LMS Rule to Units with Differentiable Activation Functions: Delta Rule
- 3.1.4 Adaptive Ho-Kashyap (AHK) Learning Rules
- 3.1.5 Other Criterion Functions
- 3.1.6 Extension of Gradient Descent-Based Learning to Stochastic Units
- 3.2 Reinforcement Learning
- 3.2.1 Associative Reward-Penalty Reinforcement Learning Rule
- 3.3 Unsupervised Learning
- 3.3.1 Hebbian Learning
- 3.3.2 Oja's Rule
- 3.3.3 Yuille et al. Rule
- 3.3.4 Linsker's Rule
- 3.3.5 Hebbian Learning in a Network Setting: Principal Component Analysis (PCA)
PCA in a Network of Interacting Units
PCA in a Single Layer Network with Adaptive Lateral Connections
- 3.3.6 Nonlinear PCA
- 3.4 Competitive learning
- 3.4.1 Simple Competitive Learning
- 3.4.2 Vector Quantization
- 3.5 Self-Organizing Feature Maps: Topology Preserving Competitive Learning
- 3.5.1 Kohonen's SOFM
- 3.5.2 Examples of SOFMs
- 3.6 Summary
Problems
Chapter 4 Mathematical Theory of Neural Learning
- 4.0 Introduction
- 4.1 Learning as a Search Mechanism
- 4.2 Mathematical Theory of Learning in a Single Unit Setting
- 4.2.1 General Learning Equation
- 4.2.2 Analysis of the Learning Equation
- 4.2.3 Analysis of some Basic Learning Rules
- 4.3 Characterization of Additional Learning Rules
- 4.3.1 Simple Hebbian Learning
- 4.3.2 Improved Hebbian Learning
- 4.3.3 Oja's Rule
- 4.3.4 Yuille et al. Rule
- 4.3.5 Hassoun's Rule
- 4.4 Principal Component Analysis (PCA)
- 4.5 Theory of Reinforcement Learning
- 4.6 Theory of Simple Competitive Learning
- 4.6.1 Deterministic Analysis
- 4.6.2 Stochastic Analysis
- 4.7 Theory of Feature Mapping
- 4.7.1 Characterization of Kohonen's Feature Map
- 4.7.2 Self-Organizing Neural Fields
- 4.8 Generalization
- 4.8.1 Generalization Capabilities of Deterministic Networks
- 4.8.2 Generalization in Stochastic Networks
- 4.9 Complexity of Learning
- 4.10 Summary
Problems
Chapter 5 Adaptive Multilayer Neural Networks I
- 5.0 Introduction
- 5.1 Learning Rule for Multilayer Feedforward Neural Networks
- 5.1.1 Error Backpropagation Learning Rule
- 5.1.2 Global Descent-Based Error Backpropagation
- 5.2 Backprop Enhancements and Variations
- 5.2.1 Weights Initialization
- 5.2.2 Learning Rate
- 5.2.3 Momentum
- 5.2.4 Activation Function
- 5.2.5 Weight Decay, Weight Elimination, and Unit Elimination
- 5.2.6 Cross-Validation
- 5.2.7 Criterion Functions
- 5.3 Applications
- 5.3.1 NetTalk
- 5.3.2 Glove-Talk
- 5.3.3 Handwritten ZIP Code Recognition
- 5.3.4 ALVINN: A Trainable Autonomous Land Vehicle
- 5.3.5 Medical Diagnosis Expert Net
- 5.3.6 Image Compression and Dimensionality Reduction
- 5.4 Extensions of Backprop for Temporal Learning
- 5.4.1 Time-Delay Neural Networks
- 5.4.2 Backpropagation Through Time
- 5.4.3 Recurrent Back-Propagation
- 5.4.4 Time-Dependent Recurent Back-Propagation
- 5.4.5 Real-Time Recurrent Learning
- 5.5 Summary
Problems
Chapter 6 Adaptive Multilayer Neural Networks II
- 6.0 Introduction
- 6.1 Radial Basis Function (RBF) Networks
- 6.1.1 RBF Networks versus Backprop Networks
- 6.1.2 RBF Network Variations
- 6.2 Cerebeller Model Articulation Controller (CMAC)
- 6.2.1 CMAC Relation to Rosenblatt's Perceptron and Other Models
- 6.3 Unit-Allocating Adaptive Networks
- 6.3.1 Hyperspherical Classifiers
Restricted Coulomb Energy (RCE) Classifier
Real-Time Trained Hyperspherical Classifier
- 6.3.2 Cascade-Correlation Network
- 6.4 Clustering Networks
- 6.4.1 Adaptive Resonance Theory (ART) Networks
- 6.4.2 Autoassociative Clustering Network
- 6.5 Summary
Problems
Chapter 7 Associative Neural Memories
- 7.0 Introduction
- 7.1 Basic Associative Neural Memory Models
- 7.1.1 Simple Associative Memories and their Associated Recording Recipes
Correlation Recording Recipe
A Simple Nonlinear Associative Memory Model
Optimal Linear Associative Memory (OLAM)
OLAM Error Correction Capabilities
Strategies for Improving Memory Recording
- 7.1.2 Dynamic Associative Memories (DAM)
Continuous-Time Continuous-State Model
Discrete-Time Continuous-State Model
Discrete-Time Discrete-State Model
- 7.2 DAM Capacity and Retrieval Dyanamics
- 7.2.1 Correlation DAMs
- 7.2.2 Projection DAMs
- 7.3 Characteristics of High-Performance DAMs
- 7.4 Other DAM Models
- 7.4.1 Brain-State-in-a-Box (BSB) DAM
- 7.4.2 Non-Monotonic Activations DAM
Discrete Model
Continuous Model
- 7.4.3 Hysteretic Activations DAM
- 7.4.4 Exponential Capacity DAM
- 7.4.5 Sequence Generator DAM
- 7.4.6 Heteroassociative DAM
- 7.5 The DAM as a Gradient Net and its Application to Combinatorial Optimization
- 7.6 Summary
Problems
Chapter 8 Global Search Methods for Neural Networks
- 8.0 Introduction
- 8.1 Local versus Global Search
- 8.1.1 A Gradient Descent/Ascent Search Strategy
- 8.1.2 Stochastic Gradient Search: Global Search via Diffusion
- 8.2 Simulated Annealing-Based Global Search
- 8.3 Simulated Annealing for Stochastic Neural Networks
- 8.3.1 Global Convergence in a Stochastic Recurrent Neural Net: The Boltzmann Machine
- 8.3.2 Learning in Boltzmann Machines
- 8.4 Mean-Field Annealing and Deterministic Boltzmann Machines
- 8.4.1 Mean-Field Retrieval
- 8.4.2 Mean-Field Learning
- 8.5 Genetic Algorithms in Neural Network Optimization
- 8.5.1 Fundamentals of Genetic Search
- 8.5.2 Application of Genetic Algorithms to Neural Networks
- 8.6 Genetic Algorithm Assisted Supervised Learning
- 8.6.1 Hybrid GA/Gradient Descent Method for Feedforward Multilayer Net Training
- 8.6.2 Simulations
- 8.7 Summary
Problems
References
Index
Back to Main Menu