**References**

Aart, E. and Korst, J. (1989). *Simulated Annealing
and Boltzmann Machines*. Wiley, New York.

Abu-Mostafa, Y. S. (1986a). "Neural Networks for
Computing?" in *Neural Networks for Computing*, J. S. Denker,
Editor, **151**, 1-6. American Institute of Physics, New York.

Abu-Mostafa, Y. S. (1986b). "Complexity of Random
Problems," in *Complexity in Information Theory*, Y. Abu-Mostafa,
Editor, 115-131. Springer-Verlag, Berlin.

Abu-Mostafa, Y. S. and Psaltis, D. (1987). "Optical
Neural Computers," *Scientific American*, **256**(3), 88-95.

Abu Zitar, R. A. (1993). Machine Learning with Rule Extraction
by Genetic Assisted Reinforcement (REGAR): Application to Nonlinear Control.
Ph.D. Dissertation, Department of Electrical and Computer Engineering,
Wayne State University, Detroit, Michigan.

Abu Zitar, R. A. and Hassoun, M. H. (1993a). "Neurocontrollers
Trained with Rules Extracted by a Genetic Assisted Reinforcement Learning
System," *IEEE Transactions on Neural Networks*, to appear in
1994.

Abu Zitar, R. A. and Hassoun, M. H. (1993b). "Regulator
Control via Genetic Search Assisted Reinforcement," in *Proceedings
of the Fifth International Conference on Genetic Algorithms* (Urbana-Champaign
1993), S. Forrest, Editor, 254-262. Morgan Kaufmann, San Mateo.

Ackley, D. H. and Littman, M. S. (1990). "Generalization
and Scaling in Reinforcement Learning," in *Advances in Neural Information
Processing II* (Denver 1989), D. S. Touretzky, Editor, 550-557. Morgan
Kaufmann, San Mateo.

Ackley, D. H., Hinton, G. E., and Sejnowski, T. J. (1985).
"A Learning Algorithm for Boltzmann Machines," *Cognitive Science*,
**9**, 147-169.

Alander, J. T. (1992). "On Optimal Population Size
of Genetic Algorithms," *Proceedings of CompEuro 92* (The Hague,
Netherlands 1992), 65-70. IEEE Computer Society Press, New York.

Albert, A. (1972). *Regression and the Moore-Penrose
Pseudoinverse*. Academic Press, New York, NY.

Albus, J. S. (1971). "A Theory of Cerebellar Functions,"
*Mathematical Biosciences*, **10**, 25-61.

Albus, J. S. (1975). "A New Approach to Manipulator
Control: The Cerebellar Model Articulation Controller (CMAC)," *Journal
of Dynamic Systems Measurement and Control, Transactions of the ASME*,
**97**, 220-227.

Albus, J. S. (1979). "A Model of the Brain for Robot
Control, Part 2: A Neurological Model," *BYTE*, 54-95.

Albus, J. S. (1981). *Brains, Behavior, and Robotics*.
BYTE/McGraw-Hill, Peterborough.

Alkon, D. L., Blackwell, K. T., Vogl, T. P., and Werness,
S. A. (1993). "Biological Plausibility of Artificial Neural Networks:
Learning by Non-Hebbian Synapses," in *Associative Neural Memories:
Theory and Implementation*, M. H. Hassoun, Editor, 31-49. Oxford University
Press, New York.

Almeida, L. B. (1987). "A Learning Rule for Asynchronous
Perceptrons with Feedback in a Combinatorial Environment," in *IEEE
First International Conference on Neural Networks* (San Diego 1987),
M. Caudill and C. Butler, Editors, vol. II, 609-618. IEEE, New York.

Almeida, L. B. (1988). "Backpropagation in Perceptrons
with Feedback," in *Neural Computers* (Neuss 1987), R. Eckmiller
and C. von der Malsburg, Editors, 199-208. Springer-Verlag, Berlin.

Alspector, J. and Allen, B. B. (1987). "A Neuromorphic
VLSI Learning System," in *Advanced Research in VLSI: Proceedings
of the 1987 Stanford Conference*, P. Losleben, Editor, 313-349. MIT,
Cambridge.

Aluffi-Pentini, F., Parisi, V., and Zirilli, F. (1985).
"Global Optimization and Stochastic Differential Equations,"
*Journal of Optimization Theory and Applications*, **47**(1), 1-16.

Amari, S.-I. (1967). "Theory of Adaptive Pattern
Classifiers," *IEEE Trans. Electronic Computers*, **EC-16**,
299-307.

Amari, S.-I. (1968). *Geometrical Theory of Information*.
In Japanese. Kyoritsu-Shuppan, Tokyo.

Amari, S.-I. (1971). "Characteristics of Randomly
Connected Threshold-Element Networks and Network Systems," *IEEE
Proc.*, **59**(1), 35-47.

Amari, S.-I. (1972a). "Learning Patterns and Pattern
Sequences by Self-Organizing Nets of Threshold Elements," *IEEE
Trans. Computers*, **C-21**, 1197-1206.

Amari, S.-I. (1972b). "Characteristics of Random
Nets of Analog Neuron-Like Elements," *IEEE Transactions on Systems,
Man, and Cybernetics*, **SMC-2**(5), 643-657.

Amari, S.-I. (1974). "A Method of Statistical Neurodynamics,"
*Kybernetik*, **14**, 201-215.

Amari, S.-I. (1977a). "Neural Theory of Association
and Concept-Formation," *Biological Cybernetics*, **26**,
175-185.

Amari, S.-I. (1977b). "Dynamics of Pattern Formation
in Lateral-Inhibition Type Neural Fields," *Biological Cybernetics*,
**27**, 77-87.

Amari, S.-I. (1980). "Topographic Organization of
Nerve Fields," *Bull. of Math. Biology*, **42**, 339-364.

Amari, S.-I. (1983). "Field Theory of Self-Organizing
Neural Nets," *IEEE Trans. Syst., Man, Cybernetics*, **SMC-13**,
741-748.

Amari, S.-I. (1989). "Characteristics of Sparsely
Encoded Associative Memory," *Neural Networks*, **2**(6),
451-457.

Amari, S.-I. (1990). "Mathematical Foundations of
Neurocomputing," *Proceedings of the IEEE*, **78**(9), 1443-1463.

Amari, S.-I. (1993). "A Universal Theorem on Learning
Curves," *Neural Networks*, **6**(2), 161-166.

Amari, S.-I., Fujita, N., and Shinomoto, S. (1992). "Four
Types of Learning Curves," *Neural Computation*, **4**(2),
605-618.

Amari, S.-I. and Maginu, K. (1988). "Statistical
Neurodynamics of Associative Memory," *Neural Networks*, **1**(1),
63-73.

Amari, S.-I. and Murata, N. (1993). "Statistical
Theory of Learning Curves Under Entropic Loss Criterion," *Neural
Computation*, **5**(1), 140-153.

Amari, S.-I. and Yanai, H.-F. (1993). "Statistical
Neurodynamics of Various Types of Associative Nets," in *Associative
Neural Memories: Theory and Implementation*, M. H. Hassoun, Editor,
169-183. Oxford University Press, New York.

Amit, D. J. (1989). *Modeling Brain Function: The World
of Attractor Neural Networks*. Cambridge University Press, Cambridge.

Amit, D. J., Gutfreund, H., and Sompolinsky, H. (1985).
"Storing Infinite Numbers of Patterns in a Spin-Glass Model of Neural
Networks," *Physical Review Lett.*, **55**(14), 1530-1533.

Amit, D. J., Gutfreund, H., and Sompolinsky, H. (1987).
"Statistical Mechanics of Neural Networks Near Saturation," *Ann.
Phys. N. Y.*, **173**, 30-67.

Anderberg, M. R. (1973). *Cluster Analysis for Applications*.
Academic Press, NY.

Anderson, J. A. (1972). "A Simple Neural Network
Generating Interactive Memory," *Mathematical Biosciences*, **14**,
197-220.

Anderson, J. A. (1983). "Neural Models for Cognitive
Computations," *IEEE Transactions on Systems, Man, and Cybernetics*,
**SMC-13**, 799-815.

Anderson, J. A. (1993). "The BSB Model: A Simple
Nonlinear Autoassociative Neural Network," in *Associative Neural
Memories: Theory and Implementation*, M. H. Hassoun, Editor, 77-103.
Oxford University Press, New York.

Anderson, D. Z. and Erie, M. C. (1987). "Resonator
Memories and Optical Novelty Filters," *Optical Engineering*,
**26**, 434-444.

Anderson, J. A., Gately, M. T., Penz, P. A., and Collins,
D. R. (1990). "Radar Signal Categorization Using a Neural Network,"
*Proc. IEEE*, **78**, 1646-1657.

Anderson, J. A. and Murphy, G. L. (1986). "Psychological
Concepts in a Parallel System," *Physica*, **22-D**, 318-336.

Anderson, J. A., Silverstien, J. W., Ritz, S. A., and
Jones, R. S. (1977). "Distinctive Features, Categorical Perception,
and Probability Learning: Some Applications of a Neural Model," *Psychological
Review*, **84**, 413-451.

Angeniol, B., Vaubois, G. and Le Texier, Y.-Y. (1988).
"Self-Organizing Feature Maps and the Traveling Salesman Problem,"
*Neural Networks*, **1**(4), 289-293.

Apolloni, B. and De Falco, D. (1991). "Learning by
Asymmetric Parallel Boltzmann Machines." *Neural Computation*,
**3**(3), 402-408.

Apostol, T. M. (1957). *Mathematical Analysis: A Modern
Approach to Advanced Calculus*. Addison-Wesely, Reading, MA.

Bachmann, C. M., Cooper, L. N., Dembo, A., and Zeitouni,
O. (1987). "A Relaxation Model for Memory with High Storage Density,"
*Proc. of the National Academy of Sciences*, USA, **84**, 7529-7531.

Bäck, T. (1993). "Optimal Mutation Rates in
Genetic Search," *Proceedings of the Fifth International Conference
on Genetic Algorithms* (Urbana-Champaign 1993), S. Forrest, Editor,
2-8. Morgan Kaufmann, San Mateo.

Baird, B. (1990). "Associative Memory in a Simple
Model of Oscillating Cortex," in *Advances in Neural Information
Processing Systems 2* (Denver 1989) D. S. Touretzky, Editor, 68-75.
Morgan Kaufmann, San Mateo.

Baird, B. and Eeckman, F. (1993). "A Normal Form
Projection Algorithm for Associative Memory," in *Associative Neural
Memories: Theory and Implementation*, M. H. Hassoun, Editor, 135-166.
Oxford University Press, New York.

Baldi, P. (1991). "Computing with Arrays of Bell-Shaped
and Sigmoid Functions," in *Neural Information Processing Systems
3* (Denver 1990), R. P. Lippmann, J. E. Moody, and D. S. Touretzky,
Editors, 735-742. Morgan Kaufmann, San Mateo.

Baldi, P. and Chauvin, Y. (1991). "Temporal Evolution
of Generalization during Learning in Linear Networks," *Neural Computation*,
**3**(4), 589-603.

Baldi, P. and Hornik, K. (1989). "Neural Networks
and Principal Component Analysis: Learning from Examples Without Local
Minima," *Neural Networks*, **2**(1), 53-58.

Barto, A. G. (1985). "Learning by Statistical Cooperation
of Self-Interested Neuron-Like Computing Elements," *Human Neurobiology*,
**4**, 229-256.

Barto, A. G. and Anandan, P. (1985). "Pattern Recognizing
Stochastic Learning Automata," *IEEE Trans. on Systems, Man, and
Cybernetics*, **SMC-15**, 360-375.

Barto, A. G. and Jordan, M. I. (1987). "Gradient
Following without Backpropagation in Layered Networks," in *IEEE
First International Conference on Neural Networks* (San Diego 1987),
M. Caudill and C. Butler, Editors, vol. II, 629-636. IEEE, New York.

Barto, A. G. and Singh, S. P. (1991). "On The Computational
Economics of Reinforcement Learning," in *Connectionist Models:
Proceedings of the 1990 Summer School* (Pittsburgh 1990), D. S. Touretzky,
J. L. Elman, T. J. Sejnowski, and G. E. Hinton, Editors, 35-44, Morgan
Kaufmann, San Mateo.

Barto A. G., Sutton, R. S., and Anderson, C. W. (1983).
"Neuronlike Adaptive Elements That Can Solve Difficult Learning Control
Problems," *IEEE Trans. System. Man, and Cybernetics*, **SMC-13**(5),
834-846.

Batchelor, B. G. (1969). Learning Machines for Pattern
Recognition, Ph.D. thesis, University of Southampton, Southampton, England.

Batchelor, B. G. (1974). *Practical Approach to Pattern
Classification*. Plenum, New York.

Batchelor, B. G. and Wilkins, B. R. (1968). "Adaptive
Discriminant Functions," *Pattern Recognition*, IEEE Conf. Publ.
**42**, 168-178.

Battiti, R. (1992). "First- and Second-Order Methods
for Learning: Between Steepest Descent and Newton's Method," *Neural
Computation*, **4**(2), 141-166.

Baum, E. B. (1988). "On the Capabilities of Multilayer
Perceptrons," *Journal of Complexity*, **4**, 193-215.

Baum, E. (1989). "A Proposal for More Powerful Learning
Algorithms," *Neural Computation*, **1**(2), 201-207.

Baum, E. and Haussler, D. (1989). "What Size Net
Gives Valid Generalization?" *Neural Computation*, **1**(1),
151-160.

Baum, E. B. and Wilczek, F. (1988). "Supervised Learning
of Probability Distributions by Neural Networks," in *Neural Information
Processing Systems* (Denver 1987), D. Z. Anderson, Editor, 52-61, American
Institute of Physics, New York.

Baxt, W. G. (1990). "Use of Artificial Neural Network
for Data Analysis in Clinical Decision-Making: The Diagnosis of Acute Coronary
Occlusion," *Neural Computation*, **2**(4), 480-489.

Becker, S. and Le Cun, Y. (1989). "Improving the
Convergence of Back-Propagation Learning with Second Order Methods,"
in *Proceedings of the 1988 Connectionist Models Summer School* (Pittsburgh
1988), D. Touretzky, G. Hinton, and T. Sejnowski, Editors, 29-37. Morgan
Kaufmann, San Mateo.

Beckman, F. S. (1964). "The Solution of Linear Equations
by the Conjugate Gradient Method," in *Mathematical Methods for
Digital Computers*, A. Ralston and H. S. Wilf, Editors. Wiley, New York.

Belew, R., McInerney, J., and Schraudolph, N. N. (1990).
"Evolving Networks: Using the Genetic Algorithm with Connectionist
Learning," CSE Technical Report CS90-174, University of California,
San Diego.

Benaim, M. (1994). "On Functional Approximation with
Normalized Gaussian Units," *Neural Computation*, **6**(2),
319-333.

Benaim, M. and Tomasini, L. (1992). "Approximating
Functions and Predicting Time Series with Multi-Sigmoidal Basis Functions,"
in *Artificial Neural Networks*, J. Aleksander and J. Taylor, Editors,
vol. 1, 407-411. Elsevier Science Publisher B. V., Amsterdam.

Bilbro, G. L., Mann, R., Miller, T. K., Snyder, W. E.,
van den Bout, D. E., and White, M. (1989). "Optimization by Mean Field
Annealing," in *Advances in Neural Information Processing Systems
I* (Denver 1988), Touretzky, D. S., 91-98. Morgan Kaufmann, San Mateo.

Bilbro, G. L. and Snyder, W. E. (1989). "Range Image
Restoration Using Mean Field Annealing," in *Advances in Neural
Information Processing Systems I* (Denver 1988), Touretzky, D. S., 594-601.
Morgan Kaufmann, San Mateo.

Bilbro, G. L., Snyder, W. E., Garnier, S. J., and Gault,
J. W. (1992). "Mean Field Annealing: A Formalism for Constructing
GNC-like Algorithms," *IEEE Transactions on Neural Networks*,
**3**(1), 131-138.

Bishop, C. (1991). "Improving the Generalization
Properties of Radial Basis Function Neural Networks," *Neural Computation*,
**3**(4), 579-588.

Bishop, C. (1992). "Exact Calculation of the Hessian
Matrix for the Multilayer Perceptron," *Neural Computation*,
**4**(4), 494-501.

Block, H. D. and Levin, S. A. (1970). "On the Boundedness
of an Iterative Procedure for Solving a System of Linear Inequalities,"
*Proc. American Mathematical Society*, **26**, 229-235.

Blum, A. L. and Rivest, R. (1989). "Training a 3-Node
Neural Network is NP-Complete," *Proceedings of the 1988 Workshop
on Computational Learning Theory*, 9-18, Morgan Kaufmann, San Mateo.

Blum, A. L. and Rivest, R. (1992). "Training a 3-Node
Neural Network is NP-Complete," *Neural Networks*, **5**(1),
117-127.

Blumer, A., Ehrenfeucht, A., Haussler, D., and Warmuth,
M. (1989). "Learnability and the Vapnik-Chervonenkis Dimension,"
*JACM*, **36**(4), 929-965.

Boole, G. (1854). *An Investigation of the Laws of Thought*.
Dover, NY.

Bounds, D. G., Lloyd, P. J., Mathew, B., and Wadell, G.
(1988). "A Multilayer Perceptron Network for the Diagnosis of Low
Back Pain," in *Proc. IEEE International Conference on Neural Networks*
(San Diego 1988), vol. II, 481-489

Bourlard, H. and Kamp, Y. (1988). "Auto-Association
by Multilayer Perceptrons and Singular Value Decomposition," *Biological
Cybernetics*, **59**, 291-294.

van den Bout, D. E. and Miller, T. K. (1988). "A
Traveling Salesman Objective Function that Works," in *IEEE International
Conference on Neural Networks* (San Diego 1988), vol. II, 299-303. IEEE,
New York.

van den Bout, D. E. and Miller, T. K. (1989). "Improving
the Performance of the Hopfield-Tank Neural Network Through Normalization
and Annealing," *Biological Cybernetics*, **62**, 129-139.

Bromley, J. and Denker, J. S. (1993). "Improving
Rejection Performance on Handwritten Digits by Training with 'Rubbish',"
*Neural Computation*, **5**(3), 367-370.

Broomhead, D. S. and Lowe, D. (1988). "Multivariate
Functional Interpolation and Adaptive Networks," *Complex Systems*,
**2**, 321-355.

Brown, R. R. (1959). "A Generalized Computer Procedure
for the Design of Optimum Systems: Parts I and II," *AIEE Transactions,
Part I: Communications and Electronics*, **78**, 285-293.

Brown, R. J. (1964). *Adaptive Multiple-Output Threshold
Systems and Their Storage Capacities*, Ph.D. Thesis, Tech. Report 6771-1,
Stanford Electron. Labs, Stanford University, CA.

Brown, M., Harris, C. J., and Parks, P. (1993). "The
Interpolation Capabilities of the Binary CMAC," *Neural Networks*,
**6**(3), 429-440.

Bryson, A. E. and Denham, W. F. (1962). "A Steepest-Ascent
Method for Solving Optimum Programming Problems," *J. Applied Mechanics*,
**29**(2), 247-257.

Bryson, A. E. and Ho, Y.-C. (1969). *Applied Optimal
Control*. Blaisdell, New York.

Burke, L. I. (1991). "Clustering Characterization
of Adaptive Resonance," *Neural Networks*, **4**(4), 485-491.

Burshtien, D. (1993). "Nondirect Convergence Analysis
of the Hopfield Associative Memory," in *Proc. World Congress on
Neural Networks* (Portland 1993), vol. II, 224-227. LEA, Hillsdale,
NJ.

Butz, A. R. (1967). "Perceptron Type Learning Algorithms
in Nonseparable Situations," *J. Math Anal. and Appl.*, **17**,
560-576. Also, see Ph.D. Dissertation, University of Minnesota, 1965.

Cameron, S. H. (1960). "An Estimate of the Complexity
Requisite in a Universal Decision Network," Wright Air Development
Division, Report 60-600, 197-212.

Cannon Jr., R. H. (1967). *Dynamics of Physical Systems*.
McGraw-Hill, New York.

Carnahan, B., Luther, H. A., and Wilkes, J. O. (1969).
*Applied Numerical Methods*. John Wiley and Sons, New York.

Carpenter, G. A. and Grossberg, S. (1987a). "A Massively
Parallel Architecture for a Self-Organizing Neural Pattern Recognition
Machine," *Computer Vision, Graphics, and Image Processing*,
**37**, 54-115.

Carpenter, G. A. and Grossberg, S. (1987b). "ART2:
Self-Organization of Stable Category Recognition Codes for Analog Input
Patterns," *Applied Optics*, **26**(23), 4919-4930.

Carpenter, G. A. and Grossberg, S. (1990). "ART3:
Hierarchical Search Using Chemical Transmitters in Self-Organizing Pattern
Recognition Architectures," *Neural Networks*, **3**(2), 129-152.

Carpenter, G. A., Grossberg, S., and Reynolds, J. H. (1991a).
"ARTMAP: Supervised Real-Time Learning and Classification of Nonstationary
Data by a Self-Organizing Neural Network," *Neural Networks*,
**4**(5), 565-588.

Carpenter, G. A., Grossberg, S., and Rosen, D. B. (1991b).
"ART2-A: An Adaptive Resonance Algorithm for Rapid Category Learning
and Recognition," *Neural Networks*, **4**(4), 493-504.

Casasent, D. and Telfer, B. (1987). "Associative
Memory Synthesis, Performance, Storage Capacity, and Updating: New Heteroassociative
Memory Results," *Proc. SPIE, Intelligent Robots and Computer Vision*,
**848**, 313-333.

Casdagli, M. (1989). "Nonlinear Prediction of Chaotic
Time Series," *Physica*, **35D**, 335-356.

Cater, J. P. (1987). "Successfully Using Peak Learning
Rates of 10 (and Greater) in Back-Propagation Networks with the Heuristic
Learning Algorithm," in *Proc. IEEE First International Conference
on Neural Networks* (San Diego 1987), M. Caudill and C. Butler, Editors,
vol. II, 645-651. IEEE, New York.

Cauchy, A. (1847). "Méthod Générale
pour la Résolution des Systémes d' E'quations Simultanées,"
*Comptes Rendus Hebdomadaires des Séances del l' Académie
des Sciences*, Paris, **25**, 536-538.

Caudell, T. P. and Dolan, C. P. (1989). "Parametric
Connectivity: Training of Constrained Networks Using Genetic Algorithms,"
in *Proceedings of the Third International Conference on Genetic Algorithms*
(Arlington 1989), J. D. Schaffer, Editor, 370-374. Morgan Kaufmann, San
Mateo.

Cetin, B. C., Burdick, J. W., and Barhen, J. (1993a).
"Global Descent Replaces Gradient Descent to Avoid Local Minima Problem
in Learning With Artificial Neural Networks," in *IEEE International
Conference on Neural Networks* (San Francisco 1993), vol. II, 836-842.
IEEE, New York.

Cetin, B. C., Barhen, J., and Burdick, J. W. (1993b).
"Terminal Repeller Unconstrained Subenergy Tunneling (TRUST) for Fast
Global Optimization," *Journal of Optimization Theory and Applications*,
**77**, 97-126.

Chalmers, D. J. (1991). "The Evolution of Learning:
An Experiment in Genetic Connectionism," in *Connectionist Models:
Proceedings of the 1990 Summer School* (Pittsburgh 1990), D. S. Touretzky,
J. L. Elman, and G. E. Hinton, Editors, 81-90. Morgan Kaufmann, San Mateo.

Chan, L. W. and Fallside, F. (1987). "An Adaptive
Training Algorithm for Backpropagation Networks," *Computer Speech
and Language*, **2**, September - December, 205-218.

Changeux, J. P. and Danchin, A. (1976). "Selective
Stabilization of Developing Synapses as a Mechanism for the Specification
of Neural Networks," *Nature* (London), **264**, 705-712.

Chauvin, Y. (1989). "A Back-Propagation Algorithm
with Optimal Use of Hidden Units," in *Advances in Neural Information
Processing Systems 1* (Denver 1988), D. S. Touretzky, Editor, 519-526.
Morgan Kaufmann, San Mateo.

Chiang, T.-S., Hwang, C.-R., and Sheu, S.-J. (1987). "Diffusion
for Global Optimization in R*n*,"
*SIAM J. Control Optimization*, **25**(3), 737-752.

Chiueh, T. D. and Goodman, R. M. (1988). "High Capacity
Exponential Associative Memory," in *Proc. IEEE Int. Conference
on Neural Networks* (San Diego 1988), vol. I, 153-160. IEEE Press, New
York.

Chiueh, T. D. and Goodman, R. M. (1991). "Recurrent
Correlation Associative Memories," *IEEE Transactions on Neural
Networks*, **2**(2), 275-284.

Cichocki, A. and Unbehauen, R. (1993). *Neural Networks
for Optimization and Signal Processing*. John Wiley & Sons, Chichester,
England.

Cohen, M. A. and Grossberg, S. (1983). "Absolute
Stability of Global Pattern Formation and Parallel Memory Storage by Competitive
Neural Networks," *IEEE Trans. Systems, Man, and Cybernetics*,
**SMC-13**, 815-826.

Cohn, D. and Tesauro, G. (1991). "Can Neural Networks
do Better than the Vapnik-Chervonenkis Bounds?" in *Neural Information
Processing Systems 3* (Denver, 1990), R. P. Lippmann, J. E. Moody, and
D. S. Touretzky, Editors., 911-917. Morgan Kaufmann, San Mateo.

Cohn, D. and Tesauro, G. (1992). "How Tight are the
Vapnik-Chervonenkis Bounds?" *Neural Computation*, **4**(2),
249-269.

Cooper, P. W. (1962). "The Hypersphere in Pattern
Recognition," *Information and Control*, **5**,324-346.

Cooper, P. W. (1966). "A Note on Adaptive Hypersphere
Decision Boundary," *IEEE Transactions on Electronic Computers*
(December 1966), 948-949.

Cortes, C. and Hertz, J. A. (1989). "A Network System
for Image Segmentation," in *International Joint Conference on Neural
Networks* (Washington 1989), vol. I, 121-127. IEEE, New York.

Cotter, N. E. and Guillerm, T. J. (1992). "The CMAC
and a Theorem of Kolmogorov," *Neural Networks*, **5**(2),
221-228.

Cottrell, G. W. (1991). "Extracting Features from
Faces Using Compression Networks: Face, Identity, Emotion, and Gender Recognition
Using Holons," in *Connectionist Models: Proceedings of the 1990
Summer School* (Pittsburgh 1990), D. S. Touretzky, J. L. Elman, T. J.
Sejnowski, and G. E. Hinton, Editors, 328-337. Morgan Kaufmann, San Mateo.

Cottrell, M. and Fort, J. C. (1986). "A Stochastic
Model of Retinotopy: A Self-Organizing Process," *Biol. Cybern.*,
**53**, 405-411.

Cottrell, G. W. and Munro, P. (1988). "Principal
Component Analysis of Images via Backpropagation," invited talk, in
*Proceedings of the Society of Photo-Optical Instrumentation Engineers*
(Cambridge 1988), vol. **1001**, 1070-1077.

Cottrell, G. W., Munro, P., and Zipser, D. (1987). "Learning
Internal Representations from Gray-Scale Images: An Example of Extensional
Programming," in *Ninth Annual Conference of the Cognitive Science
Society* (Seattle 1987), 462-473. Erlbaum, Hillsdale.

Cottrell, G. W., Munro, P., and Zipser, D. (1989). "Image
Compression by Back Propagation: An Example of Extensional Programming,"
in *Models of Cognition: A Review of Cognitive Science*, vol. 1, N.
Sharkey, Editor. Ablex, Norwood.

Cover, T. M. (1964). *Geometrical and Statistical Properties
of Linear Threshold Devices*, Ph.D. Dissertation, Tech. Report 6107-1,
Stanford Electron. Labs, Stanford University, CA.

Cover, T. M. (1965). "Geometrical and Statistical
Properties of Systems of Linear Inequalities with Applications in Pattern
Recognition," *IEEE Trans. Elec. Comp.*, **EC-14**, 326-334.

Cover, T. M. (1968). "Rates of Convergence of Nearest
Neighbor Decision Procedures," *Proc. First Annual Hawaii Conference
on Systems Theory*, 413-415.

Cover, T. M. and Hart, P. E. (1967). "Nearest Neighbor
Pattern Classification," *IEEE Transactions on Information Theory*,
**IT-13**(1), 21-27.

Crisanti, A. and Sompolinsky, H. (1987). "Dynamics
of Spin Systems with Randomly Asymmetric Bounds: Langevin Dynamics and
a Spherical Model," *Phys. Rev. A.*, **36**, 4922.

Crowder III, R. S. (1991). "Predicting the Mackey-Glass
Time Series with Cascade-Correlation Learning," in *Connectionist
Models: Proceedings of the 1990 Summer School* (Pittsburgh 1990), D.
S. Touretzky, J. L. Elman, T. J. Sejnowski, and G. E. Hinton, Editors,
117-123. Morgan Kaufmann, San Mateo.

Cybenko, G. (1989). "Approximation by Superpositions
of a Sigmoidal Function," *Math. Control Signals Systems*, **2**,
303-314.

Darken, C. and Moody, J. (1991). "Note on Learning
Rate Schedules for Stochastic Optimization," in *Neural Information
Processing Systems 3* (Denver 1990), R. P. Lippmann, J. E. Moody, and
D. S. Touretzky, Editors, 832-838. Morgan Kaufmann, San Mateo.

Darken, C. and Moody, J. (1992). "Towards Faster
Stochastic Gradient Search," in *Neural Information Processing Systems
4* (Denver, 1991), J. E. Moody, S. J. Hanson, and R. P. Lippmann, Editors,
1009-1016. Morgan Kaufmann, San Mateo.

Davis, L. (1987). *Genetic Algorithms and Simulated
Annealing*. Pitman, London, England.

Davis, T. E. and Principe, J. C. (1993). "A Markov
Chain Framework for the Simple Genetic Algorithm," *Evolutionary
Computation*, **1**(3), 269-288.

D'Azzo, J. J. and Houpis, C. H. (1988). *Linear Control
Systems Analysis and Design* (3rd edition). McGraw-Hill, New York.

De Jong, K. (1975). An Analysis of the Behavior of a Class
of Genetic Adaptive Systems. Doctoral Thesis, Department of Computer and
Communications Sciences, University of Michigan, Ann Arbor.

De Jong, K. and Spears, W. (1993). "On the State
of Evolutionary Computation," in *Proceedings of the Fifth International
Conference on Genetic Algorithms* (Urbana-Champaign 1993), S. Forrest,
Editor, 618-623. Morgan Kaufmann, San Mateo.

Dembo, A. and Zeitouni, O. (1988). "High Density
Associative Memories," in *Neural Information Processing Systems*
(Denver 1987), D. Z. Anderson, Editor, 211-212. American Institute of Physics,
New York.

DeMers, D. and Cottrell, G. (1993). "Non-Linear Dimensionality
Reduction," in *Advances in Neural Information Processing Systems
5* (Denver 1992), S. J. Hanson, J. D. Cowan, and C. L. Giles, Editors,
550-587. Morgan Kaufmann, San Mateo.

Dennis Jr., J. E. and Schnabel, R. B. (1983). *Numerical
Methods for Unconstrained Optimization and Nonlinear Equations*. Prentice-Hall,
Englewood Cliffs.

Denoeux, T. and Lengellé, R. (1993). "Initializing
Back Propagation Networks with Prototypes," *Neural Networks*,
**6**(3), 351-363.

Derthick, M. (1984). "Variations on the Boltzmann
Machine," Technical Report CMU-CS-84-120, Department of Computer Science,
Carnegie Mellon University, Pittsburgh, PA.

Dertouzos, M. L. (1965). *Threshold Logic: A Synthesis
Approach*. MIT Press, Cambridge, MA.

Dickinson, B. W. (1991). *Systems: Analysis, Design,
and Computation*. Prentice-Hall, Englewood Cliffs, CA.

Dickmanns, E. D. and Zapp, A. (1987). "Autonomous
High Speed Road Vehicle Guidance by Computer Vision," in *Proceedings
of the 10th World Congress on Automatic Control* (Munich, West Germany
1987), vol. 4, 221-226.

Drucker, H. and Le Cun, Y. (1992), "Improving Generalization
Performance Using Double Backpropagation," *IEEE Transactions on
Neural Networks*, **3**(6), 991-997.

Duda, R. O. and Hart, P. E. (1973). *Pattern Classification
and Scene Analysis*. John Wiley. New York.

Duda, R. O. and Singleton, R. C. (1964). "Training
a Threshold Logic Unit with Imperfect Classified Patterns," *IRE
Western Electric Show and Convention Record*, Paper 3.2.

Durbin, R. and Willshaw, D. (1987). "An Analogue
Approach to The Traveling Salesman Problem Using an Elastic Net Method,"
*Nature*, **326**, 689-691.

Efron, D. (1964). "The Perceptron Correction Procedure
in Non-Separable Situations," Tech. Report. No. RADC-TDR-63-533.

Elamn, J. L. and Zipser, D. (1988). "Learning the
Hidden Structure of Speech," *Journal of Acoustical Society of America*,
**83**, 1615-1626.

Everitt, B. S. (1980). *Cluster Analysis* (2nd edition).
Heinemann Educational Books, London.

Fahlman, S. E. (1989). "Fast Learning Variations
on Back-Propagation: An Empirical Study," in *Proceedings of the
1988 Connectionist Models Summer School* (Pittsburgh 1988), D. Touretzky,
G. Hinton, and T. Sejnowski, Editors, 38-51. Morgan Kaufmann, San Mateo.

Fahlman, S. E. and Lebiere, C. (1990). "The Cascade-Correlation
Learning Architecture," in *Advances in Neural Information Processing
Systems 2* (Denver 1989), D. S. Touretzky, Editor, 524-532. Morgan Kaufmann,
San Mateo.

Fakhr, W. (1993). "Optimal Adaptive Probabilistic
Neural Networks for Pattern Classification," Ph.D. Thesis, Department
of Electrical and Computer Engineering, University of Waterloo, Waterloo,
Canada.

Fang, Y. and Sejnowski, T. J. (1990). "Faster Learning
for Dynamic Recurrent Backpropagation," *Neural Computation*,
**2**(3), 270-273.

Farden, D. C. (1981). "Tracking Properties of Adaptive
Signal Processing Algorithms," *IEEE Trans. Acoust. Speech Signal
Proc.*, **ASSP-29**, 439-446.

Farhat, N. H. (1987). "Optoelectronic Analogs of
Self-Programming Neural Nets: Architectures and Methods for Implementing
Fast Stochastic Learning by Simulated Annealing," *Applied Optics*,
**26**, 5093-5103.

Feigenbaum, M. (1978). "Quantitative Universality
for a Class of Nonlinear Transformations," *J. Stat. Phys.*,
**19**, 25-52.

Fels, S. S. and Hinton, G. E. (1993). "Glove-Talk:
A Neural Network Interface Between a Data-Glove and a Speech Synthesizer,"
*IEEE Transactions on Neural Networks*, **4**(1), 2-8.

Finnoff, W. (1993). "Diffusion Approximations for
the Constant Learning Rate Backpropagation Algorithm and Resistance to
Local Minima," in *Advances in Neural Information Processing Systems
5* (Denver 1992), S. J. Hanson, J. D. Cowan, and C. L. Giles, Editors,
459-466. Morgan Kaufmann, San Mateo.

Finnoff, W. (1994). "Diffusion Approximations for
the Constant Learning Rate Backpropagation Algorithm and Resistance to
Local Minima," *Neural Computation*, **6**(2), 285-295.

Finnoff, W., Hergert, F., Zimmermann, H. G. (1993). "Improving
Model Selection by Nonconvergent Methods," *Neural Networks*,
**6**(5), 771-783.

Fisher, M. L. (1981). "The Lagrangian Relaxation
Method for Solving Integer Programming Problems," *Manag. Sci.*,
**27**(1), 1-18.

Fix, E. and Hodges Jr., J. L. (1951). "Discriminatory
Analysis: Non-parametric Discrimination," Report 4, Project 21-49-004,
USAF School of Aviation Medicine, Randolph Field, Texas.

Franzini, M. A. (1987). "Speech Recognition with
Back Propagation," in *Proceedings of the Ninth Annual Conference
of the IEEE Engineering in Medicine and Biology Society* (Boston 1987),
1702-1703. IEEE, New York.

Frean, M. (1990). "The Upstart Algorithm: A Method
for Constructing and Training Feedforward Neural Networks," *Neural
Computation*, **2**(2), 198-209.

Fritzke, B. (1991). "Let it Grow - Self-Organizing
Feature Maps with Problem Dependent Cell Structure," in *Artificial
Neural Networks, Proc. of the 1991 Int. Conf. on Artificial Neural Networks*
(Espoo 1991), T. Kohonen, K. Mäkisara, O. Simula, and J. Kangas, Editors,
vol. I, 403-408. Elsevier Science Publishers B. V., Amsterdam.

Funahashi, K.-I. (1989). "On the Approximate Realization
of Continuous Mappings by Neural Networks," *Neural Networks*,
**2**(3), 183-192.

Funahashi, K.-I. (1990). "On the Approximate Realization
of Identity Mappings by 3-Layer Neural Networks (in Japanese," *Trans.
IEICE A*, **J73-A**, 139-145.

Funahashi, K.-I. and Nakamura, Y. (1993). "Approximation
of Dynamical Systems by Continuous Time Recurrent Neural Networks,"
*Neural Networks*, **6**(6), 801-806.

Galland, C. C. and Hinton, G. E. (1991). "Deterministic
Boltzmann Learning in Networks with Asymmetric Connectivity," in *Connectionist
Models: Proceeding of the 1990 Summer School* (Pittsburgh 1990), D.
S. Touretzky, J. L. Elman, T. J. Sejnowski, and G. E. Hinton, Editors,
3-9. Morgan Kaufmann, San Mateo.

Gallant, S. I. (1993). *Neural Network Learning and
Expert Systems*. MIT Press, Cambridge, MA.

Gallant, S. I. and Smith, D. (1987). "Random Cells:
An Idea Whose Time Has Come and Gone ... and Come Again?" *Proceedings
of the IEEE International Conference on Neural Networks* (San Diego
1987), vol. II, 671-678.

Gamba, A. (1961). "Optimum Performance of Learning
Machines," *Proc. IRE*, **49**, 349-350.

Gantmacher, F. R. (1990). *The Theory of Matrices*,
vol. 1, 2nd edition. Chelsea, New York.

Gardner, E. (1986). "Structure of Metastable States
in Hopfield Model," *J. Physics A*, **19**, L1047-1052.

Garey, M. R. and Johnson (1979). *Computers and Intractability:
A Guide to the Theory of NP-Completeness*. Freeman, New York.

Gelfand, S. B. and Mitter, S. K. (1991). "Recursive
Stochastic Algorithms for Global Optimization in R*d*,"
*SIAM J. Control and Optimization*, **29**(5), 999-1018.

Geman, S. (1979). "Some Averaging and Stability Results
for Random Differential Equations," *SIAM J. Applied Math.*,
**36**, 86-105.

Geman, S. (1980). "A Limit Theorem for the Norm of
Random Matrices," *Ann. Prob.*, **8**, 252-261.

Geman, S. (1982). "Almost Sure Stable Oscillations
in a Large System of Randomly Coupled Equations," *SIAM J. Appl.
Math.*, **42**(4), 695-703.

Geman, S. and Geman, D. (1984). "Stochastic Relaxation,
Gibbs Distributions, and The Bayesian Restoration of Images," *IEEE
Trans. Pattern Analysis and Machine Intelligence*, **6**, 721-741.

Geman, S. and Hwang, C. R. (1986). "Diffusions for
Global Optimization," *SIAM J. of Control and Optimization*,
**24**(5), 1031-1043.

Gerald, C. F. (1978). *Applied Numerical Analysis (2nd
edition)*. Addison-Wesley, Reading, MA.

Geszti, T. (1990). *Physical Models of Neural Networks*.
World Scientific, Singapore.

Gill, P. E., Murray, W., and Wright, M. H. (1981). *Practical
Optimization*. Academic Press, New York.

Girosi, F. and Poggio, T. (1989). "Representation
Properties of Networks: Kolmogorov's Theorem is Irrelevant," *Neural
Computation*, **1**(4), 465-469.

Glanz, F. H. and Miller, W. T. (1987). "Shape Recognition
Using a CMAC Based Learning System," *Proc. SPIE, Robotics and Intelligent
Systems*, **848**, 294-298.

Glanz, F. H. and Miller, W. T. (1989). "Deconvolution
and Nonlinear Inverse Filtering Using a Neural Network," *Proceedings
of the International Conference on Acoustics and Signal Processing*,
vol. 4, 2349-2352.

Glauber, R. J. (1963). "Time Dependent Statistics
of the Ising Model," *Journal of Mathematical Physics*, **4**,
294-307.

Goldberg, D. (1989). *Genetic Algorithms*. Addison-Wesley,
Reading, MA.

Golden, R. M. (1986). "The "Brain-State-in-a-Box"
Neural Model is a Gradient Descent Algorithm," *Journal of Mathematical
Psychology*, **30**, 73-80.

Golden, R. M. (1993). "Stability and Optimization
Analysis of the Generalized Brain-State-in-a-Box Neural Network Model,"
*J. Math. Psychol.*, **37**, 282-298.

Goldman, L., Cook, E. F., Brand, D. A., Lee, T. H., Rouan,
G. W., Weisberg, M. C., Acampora, D., Stasiulewicz, C., Walshon, J., Gterranova,
G., Gottlieb, L., Kobernick, M., Goldstein-Wayne, B., Copen, D., Daley,
K., Brandt, A. A., Jones, D., Mellors, J., Jakubowski, R. (1988). "A
Computer Protocol to Predict Myocardial Infarction in Emergency Department
Patients with Chest Pain," *N. Engl. J. Med.*, **318**, 797-803.

Gonzalez, R. C. and Wintz, P. (1987). *Digital Image
Processing*, 2nd edition. Addison-Wesley, Reading, MA.

Gordon, M. B., Peretto, P., and Berchier, D. (1993). "Learning
Algorithms for Perceptrons from Statistical Physics," *J. Physique
I*, **3**, 377-387.

Gorse, D. and Shepherd, A. (1992). "Adding Stochastic
Search to Conjugate Gradient Algorithms," in *Proceedings of 3rd
International Conf. on Parallel Applications in Statistics and Economics*,
Praque: Tiskrenfk
Zacody.

Gray, R. M. (1984). "Vector Quantization," *IEEE
ASSP Magazine*, **1**, 4-29.

Greenberg, H. J. (1988). "Equilibria of the Brain-State-in-a-Box
(BSB) Neural Model," *Neural Networks*, **1**(4), 323-324.

Grefenstette, J. J. (1986). "Optimization of Control
Parameters for Genetic Algorithms," *IEEE Trans. on System, Man
and Cybernetics*, **SMC-16**, 122-128.

Grossberg, S. (1969). "On Learning and Energy-Entropy
Dependence in Recurrent and Nonrecurrent Signed Networks," *Journal
of Statistical Physics*, **1**, 319-350.

Grossberg, S. (1976). "Adaptive Pattern Classification
and Universal Recording: I. Parallel Development and Coding of Neural Feature
Detectors," *Biological Cybernetics*, **23**, 121-134.

Grossberg, S. (1976). "Adaptive Pattern Classification
and Universal Recording, II: Feedback, Expectation, Olfaction, and Illusions,"
*Biological Cybernetics*, **23**, 187-202.

Hanson, S. J. and Burr, D. J. (1987). "Knowledge
Representation in Connectionist Networks," Bellcore Technical Report.

Hanson, S. J. and Burr, D. J. (1988). "Minkowski-r
Back-Propagation: Learning in Connectionist Models with Non-Euclidean Error
Signals," in *Neural Information Processing Systems* (Denver
1987), D. Z. Anderson, Editor, 348-357. American Institute of Physics,
New York.

Hanson, S. J. and Pratt, L. (1989). "A Comparison
of Different Biases for Minimal Network Construction with Back-Propagation,"
in *Advances in Neural Information Processing Systems 1* (Denver 1988),
D. S. Touretzky, Editor, 177-185. Morgan Kaufmann, San Mateo.

Hardy, G., Littlewood, J., and Polya, G. (1952). *Inequalities*.
Cambridge University Press, Cambridge, England.

Harp, S. A., Samad, T., and Guha, A. (1989). "Towards
the Genetic Synthesis of Neural Networks," in *Proceedings of the
Third International Conference on Genetic Algorithms* (Arlington 1989),
J. D. Schaffer, Editor, 360-369. Morgan Kaufmann, San Mateo.

Harp, S. A., Samad, T., and Guha, A. (1990). "Designing
Application-Specific Neural Networks Using the Genetic Algorithms,"
in *Advances in Neural Information Processing Systems 2* (Denver 1989),
D. S. Touretzky, Editor, 447-454. Morgan Kaufmann, San Mateo.

Hartigan, J. A. (1975). *Clustering Algorithms*.
John Wiley & Sons, New York.

Hartman, E. J. and Keeler, J. D. (1991a). "Semi-local
Units for Prediction," in *Proceedings of the International Joint
Conference on Neural Networks* (Seattle 1991), vol. II, 561-566. IEEE,
New York.

Hartman, E. J. and Keeler, J. D. (1991b). "Predicting
the Future: Advantages of Semilocal Units," *Neural Computation*,
**3**(4), 566-578.

Hartman, E. J., Keeler, J. D., and Kowalski, J. M. (1990).
"Layered Neural Networks with Gaussian Hidden Units as Universal Approximators,"
*Neural Computation*, **2**(2), 210-215.

Hassoun, M. H. (1988). "Two-Level Neural Network
for Deterministic Logic Processing," *Proc. SPIE, Optical Computing
and Nonlinear Materials*, **881**, 258-264.

Hassoun, M. H. (1989a). "Adaptive Dynamic Heteroassociative
Neural Memories for Pattern Classification," in *Proc. SPIE, Optical
Pattern Recognition*, H-K Liu, Editor, **1053**, 75-83.

Hassoun, M. H. (1989b). "Dynamic Heteroassociative
Neural Memories," *Neural Networks*, **2**(4), 275-287.

Hassoun, M. H., Editor (1993). *Associative Neural Memories:
Theory and Implementation*. Oxford University Press, New York, NY.

Hassoun, M. H. and Clark, D. W. (1988). "An Adaptive
Attentive Learning Algorithm for Single-Layer Neural Networks," *Proc.
IEEE Annual Conf. Neural Networks*, vol. I, 431-440.

Hassoun, M. H. and Song, J. (1992). "Adaptive Ho-Kashyap
Rules for Perceptron Training," *IEEE Transactions on Neural Networks*,
**3**(1), 51-61.

Hassoun, M. H. and Song, J. (1993a). "Multilayer
Perceptron Learning Via Genetic Search for Hidden Layer Activations,"
in *Proceedings of the World Congress on Neural Networks* (Portland
1993), vol. III, 437-444.

Hassoun, M. H. and Song, J. (1993b). "Hybrid Genetic/Gradient
Search for Multilayer Perceptron Training." *Optical Memory and
Neural Networks, Special Issue on Architectures, Designs, Algorithms and
Devices for Optical Neural Networks* (Part 1), **2**(1), 1-15.

Hassoun, M. H. and Spitzer, A. R. (1988). "Neural
Network Identification and Extraction of Repetitive Superimposed Pulses
in Noisy 1-D Signals," *Neural Networks*, **1**, Supplement
1: Abstracts of the *First Annual Meeting of the International Neural
Networks Society* (Boston 1988), 443. Pergamon Press, New York.

Hassoun, M. H., and Youssef, A. M. (1989). "A High-Performance
Recording Algorithm for Hopfield Model Associative Memories," *Optical
Engineering*, **28**(1), 46-54.

Hassoun, M. H., Song, J., Shen, S.-M., and Spitzer, A.
R. (1990). "Self-Organizing Autoassociative Dynamic Multiple-Layer
Neural Net for the Decomposition of Repetitive Superimposed Signals,"
*Proceedings of the International Joint Conference on Neural Networks*
(Washington, D. C. 1990), vol. I, 621-626. IEEE, New York.

Hassoun, M. H., Wang, C., and Spitzer, A. R. (1992). "Electromyogram
Decomposition via Unsupervised Dynamic Multi-Layer Neural Network,"
in *Proceedings of the International Joint Conference on Neural Networks*
(Baltimore 1992), vol. II, 405-412. IEEE, New York.

Hassoun, M. H., Wang, C., and Spitzer, A. R. (1994a).
"NNERVE: Neural Network Extraction of Repetitive Vectors for Electromyography,
Part I: Algorithm," *IEEE Transactions on Biomedical Engineering*,
XXX to appear XXX.

Hassoun, M. H., Wang, C., and Spitzer, A. R. (1994b).
"NNERVE: Neural Network Extraction of Repetitive Vectors for Electromyography,
Part II: Performance Analysis," *IEEE Transactions on Biomedical
Engineering*, XXX to appear XXX.

Haussler, D., Kearns, M., Opper, M., and Schapire, R.
(1992). "Estimating Average-Case Learning Curves Using Bayesian, Statistical
Physics and VC Dimension Methods," in *Neural Information Processing
Systems 4* (Denver 1991), J. E. Moody, S. J. Hanson, and R. P. Lippmann,
Editors., 855-862. Morgan Kaufmann, San Mateo.

Hebb, D. (1949). *The Organization of Behavior*.
Wiley, New York.

Hecht-Nielsen, R. (1987). "Kolmogorov's Mapping Neural
Network Existence Theorem," in *Proc. Int. Conf. Neural Networks*
(San Diego 1987), vol. III, 11-14, IEEE Press, New York.

Hecht-Nielsen, R. (1990). *Neurocomputing*. Addison-Wesley,
Reading, MA.

van Hemman, J. L., Ioffe, L. B., and Vaas, M. (1990).
"Increasing the Efficiency of a Neural Network through Unlearning,"
*Physica*, **163A**, 368-392.

Hergert, F., Finnoff, W., and Zimmermann, H. G. (1992).
"A Comparison of Weight Elimination Methods for Reducing Complexity
in Neural Networks," in *Proceedings of the International Joint
Conference on Neural Networks* (Baltimore 1992), vol. III, 980-987.
IEEE, New York.

Hertz, J., Krogh, A., and Palmer, R. G. (1991). *Introduction
to the Theory of Neural Computation*. Addison-Wesley, New York.

Heskes, T. M. and Kappen, B. (1991). "Learning Processes
in Neural Networks," *Physical Review A*, **44**(4), 2718-2726.

Heskes, T. M. and Kappen, B. (1993a). "Error Potentials
for Self-Organization," in *IEEE International Conference on Neural
Networks* (San Francisco 1993), vol. III, 1219-1223. IEEE, New York.

Heskes, T. M. and Kappen, B. (1993)b. "On-Line Learning
Processes in Artificial Neural Networks," in *Mathematical Approaches
to Neural Networks*, J. G. Taylor, Editor, 199-233. Elsevier Science
Publishers B. V., Amsterdam.

Hestenes, M. R. and Stiefel, E. (1952). "Methods
of Conjugate Gradients for Solving Linear Systems," *J. Res. Nat.
Bur. Standards*, **49**, 409-436.

Hinton, G. E. (1986). "Learning Distributed Representations
of Concepts," in *Proceedings of the 8th Annual Conference of the
Cognitive Science Society* (Amherst 1986), 1-12. Erlbaum, Hillsdale.

Hinton, G. E. (1987a). "Connectionist Learning Procedures,"
Technical Report CMU-CS-87-115, Carnegie-Mellon University, Computer Science
Department, Pittsburgh, PA.

Hinton, G. E. (1987b). "Learning Translation Invariant
Recognition in a Massively Parallel Network," in *PARLE: Parallel
Architectures and Languages, Europe Lecture Notes in Computer Science*,
G. Goos and J. Hartmanis, Editors, 1-13. Springer-Verlag, Berlin.

Hinton, G. E. and Nowlan, S. J. (1987). "How Learning
can Guide Evolution," *Complex Systems*, **1**, 495-502.

Hinton, G. E. and Sejnowski, T. J. (1983). "Optimal
Perceptual Inference," in *Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition* (Washington 1983), 448-453.
IEEE, New York.

Hinton, G. E. and Sejnowski, T. J. (1986). "Learning
and Relearning in Boltzmann Machines," in *Parallel Distributed
Processing: Explorations in the Microstructure of Cognition*, vol. I,
D. E. Rumelhart, J. L. McClelland, and the PDP Research Group, MIT Press,
Cambridge (1986).

Hirsch, M. W. (1989). "Convergent Activation Dynamics
in Continuous Time Networks," *Neural Networks*, **2**(5),
331-349.

Hirsch, M. and Smale, S. (1974). *Differential Equations,
Dynamical Systems, and Linear Algebra*. Academic Press. New York.

Ho, Y.-C. and Kashyap, R. L. (1965). "An Algorithm
for Linear Inequalities and its Applications," *IEEE Trans. Elec.
Comp.*, **EC-14**, 683-688.

Holland, J. H. (1975). *Adaptation in Natural and Artificial
Systems*. The University of Michigan Press, Ann Arbor, Michigan. Reprinted
as a second edition (1992). MIT Press, Cambridge.

Holland, J. H. (1986). "Escaping Brittleness: The
Possibilities of General-Purpose Learning Algorithms applied to Parallel
Rule Based Systems," in *Machine Learning: An Artificial Intelligence
Approach*, vol. 2, R. Michalski, J. Carbonell, and T. Mitchell, Editors,
593-623. Morgan Kaufmann, San Mateo.

Holland, J. H. and Reitman, J. S. (1978). "Cognitive
Systems Based on Adaptive Algorithms," in *Pattern Directed Inference
Systems*, D. A. Waterman and F. Hayes-Roth, Editors, 313-329. Academic
Press, New York.

Hopfield, J. J. (1982). "Neural Networks and Physical
Systems with Emergent Collective Computational Abilities," *Proc.
National Academy Sciences*, USA, **79**, 2445-2558.

Hopfield, J. J. (1984). "Neurons with Graded Response
have Collective Computational Properties like Those of Two-State Neurons,"
*Proc. National Academy Sciences*, USA, **81**, 3088-3092.

Hopfield, J. J. (1987). "Learning Algorithms and
Probability Distributions in Feed-Forward and Feed-Back Networks,"
*Proceedings of the National Academy of Sciences*, USA, **84**,
8429-8433.

Hopfield, J. J. (1990). "The Effectiveness of Analogue
'Neural Network' Hardware," *Network: Computation in Neural Systems*,
**1**(1), 27-40.

Hopfield, J. J. and Tank, (1985). "Neural Computation
of Decisions in Optimization Problems," *Biological Cybernetics*,
**52**, 141-152.

Hopfield, J. J., Feinstein, D. I., and Palmer, R. G. (1993).
" "Unlearning" Has a Stabilizing Effect in Collective Memories,"
*Nature*, **304**, 158-159.

Hoptroff, R. G. and Hall, T. J. (1989). "Learning
by Diffusion for Multilayer Perceptron," *Electronic Letters*,
**25**(8), 531-533.

Hornbeck, R. W. (1975). *Numerical Methods*. Quantum,
New York.

Hornik, K. (1991). "Approximation Capabilities of
Multilayer Feedforward Networks," *Neural Networks*, **4**(2),
251-257.

Hornik, K. (1993). "Some New Results on Neural Network
Approximation," *Neural Networks*, **6**(8), 1069-1072.

Hornik, K., Stinchcombe, M., and White, H. (1989). "Multilayer
Feedforward Networks are Universal Approximators," *Neural Networks*,
**2**(5), 359-366.

Hornik, K., Stinchcombe, M., and White, H. (1990). "Universal
Approximation of an Unknown Mapping and its Derivatives Using Multilayer
Feedforward Networks," *Neural Networks*, **3**(5), 551-560.

Horowitz, L. L. and Senne, K. D. (1981). "Performance
Advantage of Complex LMS for Controlling Narrow-Band Adaptive Arrays,"
*IEEE Trans. Circuits Systems*, **CAS-28**, 562-576.

Hoshino, T., Yonekura, T., Matsumoto, T., and Toriwaki,
J. (1990). "Studies of PCA Realized by 3-Layer Neural Networks Realizing
Identity Mapping (in Japanese)," PRU90-54, 7-14.

Hoskins, J. C., Lee, P., and Chakravarthy, S. V. (1993).
"Polynomial Modeling Behavior in Radial Basis Function Networks,"
in *Proc. World Congress on Neural Networks* (Portland 1993), vol.
IV, 693-699. LEA, Hillsdale.

Householder, A. S. (1964). *The Theory of Matrices in
Numerical Analysis*. Blaisdel, New York. (Reprinted, 1975, by Dover,
New York.)

Hu, S. T. (1965). *Threshold Logic*. University of
California Press, Berkeley, CA.

Huang, W. Y. and Lippmann, R. P. (1988). "Neural
Nets and Traditional Classifiers," in *Neural Information Processing
Systems* (Denver 1987), D. Z. Anderson, Editor. American Institute of
Physics, New York, 387-396.

Huang, Y. and Schultheiss, P. M. (1963). "Block Quantization
of Correlated Gaussian Random Variables," *IEEE Trans. Commun. Syst.*,
**CS-11**, 289-296.

Huber, P. J. (1981). *Robust Statistics*. Wiley,
New York.

Hudak, M. J. (1992). "RCE Classifiers: Theory and
Practice," *Cybernetics and Systems: An International Journal*,
**23**, 483-515.

Hueter, G. J. (1988). "Solution of the Traveling
Salesman Problem with an Adaptive Ring," in *IEEE International
Conference on Neural Networks* (San Diego 1988), vol. I, 85-92. IEEE,
New York.

Hui, S. and ak,
S. H. (1992). "Dynamical Analysis of the Brain-State-in-a-Box Neural
Models," *IEEE Transactions on Neural Networks*, **3**, 86-89.

Hui, S., Lillo, W. E. and ak,
S. H. (1993). "Dynamics and Stability Analysis of the Brain-State-in-a-Box
(BSB) Neural Models," in *Associative Neural Memories: Theory and
Implementation*, M. H. Hassoun, Editor. Oxford Univ. Press, NY.

Hush, D. R., Salas, J. M., and Horne, B. (1991). "Error
Surfaces for Multi-layer Perceptrons," in *International Joint Conference
on Neural Networks* (Seattle 1991), vol. I, 759-764, IEEE, New York.

Irie, B. and Miyake, S. (1988). "Capabilities of
Three-Layer Perceptrons," *IEEE Int. Conf. Neural Networks*,
vol. I, 641-648.

Ito, Y. (1991). "Representation of Functions by Superpositions
of Step or Sigmoid Function and Their Applications to Neural Network Theory,"
*Neural Networks*, **4**(3), 385-394.

Jacobs, R. A. (1988). "Increased Rates of Convergence
Through Learning Rate Adaptation," *Neural Networks*, **1**(4),
295-307.

Johnson, D. S., Aragon, C. R., McGeoch, L. A., and Schevon,
C. (1989). "Optimization by Simulated Annealing: An Experimental Evaluation;
Part I, Graph Partitioning," *Operations Research*, **37**(6),
865-892.

Johnson, D. S., Aragon, C. R., McGeoch, L. A., and Schevon,
C. (1990a). "Optimization by Simulated Annealing: An Experimental
Evaluation; Part II, Graph Coloring and Number Partitioning," *Operations
Research*, **39**(3), 378-406.

Johnson, R. A. and Wichern, D. W. (1988). *Applied Multivariate
Statistical Analysis* (2nd edition), Prentice-Hall, Englewood Cliffs,
NJ.

Jones, R. D., Lee, Y. C., Barnes, C. W., Flake, G. W.,
Lee, K., Lewis, P. S., and Qian, S. (1990). "Function Approximation
and Time Series Prediction with Neural Networks," in *Proc. Intl.
Joint Conference on Neural Networks* (San Diego 1990), vol. I, 649-666.
IEEE Press, New York.

Judd, J. S. (1987). "Learning in Networks is Hard,"
in *IEEE First International Conference on Neural Networks* (San Diego
1987), M. Caudill and C. Butler, Editors, vol. II, 685-692. IEEE, New York.

Judd, J. S. (1990). *Neural Network Design and the Complexity
of Learning*. MIT Press, Cambridge.

Kadirkamanathan, V., Niranjan, M., and Fallside, F. (1991).
"Sequential Adaptation of Radial Basis Function Neural Networks and
its Application to Time-Series Prediction," in *Advances in Neural
Information Processing Systems 3* (Denver 1990) R. P. Lippmann, J. E.
Moody, and D. S. Touretzky, Editors, 721-727. Morgan Kaufmann, San Mateo.

Kamimura, R. (1993). "Minimum Entropy Method for
the Improvement of Selectivity and Interpretability," in *Proc.
World Congress on Neural Networks* (Portland 1993), vol. III, 512-519.
LEA, Hillsdale.

Kanerva, P. (1988). *Sparse Distributed Memory*.
Bradford/MIT Press, Cambridge, MA.

Kanerva, P. (1993). "Sparse Distributed Memory and
Other Models," in *Associative Neural Memories: Theory and Implementation*,
M. H. Hassoun, Editor, 50-76. Oxford University Press, New York.

Kanter, I. and Sompolinsky, H. (1987). "Associative
Recall of Memory Without Errors," *Phys. Rev. A.*, **35**,
380-392.

Karhunen, J. (1994). "Optimization Criteria and Nonlinear
PCA Neural Networks," *IEEE International Conference on Neural Networks*,
(Orlando 1994), vol. XXX, XXXpage numbersXXX, IEEE Press.

Karhunen, K. (1947). "Uber lineare methoden in der
Wahrscheinlichkeitsrechnung," *Annales Academiae Scientiarium Fennicae*,
**A**, **37**(1), 3-79, (translated by RAND Corp., Santa Monica,
CA, Rep. T-131, 1960).

Karmarkar, N. (1984). "A New Polynomial Time Algorithm
for Linear Programming," *Combinatorica*, **1**, 373-395.

Karnaugh, M. (1953). "A Map Method for Synthesis
of Combinatorial Logic Circuits," *Transactions AIEE, Comm and Electronics*,
**72**, Part I, 593-599.

Kashyap, R. L. (1966). "Synthesis of Switching Functions
by Threshold Elements," *IEEE Trans. Elec. Comp.*, **EC-15**(4),
619-628.

Kaszerman, P. (1963). "A Nonlinear Summation Threshold
Device," *IEEE Trans. Elec. Comp.*,** EC-12**, 914-915.

Katz, B. (1966). *Nerve, Muscle and Synapse*. McGraw-Hill,
New York.

Keeler, J. and Rumelhart, D. E. (1992). "A Self-Organizing
Integrated Segmentation and Recognition Neural Network," in *Advances
in Neural Information Processing Systems 4* (Denver 1991), J. E. Moody,
S. J. Hanson, and R. P. Lippmann, Editors, 496-503. Morgan Kaufmann, San
Mateo.

Keeler, J. D., Rumelhart, D. E., and Leow, W.-K. (1991).
"Integrated Segmentation and Recognition of Handprinted Numerals,"
in *Advances in Neural Information Processing Systems 3* (Denver 1990),
R. P. Lippmann, J. E. Moody, and D. S. Touretzky, Editors, 557-563. Morgan
Kaufmann, San Mateo.

Keesing, R. and Stork, D. G. (1991). "Evolution and
Learning in Neural Networks: The Number and Distribution of Learning Trials
Affect the Rate of Evolution," in *Advances in Neural Information
Processing Systems 3* (Denver 1990), R. P. Lippmann, J. E. Moody, and
D. S. Touretzky, Editors, 804-810. Morgan Kaufmann, San Mateo.

Kelly, H. J. (1962). "Methods of Gradients,"
in *Optimization Techniques with Applications to Aerospace Systems*,
C. Leitmann, Editor. Academic Press, New York.

Khachian, L. G. (1979). "A Polynomial Algorithm in
Linear Programming," *Soviet Mathematika Doklady*, **20**,
191-194.

Kirkpatrick, S. (1984). "Optimization by Simulated
Annealing: Quantitative Studies," *J. Statist. Physics*, **34**,
975-986.

Kirkpatrick, S., Gilatt, C. D., and Vecchi, M. P. (1983).
"Optimization by Simulated Annealing," *Science*, **220**,
671-680.

Kishimoto, K. and Amari, S. (1979). "Existence and
Stability of Local Excitations in Homogeneous Neural Fields," *J.
Math. Biology*, **7**, 303-318.

Knapp, A. G. and Anderson, J. A. (1984). "A Theory
of Categorization Based on Distributed Memory Storage," *Journal
of Experimental Psychology: Learning, Memory, and Cognition*, **9**,
610-622.

Kohavi, Z. (1978). *Switching and Finite Automata*.
McGraw-Hill, NY.

Kohonen, T. (1972). "Correlation Matrix Memories,"
*IEEE Trans. Computers*, **C-21**, 353-359.

Kohonen, T. (1974). "An Adaptive Associative Memory
Principle," *IEEE Trans. Computers*, **C-23**, 444-445.

Kohonen, T. (1982a). "Self-Organized Formation of
Topologically Correct Feature Maps," *Biological Cybernetics*,
**43**, 59-69.

Kohonen, T. (1982b). "Analysis of Simple Self-Organizing
Process," *Biological Cybernetics*, **44**, 135-140.

Kohonen, T. (1984). *Self-Organization and Associative
Memory*. Springer-Verlag, Berlin.

Kohonen, T. (1988). "The 'Neural' Phonetic Typewriter,"
*IEEE Computer Magazine*, March 1988, 11-22.

Kohonen, T. (1989). *Self-Organization and Associative
Memory* (3rd ed.). Springer-Verlag, Berlin.

Kohonen, T. (1990). "Improved Versions of Learning
Vector Quantization," in *Proceedings of the International Joint
Conference on Neural Networks* (San Diego 1990), vol. I, 545-550. IEEE,
New York.

Kohonen, T. (1991). "Self-Organizing Maps: Optimization
Approaches," in *Artificial Neural Networks*, T. Kohonen, K.
Makisara, O. Simula, and J. Kanga, Editors, 981-990. North-Holland, Amsterdam.

Kohonen, T. (1993a). "Things You Haven't Heard About
the Self-Organizing Map," *IEEE International Conference on Neural
Networks* (San Francisco 1993), vol. III, 1147-1156. IEEE, New York.

Kohonen, T. (1993b). "Physiological Interpretation
of the Self-Organizing Map Algorithm," *Neural Networks*, **6**(7),
895-905.

Kohonen, T. and Ruohonen, M. (1973). "Representation
of Associated Data by Matrix Operators," *IEEE Trans. Computers*,
**C-22**, 701-702.

Kohonen, T., Barna, G., and Chrisley, R. (1988). "Statistical
Pattern Recognition with Neural Networks: Benchmarking Studies," in
*IEEE International Conference on Neural Networks* (San Diego 1988),
vol. I, 61-68. IEEE, New York.

Kolen, J. F. and Pollack, J. B. (1991). "Back Propagation
is Sensitive to Initial Conditions," in *Advances in Neural Information
Processing Systems 3* (Denver 1990). R. P. Lippmann, J. E. Moody, and
D. S. Touretzky, Editors, 860-867. Morgan Kaufmann, San Mateo.

Kolmogorov, A. N. (1957). "On the Representation
of Continuous Functions of Several Variables by Superposition of Continuous
Functions of one Variable and Addition," *Doklady Akademii. Nauk
USSR*, **114**, 679-681.

Komlós, J. (1967). On the Determinant of (0,1)
Matricies. *Studia Scientarium Mathematicarum Hungarica*, **2**,
7-21.

Komlós, J. and Paturi, R. (1988). "Convergence
Results in an Associative Memory Model," *Neural Networks*, **3**(2),
239-250.

Kosko, B. (1987). "Adaptive Bidirectional Associative
Memories," *Applied Optics*, **26**, 4947-4960.

Kosko, B. (1988). "Bidirectional Associative Memories,"
*IEEE Trans. Sys. Man Cybern*., **SMC-18**, 49-60.

Kosko, B. (1992). *Neural Networks and Fuzzy Systems:
A Dynamical Systems Approach to Machine Intelligence*. Prentice-Hall,
Englewood.

Kramer, A. H. and Sangiovanni-Vincentelli, A. (1989).
"Efficient Parallel Learning Algorithms for Neural Networks,"
in *Advances in Neural Information Processing Systems 1* (Denver 1988)
D. S. Touretzky, Editor, 40-48. Morgan Kaufmann, San Mateo.

Kramer, M. (1991). "Nonlinear Principal Component
Analysis Using Autoassociative Neural Networks," *AICHE Journal*,
**37**, 233-243.

Krauth, W., Mézard, M., and Nadal, J.-P. (1988).
"Basins of Attraction in a Perceptron Like Neural Network," *Complex
Systems*, **2**, 387-408.

Krekelberg, B. and Kok, J. N. (1993). "A Lateral
Inhibition Neural Network that Emulates a Winner-Takes-All Algorithm,"
in *Proc. of the European Symposium on Artificial Neural Networks*
(Brussels 1993). M. Verleysen, Editor, 9-14. D facto, Brussels, Belgium.

Krishnan, T. (1966). "On the Threshold Order of Boolean
Functions," *IEEE Trans. Elec. Comp.*,** EC-15**, 369-372.

Krogh, A. and Hertz, J. A. (1992). "A Simple Weight
Decay Can Improve Generalization," in *Advances in Neural Information
Processing Systems 4* (Denver 1991), J. E. Moody, S. J. Hanson, and
R. P. Lippmann, Editors, 950-957. Morgan Kaufmann, San Mateo.

Kruschke, J. K. and Movellan, J. R. (1991). "Benefits
of Gain: Speeded Learning and Minimal Hidden Layers in Back-Propagation
Networks," *IEEE Transactions on System, Man, and Cybernetics*,
**SMC-21**(1), 273-280.

Kuczewski, R. M., Myers, M. H., and Crawford, W. J. (1987).
"Exploration of Backward Error Propagation as a Self-Organizational
Structure," *IEEE International Conference on Neural Networks*
(San Diego 1987), M. Caudill and C. Butler, Editors, vol. II, 89-95. IEEE,
New York.

Kufudaki, O. and Horejs, J. (1990). "PAB: Parameters
Adapting Backpropagation," *Neural Network World*, **1**,
267-274.

Kühn, R., Bös, S., and van Hemmen, J. L. (1991).
"Statistical Mechanics for Networks of Graded Response Neurons,"
*Phy. Rev. A*, **43**, 2084-2087.

Kullback, S. (1959). *Information Theory and Statistics*.
Wiley, New York.

Kung, S. Y. (1993). *Digital Neural Networks*. PTR
Prentice-Hall, Englewood Cliffs, New Jersey.

Kuo, T. and Hwang, S. (1993). "A Genetic Algorithm
with Disruptive Selection," *Proceedings of the Fifth International
Conference on Genetic Algorithms* (Urbana-Champaign 1993), S. Forrest,
Editor, 65-69. Morgan Kaufmann, San Mateo.

Krková,
V. (1992). "Kolmogorov's Theorem and Multilayer Neural Networks,"
*Neural Networks*, **5**(3), 501-506.

Kushner, H. J. (1977). "Convergence of Recursive
Adaptive and Identification Procedures Via Weak Convergence Theory,"
*IEEE Trans. Automatic Control*, **AC-22**(6), 921-930.

Kushner, H. J. and Clark, D. (1978). *Stochastic Approximation
Methods for Constrained and Unconstrained Systems*. Springer, New York.

Lane, S. H., Handelman, D. A., and Gelfand, J. J. (1992).
"Theory and Development of Higher Order CMAC Neural Networks,"
*IEEE Control Systems Magazine*, April 1992, 23-30.

Lang, K. J. and Witbrock, M. J. (1989). "Learning
to Tell Two Spirals Apart," *Proceedings of the 1988 Connectionists
Models Summer Schools* (Pittsburgh 1988), D. Touretzky, G. Hinton, and
T. Sejnowski, Editors, 52-59. Morgan Kaufmann, San Mateo.

Lapedes, A. S. and Farber, R. (1987). "Nonlinear
Signal Processing Using Neural Networks: Prediction and System Modeling,"
Technical Report, Los Alamos National Laboratory, Los Alamos, New Mexico.

Lapedes, A. and Farber, R. (1988). "How Neural Networks
Works," in *Neural Information Processing Systems* (Denver 1987),
D. Z. Anderson, Editor, 442-456. American Institute of Physics, New York.

Lapidus, L. E., Shapiro, E., Shapiro, S., and Stillman,
R. E. (1961). "Optimization of Process Performance," *AICHE
Journal*, **7**, 288-294.

Lawler, E. L. and Wood, D. E. (1966). "Branch-and-bound
methods: A Survey," *Operations Research*, **14**(4). 699-719.

Lay, S.-R. and Hwang, J.-N. (1993). "Robust Construction
of Radial Basis Function Networks for Classification," in *Proceedings
of the IEEE International Conference on Neural Networks* (San Francisco
1993), vol. III, 1859-1864. IEEE, New York.

Le Cun, Y., Boser, B., Denker, J. S., Henderson, D., Howard,
R. E., Hubbard, W., and Jackel, L. D. (1989). "Backpropagation Applied
to Handwritten Zip Code Recognition," *Neural Computation*, **1**(4),
541-551.

Le Cun, Y., Boser, B., Denker, J. S., Henderson, D., Howard,
R. E., Hubbard, W., and Jackel, L. D. (1990). "Handwritten Digit Recognition
with a Backpropagation Network," in *Advances in Neural Information
Processing Systems 2* (Denver 1989), D. S. Touretzky, Editor, 396-404.
Morgan Kaufmann, San Mateo.

Le Cun, Y., Kanter, I., and Solla, S. A. (1991a). "Second
Order Properties of Error Surfaces: Learning Time and Generalization,"
in *Advances in Neural Information Processing Systems 3* (Denver 1990),
R. P. Lippmann, J. E. Moody, and D. S. Touretzky, Editors, 918-924. Morgan
Kaufmann, San Mateo.

Le Cun, Y., Kanter, I., and Solla, S. A. (1991b). "Eigenvalues
of Covariance Matrices: Application to Neural Network Learning," *Phys.
Rev. Lett.*, **66**, 2396-2399.

Le Cun, Y., Simard, P. Y., and Pearlmutter, B. (1993).
"Automatic Learning Rate Maximization by On-Line Estimation of the
Hessian's Eigenvectors," in *Advances in Neural Information Processing
Systems 5* (Denver 1992), S. J. Hanson, J. D. Cowan, and C. L. Giles,
Editors, 156-163. Morgan Kaufmann, San Mateo.

Lee, B. W. and Shen, B. J. (1991). "Hardware Annealing
in Electronic Neural Networks," *IEEE Transactions on Circuits and
Systems*, **38**, 134-137.

Lee, B. W. and Sheu, B. J. (1993). "Parallel Hardware
Annealing for Optimal Solutions on Electronic Neural Networks," *IEEE
Transactions on Neural Networks*, **4**(4), 588-599.

Lee, S. and Kil, R. (1988). "Multilayer Feedforward
Potential Function Networks," in *Proceedings of the IEEE Second
International Conference on Neural Networks* (San Diego 1988), vol.
I, 161-171. IEEE, New York.

Lee, Y. (1991). "Handwritten Digit Recognition Using
*k*-Nearest Neighbor, Radial-Basis Functions, and Backpropagation
Neural Networks," *Neural Computation*, **3**(3), 440-449.

Lee, Y. and Lippmann, R. P. (1990). "Practical Characteristics
of Neural Networks and Conventional Pattern Classifiers on Artificial and
Speech Problems," in *Advances in Neural Information Processing
Systems 2* (Denver 1989), D. S. Touretzky, Editor, 168-177. Morgan Kaufmann,
San Mateo.

Lee, Y., Oh, S.-H., and Kim, M. W. (1991). "The Effect
of Initial Weights on Premature Saturation in Back-Propagation Learning,"
in *International Joint Conference on Neural Networks* (Seattle 1991),
vol. I, 765-770. IEEE, New York.

von Lehman, Paek, E. G., Liao, P. F., Marrakchi, A., and
Patel, J. S. (1988). "Factors Influencing Learning by Back-Propagation,"
in *IEEE International Conference on Neural Networks* (San Diego 1988),
vol. I, 335-341. IEEE, New York.

Leshno, M., Lin, V. Y., Pinkus, A., and Schocken, S. (1993).
"Multilayer Feedforward Networks with a Nonpolynomial Activation Function
Can Approximate Any Function," *Neural Networks*, **6**(6),
861-867.

Leung, C. S. and Cheung, K. F. (1991). "Householder
Encoding for Discrete Bidirectional Associative Memory," in *Proc.
Int. Conference on Neural Networks* (Singapore 1991), 237-241.

Levin, A. V. and Narendra, K. S. (1992). "Control
of Nonlinear Dynamical Systems Using Neural Networks, Part II: Observability
and Identification," Technical Report 9116, Center for Systems Science,
Yale Univ., New Haven, CT.

Lewis II, P. M. and Coates, C. L. (1967). *Threshold
Logic*. John Wiley, New York, NY.

Light, W. A. (1992a). "Ridge Functions, Sigmoidal
Functions and Neural Networks," in *Approximation Theory VII*,
E. W. Cheney, C. K. Chui, and L. L. Schumaker, Editors, 163-206. Academic
Press, Boston.

Light, W. A. (1992b). "Some Aspects of Radial Basis
Function Approximation," in *Approximation Theory, Spline Functions,
and Applications*, S. P. Singh, Editor, NATO ASI Series, **256**,
163-190. Klawer Academic Publishers, Boston, MA.

Ligthart, M. M., Aarts, E. H. L., and Beenker, F. P. M.
(1986). "Design-for-Testability of PLA's Using Statistical Cooling,"
in *Proc. ACM/IEEE 23rd Design Automation Conference* (Las Vegas 1986),
339-345.

Lin, J.-N. and Unbehauen, R. (1993). "On the Realization
of a Kolmogorov Network," *Neural Computation*, **5**(1),
21-31.

Linde, Y., Buzo, A., and Gray, R. M. (1980). "An
Algorithm for Vector Quantizer Design," *IEEE Trans. on Communications*,
**COM-28**, 84-95.

Linsker, R. (1986). "From Basic Network Principles
to Neural Architecture," *Proceedings of the National Academy of
Sciences*, USA, **83**, 7508-7512, 8390-8394, 8779-8783.

Linsker, R. (1988). "Self-Organization in a Perceptual
Network," *Computer*, March 1988, 105-117.

Lippmann, R. P. (1987). "An Introduction to Computing
with Neural Nets," *IEEE Magazine on Accoustics, Signal, and Speech
Processing* (April), **4**, 4-22.

Lippmann, R. P. (1989). "Review of Neural Networks
for Speech Recognition," *Neural Computation*, **1**(1), 1-38.

Little, W. A. (1974). "The Existence of Persistent
States in the Brain," *Math Biosci*., **19**, 101-120.

Ljung, L. (1977). "Analysis of Recursive Stochastic
Algorithms," *IEEE Trans. on Automatic Control*, **AC-22**(4),
551-575.

Ljung, L. (1978). "Strong Convergence of Stochastic
Approximation Algorithm," *Annals of Statistics*, **6**(3),
680-696.

Lo, Z.-P., Yu, Y., and Bavarian, B. (1993). "Analysis
of the Convergence Properties of Topology Preserving Neural Networks,"
*IEEE Transactions on Neural Networks*, **4**(2), 207-220.

Loève, M. (1963). *Probability Theory*, 3rd
edition, Van Nostrand, New York.

Logar, A. M., Corwin, E. M., and Oldham, W. J. B. (1993).
"A Comparison of Recurrent Neural Network Learning Algorithms,"
in *Proceedings of the IEEE International Conference on Neural Networks*
(San Francisco 1993), vol. II, 1129-1134. IEEE, New York.

Luenberger, D. G. (1969). *Optimization by Vector Space
Methods*. John Wiley, New York, NY.

Macchi, O. and Eweda, E. (1983). "Second-Order Convergence
Analysis of Stochastic Adaptive Linear Filtering," *IEEE Trans.
Automatic Control*, **AC-28**(1), 76-85.

Mackey, D. J. C. and Glass, L. (1977). "Oscillation
and Chaos in Physiological Control Systems," *Science*, **197**,
287-289.

MacQueen, J. (1967). "Some Methods for Classification
and Analysis of Multivariate Observations," in *Proceedings of the
Fifth Berkeley Symposium on Mathematics, Statistics, and Probability*,
L. M. LeCam and J. Neyman, Editors, 281-297. University of California Press,
Berkeley.

Magnus, J. R. and Neudecker, H. (1988). *Matrix Differential
Calculus with Applications in Statistics and Econometrics*. Wiley, Chichester.

Makram-Ebeid, S., Sirat, J.-A., and Viala, J.-R. (1989).
"A Rationalized Back-Propagation Learning Algorithm," in *International
Joint Conference on Neural Networks* (Washington 1989), vol. II, 373-380.
IEEE, New York.

von der Malsberg, C. (1973). "Self-Organizing of
Orientation Sensitive Cells in the Striate Cortex," *Kybernetick*,
**14**, 85-100.

Mano, M. M. (1979). *Digital Logic and Computer Design*,
Prentice-Hall, Englewood Cliffs, NJ.

Mao, J. and Jain, A. K. (1993). "Regularization Techniques
in Artificial Neural Networks," in *Proc. World Congress on Neural
Networks* (Portland 1993), vol. IV, 75-79. LEA, Hillsdale.

Marchand, M., Golea, M., and Rujan, P. (1990). "A
Convergence Theorem for Sequential Learning in Two-Layer Perceptrons,"
*Europhysics Letters*, **11**, 487-492.

Marcus, C. M. and Westervelt, R. M. (1989). "Dynamics
of Iterated-Map Neural Networks," *Physical Review A*, **40**(1),
501-504.

Marcus, C. M., Waugh, F. R., and Westervelt, R. M. (1990).
"Associative Memory in an Analog Iterated-Map Neural Network,"
*Physical Review A*, **41**(6), 3355-3364.

Marr, D. (1969). "A Theory of Cerebellar Cortex,"
*J. Physiol.* (London), **202**, 437-470.

Martin, G. L. (1990). "Integrating Segmentation and
Recognition Stages for Overlapping Handprinted Characters," MCC Tech.
Rep. ACT-NN-320-90, Austin, Texas.

Martin, G. L. (1993). "Centered-Object Integrated
Segmentation and Recognition of Overlapping Handprinted Characters,"
*Neural Networks*, **5**(3), 419-429.

Martin, G. L., and Pittman, J. A. (1991). "Recognizing
Hand-Printed Letters and Digits Using Backpropagation Learning," *Neural
Computation*, **3**(2), 258-267.

Mays, C. H. (1964). "Effects of Adaptation Parameters
on Convergence Time and Tolerance for Adaptive Threshold Elements,"
*IEEE Trans. Elec. Comp.*, **EC-13**, 465-468.

McCulloch, J. L. and Pitts, W. (1943). "A Logical
Calculus of Ideas Immanent in Nervous Activity," *Bulletin of Mathematical
Biophysics, 5*, 115-133.

McEliece, R. J., Posner, E. C., Rodemich, E. R., and Venkatesh,
S. S. (1987). "The Capacity of the Hopfield Associative Memory,"
*IEEE Trans. Info. Theory*, **IT-33**, 461-482.

McInerny, J. M., Haines, K. G., Biafore, S., and Hecht-Nielsen,
R. (1989). "Backpropagation Error Surfaces Can Have Local Minima,"
in *International Joint Conference on Neural Networks* (Washington
1989), vol. II, 627. IEEE, New York.

Mead, C. (1991). "Neuromorphic Electronic Systems,"
*Aerospace and Defense Science*, **10**(2), 20-28.

Medgassy, P. (1961). *Decomposition of Superposition
of Distributed Functions*. Hungarian Academy of Sciences, Budapest.

Megiddo, N. (1986). "On the Complexity of Polyhedral
Separability," Tech. Rep. RJ 5252, IBM Almaden Research Center, San
Jose, CA.

Mel, B. W. and Omohundro, S. M. (1991). "How Receptive
Field Parameters Affect Neural Learning," in *Advances in Neural
Information Processing Systems 3* (Denver 1990), R. P. Lippmann, J.
E. Moody, And D. S. Touretzky, Editors, 757-763. Morgan Kaufmann, San Mateo.

Metropolis, N., Rosenbluth, A., Teller, A., and Teller,
E. (1953). "Equation of State Calculations by Fast Computing Machines,"
*J. Chemical Physics*, **21**(6), 1087-1092.

Mézard, M. and Nadal, J.-P. (1989). "Learning
in Feedforward Layered Networks: The Tiling Algorithm," *Journal
of Physics A*, **22**, 2191-2204.

Micchelli, C. A. (1986). "Interpolation of Scattered
Data: Distance and Conditionally Positive Definite Functions," *Constructive
Approximation*, **2**, 11-22.

Miller, G. F., Todd, P. M., and Hedg, S. U. (1989). "Designing
Neural Networks Using Genetic Algorithms," in *Proceedings of the
Third International Conference on Genetic Algorithms* (Arlington 1989),
J. D. Schaffer, Editor, 379-384. Morgan Kaufmann, San Mateo.

Miller, W. T., Sutton, R. S., and Werbos, P. J., Editors
(1990a). *Neural Networks for Control*. MIT Press, Cambridge.

Miller, W. T., Box, B. A., and Whitney, E. C. (1990b).
"Design and Implementation of a High Speed CMAC Neural Network Using
Programmable CMOS Logic Cell Arrays," Report No. ECE.IS.90.01, University
of New Hampshire.

Miller, W. T., Glanz, F. H., and Kraft, L. G. (1990c).
"CMAC: An Associative Neural Network Alternative to Backpropagation,"
*Proc. IEEE*, **78**(10), 1561-1657.

Miller, W. T., Hewes, R. P., Glanz, F. H., and Kraft,
L. G. (1990d). "Real-Time Dynamic Control of an Industrial Manipulator
Using a Neural-Network-Based Learning Controller," *IEEE Trans.
Robotics Automation*, **6**, 1-9.

Minsky, M. and Papert, S. (1969). *Perceptrons: An Introduction
to Computational Geometry*. MIT Press, Cambridge, MA.

Møller, M. F. (1990). "A Scaled Conjugate
Gradient Algorithm for Fast Supervised Learning," Technical Report
PB-339, Computer Science Department, University of Aarhus, Aarhus, Denmark.

Montana, D. J. and Davis, L. (1989). "Training Feedforward
Networks Using Genetic Algorithms," in *Eleventh International Joint
Conference on Artificial Intelligence* (Detroit 1989), N. S. Sridhara,
Editor, 762-767. Morgan Kaufmann, San Mateo.

Moody, J. (1989). "Fast Learning in Multi-Resolution
Hierarchies," in *Advances in Neural Information Processing Systems
I* (Denver 1988), D. S. Touretzky, Editor, 29-39. Morgan Kaufmann, San
Mateo.

Moody, J. and Darken, C. (1989a). "Learning with
Localized Receptive Fields," in *Proceedings of the 1988 Connectionist
Models Summer School* (Pittsburgh 1988), D. Touretzky, G. Hinton, and
T. Sejnowski, Editors, 133-143. Morgan Kaufmann, San Mateo.

Moody, J. and Darken, C. (1989b). "Fast Learning
in Networks of Locally-Tuned Processing Units," *Neural Computation*,
**1**(2), 281-294.

Moody, J. and Yarvin, N. (1992). "Networks with Learned
Unit Response Functions," in *Advances in Neural Information Processing
Systems 4* (Denver 1991), J. E. Moody, S. J. Hanson, and R. P. Lippmann,
Editors, 1048-1055. Morgan Kaufmann, San Mateo.

Moore, B. (1989). "ART1 and Pattern Clustering,"
in *Proceedings of the 1988 Connectionists Models Summer Schools*
(Pittsburgh 1988), D. Touretzky, G. Hinton, and T. Sejnowski, Editors,
174-185. Morgan Kaufmann, San Mateo.

Morgan, N. and Bourlard, H. (1990). "Generalization
and Parameter Estimation in Feedforward Nets: Some Experiments," in
*Advances in Neural Information Processing Systems 2* (Denver 1989),
D. S. Touretzky, Editor, 630-637. Morgan Kaufmann, San Mateo.

Morita, M. (1993). "Associative Memory with Nonmonotone
Dynamics," *Neural Networks*, **6**(1), 115-126.

Morita, M., Yoshizawa, S., and Nakano, K. (1990a). "Analysis
and Improvement of the Dynamics of Autocorrelation Associative Memory,"
*Trans. Institute Electronics, Information and Communication Engrs.*,
**J73-D-III**(2), 232-242.

Morita, M., Yoshizawa, S., and Nakano, K. (1990b). "Memory
of Correlated Patterns by Associative Neural Networks with Improved Dynamics,"
in *Proc. INNC '90*, Paris, vol. 2, 868-871.

Mosteller, F. and Tukey, J. (1980). *Robust Estimation
Procedures*. Addison-Wesley, New York.

Mosteller, F., Rourke, R. E., and Thomas Jr., G. B. (1970).
*Probability with Statistical Applications*, 2nd edition. Addison-Wesley,
Reading, MA.

Mukhopadhyay, S., Roy, A., Kim, L. S., and Govil, S. (1993).
"A Polynomial Time Algorithm for Generating Neural Networks for Pattern
Classification: Its Stability Properties and Some Test Results," *Neural
Computation*, **5**(2), 317-330.

Muroga, S. (1959). "The Principle of Majority Decision
Logical Elements and the Complexity of their Circuits," *Proc. Int.
Conf. on Information Processing*, Paris, 400-407.

Muroga, S. (1965). "Lower Bounds of the Number of
Threshold Functions and a Maximum Weight," *IEEE Trans. Elect. Comp*.,
**EC-14**(2), 136-148.

Muroga, S. (1971). *Threshold Logic and its Applications*.
John Wiley Interscience, New York, NY.

Musavi, M. T., Ahmed, W., Chan, K. H., Faris, K. B., and
Hummels, D. M. (1992). "On the Training of Radial Basis Function Classifiers,"
*Neural Networks*, **5**(4), 595-603.

Nakano, K. (1972). "Associatron: A Model of Associative
Memory," *IEEE Trans. Sys. Man Cybern.*, **SMC-2**, 380-388.

Narayan, S. (1993). "ExpoNet: A Generalization of
the Multi-Layer Perceptron Model," in *Proc. World Congress on Neural
Networks* (Portland 1993), vol. III, 494-497. LEA, Hillsdale.

Narendra, K. S. and Parthasarathy, K. (1990). "Identification
and Control of Dynamical Systems Using Neural Networks," *IEEE Trans.
Neural Networks*, **1**(1), 4-27.

Narendra, K. S. and Wakatsuki, K. (1991). "A Comparative
Study of Two Neural Network Architectures for the Identification and Control
of Nonlinear Dynamical Systems," Technical Report, Center for Systems
Science, Yale University, New Haven, CT.

Nerrand, O., Roussel-Ragot, P., Personnaz, L., Dreyfus,
G., and Marcos, S. (1993). "Neural Networks and Nonlinear Adaptive
Filtering: Unifying Concepts and New Algorithms," *Neural Computation*,
**5**(2), 165-199.

Newman, C. (1988). "Memory Capacity in Neural Network
Models: Rigorous Lower Bounds," *Neural Networks*, **3**(2),
223-239.

Nguyen, D. and Widrow, B. (1989). "The Truck Backer-Upper:
An Example of Self-Learning in Neural Networks," in *Proceedings
of the International Joint Conference on Neural Networks* (Washington,
DC 1989), vol. II, 357-362.

Nilsson, N. J. (1965). *Learning Machines*. McGraw-Hill,
New York. Reissued as *The Mathematical Foundations of Learning Machines*.
Morgan Kaufmann, San Mateo, CA, 1990.

Niranjan, M. and Fallside, F. (1988). "Neural Networks
and Radial Basis Functions in Classifying Static Speech Patterns,"
Technical Report CUEDIF-INFENG17R22, Engineering Department, Cambridge
University.

Nishimori, H. and Opri,
I. (1993). "Retrieval Process of an Associative Memory with a General
Input-Output Function," *Neural Networks*, **6**(8), 1061-1067.

Nolfi, S., Elman, J. L., and Parisi, D. (1990). "Learning
and Evolution in Neural Networks," CRL Technical Report 9019, University
of California, San Diego.

Novikoff, A. B. J. (1962). "On Convergence Proofs
of Perceptrons," *Proc. Symp. on Math. Theory of Automata* (Polytechnic
Institute of Brooklyn, Brooklyn, NY.), 615-622

Nowlan, S. J. (1988). "Gain Variation in Recurrent
Error Propagation Networks," *Complex Systems*, **2**, 305-320.

Nowlan, S. J. (1990). "Maximum Likelihood Competitive
Learning," in *Advances in Neural Information Processing Systems
2* (Denver 1989). D. Touretzky, Editor, 574-582. Morgan Kaufmann, San
Mateo.

Nowlan, S. J. and Hinton, G. E. (1992a). "Adaptive
Soft Weight Tying using Gaussian Mixtures," in *Advance in Neural
Information Processing Systems 4* (Denver, 1991), J. E. Moody, S. J.
Hanson, and R. P. Lippmann, Editors, 993-1000. Morgan Kaufmann, San Mateo.

Nowlan, S. J., and Hinton, G. E. (1992b). "Simplifying
Neural Networks by Soft Weight-Sharing," *Neural Computation*,
**4**(4), 473-493.

Oja, E. (1982). "A Simplified Neuron Model As a Principal
Component Analyzer," *Journal of Mathematical Biology*, **15**,
267-273.

Oja, E. (1983). *Subspace Methods of Pattern Recognition*.
Research Studies Press and John Wiley. Letchworth, England.

Oja, E. (1989). "Neural Networks, Principal Components,
and Subspaces," *International Journal of Neural Systems*, **1**(1),
61-68.

Oja, E. (1991). "Data Compression, Feature Extraction,
and Autoassociation in Feedforward Neural Networks," *Artificial
Neural Networks, Proceedings of the 1991 International Conference on Artificial
Neural Networks* (Espoo 1991), T. Kohonen, K. Mäkisara, O. Simula,
and J. Kangas, Editors, vol. I, 737-745. Elsevier Science Publishers B.
V., Amsterdam.

Oja, E. and Karhunen, J. (1985). "On Stochastic Approximation
of the Eigenvectors of the Expectation of a Random Matrix," *Journal
of Mathematical Analysis and Applications*, **106**, 69-84.

Okajima, K., Tanaka, S., and Fujiwara, S. (1987). "A
Heteroassociative Memory Network with Feedback Connection," in *Proc.
IEEE First International Conference on Neural Networks* (San Diego 1987),
M. Caudill & C. Butler, Editors, vol. II, 711-718.

Paek, E. G. and Psaltis, D. (1987). "Optical Associative
Memory Using Fourier Transform Holograms," *Optical Engineering*,
**26**, 428-433.

Pao, Y. H. (1989). *Adaptive Pattern Recognition and
Neural Networks*. Addison-Wesley, Reading, MA.

Papadimitriou, C. H. and Steiglitz (1982). *Combinatorial
Optimization: Algorithms and Complexity*. Prentice-Hall, Englewood Cliffs.

Park, J. and Sandberg, I. W. (1991). "Universal Approximation
Using Radial-Basis-Function Networks," *Neural Computation*,
**3**(2), 246-257.

Park, J. and Sandberg, I. W. (1993). "Approximation
and Radial-Basis-Function Networks," *Neural Computation*, **5**(2),
305-316.

Parker, D. B. (1985). "Learning Logic," Technical
Report TR-47, Center for Computational Research in Economics and Management
Science, Massachusetts Institute of Technology, Cambridge, MA.

Parker, D. B. (1987). "Optimal Algorithms for Adaptive
Networks: Second Order Backprop, Second Order Direct Propagation, and Second
Order Hebbian Learning," in *IEEE First International Conference
on Neural Networks* (San Diego 1987), M. Caudill and C. Butler, Editors,
vol. II, 593-600. IEEE, New York.

Parks, M. (1987). "Characterization of the Boltzmann
Machine Learning Rate," in *IEEE First International Conference
on Neural Networks* (San Diego 1987), M. Caudill and C. Butler, vol.
III, 715-719. New York, IEEE.

Parks, P. C. and Militzer, J. (1991). "Improved Allocation
of Weights for Associative Memory Storage in Learning Control Systems,
*Proceedings of the 1st IFAC Symposium on Design Methods of Control Systems*,
col. II, 777-782. Pergamon Press, Zurich.

Parzen, E. (1962). "On Estimation of a Probability
Density Function and Mode," *Ann. Math. Statist.*, **33**,
1065-1076.

Pearlmutter, B. A. (1988). "Learning State Space
Trajectories in Recurrent Neural Networks," Technical Report CMU-CS-88-191,
School of Computer Science, Carnegie Mellon University, Pittsburgh, PA.

Pearlmutter, B. A. (1989a). "Learning State Space
Trajectories in Recurrent Neural Networks," in *International Joint
Conference on Neural Networks* (Washington 1989), vol. II, 365-372.
IEEE, New York.

Pearlmutter, B. A. (1989b). "Learning State Space
Trajectories in Recurrent Neural Networks," *Neural Computation*,
**1**(2), 263-269.

Penrose, R. (1955). "A Generalized Inverse for Matrices,"
*Proc. Cambridge Philosophical Society*, **51**, 406-413.

Peretto, P. (1984). "Collective Properties of Neural
Networks: A Statistical Physics Approach," *Biological Cybernetics*,
**50**, 51-62.

Personnaz, L., Guyon, I., and Dreyfus, G. (1986). "Collective
Computational Properties of Neural Networks: New Learning Mechanisms,"
*Physical Review A*, **34**(5), 4217-4227.

Peterson, C. and Anderson, J. R. (1987). "A Mean
Field Theory Learning Algorithm for Neural Networks," *Complex Systems*,
**1**, 995-1019.

Peterson, G. E. and Barney, H. L. (1952). "Control
Methods used in a Study of the Vowels," *Journal of the Acoustical
Society of America*, **24**(2), 175-184.

Pflug, G. Ch. (1990). "Non-Asymptotic Confidence
Bounds for Stochastic Approximation Algorithms with Constant Step Size,"
*Mathematik*, **110**, 297-314.

Pineda, D. A. (1988). "Dynamics and Architectures
for Neural Computation," *Journal of Complexity*, **4**, 216-245.

Pineda, F. J. (1987). "Generalization of Back-Propagation
to Recurrent Neural Networks," *Physical Review Letters*, **59**,
2229-2232.

Platt, J. (1991). "A Resource-Allocating Network
for Function Interpolation," *Neural Computation*, **3**(2),
213-225.

Plaut, D. S., Nowlan, S., and Hinton, G. (1986). "Experiments
on Learning by Back Propagation," Technical Report CMU-CS-86-126,
Department of Computer Science, Carnegie Mellon University, Pittsburgh,
PA.

Poggio, T. and Girosi, F. (1989). "A Theory of Networks
for Approximation and Learning," A. I. Memo 1140, M.I.T., Cambridge,
MA.

Poggio, T. and Girosi, F. (1990a). "Networks for
Approximation and Learning," *Proceedings of the IEEE*, **78**(9),
1481-1497.

Poggio, T. and Girosi, F. (1990b). "Regularization
Algorithms for Learning that are Equivalent to Multilayer Networks,"
*Science*, **247**, 978-982.

Polak, E. and Ribiére, G. (1969). "Note sur
la Convergence de Methods de Directions Conjugées," *Revue
Francaise d'Informatique et Recherche Operationnalle*, **3**, 35-43.

Polyak, B. T. (1987). *Introduction to Optimization*.
Optimization Software, Inc., New York.

Polyak, B. T. (1990). "New Method of Stochastic Approximation
Type," *Automat. Remote Control*, **51**, 937-946.

Pomerleau, D. A. (1991). "Efficient Training of Artificial
Neural Networks for Autonomous Navigation," *Neural Computation*,
**3**(1), 88-97.

Pomerleau, D. A. (1993). *Neural Network Perception
for Mobile Robot Guidance*. Kluwer, Boston.

Powell, M. J. D. (1987). "Radial Basis Functions
for Multivariate Interpolation: A Review," in *Algorithms for the
Approximation of Functions and Data*, J. C. Mason and M. G. Cox, Editors,
Clarendon Press, Oxford.

Press, W. H., Flannery, B. P., Teukolsky, S. A. and Vetterling,
W. T. (1986). *Numerical Recipes: The Art of Scientific Computing*.
Cambridge University Press, Cambridge.

Psaltis, D. and Park, C. H. (1986). "Nonlinear Discriminant
Functions and Associative Memories," in *Neural Networks for Computing*,
J. S. Denker, Editor, *Proc. American Inst. Physics*, vol. 151, 370-375.

Qi, X. and Palmieri, F. (1993). "The Diversification
Role of Crossover in the Genetic Algorithms," *Proceedings of the
Fifth International Conference on Genetic Algorithms* (Urbana-Champaign
1993), S. Forrest, Editor, 132-137. Morgan Kaufmann, San Mateo.

Qian, N. and Sejnowski, T. (1989). "Learning to Solve
Random-Dot Stereograms of Dense Transparent Surfaces with Recurrent Back-Propagation,"
in *Proceedings of the 1988 Connectionist Models Summer School* (Pittsburgh
1988), D. Touretzky, G. Hinton, and T. Sejnowski, Editors, 435-443. Morgan
Kaufmann, San Mateo.

Rao, C. R. and Mitra, S. K. (1971). *Generalized Inverse
of Matrices and its Applications*. John Wiley, New York.

Reed, R. (1993). "Pruning Algorithms - A Survey,"
*IEEE Trans. Neural Networks*, **4**(5), 740-747.

Reeves, C. R. (1993). "Using Genetic Algorithms with
Small Populations," *Proceedings of the Fifth International Conference
on Genetic Algorithms* (Urbana-Champaign 1993), S. Forrest, Editor,
92-99. Morgan Kaufmann, San Mateo.

Reilly, D. L. and Cooper, L. N. (1990). "An Overview
of Neural Networks: Early Models to Real World Systems," in *An
Introduction to Neural and Electronic Networks*, S. F. Zornetzer, J.
L. Davis, and C. Lau, Editors. Academic Press, San Diego.

Reilly, D. L., Cooper, L. N., and Elbaum, C. (1982). "A
Neural Model for Category Learning," *Biological Cybernetics*,
**45**, 35-41.

Rezgui, A. and Tepedelenlioglu, N. (1990). "The Effect
of the Slope of the Activation Function on the Backpropagation Algorithm,"
in *Proceedings of the International Joint Conference on Neural Networks*
(Washington, DC 1990), M. Caudill, Editor, vol. I, 707-710. IEEE, New York.

Ricotti, L. P., Ragazzini, S., and Martinelli, G. (1988).
"Learning of Word Stress in a Sub-Optimal Second Order Back-Propagation
Neural Network," in *IEEE First International Conference on Neural
Networks* (San Diego 1987), M. Caudill and C. Butler, Editors, vol.
I, 355-361. IEEE, New York.

Ridgway III, W. C. (1962). "An Adaptive Logic System
with Generalizing Properties," Technical Report 1556-1, Stanford Electronics
Labs., Stanford University, Stanford, CA.

Riedel, H. and Schild, D. (1992). "The Dynamics of
Hebbian Synapses can be Stabilized by a Nonlinear Decay Term," *Neural
Networks*, **5**(3), 459-463.

Ritter, H. and Schulten, K. (1986). "On the Stationary
State of Kohonen's Self-Organizing Sensory Mapping," *Biol. Cybernetics*,
**54**, 99-106.

Ritter, H. and Schulten, K. (1988a). "Kohonen's Self-Organizing
Maps: Exploring Their Computational Capabilities," in *IEEE International
Conference on Neural Networks* (San Diego 1988), vol. I, 109-116. IEEE,
New York.

Ritter, H. and Schulten, K. (1988b). "Convergence
Properties of Kohonen's Topology Conserving Maps: Fluctuations, Stability,
and Dimension Selection," *Biol. Cybernetics*, **60**, 59-71.

Robinson, A. J. and Fallside, F. (1988). "Static
and Dynamic Error Propagation Networks with Application to Speech Coding,"
in *Neural Information Processing Systems *(Denver 1987), D. Z. Anderson,
Editor, 632-541. American Institute of Physics, New York.

Robinson, A. J., Niranjan, M., and Fallside, F. (1989).
"Generalizing the Nodes of the Error Propagation Network," (abstract)
in *Proc. Int. Joint Conference on Neural Networks* (Washington, D.
C. 1989), vol. II, 582. IEEE, New York. Also, printed as Technical Report
CUED/F-INFENG/TR.25, Cambridge University, Engineering Department, Cambridge,
England.

Rohwer, R. (1990). "The 'Moving Targets' Training
Algorithm," in *Advances in Information Processing Systems 2*
(Denver 1989), D. S. Touretzky, Editor, 558-565. Morgan Kaufmann, San Mateo.

Romeo, F. I. (1989). Simulated Annealing: Theory and Applications
to Layout Problems. Ph.D. Thesis, Memorandum UCB/ERL-M89/29, University
of California at Berkeley. Berkeley, CA.

Rosenblatt, F. (1961). *Principles of Neurodynamics:
Perceptrons and the Theory of Brain Mechanisms*. Spartan Press, Washington,
D. C.

Rosenblatt, F. (1962). *Principles of Neurodynamics:
Perceptrons and the Theory of Brain Mechanisms*. Spartan Books, Washington,
DC.

Roy, A. and Govil, S. (1993). "Generating Radial
Basis Function Net in Polynomial Time for Classification," in *Proc.
World Congress on Neural Networks* (Portland 1993), vol. III, 536-539.
LEA, Hillsdale.

Roy, A. and Mukhopadhyay, S. (1991). "Pattern Classification
Using Linear Programming," *ORSA J. Comput.*, **3**(1), 66-80.

Roy, A., Kim, L. S., and Mukhopadhyay, S. (1993). "A
Polynomial Time Algorithm for the Construction and Training of a Class
of Multilayer Perceptrons," *Neural Networks*, **6**(4), 535-545.

Rozonoer, L. I. (1969). "Random Logic Nets, I,"
*Automat. Telemekh.*, **5**, 137-147.

Rubner, J. and Tavan, P. (1989). "A Self-Organizing
Network for Principal-Component Analysis," *Europhysics Letters*,
**10**, 693-698.

Rudin, W. (1964). *Principles of Mathematical Analysis*.
McGraw-Hill, New York, NY.

Rumelhart, D. E. (1989). "Learning and Generalization
in Multilayer Networks," presentation given at the *NATO Advanced
Research Workshop on Neuro Computing, Architecture, and Applications*
(Les Arcs, France 1989).

Rumelhart, D. E. and Zipser, D. (1985). "Feature
Discovery By Competitive Learning," *Cognitive Science*, **9**,
75-112.

Rumelhart, D. E., McClelland, J. L. and the PDP Research
Group. (1986a). *Parallel Distributed Processing: Exploration in the
Microstructure of Cognition*, volume 1, MIT Press, Cambridge.

Rumelhart, D. E., Hinton, G. E., and Williams, R. J. (1986b).
"Learning Internal Representations by Error Propagation," in
*Parallel Distributed Processing: Explorations in the Microstructure
of Cognition*, vol. I, D. E. Rumelhart, J. L. McClelland, and the PDP
Research Group. MIT Press, Cambridge (1986).

Rutenbar, R. A. (1989). "Simulated Annealing Algorithms:
An Overview," *IEEE Circuits Devices Magazine*, **5**(1).
19-26.

Saha, A. and Keeler, J. D. (1990). "Algorithms for
Better Representation and Faster Learning in Radial Basis Function Networks,"
in *Advances in Neural Information Processing Systems 2* (Denver 1989),
D. Touretzky, Editor, 482-489. Morgan Kaufmann, San Mateo.

Salamon, P., Nulton, J. D., Robinson, J., Petersen, J.,
Ruppeiner, G., and Liao, L. (1988). "Simulated Annealing with Constant
Thermodynamic Speed," *Computer Physics Communications*, **49**,
423-428.

Sanger, T. D. (1989). "Optimal Unsupervised Learning
in a Single Layer Linear Feedforward Neural Network," *Neural Networks*,
**2**(6), 459-473.

Sato, M. (1990). "A Real Time Learning Algorithm
for Recurrent Analog Neural Networks," *Biological Cybernetics*,
**62**, 237-241.

Sayeh, M. R. and Han, J. Y. (1987). "Pattern Recognition
Using a Neural Network," *Proc. SPIE, Intelligent Robots and Computer
Vision*, **848**, 281-285.

Schaffer, J. D., Caruana, R. A., Eshelman, L. J., and
Das, R. (1989). "A Study of Control Parameters Affecting Online Performance
of Genetic Algorithms for Function Optimization," in *Proceedings
of the Third International Conference on Genetic Algorithms and their Applications*
(Arlington 1989), J. D. Schaffer, Editor, 51-60. Morgan Kaufmann, San Mateo.

Schoen, F. (1991). "Stochastic Techniques for Global
Optimization: A Survey of Recent Advances," *Journal of Global Optimization*,
**1**, 207-228.

Schultz, D. G. and Gibson, J. E. (1962). "The Variable
Gradient Method for Generating Liapunov Functions," *Trans. IEE*,
**81**(II), 203-210.

Schumaker, L. L. (1981). *Spline Functions: Basic Theory*.
Wiley, New York.

Schwartz, D. B., Samalam, V. K., Solla, S. A., and Denker,
J. S. (1990). "Exhaustive Learning," *Neural Computation*,
**2**(3), 374-385.

Scofield, C. L., Reilly, D. L., Elbaum, C., and Cooper,
L. N. (1988). "Pattern Class Degeneracy in an Unrestricted Storage
Density Memory," in *Neural Information Processing Systems* (Denver
1987), D. Z. Anderson, Editor, 674-682. American Institute of Physics,
New York.

Sejnowski, T. J. and Rosenberg, C. R. (1987). "Parallel
Networks that Learn to Pronounce English Text," *Complex Systems*,
**1**, 145-168.

Sejnowski, T. J., Kienker, P. k., and Hinton, G. (1986).
"Learning Symmetry Groups with Hidden Units: Beyond the Perceptron,"
*Physica*, **22D**, 260-275.

Shannon, C. E. (1938). "A Symbolic Analysis of Relay
and Switching Circuits," *Trans. of the AIEE*, **57**, 713-723.

Shaw, G. and Vasudevan, R. (1974). "Persistent States
of Neural Networks and the Nature of Synaptic Transmissions," *Math.
Biosci.*, **21**, 207-218.

Sheng, C. L. (1969). *Threshold Logic.* Academic
Press, New York, NY.

Shiino, M. and Fukai, T. (1990). "Replica-Symmetric
Theory of the Nonlinear Analogue Neural Networks," *J. Phs. A*,
**23**, L1009-L1017.

Shrödinger, E. (1946). *Statistical Thermodynamics*.
Cambridge University Press, London.

Sietsma, J. and Dow, R. J. F. (1988). "Neural Net
Pruning - Why and How," in *IEEE International Conference on Neural
Networks* (San Diego 1988), vol. I, 325-333. IEEE, New York.

Silva, F. M. and Almeida, L. B. (1990). "Acceleration
Techniques for the Backpropagation Algorithm," *Neural Networks,
Europe Lecture Notes in Computer Science*, L. B. Almeida and Wellekens,
Editors, 110-119. Springer-Verlag, Berlin.

Simard, P. Y., Ottaway, M. B., and Ballard, D. H. (1988).
"Analysis of Recurrent Backpropagation," Technical Report 253,
Department of Computer Science, University of Rochester.

Simard, P. Y., Ottaway, M. B., and Ballard, D. H. (1989).
"Analysis of Recurrent Backpropagation," in *Proceedings of
the 1988 Connectionist Models Summer School* (Pittsburgh 1988), D. Touretzky,
G. Hinton, and T. Sejnowski, Editors, 103-112. Morgan Kaufmann, San Mateo.

Simeone, B., Editor (1989). *Combinatorial Optimization*.
Springer-Verlag, New York.

Simpson, P. K. (1990). "Higher-Ordered and Intraconnected
Bidirectional Associative Memory, *IEEE Trans. System, Man, and Cybernetics*,
**20**(3), 637-653.

Slansky, J. and Wassel, G. N. (1981). *Pattern Classification
and Trainable Machines.* Springer-Verlag, New York.

van der Smagt, P. P. (1994). "Minimisation Methods
for Training Feedforward Neural Networks," *Neural Networks*,
**7**(1), 1-11.

Smith, J. M. (1987). "When Learning Guides Evolution,"
*Nature*, **329**, 761-762.

Smolensky, P. (1986). "Information Processing in
Dynamical Systems: Foundations of Harmony Theory," in *Parallel
Distributed Processing: Explorations in the Microstructure of Cognition*,
vol. I, D. E. Rumelhart, J. L. McClelland, and the PDP Research Group.
MIT Press, Cambridge.

Snapp, R. R., Psaltis, D. and Venkatesh, S. S. (1991).
"Asymptotic Slowing Down of the Nearest-Neighbor Classifier,"
in *Advances in Neural Information Processing Systems 3* (Denver 1990),
R. P. Lippmann, J. E. Moody, and D. S. Touretzky, Editors, 932-938. Morgan
Kaufmann, San Mateo.

Solla, S. A., Levin, E., and Fleisher, M. (1988). "Accelerated
Learning in Layered Neural Networks," *Complex Systems*, **2**,
625-639.

Song, J. (1992). "Hybrid Genetic/Gradient Learning
in Multi-Layer Artificial Neural Networks," Ph.D. Dissertation, Department
of Electrical and Computer Engineering, Wayne State University, Detroit,
Michigan.

Sontag, E. D. and Sussann, H. J. (1985). "Image Restoration
and Segmentation Using Annealing Algorithm," in *Proc. 24th Conference
on Decision and Control* (Ft. Lauderdale 1985), 768-773.

Soukoulis, C. M., Levin, K., and Grest, G. S. (1983).
"Irreversibility and Metastability in Spin-Glasses. I. Ising Model,"
*Physical Review*, **B28**, 1495-1509.

Specht, D. F. (1990). "Probabilistic Neural Networks,"
*Neural Networks*, **3**(1), 109-118.

Sperduti, A. and Starita, A. (1991). "Extensions
of Generalized Delta Rule to Adapt Sigmoid Functions," *Proceedings
of the 13th Annual International Conference IEEE/EMBS*, 1393-1394. IEEE,
New York.

Sperduti, A. and Starita, A. (1993). "Speed Up Learning
and Networks Optimization with Extended Back Propagation," *Neural
Networks*, **6**(3), 365-383.

Spitzer, A. R., Hassoun, M. H., Wang, C., and Bearden,
F. (1990). "Signal Decomposition and Diagnostic Classification of
the Electromyogram Using a Novel Neural Network Technique," in *Proc.
XIVth Ann. Symposium on Computer Applications in Medical Care* (Washington
D. C., 1990), R. A. Miller, Editor, 552-556. IEEE Computer Society Press,
Los Alamitos.

Spreecher, D. A. (1993). "A Universal Mapping for
Kolmogorov's Superposition Theorem," *Neural Networks*, **6**(8),
1089-1094.

Stent, G. S. (1973). "A Physiological Mechanism for
Hebb's Postulate of Learning," *Proceedings of the National Academy
of Sciences *(USA), **70**, 997-1001.

Stiles, G. S. and Denq, D-L. (1987). "A Quantitative
Comparison of Three Discrete Distributed Associative Memory Models,"
*IEEE Trans. Computers*, **C-36**, 257-263.

Stinchcombe, M. and White, H. (1989). "Universal
Approximations Using Feedforward Networks with Non-Sigmoid Hidden Layer
Activation Functions," *Proc. Int. Joint Conf. Neural Networks*
(Washington, D. C. 1989), vol. I, 613-617. SOS Printing, San Diego.

Stone, M. (1978). "Cross-Validation: A Review,"
*Math. Operationsforsch Statistik*, **9**, 127-140.

Sudjianto, A. and Hassoun, M. (1994). "Nonlinear
Hebbian Rule: A Statistical Interpretation," *IEEE International
Conference on Neural Networks*, (Orlando 1994), vol. XXX, XXXpage numbersXXX,
IEEE Press.

Sun, G.-Z., Chen, H.-H., and Lee, Y.-C. (1992). "Green's
Function Method for Fast On-Line Learning Algorithm of Recurrent Neural
Networks," in *Advances in Neural Information Processing 4* (Denver
1991), J. E. Moody, S. J. Hanson, and R. P. Lippmann, Editors, 317-324.
Morgan Kaufmann, San Mateo.

Sun, X. and Cheney, E. W. (1992). "The Fundamentals
of Sets of Ridge Functions," *Aequationes Math*., **44**,
226-235.

Suter, B. and Kabrisky, M. (1992). "On a Magnitude
Preserving Iterative MAXnet Algorithm," *Neural Computation*,
**4**(2), 224-233.

Sutton, R. (1986). "Two Problems with Backpropagation
and Other Steepest-Descent Learning Procedures for Networks," *Proceedings
of the 8th Annual Conference on the Cognitive Science Society* (Amherst
1986), 823-831. Lawrence Erlbaum, Hillsdale.

Sutton, R. S., Editor. (1992). Special Issue on Reinforcement
Learning, *Machine Learning*, **8**, 1-395.

Sutton, R. S., Barto, A. G., and Williams, R. J. (1991).
"Reinforcement Learning is Direct Adaptive Optimal Control,"
in *Proc. of the American Control Conference* (Boston 1991), 2143-2146.

Szu, H. (1986). "Fast Simulated Annealing,"
in *Neural Networks for Computing* (Snowbird 1986), J. S. Denker,
Editor, 420-425. American Institute of Physics, New York.

Takefuji, Y. and Lee, K. C. (1991). "Artificial Neural
Network for Four-Coloring Map Problems and K-Colorability Problem,"
*IEEE Transactions Circuits ad Systems*, **38**, 1991, 326-333.

Takens, F. (1981). "Detecting Strange Attractors
in Turbulence," in *Dynamical Systems and Turbulence. Lecture Notes
in Mathematics*, vol. 898 (Warwick 1980), D. A. Rand and L.-S. Young,
Editors, 366-381. Springer-Verlag, Berlin.

Takeuchi, A. and Amari, S.-I. (1979). "Formation
of Topographic Maps and Columnar Microstructures," *Biol. Cybernetics*,
**35**, 63-72.

Tank, D. W. and Hopfield, J. J. (1986). "Simple "Neural"
Optimization Networks: An A/D Converter, Signal Decision Circuit, and a
Linear Programming Circuit," *IEEE Transactions on Circuits and
Systems*, **33**, 533-541.

Tank, D. W. and Hopfield, J. J. (1987). "Concentrating
Information in Time: Analog Neural Networks with Applications to Speech
Recognition Problems," in *IEEE First International Conference on
Neural Networks* (San Diego 1987), M. Caudill and C. Butler, Editors,
vol. IV, 455-468. IEEE, New York.

Tattersal, G. D., Linford, P. W., and Linggard, R. (1990).
"Neural Arrays for Speech Recognition," in *Speech and Language
Processing*, C. Wheddon and R. Linggard, Editors, 245-290. Chapman and
Hall, London.

Tawel, R. (1989). "Does the Neuron 'Learn' Like the
Synapse?" in *Advances in Neural Information Processing Systems
1* (Denver 1988), D. S. Touretzky, Editor, 169-176. Morgan Kaufmann,
San Mateo.

Taylor, J. G. and Coombes, S. (1993). "Learning Higher
Order Correlations," *Neural Networks*, **6**(3), 423-427.

Tesauro, G. and Janssens, B. (1988). "Scaling Relationships
in Back-Propagation Learning," *Complex Systems*, **2**, 39-44.

Thierens, D. and Goldberg, D. (1993). "Mixing in
Genetic Algorithms," *Proceedings of the Fifth International Conference
on Genetic Algorithms* (Urbana-Champaign 1993), S. Forrest, Editor,
38-45. Morgan Kaufmann, San Mateo.

Thorndike, E. L. (1911). *Animal Intelligence*. Hafner,
Darien, CT.

Ticknor, A. J. and Barrett, H. (1987). "Optical Implementations
of Boltzmann Machines," *Optical Engineering*, **26**, 16-21.

Tishby, N., Levin, E., and Solla, S. A. (1989). "Consistent
Inference of Probabilities in Layered Networks: Predictions and Generalization,"
in *International Joint Conference on Neural Networks* (Washington
1989), vol. II, 403-410. IEEE, New York.

Tolat, V. (1990). "An Analysis of Kohonen's Self-Organizing
Maps Using a System of Energy Functions," *Biological Cybernetics*,
**64**, 155-164.

Tollenaere, T. (1990). "SuperSAB: Fast Adaptive Back
Propagation with Good Scaling Properties," *Neural Networks*,
**3**(5), 561-573.

Tompkins, C. B. (1956). "Methods of Steepest Descent,"
in *Modern Mathematics for the Engineer*, E. B. Beckenbach, Editor,
McGraw-Hill, New York.

Törn, A. A. and ilinskas,
A. (1989). *Global optimization*. Springer-Verlag, Berlin.

Tsypkin, Ya. Z. (1971). *Adaptation and Learning in
Automatic Systems*. Translated by Z. J. Nikolic. Academic Press, New
York. (First published in Russian language under the title *Adaptatsia
i obuchenie v avtomaticheskikh sistemakh*. Nauka, Moskow 1968)

Turing, A. M. (1952). "The Chemical Basis of Morphogenesis,"
*Philosophical Transactions of the Royal Society*, Series B, **237**,
5-72.

Uesaka, G. and Ozeki, K. (1972). "Some Properties
of Associative Type Memories," *Journal of the Institute of Electrical
and Communication Engineers of Japan*, **55-D**, 323-330.

Usui, S., Nakauchi, S., and Nakano, M. (1991). "Internal
Color Representation Acquired by a Five-Layer Network," *Artificial
Neural Networks, Proceedings of the 1991 International Conference on Artificial
Neural Networks* (Espoo 1991), T. Kohonen, K. Mäkisara, O. Simula,
and J. Kangas, Editors, vol. I, 867-872. Elsevier Science Publishers B.
V., Amsterdam.

Vapnik, V. N. and Chervonenkis, A. Y. (1971). "On
the Uniform Convergence of Relative Frequencies of Events to Their Probabilities,"
*Theory of Probability and its Applications*, **16**(2), 264-280.

Veitch, E. W. (1952). " A Chart Method for Simplifying
Truth Functions," *Proc. of the ACM*, 127-133.

Villiers, J. and Barnard, E. (1993). "Backpropagation
Neural Nets with One and Two Hidden Layers," *IEEE Transactions
on Neural Networks*, **4**(1), 136-141.

Vogl, T. P., Manglis, J. K., Rigler, A. K., Zink, W. T.,
and Alkon, D. L. (1988). "Accelerating the Convergence of the Back-propagation
Method," *Biological Cybernetics*, **59**, 257-263.

Vogt, M. (1993). "Combination of Radial Basis Function
Neural Networks with Optimized Learning Vector Quantization," in *Proceedings
of the IEEE International Conference on Neural Networks* (San Francisco
1993), vol. III, 1841-1846. IEEE, New York.

Waibel, A. (1989). "Modular Construction of Time-Delay
Neural Networks for Speech Recognition," *Neural Computation*,
**1**, 39-46.

Waibel, A., Hanazawa, T., Hinton, G., Shikano, K., and
Lang, K. (1989). "Phoneme Recognition Using Time-Delay Neural Networks,"
*IEEE Transactions on Acoustics, Speech, and Signal Processing*, **37**,
328-339.

Wang, C. (1991). *A Robust System for Automated Decomposition
of the Electromyogram Utilizing a Neural Network Architecture.* Ph.D.
Dissertation, Department of Electrical and Computer Engineering, Wayne
State University, Detroit, Michigan.

Wang, Y.-F., Cruz Jr., J. B., and Mulligan Jr., J. H.
(1990). "Two Coding Strategies for Bidirectional Associative Memory,"
*IEEE Transactions on Neural Networks*, **1**(1), 81-92.

Wang, Y.-F., Cruz Jr., J. B., and Mulligan Jr., J. H.
(1991). "Guaranteed Recall of All Training Pairs for Bidirectional
Associative Memory," *IEEE Transactions on Neural Networks*,
**2**(6), 559-567.

Wasan, M. T. (1969). *Stochastic Approximation*.
Cambridge University Press, New York.

Watta, P. B. (1994). "A Coupled Gradient Network
Approach for Static and Temporal Mixed Integer Optimization," Ph.D.
Dissertation, Department of Electrical and Computer Engineering, Wayne
State University, Detroit, Michigan.

Waugh, F. R., Marcus, C. M., and Westervelt, R. M. (1991).
"Reducing Neuron Gain to Eliminate Fixed-Point Attractors in an Analog
Associative Memory," *Phys. Rev. A*, **43**, 3131-3142.

Waugh, F. R., Marcus, C. M., and Westervelt, R. M. (1993).
"Nonlinear Dynamics of Analog Associative Memories," in *Associative
Neural Memories: Theory and Implementation*, M. H. Hassoun, Editor,
197-211. Oxford University Press, New York.

Wegstein, J. H. (1958). "Accelerating Convergence
in Iterative Processes," *ACM Commun.*, **1**(6), 9-13.

Weigend, A. S. and Gershenfeld, N. A. (1993). "Results
of the Time Series Prediction Competition at the Santa Fe Institute,"
*Proceedings of the IEEE International Conference on Neural Networks*
(San Francisco 1993), vol. III, 1786-1793. IEEE, New York.

Weigend, A. S. and Gershenfeld, N. A., Editors (1994).
*Time Series Prediction: Forecasting the Future and Understanding the
Past*. *Proc. of the NTAO Advanced Research Workshop on Comparative
Time Series Analysis* (Santa Fe 1992). Addison-Wesley, Reading MA.

Weigend, A. S., Rumelhart, D. E., and Huberman, B. A.
(1991). "Generalization by Weight-Elimination with Application to
Forecasting," in *Advances in Neural Information Processing Systems
3* (Denver 1990), R. P. Lippmann, J. E. Moody, and D. S. Touretzky,
Editors, 875-882. Morgan Kaufmann, San Mateo.

Weisbuch, G. and Fogelman-Soulié, F. (1985). "Scaling
Laws for the Attractors of Hopfield Networks," *Journal De Physique
Lett.*, **46**(14), L-623-L-630.

Werbos, P. (1974). "Beyond Regression: New Tools
for Prediction and Analysis in the Behavioral Sciences," Ph.D. Dissertation,
Committee on Applied Mathematics, Harvard University, Cambridge, MA.

Werbos, P. J. (1988). "Generalization of Backpropagation
with Application to Gas Market Model," *Neural Networks*, **1**,
339-356.

Werntges, H. W. (1993). "Partitions of Unity Improve
Neural Function Approximators," in *Proceedings of the IEEE International
Conference on Neural Networks* (San Francisco 1993), vol. II, 914-918.
IEEE, New York.

Wessels, L. F. A. and Barnard, E. (1992). "Avoiding
False Local Minima by Proper Initialization of Connections," *IEEE
Transactions on Neural Networks*, **3**(6), 899-905.

Wettschereck, D. and Dietterich, T. (1992). "Improving
the Performance of Radial Basis Function Networks by Learning Center Locations,"
in *Advances in Neural Information Processing Systems 4* (Denver 1991),
J. E. Moody, S. J. Hanson, and R. P. Lippmann, Editors, 1133-1140. Morgan
Kaufmann, San Mateo.

White, H. (1989). "Learning in Artificial Neural
Networks: A Statistical Perspective," *Neural Networks*, **1**,
425-464.

White, S. A. (1975). "An Adaptive Recursive Digital
Filter," in *Proc. 9th Asilomar Conf. Circuits Syst. Comput.*
(San Francisco 1975), 21-25. Western Periodicals, North Hollywood, CA.

Whitley, D. and Hanson, T. (1989). "Optimizing Neural
Networks Using Faster, More Accurate Genetic Search," in *Proceedings
of the Third International Conference on Genetic Algorithms* (Arlington
1989), J. D. Schaffer, Editor, 391-396. Morgan Kaufmann, San Mateo.

Widrow, B. (1987). "ADALINE and MADALINE - 1963,"
Plenary Speech, *Proc. IEEE 1st Int. Conf. on Neural Networks* (San
Diego 1982), vol. I, 143-158.

Widrow, B. and Angell, J. B. (1962). "Reliable, Trainable
Networks for Computing and Control," *Aerospace Eng.*, **21**
(September issue), 78-123.

Widrow, B. and Hoff Jr., M. E. (1960). "Adaptive
Switching Circuits," *IRE Western Electric Show and Convention Record*,
Part 4, 96-104.

Widrow, B. and Lehr, M. A. (1990). "30 Years of Adaptive
Neural Networks: Perceptron, Madaline, and Backpropagation," *Proc.
IEEE*, **78**(9), 1415-1442.

Widrow, B. and Stearns, S. D. (1985). *Adaptive Signal
Processing*, Prentice-Hall, Englewood Cliffs.

Widrow, B., Gupta, N. K., and Maitra, S. (1973). "Punish/Reward:
Learning with a Critic in Adaptive Threshold Systems," *IEEE Trans.
on System, Man, and Cybernetics*, **SMC-3**, 455-465.

Widrow, B., McCool, J. M., Larimore, M. G., and Johnson
Jr., C. R. (1976). "Stationary and Nonstationary Learning Characteristics
of the LMS Adaptive Filter," *Proc. IEEE*, **64**(8), 1151-1162.

Wieland, A. P. (1991). "Evolving Controls for Unstable
Systems," in *Connectionist Models: Proceedings of the 1990 Summer
School* (Pittsburgh 1990), D. S. Touretzky, J. L. Elman, and G. E. Hinton,
Editors, 91-102. Morgan Kaufmann, San Mateo.

Wieland, A. and Leighton, R. (1987). "Geometric Analysis
of Neural Network Capabilities," *First IEEE Int. Conf. on Neural
Networks* (San Diego 1987), vol. III, 385-392. IEEE, New York.

Wiener, N. (1956). *I Am a Mathematician*. Doubleday,
NY.

Wilkinson, J. H. (1965). *The Algebraic Eigenvalue Problem*.
Oxford University Press, Oxford, UK.

Williams, R. J. (1987). "A Class of Gradient Estimating
Algorithms for Reinforcement Learning in Neural Networks," in *IEEE
First International Conference on Neural Networks* (San Diego 1987),
M. Caudill and C. Butler, Editors., vol. II, 601-608. IEEE, New York.

Williams, R. J. (1992). "Simple Statistical Gradient-Following
Algorithms for Connectionist Reinforcement Learning," *Machine Learning*,
**8**, 229-256.

Williams, R. J. and Zipser, D. (1989a). "A Learning
Algorithm for Continually Running Fully Recurrent Neural Networks,"
*Neural Computation*, **1**(2), 270-280.

Williams, R. J. and Zipser, D. (1989b). "Experimental
Analysis of the Real-Time Recurrent Learning Algorithm," *Connection
Science*, **1**, 87-111.

Willshaw, D. J. and von der Malsburg, C. (1976). "How
Patterned Neural Connections can be set up by Self-Organization,"
*Proceedings of the Royal Society of London*, **B 194**, 431-445.

Winder, R. O. (1962). *Threshold Logic,* Ph.D. Dissertation,
Dept. of Mathematics, Princeton University, NJ.

Winder, R. O. (1963). "Bounds on Threshold Gate Realizability,"
*IEEE Trans. Elec. Computers*,** EC-12**(5), 561-564.

Wittner, B. S. and Denker, J. S. (1988). "Strategies
for Teaching Layered Networks Classification Tasks," in *Neural
Information Processing Systems* (Denver 1987), D. Z. Anderson, Editor,
850-859. American Institute of Physics, New York.

Wong, Y.-F. and Sideris, A. (1992). "Learning Convergence
in the Cerebellar Model Articulation Controller," *IEEE Trans. on
Neural Networks*, **3**(1), 115-121.

Xu, L. (1993). "Least Mean Square Error Reconstruction
Principle for Self-Organizing Neural-Nets," *Neural Networks*,
**6**(5), 627-648.

Xu, L. (1994). "Theories of Unsupervised Learning:
PCA and its Nonlinear Extensions," *IEEE International Conference
on Neural Networks*, (Orlando 1994), vol. XXX, XXXpage numbers XXX,
IEEE Press.

Yanai, H. and Sawada, Y. (1990). "Associative Memory
Network Composed of Neurons with Hysteretic Property," *Neural Networks*,
**3**(2), 223-228.

Yang, L. and Yu, W. (1993). "Backpropagation with
Homotopy," *Neural Computation*, **5**(3), 363-366.

Yoon, Y. O., Brobst, R. W., Bergstresser, P. R., and Peterson,
L. L. (1989). "A Desktop Neural Network for Dermatology Diagnosis,"
*Journal of Neural Network Computing*, Summer, 43-52.

Yoshizawa, S., Morita, M. and Amari, S.-I. (1993a). "Capacity
of Associative Memory Using a Nonmonotonic Neuron Model," *Neural
Networks*, **6**(2), 167-176.

Yoshizawa, S., Morita, M., and Amari, S.-I. (1993b). "Analysis
of Dynamics and Capacity of Associative Memory Using a Nonmonotonic Neuron
Model," in *Associative Neural Memories: Theory and Implementation*,
M. H. Hassoun, Editor, 239-248. Oxford University Press, New York.

Youssef, A. M. and Hassoun, M. H. (1989). "Dynamic
Autoassociative Neural Memory Performance vs. Capacity," *Proc.
SPIE, Optical Pattern Recognition*, H.-K. Liu, Editor, **1053**,
52-59.

Yu, X., Loh, N. K., and Miller, W. C. (1993). "A
New Acceleration Technique for the Backpropagation Algorithm," in
*IEEE International Conference on Neural Networks* (San Francisco
1993), vol. III, 1157-1161, IEEE, New York.

Yuille, A. L., Kammen, D. M., and Cohen, D. S. (1989).
"Quadrature and the Development of Orientation Selective Cortical
Cells by Hebb Rules," *Biological Cybernetics*, **61**, 183-194.

Zak, M. (1989). "Terminal Attractors in Neural Networks,"
*Neural Networks*, **2**(4), 258-274.

Zhang, J. (1991). "Dynamics and Formation of Self-Organizing
Maps," *Neural Computation*, **3**(1), 54-66.