Bachmann, C. M., Cooper, L. N., Dembo, A., and Zeitouni,
O. (1987). "A Relaxation Model for Memory with High Storage Density,"
*Proc. of the National Academy of Sciences*, USA, **84**, 7529-7531.

Bäck, T. (1993). "Optimal Mutation Rates in
Genetic Search," *Proceedings of the Fifth International Conference
on Genetic Algorithms* (Urbana-Champaign 1993), S. Forrest, Editor,
2-8. Morgan Kaufmann, San Mateo.

Baird, B. (1990). "Associative Memory in a Simple
Model of Oscillating Cortex," in *Advances in Neural Information
Processing Systems 2* (Denver 1989) D. S. Touretzky, Editor, 68-75.
Morgan Kaufmann, San Mateo.

Baird, B. and Eeckman, F. (1993). "A Normal Form
Projection Algorithm for Associative Memory," in *Associative Neural
Memories: Theory and Implementation*, M. H. Hassoun, Editor, 135-166.
Oxford University Press, New York.

Baldi, P. (1991). "Computing with Arrays of Bell-Shaped
and Sigmoid Functions," in *Neural Information Processing Systems
3* (Denver 1990), R. P. Lippmann, J. E. Moody, and D. S. Touretzky,
Editors, 735-742. Morgan Kaufmann, San Mateo.

Baldi, P. and Chauvin, Y. (1991). "Temporal Evolution
of Generalization during Learning in Linear Networks," *Neural Computation*,
**3**(4), 589-603.

Baldi, P. and Hornik, K. (1989). "Neural Networks
and Principal Component Analysis: Learning from Examples Without Local
Minima," *Neural Networks*, **2**(1), 53-58.

Barto, A. G. (1985). "Learning by Statistical Cooperation
of Self-Interested Neuron-Like Computing Elements," *Human Neurobiology*,
**4**, 229-256.

Barto, A. G. and Anandan, P. (1985). "Pattern Recognizing
Stochastic Learning Automata," *IEEE Trans. on Systems, Man, and
Cybernetics*, **SMC-15**, 360-375.

Barto, A. G. and Jordan, M. I. (1987). "Gradient
Following without Backpropagation in Layered Networks," in *IEEE
First International Conference on Neural Networks* (San Diego 1987),
M. Caudill and C. Butler, Editors, vol. II, 629-636. IEEE, New York.

Barto, A. G. and Singh, S. P. (1991). "On The Computational
Economics of Reinforcement Learning," in *Connectionist Models:
Proceedings of the 1990 Summer School* (Pittsburgh 1990), D. S. Touretzky,
J. L. Elman, T. J. Sejnowski, and G. E. Hinton, Editors, 35-44, Morgan
Kaufmann, San Mateo.

Barto A. G., Sutton, R. S., and Anderson, C. W. (1983).
"Neuronlike Adaptive Elements That Can Solve Difficult Learning Control
Problems," *IEEE Trans. System. Man, and Cybernetics*, **SMC-13**(5),
834-846.

Batchelor, B. G. (1969). Learning Machines for Pattern
Recognition, Ph.D. thesis, University of Southampton, Southampton, England.

Batchelor, B. G. (1974). *Practical Approach to Pattern
Classification*. Plenum, New York.

Batchelor, B. G. and Wilkins, B. R. (1968). "Adaptive
Discriminant Functions," *Pattern Recognition*, IEEE Conf. Publ.
**42**, 168-178.

Battiti, R. (1992). "First- and Second-Order Methods
for Learning: Between Steepest Descent and Newton's Method," *Neural
Computation*, **4**(2), 141-166.

Baum, E. B. (1988). "On the Capabilities of Multilayer
Perceptrons," *Journal of Complexity*, **4**, 193-215.

Baum, E. (1989). "A Proposal for More Powerful Learning
Algorithms," *Neural Computation*, **1**(2), 201-207.

Baum, E. and Haussler, D. (1989). "What Size Net
Gives Valid Generalization?" *Neural Computation*, **1**(1),
151-160.

Baum, E. B. and Wilczek, F. (1988). "Supervised Learning
of Probability Distributions by Neural Networks," in *Neural Information
Processing Systems* (Denver 1987), D. Z. Anderson, Editor, 52-61, American
Institute of Physics, New York.

Baxt, W. G. (1990). "Use of Artificial Neural Network
for Data Analysis in Clinical Decision-Making: The Diagnosis of Acute Coronary
Occlusion," *Neural Computation*, **2**(4), 480-489.

Becker, S. and Le Cun, Y. (1989). "Improving the
Convergence of Back-Propagation Learning with Second Order Methods,"
in *Proceedings of the 1988 Connectionist Models Summer School* (Pittsburgh
1988), D. Touretzky, G. Hinton, and T. Sejnowski, Editors, 29-37. Morgan
Kaufmann, San Mateo.

Beckman, F. S. (1964). "The Solution of Linear Equations
by the Conjugate Gradient Method," in *Mathematical Methods for
Digital Computers*, A. Ralston and H. S. Wilf, Editors. Wiley, New York.

Belew, R., McInerney, J., and Schraudolph, N. N. (1990).
"Evolving Networks: Using the Genetic Algorithm with Connectionist
Learning," CSE Technical Report CS90-174, University of California,
San Diego.

Benaim, M. (1994). "On Functional Approximation with
Normalized Gaussian Units," *Neural Computation*, **6**(2),
319-333.

Benaim, M. and Tomasini, L. (1992). "Approximating
Functions and Predicting Time Series with Multi-Sigmoidal Basis Functions,"
in *Artificial Neural Networks*, J. Aleksander and J. Taylor, Editors,
vol. 1, 407-411. Elsevier Science Publisher B. V., Amsterdam.

Bilbro, G. L., Mann, R., Miller, T. K., Snyder, W. E.,
van den Bout, D. E., and White, M. (1989). "Optimization by Mean Field
Annealing," in *Advances in Neural Information Processing Systems
I* (Denver 1988), Touretzky, D. S., 91-98. Morgan Kaufmann, San Mateo.

Bilbro, G. L. and Snyder, W. E. (1989). "Range Image
Restoration Using Mean Field Annealing," in *Advances in Neural
Information Processing Systems I* (Denver 1988), Touretzky, D. S., 594-601.
Morgan Kaufmann, San Mateo.

Bilbro, G. L., Snyder, W. E., Garnier, S. J., and Gault,
J. W. (1992). "Mean Field Annealing: A Formalism for Constructing
GNC-like Algorithms," *IEEE Transactions on Neural Networks*,
**3**(1), 131-138.

Bishop, C. (1991). "Improving the Generalization
Properties of Radial Basis Function Neural Networks," *Neural Computation*,
**3**(4), 579-588.

Bishop, C. (1992). "Exact Calculation of the Hessian
Matrix for the Multilayer Perceptron," *Neural Computation*,
**4**(4), 494-501.

Block, H. D. and Levin, S. A. (1970). "On the Boundedness
of an Iterative Procedure for Solving a System of Linear Inequalities,"
*Proc. American Mathematical Society*, **26**, 229-235.

Blum, A. L. and Rivest, R. (1989). "Training a 3-Node
Neural Network is NP-Complete," *Proceedings of the 1988 Workshop
on Computational Learning Theory*, 9-18, Morgan Kaufmann, San Mateo.

Blum, A. L. and Rivest, R. (1992). "Training a 3-Node
Neural Network is NP-Complete," *Neural Networks*, **5**(1),
117-127.

Blumer, A., Ehrenfeucht, A., Haussler, D., and Warmuth,
M. (1989). "Learnability and the Vapnik-Chervonenkis Dimension,"
*JACM*, **36**(4), 929-965.

Boole, G. (1854). *An Investigation of the Laws of Thought*.
Dover, NY.

Bounds, D. G., Lloyd, P. J., Mathew, B., and Wadell, G.
(1988). "A Multilayer Perceptron Network for the Diagnosis of Low
Back Pain," in *Proc. IEEE International Conference on Neural Networks*
(San Diego 1988), vol. II, 481-489

van den Bout, D. E. and Miller, T. K. (1988). "A
Traveling Salesman Objective Function that Works," in *IEEE International
Conference on Neural Networks* (San Diego 1988), vol. II, 299-303. IEEE,
New York.

van den Bout, D. E. and Miller, T. K. (1989). "Improving
the Performance of the Hopfield-Tank Neural Network Through Normalization
and Annealing," *Biological Cybernetics*, **62**, 129-139.

Bromley, J. and Denker, J. S. (1993). "Improving
Rejection Performance on Handwritten Digits by Training with 'Rubbish',"
*Neural Computation*, **5**(3), 367-370.

Broomhead, D. S. and Lowe, D. (1988). "Multivariate
Functional Interpolation and Adaptive Networks," *Complex Systems*,
**2**, 321-355.

Brown, R. R. (1959). "A Generalized Computer Procedure
for the Design of Optimum Systems: Parts I and II," *AIEE Transactions,
Part I: Communications and Electronics*, **78**, 285-293.

Brown, R. J. (1964). *Adaptive Multiple-Output Threshold
Systems and Their Storage Capacities*, Ph.D. Thesis, Tech. Report 6771-1,
Stanford Electron. Labs, Stanford University, CA.

Brown, M., Harris, C. J., and Parks, P. (1993). "The
Interpolation Capabilities of the Binary CMAC," *Neural Networks*,
**6**(3), 429-440.

Bryson, A. E. and Denham, W. F. (1962). "A Steepest-Ascent
Method for Solving Optimum Programming Problems," *J. Applied Mechanics*,
**29**(2), 247-257.

Bryson, A. E. and Ho, Y.-C. (1969). *Applied Optimal
Control*. Blaisdell, New York.

Burke, L. I. (1991). "Clustering Characterization
of Adaptive Resonance," *Neural Networks*, **4**(4), 485-491.

Burshtien, D. (1993). "Nondirect Convergence Analysis
of the Hopfield Associative Memory," in *Proc. World Congress on
Neural Networks* (Portland 1993), vol. II, 224-227. LEA, Hillsdale,
NJ.

Butz, A. R. (1967). "Perceptron Type Learning Algorithms
in Nonseparable Situations," *J. Math Anal. and Appl.*, **17**,
560-576. Also, see Ph.D. Dissertation, University of Minnesota, 1965.