Subject Index
Fundamentals of Artificial Neural Networks
Mohamad H. Hassoun
MIT Press
A
a-LMS rule, 67
m-LMS rule, 67-69, 72
f--general positon, 20-22, 25
f-mapping, 20
f-separable, 20
f-space, 20-21, 25
f-surface, 20, 24-25
A
analog-to-digital (A/D) convertor, 398, 413-414
activation function,
219
homotopy, 220
hyperbolic tangent, 77, 198, 201, 220, 332
hysteretic, 387-388
logistic function, 77, 198, 201
nondeterministic, 428
nonmonotonic, 385
nonsigmoidal, 49
sigmoidal, 46, 76
sign (sgn), 79, 347, 383
activity pattern, 433
activation slope, 220, 332
ADALINE, 66, 85
adaptive linear combiner element, 66
adaptive resonance theory (ART) networks, 323-328
adjoint net, 270
admissible pattern, 7
AHK
see Ho-Kashyap learning rules
AI, 51
algorithmic complexity, 51
ALVINN, 244-246
AND, 3
ambiguity, 24, 182
ambiguous response, 26
analog optical implementations, 53
analog VLSI technology, 53
annealing
deterministic, 369
schedule, 428
see also simulated annealing
anti-Hebbian learning, 100
approximation
capability, 48, 221
function, 144
theory, 144
architecture
adaptive resonance, 323
AND-OR network, 35-36
bottleneck, 250, 332
CCN, 319
fully recurrent, 259
multilayer feedforward, 38, 40, 43, 184, 197, 261
partially recurrent, 259
randomly interconnected feedforward, 38,40
recurrent, 254
threshold-OR net, 36
time-delay, 254
unit-allocating, 318
artificial intelligence, 51
artificial neuarl network, 1
artificial neuron, 1
ART, 323-328
association
auto, 346
hetero, 346
pairs, 346
associative memory, 271, 345
absolute capacity, 363, 365, 381, 385, 387, 390
basin of attraction, 366, 374-375, 378, 388
cross-talk, 347, 348, 364, 381, 383
default memory, 353
dynamic (DAM), 345, 353-374
error correction, 348, 352, 363, 366, 370-371, 385, 389
fundamental memories, 365, 374
ground state, 375
high-performance, 374
linear (LAM), 346
low-performance, 375
noise-suppression, 351
optimal linear (OLAM), 350-351
oscillation, 367, 373
oscillation-free, 373
performance characteristics, 374
performance criteria, 374-375
recording/storage recipe, 346
relative capacity, 363, 366-367, 388
simple, 346
spurious memories, 359, 363, 372, 374-375, 382
variations on, 275-394
see also DAM
associative reward-penalty, 89
asymptotic stability, 268, 358
asymptotically stable points, 148
attractor state, 265, 270, 354, 366, 375
autoassociative net, 248
autoassociative clustering net, 328
autocorrelation matrix, 71, 91, 97-98, 150
automatic scaling, 325
average entropic error, 186
average generalization ability, 183
average learning equation, 147, 176
average prediction error, 183
B
backprop, 199-202, 234, 271, 455
backprop net, 318
backpropagation, 199-202
activation function, 219-221
applications, 234-253
basin of attraction, 203, 208
batch mode, 202
convergence speed, 211
criterion (error) functions, 199, 202, 230-234
derivation,199-201
example, 203-205,
generalization phenomenon, 229
incremental, 201, 202-203, 213, 224, 271
Langevin-type, 424
learning rate, 199-202, 211-213
local minima, 203, 358, 421
momentum, 213-218
network, 198
recurrent, 265-271
second-order, 218
stochastic, 202
through time, 259-262
time-dependent recurrent, 271-274
variations on, 211-226, 230-234
weight initialization, 210-211
weight decay, 221, 225
basin of attration, 328
basis function, 287
batch mode/update, 63, 81, 90, 167, 172, 290
Bayes decision theory, 112
Bayes classifier, 310
Bayes' rule, 434
bell-shaped function, 49
Bernstein polynomials, 49
bias, 58, 197, 308, 354, 376
bidirectional associative memory (BAM), 393
binary representation, 398
binomial
distribution, 364
expansion, 364
theorem, 445
bit-wise complementation, 441
boarder aberration effects, 116
Boltzmann machine, 431-432
Boltzmann constant, 422, 426
Boltzmann-Gibbs distribution, 426, 431-432
Boltzmann learning, 431
Boolean functions, 3-4, 35, 42, 188
threshold, 5
nonthreshold, 22
random, 304
see also AND, XNOR, XOR
bottleneck, 250
brain-state-in-a-box (BSB) model, 331, 375-381
building block hypothesis, 446
building blocks, 446
Butz's rule, 63
C
calculus of variations, 272
calibration, 107
capacity, 17, 29, 380
Hopfield network, 363-365
linear threshold gate (LTG), 19, 41
polynomial threshold gate (PTG), 21
see also associative memory capacity
cascade-correlation net (CCN), 318-322
CCN, 318-322
center of mass, 105
central limit theorem, 364, 436
cerebeller model articulation controller (CMAC), 301-304
relation to Rosenblatt's perceptron, 304-309
cerebellum, 301
chaos hypothesis, 416
Chebyshev polynomial, 49
classifier, 50, 306, 311
classifier system, 461
classifiers, 461
cluster membership matrix, 167
cluster granuality, 331, 334
clusters, 107, 125, 168, 288
clustering, 106, 171, 328
behavior, 326
network, 106, 322
CMAC, 301-304, 306
codebook, 110
combinatorial complexity, 395
combinatorial optimization, 120, 429
compact representation, 311
competitive learning, 103, 167, 290
stochastic analysis, 168
deterministic analysis, 167
complexity, 51
algorithmic, 51
Kolmogorov, 51
learning, 180, 187, 310
polynomial, 318
space, 51
time, 51
computational complexity, 269
computational energy, 52, 357
concept forming cognitive model, 328
conditional probability density, 85
conjugate gradient method, 217-218
connections
lateral, 100, 173
lateral-inhibitory, 323
self-excitatory, 323
constraints satisfaction term, 396
controller, 264
convergence in the mean, 69
convergence phase, 116
convergence-inducing process, 424
convex, 70
cooling schedule, 427-428
correlation learning rule, 76, 148
correlation matrix, 91, 102, 346
eigenvalues, 91
eigenvectors, 91
correlation memory, 346
cost function, 143, 395, 397
see also criterion function
cost term, 396
Coulomb potential, 312
covariance, 320
critic, 88
critical features, 306-308
cross-correlations, 91
cross-correlation vector, 71
crossing site, 441
crossover
see genetic operators
cross-validation, 187, 226-230, 290
criterion function, 58, 63, 68, 127-133, 143, 145, 155, 195, 230-234
backprop, 199
Durbin and Willshaw, 141
entropic, 86, 89, 186, 220, 230
Gordon et al., 137
Kohonen's feature map, 171
mean-square error (MSE), 71
Minkowski-r, 83, 168, 230, 231
perceptron, 63, 65, 86
sum of squared error (SSE), 68, 70, 79, 82, 87, 288, 353
travelling salesman problem,
well-formed, 86, 231
see cost function
criterion functional, 271
critical overlap, 384, 386
curve fitting, 221
Cybenko's theorem, 47
training cycle, 59
training pass, 59
D
DAM, 353-374
bidirectional (BAM), 393
BSB, 375-381
correlation, 363, 381
combinatorial optimization, 394-399
exponential capacity, 389-391
heteroassociative, 392-394
hysteretic activations, 386-388
nonmonotonic activations, 381
projection, 369-374
sequence generator, 391-392
see also associative memory
data compression, 109
dead-zone, 62
DEC-talk, 236
decision
boundary, 311
hyperplane, 62
region, 311
surface, 16, 24
deep net, 318
degrees of freedom, 21, 27, 29, 332, 425
tapped-delay lines, 254
delta learning rule, 88, 199, 289, 291, 455, 459
density-preserving feature, 173
desired associations, 346
desired response, 268
deterministic annealing, 369
deterministic unit, 87
device physics, 52
diffusion process, 421
dichotomy, 15, 25, 43
linear, 16-18, 25
machine, 185
dimensionality
expansion, 188
reduction, 97, 120
direct Ho-Kashyap (DHK) algorithm, 80
discrete-time states, 274
distributed representation, 250, 308-309, 323, 333
distribution
don't care states, 12
dynamic associative memory
see DAM
dynamic mutation rate, 443
dynamic slope, 332
E
eigenvalue, 92, 361, 376
eigenvector extraction, 331
elastic net, 120
electromyogram (EMG) signal, 334
EMG, 334-337
encoder, 247
energy function, 357, 362, 376, 393, 426, 420, 432
entropic loss function, 186
entropy, 433
environment
nonstationary, 153
stationary, 153
ergodic, 147
error-backpropagation network
see backpropagation
error function, 70, 143, 199, 268
see also criterion function
error function, erf(x), 364
error rate, 348
error suppressor function, 232
estimated target, 201
Euclidean norm, 288
exclusive NOR
see XNOR
exclusive OR, 5
see XOR
expected value, 69
exploration process, 424
exponential decay laws, 173
extrapolation, 295
extreme inequalities, 189
extreme points, 27-29
F
false-positive classification error, 294-295
feature map(s), 171, 241
Feigenbaum time-series, 278
Fermat's stationarity principle, 418
filter
filtering,
finite difference approximation, 272
fitness, 440
fitness function, 439, 457
multimodal, 443
see also genetic algorithms
fixed point method, 360
fixed points,
flat spot, 77, 86, 211, 219, 220, 231
forgetting
term, 146, 157
effects, 179
free parameters, 58, 288, 295, 311, 354
function approximation, 46, 294, 299, 319
function counting theorem, 17
function decomposition, 296
fundamental theorem of genetic algorithms, 443
G
GA
see genetic algorithm
GA-assisted supervised learning, 454
GA-based learning methods, 452
GA-deceptive problems, 446
gain, 357
Gamba perceptron, 308
Gaussian-bar unit
see units
Gaussian distribution
see distribution
Gaussian unit
see unit
general learning equation, 145
general position, 16-17, 41
generalization, 24, 145, 180, 186, 221, 226, 243, 295
ability, 182
average, 180
enforcing, 233
error, 26-27, 180, 185, 187, 226
local, 302, 307
parameter, 301-302
theoretical framework, 180-187
worst case, 180, 184-185
generalized inverse, 70, 291
see also pseudo-inverse
genetic algorithm (GA), 439-447
genetic operators, 440, 442
crossover, 440, 442, 445
mutation, 440, 443, 445
reproduction, 440
global
descent search, 206, 421
fit, 295
minimal solution, 395
minimum, 70, 417
minimization, 419
optimization, 206, 419, 425
search strategy, 419
Glove-Talk, 236-240
gradient descent search, 64, 200, 202, 418-419, 288, 300, 418-419
gradient-descent/ascent startegy, 420
gradient net, 394-399
gradient system, 148, 358, 376, 396-397
Gram-Schmidt orthogonalization, 217
Greville's theorem, 351
GSH model, 307, 310
guidance process, 424
H
Hamming Distance, 371
hypersphere, 365
net, 408
normalized, 307, 349, 365
handwritten digits, 240
hard competition, 296-297
hardware annealing, 396, 438
Hassoun's rule, 161
Hebb rule, 91
Hermitian matrix, 91
Hessian matrix, 64, 70, 98, 148-149, 212, 215, 418
hexagonal array, 124
hidden layer, 197-198, 286
hidden targets, 455
hidden-target space, 455-456, 458
hidden units, 198, 286, 431
higher-order statistics, 101, 333
higher-order unit, 101, 103
Ho-Kashyap algorithm, 79, 317
Ho-Kashyap learning rules, 78-82
Hopfield model/net, 354, 396-397, 429
capacity, 19
continuous, 360
discrete, 362
stochastic, 430-431, 437
hybrid GA/gradient search, 453-454
hybrid learning algorithms, 218, 453
hyperbolic tangent activation function
see activation function
hyperboloid, 316
hypercube, 302, 357, 376, 396
hyperellipsoid, 316
hyperplane, 5, 16-17, 25, 59, 311
hypersurfaces, 311
hyperspheres, 315
hyperspherical classifiers, 311
hysteresis, 386-387
I
ill-conditioned, 292
image compression, 247-252
implicit parallelism, 447
incremental gradient descent, 64, 202
incremental update, 66, 88, 202
input sequence, 261
input stimuli, 174
instantaneous error function, 268
instantaneous SSE criterion function, 77, 199
interconnection
matrix, 330, 346, 357
weights, 350
interpolation, 223, 290
function, 291
matrix, 292
quality, 292
Ising model, 429
isolated-word recognition, 125
J
Jacobian matrix, 149
joint distribution, 85
K
Karnaugh map (K-map), 6, 36
K-map technique, 37
k-means clusterting, 167, 290
incremental algorithm, 296
k-nearest neighbor classifier, 204, 310, 318
Karhunen-Lo‚ve transform, 98
see also principal component analysis
kernel, 287
key input pattern, 347-348
kinks, 172-173
Kirchoff's current law, 354
Kohonen's feature map, 120, 171
Kolmogorov's theorem, 46-47
Kronecker delta, 268
L
Lagrange multiplier, 145, 272
LAM
see associative memory
Langevin-type learning, 424
Laplace-like distribution, 85
lateral connections, 101
lateral inhibition, 103
lateral weights, 173
leading eigenvector, 97
learning
associative, 57
autoassociative, 328
Boltzmann, 431, 439
competitive, 105, 290
hybrid, 218
hybrid GA/gradient descent, 453, 457
Langevin-type, 424
leaky, 106
on-line, 295
parameter, 153
reinforcement, 57, 87-89, 165
robustness, 232
signal, 146
supervised, 57, 88, 289
temporal, 253
unsupervised, 57
see also learning rule
learning curve, 183-184, 186
learning rate/coefficient, 59, 71, 173
learning rule, 57
anti-Hebbian, 101, 434, 436
associative reward penalty, 89
backprop, 199-202, 455
Boltzmann, 433-434
Butz, 63
competitive, 105
correlation, 76, 148
covariance, 76
delta, 88, 199, 291, 459
global descent, 206-209
Hassoun, 161
Hebbian, 91, 151, 154-155, 176, 434,436
Ho-Kashyap, 78-82, 304, 308
Linsker, 95-97
LMS, 67-69, 150, 304, 330, 454
Mays, 66
Oja, 92, 155-156
perceptron, 58, 304
pseudo-Hebbian, 178
Sanger, 99
Widrow-Hoff, 66, 330
Yuille et al., 92, 158
learning vector quantization, 111
least-mean-square (LMS) solution, 72
Levenberg-Marquardt optimization, 218
Liapunov
asymptotic stability theorem, 358
first method, 149
function, 150, 331, 334, 357-361, 430
global asymptotic stability theorem , 359
second (direct) method, 358
limit cycle, 362, 367, 371, 378
linear array, 120
linear associative memory, 346
linear matched filter, 157
linear programming, 301, 316-317
linear threshold gate (LTG), 2, 304, 429
linear separability, 5, 61, 80
linear unit, 50, 68, 99, 113, 286, 346
linearly dependent, 350
linearly separable mappings, 188, 455
Linsker's rule, 95-97
LMS rule, 67-69, 150, 304, 330
batch, 68, 74
incremental, 74
see learning rules
local
encoding, 51
excitation, 176-177
local fit, 295
local maximum, 64, 420
local minimum, 109, 203, 417
local property, 157
locality property, 289
locally tuned
units, 286
representation, 288
response, 285
log-likelihood function, 85
logic sell array, 304
logistic function, 77, 288
also see activation function
lossless amplifiers, 397
lower bounds, 41, 43
LTG, 2, 304
network, 35
see linear threshold gate
LTG-realizable, 5
LVQ, 111
M
Mackey-Glass time-series, 281, 294, 322
Manhattan norm, 84
matrix differential calculus, 156
margin, 62, 81
MAX operation, 442
max selector, 407
maximum likelihood, 84
estimate, 85, 231, 297
estimator, 186
Mays's rule, 66
McCulloch-Pitts unit
see linear threshold gate
medical diagnosis expert net, 246
mean-field
annealing, 436, 438
learning, 438-439
theroy, 437-438
mean transitions, 436
mean-valued approximation, 436
memorization, 222
memory
see associative memory
memory vectors, 347
Metropolis algorithm, 426-427
minimal disturbance principle, 66
minimal PTG realization, 21-24
minimum Euclidean norm solution, 350
minimum energy configuration, 425
minimum MSE solution, 72
minimum SSE solution, 71, 76, 144, 150, 291
Minkowski-r criterion function, 83, 230
see criterion functions
Minkowski-r weight update rule, 231- 232
minterm, 3
misclassified, 61, 65
momentum, 213-214
motor unit,
moving-target problem, 318
multilayer feedforward networks
see architecture
multiple point crossover, 451, 456
multiple recording passes, 352
multipoint search strategy, 439
multivatiate function, 420
MUP, 337
mutation
see genetic operators
N
NAND, 6
natural selection, 439
neighborhood function, 113, 171, 173, 177-180
NETtalk, 234-236
neural field, 173
neural network architecture
see architecture
neural net emulator, 264
Newton's method, 64, 215-216
nonlinear activation function, 102
nonlinear dynamical system, 264
nonlinear repreasentations, 252
nonlinear separability, 5, 63, 80
nonlinearly separable
function, 12
mapping, 188
problems, 78
training set, 66, 306
nonstationary process, 154
nonstationary, input distribution, 326
nonthreshold function, 5, 12
NOR, 6
nonlinear PCA, 101, 332
NP-complete,
normal distribution, 84
nonuniversality, 307
Nyquist's sampling criterion, 292
O
objective function, 63, 143, 395, 417, 419
see also criterion function
off-line training, 272
Oja's rule
1-unit rule, 92, 155-156
multuple-unit rule, 99
Oja's unit, 157
OLAM
see associative memory
on-center off-surround, 173-174
on-line classifier, 311
on-line implementations, 322
on-line training, 295
optical interconnections, 53
optimal learning step, 212, 279
optimization, 417, 439
OR, 3
ordering phase, 116
orthonormal vectors, 347
orthonormal set, 15
outlier data (points), 168, 231, 317
overfitting, 145, 294, 311, 322
overlap, 371, 382
overtraining, 230
P
parity function, 35-37, 308, 458
partial recurrence, 264
partial reverse dynamics, 383
partial reverse method, 385
partition of unity, 296
pattern completion, 435
pattern ratio, 349, 367, 370
pattern recognition, 51
PCA net, 99, 163, 252
penalty term, 225
perceptron criterion function, 63, 65, 86
perceptron learning rule, 58-60, 304
perfect recall, 347, 350
phase diagram, 367
origin phase, 372
oscillation phase, 372
recall phase, 372
spin-glass phase, 372
phonemes, 235
piece-wise linear operator, 376
phonotopic map, 124
plant identification, 256, 264
Polack Ribi‚re rule, 217
polynomial, 15, 144, 224
approximation, 16, 222, 291
training time, 189, 310
polynomial complexity, 318, 395
polynomial threshold gate (PTG), 8, 287, 306
ploynomial-time classifier (PTC), 316-318
population, 440
positive definite, 70, 149, 361, 418
positive semidefinite, 150
postprocessing, 301
potential
energy, 422
function, 147, 312
field, 177
power dissipation, 52
power method, 92, 212, 330
prediction error, 85
prediction set, 230
premature convergence, 428
premature saturation, 211
preprocessing, 11
principal component(s), 98, 101, 252
principal directions, 98
principal eigenvector, 92, 164
principal manifolds, 252
probability of ambiguous response, 26-27
prototype extraction, 328
prototype unit, 322, 324
pruning, 225, 301
pseudo-inverse, 70, 79, 353
pseudo-orthogonal, 347
PTC net, 316-318
PTG
see polynomial threshold gate
Q
QTG
see quadratic threshold gate
quadratic form, 155, 361
quadratic function, 357
quadratic threshold gate (QTG), 7, 316
quadratic unit, 102
quantization, 250
quickporp, 214
R
radial basis functions, 285-287
radial basis function (RBF) network, 286-294
radially symmetric function, 287
radius of attraction, 390
random motion, 422
random problems, 51
Rayleigh quotient, 155
RBF net, 285, 294
RCE, 312-315
RCE classifier, 315
real-time learning, 244
real-time recurrent learning (RTRL) method, 274-275
recall region, 367
receptive field, 287, 292
centers, 288
localized, 295, 301
overlapping, 302
semilocal, 299
width, 288
recombination mechanism, 440
reconstruction vector, 109
recording recipe, 346
correlation, 346, 359, 376
normalized correlation, 348
projection, 350, 370, 386
recurrent backpropagation, 265-271
recurrent net, 271 , 274
see also architecture
region of influence, 311, 313, 316
regression function, 73
regularization
reinforcement learning, 57, 87
reinforcement signal, 88, 90, 165
relative entropy error measure, 85, 230, 433
see also criterion function
relaxation method, 359
repeller, 207
replica method, 366
representation layer, 250
reproduction
see genetic operators
resonance, 325
see also adaptive resonance theory
resting potential, 174
restricted Coulomb energy (RCE) network, 312-315
retinotopic map, 113
retrieval
dynamics, 382
mode, 345
multiple-pass, 366, 371
one-pass, 365
parallel, 365
properties, 369
Riccati differential equation, 179
RMS error, 202
robust decision surface, 78
robust regression, 85
robustness, 62
robustness preserving feature, 307, 309
root-mean-square (RMS) error, 202
Rosenblatt's perceptron, 304
roulette wheel method, 440, 442
row diagonal dominant matrix, 379
S
saddle point, 358
Sanger's PCA net, 100
Sanger's rule, 99, 163
search space, 452
self-connections, 388
self-coupling terms, 371
self-scaling, 325
self-stabilization property, 323
sensitized units, 106
sensor units, 302
scaling
see computational complexity
schema (schemata), 443, 447
order, 444
defining length, 444
search,
direction, 217
genetic, 439-452
global, 206-209, 419
gradient ascent, 419
gradient descent, 64, 268, 288, 419
stochastic, 431
second-order search method, 215
secong-order statistics, 101
self-organization, 112
self-organizing
feature map, 113, 171
neural field, 173
semilocal activation, 299
sensory mapping,
separating capacity, 17
separating hyperplane, 18
separating surface, 21
sequence
shortcut connections, 321
sigmoid function, 48, 144
see also activation functions
sign function
see activation functions
signal-to-noise ratio, 375, 402
similarity
measure, 168
relationship, 328
simulated annealing, 425-426
single-unit training, 58
Smoluchowski-Kramers equation, 422
smoothness regularization, 296
SOFM, 125
see also self-organizing feature map
soft competition, 296, 298
soft weight sharing, 233
solution vector, 60
somatosensory map, 113, 120
space-filling curve, 120
sparse binary vectors, 409
spatial associations, 265
specialized associations, 352
speech processing, 124
speech recognition, 125, 255
spin glass, 367
spin glass region. 367
spurious cycle, 393
spurious memory
see associative memory
SSE
see criterion function
stability-plasticity dilemma, 323
stable categorization, 323
state space, 375
state variable, 263
static mapping, 265
statistical mechanics, 425, 429
steepest gardient descent method, 64
Sterling's approximation, 42
stochastic
approximation, 71, 73
algorithm, 147
differential equation, 147, 179
dynamics, 148
force, 422
global search, 431
gradient descent, 109, 421, 425
learning equation, 148
network, 185, 428, 438
optimization, 439
process, 71-72, 147
transitions, 436
unit, 87, 165, 431
Stone-Weierstrass theorem, 48
storage capacity
see capacity
storage recipes, 345
strict interpolation, 291
string, 439
strongly mixing process, 152, 176, 179
supervised learning, 57, 88, 455
also see learning rules
sum-of-products, 3
sum of squared error, 68
superimposed patterns, 328
survival of the fittest, 440
switching
algebra, 3
functions, 3-4, 35
theory, 35
see also Boolean functions
symmetry-breaking mechanism, 211
synapse, 52
synaptic efficaces, 90
synaptic signal
synchronous dynamics
see dynamics
T
Taken's theorem, 256
tapped delay lines, 254-259
teacher forcing, 275
temperature, 422, 425
templates, 109
temporal association, 254 , 259, 262-265, 391
temporal associative memory, 391
temporal learning, 253- 275
terminal repeller, 207
thermal energy, 426
thermal equilibrium, 425, 434, 437
threshold function, 5
threshold gates, 2
linear, 2
polynomial, 8
quadratic, 7
threshold logic, 1, 37
tie breaker factor, 326
time-delay neural network, 254-259
time-series prediction, 255, 273
topographic map, 112
topological ordering, 116
trace, 351
training error, 185, 227
training with rubbish, 295
training set, 59, 144, 230
transition probabilities, 426
travelling salesman problem (TSP), 120
truck backer-upper problem, 263
truth table, 3
TSP
see travelling salesman problem
tunneling, 206, 208, 421
Turing machine, 275
twists, 172-173
two-spiral problem, 321
U
unfolding, 261
unit
Gaussian, 299-300, 318
Gaussain-bar, 299-300
hysteretic, 388
linear, 50, 68, 99, 113, 286, 346
quadratic, 102
sigmoidal, 299-300, 318
unit allocating net, 310, 318
unit elimination, 221
unity function, 296
universal approximation, 198, 221, 290, 304, 454
universal approximator, 15, 48, 287, 308
of dynamical systems, 273
universal classifier, 50
universal logic, 36
unsupervised learning, 57, 90
competitive, 105, 166
Hebbian, 90, 151
self-organization, 112
see also learning rule
update
nous, 360
parallel, 369
serially, 369
upper bound, 43
V
validation error, 226
validation set, 227, 230
Vandermonde's determinant, 32
VC dimension, 185
vector energy, 60, 352
vector quantization, 103, 109-110
Veitch diagram, 6
vigilance parameter, 325-328, 333
visible units, 431
VLSI, 304
Voronoi
cells, 110
quantizer, 110
tessellation, 111
vowel recognition, 298
W
Weierstrass's approximation theorem, 15
weight, 2
decay, 157, 163, 187, 221 , 225
decay term, 157
elimination, 221
initialization, 210
sharing, 187, 233 , 241
space, 458
update rule, 64
vector, 2
weight-sharing interconnections, 240
weighted sum, 2-3, 288, 382, 428
Widrow-Hoff rule, 66, 330
Wiener weight vector, 72
winner-take-all, 103
competition, 104, 170
network, 104, 323-324, 408
operation, 104, 297
see also competitive learning
winner unit, 113, 324
X
XNOR, 14
XOR, 5
Y
Yuille et al. rule, 92, 158
Z
ZIP code recognition, 240-243
Back to Main Menu