Several associative neural memory models have been
proposed over the last two decades [e.g., Amari, 1972a; Anderson,
1972; Nakano, 1972; Kohonen, 1972 and 1974; Kohonen and Ruohonen,
1973; Hopfield, 1982; Kosko, 1987; Okajima et al., 1987; Kanerva,
1988; Chiueh and Goodman, 1988; Baird, 1990. For an accessible
reference on various associative neural memory models the reader
is referred to the edited volume by Hassoun (1993)]. These memory
models can be classified in various ways depending on their architecture
(static versus recurrent), their retrieval mode (synchronous versus
asynchronous), the nature of the stored associations (autoassociative
versus heteroassociative), the complexity and capability of the
memory storage/recording algorithm, etc. In this section, a simple
static synchronous associative memory is presented along with
appropriate memory storage recipes. Then, this simple associative
memory is extended into a recurrent autoassociative memory by
employing feedback. These two basic associative memories will
help define some terminology and serve as a building ground for
some additional associative memory models presented in Section
7.4.

**7.1.1 Simple Associative Memories and Their Associated
Recording Recipes**

One of the earliest associative memory models is
the correlation memory (Anderson, 1972; Kohonen, 1972; Nakano,
1972). This correlation memory consists of a single layer of
*L* non-interacting linear units, with the *l*th unit
having a weight vector **w***l* R*n*.
It associates real-valued input column vectors **x***k* R*n*
with corresponding real-valued output column vectors **y***k* R*L*
according to the transfer equation

**y***k* = **Wx***k*
(7.1.1)

where {**x**k, **y**k},
*k* = 1, 2, ..., *m*, is a
collection of desired associations, and **W** is an *L* × *n*
interconnection matrix whose *l*th row is given by .
A block diagram of the simple associative memory expressed in
Equation (7.1.1) is shown in Figure 7.1.1. Note that this associative
memory is characterized by linear matrix-vector multiplication
retrievals. Hence, it is referred to as a linear associative
memory (LAM). This LAM is labeled heteroassociative since **y***k*
is different (in encoding and/or dimensionality) from **x***k*.
If **y***k* = **x***k*
for all *k*, then this memory is called autoassociative.

Correlation Recording Recipe

The correlation memory is a LAM which employs a
simple recording/storage recipe for loading the *m* associations
{**x***k*, **y***k*}
into memory. This recording recipe is responsible for synthesizing
**W** and is given by

(7.1.2)

In other words, the interconnection matrix **W**
is simply the correlation matrix of *m* association pairs.
Another way of expressing Equation (7.1.2) is

(7.1.3)

where **Y** = [**y**1 **y**2 ... **y***m*]
and **X** = [**x**1 **x**2 ... **x***m*].
Note that for the autoassociative case where the set of association
pairs {**x***k*, **x***k*}
is to be stored, one may still employ Equation (7.1.2) or (7.1.3)
with **y***k*
replaced by **x***k*.

One appealing feature of correlation memories is
the ease of storing new associations or deleting old ones. For
example, if after recording the *m* associations {**x**1, **y**1}
through {**x***m*, **y***m*}
it is desired to record one additional association {**x***m*+1, **y***m*+1},
then one simply updates the current **W** by adding to it the
matrix **y***m*+1(**x***m*+1)T.
Similarly, an already recorded association {**x***i*, **y***i*}
may be "erased" by simply subtracting **y***i*(**x***i*)T
from **W**. However, as is seen next, the price paid for simple
correlation recording is that to guarantee successful retrievals,
we must place stringent conditions on the set {**x***k*;
*k* = 1, 2, ..., *m*} of input
vectors.

What are the requirements on the {**x***k*, **y***k*}
associations which will guarantee the successful retrieval of
all recorded vectors (memories) **y***k*
from their associated "perfect key" **x***k*?
Substituting Equation (7.1.2) in Equation (7.1.1) and assuming
that the key **x***h*
is one of the **x***k*
vectors, we get an expression for the retrieved pattern
as:

(7.1.4)

The second term on the right-hand side of Equation
(7.1.4) represents the "cross-talk" between the key
**x***h* and the
remaining (*m *- 1)
patterns **x***k*.
This term can be reduced to zero if the **x***k*
vectors are orthogonal. The first term on the right-hand side
of Equation (7.1.4) is proportional to the desired memory **y***h*,
with a proportionality constant equal to the square of the norm
of the key vector **x***h*.
Hence, a sufficient condition for the retrieved memory to be
the desired perfect recollection is to have orthonormal vectors
**x***k*, independent
of the encoding of the **y***k*
(note, though, how the **y***k*
affects the cross-talk term if the **x***k*'s
are not orthogonal). If nonlinear units replace the linear ones
in the correlation LAM, perfect recall of binary {**x***k*, **y***k*}
associations is, in general, possible even when the vectors **x***k*
are only pseudo-orthogonal. This can be seen in the following
analysis.

A Simple Nonlinear Associative Memory Model

The assumption of binary-valued associations **x***k* {-1, +1}*n*
and **y***k* {-1, +1}*L*
and the presence of a clipping nonlinearity *F* operating
componentwise on the vector **Wx** (i.e., each unit now employs
a sgn or sign activation function) according to

**y** = *F*[**Wx**] (7.1.5)

relaxes some of the constraints imposed by correlation
recording of a LAM. Here, **W** needs to be synthesized with
the requirement that only the sign of the corresponding components
of **y***k* and
**Wx***k* agree.
Next, consider the normalized correlation recording recipe given
by:

(7.1.6)

which automatically normalizes the **x***k* vectors
(note that the square of the norm of an *n*-dimensional bipolar
binary vector is *n*). Now, if one of the recorded key patterns
**x***h* is presented
as input, then the following expression for the retrieved memory
pattern can be written:

(7.1.7)

where *h*
represents the cross-talk term. For the *i*th component
of , Equation (7.1.7) gives

from which it can be seen that the condition for
perfect recall is given by the requirements

and

for *i* = 1, 2, ..., *L*.
These requirements are less restrictive than the orthonormality
requirement of the **x***k*'s
in a LAM.

Uesaka and Ozeki (1972) and later Amari (1977a,
1990) [see also Amari and Yanai (1993)] analyzed the error correction
capability of the above nonlinear correlation associative memory
when the memory is loaded with *m* independent, uniformly
distributed, and random bipolar binary associations {**x***k*, **y***k*}.
Based on this analysis, the relation between the output and input
error rates *D*out
and *D*in, respectively,
in the limit of large *n* is given by:

(7.1.8)

where is defined as

(7.1.9)

Here, *D*in
is the normalized Hamming distance, ,
between a perfect key vector **x***k*
and a noisy version of **x***k*.
*D*in may also be
related to the "overlap" between **x***k*
and , , as .
Similarly, *D*out
is the normalized Hamming distance between **y***k*
and the output of the associative memory
due to the input . The error rates *D*in
and *D*out may also
be viewed as the probability of error of an arbitrary bit in the
input and output vectors, respectively (for an insight into the
derivation of Equations (7.1.8) and (7.1.9), the reader is referred
to the analysis in Section 7.2.1). Equation (7.1.8) is plotted
in Figure 7.1.2 for several values of the pattern ratio *r*,
. Note how the ability of the correlation
memory to retrieve stored memories from noisy inputs is reduced
as the pattern ratio *r* approaches and exceeds the value
0.15. For low loading levels (*r* < 0.15),
the error correction capabilities of memory improves, and for
*r* << 1 the memory can correct up to 50
percent error in the input patterns with a probability approaching
1. For large *n* with *n* >> *m*
it can be shown that a random set of *m* key vectors **x***k*
become mutually orthogonal with probability approaching unity
(Kanter and Sompolinsky, 1987). Hence, the loading of the *m*
associations {**x***k*, **y***k*},
with arbitrary **y***k*,
is assured using the normalized correlation recording recipe of
Equation (7.1.6).

Figure 7.1.2. Output versus input error rates for
a correlation memory with clipping (sgn) nonlinearity for various
values of pattern ratio *r* = .
Bipolar binary association vectors are assumed which have independent,
random, and uniformly distributed components. Dotted lines correspond
to the approximation in Equation (7.1.9).

Optimal Linear Associative Memory (OLAM)

The correlation recording recipe does not make optimal
use of the LAM interconnection weights. A more optimal recording
technique can be derived which guarantees perfect retrieval of
stored memories **y***k*
from inputs **x***k*
as long as the set {**x***k*;
*k* = 1, 2, ..., *m*} is linearly
independent (as opposed to the more restrictive requirement of
orthogonality required by the correlation-recorded LAM). This
recording technique leads to the optimal linear associative memory
(OLAM) (Kohonen and Ruohonen, 1973) and is considered next.

For perfect storage of *m* associations {**x***k*, **y***k*},
a LAM's interconnection matrix **W** must satisfy the matrix
equation given by:

**Y** = **WX** (7.1.10)

with **Y** and **X** as defined earlier in
this section. Equation (7.1.10) can always be solved exactly
if all *m* vectors **x***k*
(columns of **X**) are linearly independent, which implies
that *m* must be smaller or equal to *n*. For the case
*m* = *n*, the matrix **X** is square and
a unique solution for **W** in Equation (7.1.10) exists giving:

(7.1.11)

which requires that the matrix inverse **X**-1
exists; i.e., the set {**x***k*}
is linearly independent. Thus, this solution guarantees the perfect
recall of any **y***k*
upon the presentation of its associated key **x***k*.

Now, returning to Equation (7.1.10) with the assumption
*m* < *n* and that the **x***k*
are linearly independent, it can be seen that an exact solution
**W*** may not be unique.
In this case, we are free to choose any of the **W***
solutions satisfying Equation (7.1.10). In particular, the minimum
Euclidean norm solution (Rao and Mitra, 1971. See Problem 7.1.6
for further details) given by:

(7.1.12)

is desirable since it leads to the best error-tolerant
(optimal) LAM (Kohonen, 1984). Equation (7.1.12) will be referred
to as the "projection" recording recipe since the matrix-vector
product **x***k*
transforms the *k*th stored vector **x***k*
into the *k*th column of the *m* × *m*
identity matrix. Note that for an arbitrary **Y**, if the
set {**x***k*}
is orthonormal, then **X**T **X** = **I**
and Equation (7.1.12) reduces to the correlation recording recipe
of Equation (7.1.3). An iterative version of the projection recording
recipe in Equation (7.1.12) exists (Kohonen, 1984) based on Greville's
theorem (Albert, 1972) which leads to the exact weight matrix
**W** after exactly one presentation of each one of the *m*
vectors **x***k*.
This method is convenient since a new association can be learned
(or an old association can be deleted) in a single update step
without involving other earlier-learned memories. Other adaptive
versions of Equation(7.1.12) can be found in Hassoun (1993).

OLAM Error Correction Capabilities

The error correcting capabilities of OLAMs [with
the projection recording in Equation (7.1.12)] has been analyzed
by Kohonen (1984) and Casasent and Telfer (1987), among others,
for the case of real-valued associations. The following is a
brief account of the key points of such analysis. Let ,
where **x***k*
is the key of one of the *m* stored associations {**x***k*, **y***k*}.
Also, assume **x***k* R*n*
and **y***k* R*L*.
The vector **n** is a noise vector of zero-mean with a covariance
matrix . Denote the variance of the input
and output noise by and ,
respectively. That is, and
where is the *i*th component of
the retrieved vector . Here, the expectation
is taken over the elements of the argument
vector, not over *k* (note that for zero-mean data,
which is the case for the input and output noise, since the LAM
retrieval operation is linear). For an autoassociative OLAM (**y***k* = **x***k*,
for all *k*) with a linearly independent set {**x***k*},
the error correction measure is given
by (Kohonen, 1984)

(7.1.13)

Thus, for linearly independent key vectors (requiring
*m* *n*) the OLAM always reduces the input
noise (or in the worst case when *m* = *n*,
the input noise is not amplified). Also, note that the smaller
*m* is relative to *n*, the better the noise suppression
capability of the OLAM.

For the heteroassociative case (**y***k* **x***k*),
it can be shown (Casasent and Telfer, 1987) that the OLAM error
correction is given by:

(7.1.14)

where *y**ij*
is the *ij*th element of matrix **Y**, and Tr() is the
trace operator (which simply sums the diagonal elements of the
argument matrix). The first expected value operator is taken
over all elements of **Y**, also both expectation operators
are taken over the entire ensemble of possible recollection and
key vectors, respectively. Also, it is assumed that **y***k*
and **x***k* are
not correlated and that the recollection vectors **y***k*
have equal energy *E*(**y**), defined as .
Equation (7.1.14) shows that the error correction in a heteroassociative
OLAM depends on the choice of not only the key vectors **x***k*
but also on that of the recollection vectors. Poor performance
is to be expected when the matrix **X**T**X**
is nearly singular [which leads to a large value for Tr(**X**T**X**)-1].
The reader should be warned, though, that the
error correction measure is not suitable for heteroassociation
OLAM's because it is not normalized against variation in key/recollection
vector energies; one can artificially reduce the value of this
measure by merely reducing the energy of the recollection vectors
**y***k* (i.e.,
reducing ). The reader is referred to
Problem 7.1.11 for a more appropriate error correction measure
for heteroassociative LAM's.

Error correction characteristics of nonlinear associative
memories whose transfer characteristics are described by Equation
(7.1.5) and employing projection recording with uniformly distributed
random bipolar binary key/recollection vectors have also been
analyzed. The reader is referred to the theoretical analysis
by Amari (1977a) and the empirical analysis by Stiles and Denq
(1987) for details.

Strategies for Improving Memory Recording

The encoding of the **x***k*'s,
their number relative to their dimension, and the recording recipe
employed highly affect the performance of an associative memory.
Assuming that an associative memory architecture and a suitable
recording recipe are identified, how can one improve associative
retrieval and memory loading capacity? There are various strategies
for enhancing associative memory performance. In the following,
two example strategies are presented.

One strategy involves the use of a multiple training
method (Wang et al., 1990) which emphasizes those unsuccessfully
stored associations by introducing them to the weight matrix **W**
through multiple recording passes until all associations are recorded.
This strategy is potentially useful when correlation recording
is employed. In this case, the interconnection matrix is equivalent
to a weighted correlation matrix with different weights on different
association pairs. Here, it is assumed that there exists a weighted
correlation matrix which can store all desired association pairs.

User defined specialized associations may also be
utilized in a strategy for improving associative memory performance.
For instance, one way to enhance the error correction capability
of an associative memory is to augment the set of association
pairs {**x***k*, **y***k*}
to be stored with a collection of associations of the form {, **y***k*}
where represents a noisy version of **x***k*.
Here, several instances of noisy key vector versions for each
desired association pair may be added. This strategy arises naturally
in training pattern classifiers and is useful in enhancing the
robustness of associative retrieval. The addition of specialized
association pairs may also be employed when specific associations
must be introduced (or eliminated). One possibility of employing
this strategy is when a "default memory" is required.
For example, for associations encoded such that sparse input
vectors **s***i*
have low information content, augmenting the original set of associations
with associations of the form {**s***i*, **0**}
during recording leads to the creation of a default "no-decision"
memory **0**. This memory is retrieved when highly corrupted
noisy input key vectors are input to the associative memory, thus
preventing these undesirable inputs from causing the associative
memory to retrieve the "wrong" memories.

The strategy of adding specialized associations
increases the number of associations, *m*, to be stored and
may result in *m* > *n*. Therefore, this
strategy is not well suited for correlation LAMs. Here, one may
employ a recording technique which synthesizes **W** such that
the association error is minimized over all *m* association
pairs. Mathematically speaking, the desired solution is the one
which minimizes the SSE criterion function *J*(**W**)
given by:

(7.1.15)

where and
are the *i*th components of the desired memory **y***k*
and that of the estimated one, respectively. Now, by setting
the gradient of *J*(**W**) to zero and solving for **W**,
one arrives at the following memory storage recipe:

†
(7.1.16)

where **X**†
is the psuedo-inverse of matrix **X** (Penrose, 1955). This
solution assumes that the inverse of the matrix **XX**T
exists.

**7.1.2 Dynamic Associative Memories (DAMs)**

Continuous-Time Continuous-State Model

where and
(or if the inverting output of unit *j*
is connected to unit *i*). The above Hopfield net can be
considered as a special case of a more general dynamical network
developed and studied by (Cohen and Grossberg (1983) which has
an *i*th state dynamics expressed by:

(7.1.20)

The overall dynamics of the Hopfield net can be
described in compact matrix form as

(7.1.21)

where **C** = *C***I** (**I**
is the identity matrix), = diag(1, 2, ..., *n*),
**x** = *F*(**u**) = [*f*(*u*1) *f*(*u*2) ... *f*(*u**n*)]T,
= [*I*1 *I*2 ... *I**n*]T,
and **W** is an interconnection matrix defined as

The equilibria of the dynamics in Equation (7.1.21)
are determined by setting , giving

**u** = **Wx** + = **W***F*(**u**) +
(7.1.22)

A sufficient condition for the Hopfield net to be
stable is that the interconnection matrix **W** be symmetric
((Hopfield, 1984). Furthermore, Hopfield showed that the stable
states of the network are the local minima of the bounded computational
energy function (Liapunov function)

(7.1.23)

where **x** = [*x*1 *x*2 ... *x**n*]T
is the net's output state, and is the
inverse of the activation function *x**j* = *f*(*u**j*).
Note that the value of the right-most term in Equation (7.1.23)
depends on the specific shape of the nonlinear activation function
*f*. For high gain approaching infinity, *f*(*u**j*)
approaches the sign function; i.e., the amplifiers in the Hopfield
net become threshold elements. In this case, the computational
energy function becomes approximately the quadratic function

(7.1.24)

It has been shown ((Hopfield, 1984) that the only
stable states of the high-gain, continuous-time, continuous-state
system in Equation (7.1.21) are the corners of the hypercube;
i.e., the local minima of Equation (7.1.24) are states **x*** {-1, +1}*n*.
For large but finite amplifier gains, the third term in Equation
(7.1.23) begins to contribute. The sigmoidal nature of *f*(*u*)
leads to a large positive contribution near hypercube boundaries,
but negligible contribution far from the boundaries. This causes
a slight drift of the stable states toward the interior of the
hypercube.

Another way of looking at the Hopfield net is as
a gradient system which searches for local minima of the energy
function *E*(**x**) defined in Equation (7.1.23). To
see this, simply take the gradient of *E* with respect to
the state **x** and compare to Equation (7.1.21). Hence, by
equating terms, we have the following gradient system:

(7.1.25)

where = diag(, , ..., ).
Now, the gradient system in Equation (7.1.25) converges asymptotically
to an equilibrium state which is a local minimum or a saddle point
of the energy *E* ((Hirsch and Smale, 1974) (fortunately,
the unavoidable noise in practical applications prevents the system
from staying at the saddle points and convergence to a local minimum
is achieved). To see this, we first note that the equilibria
of the system described by Equation (7.1.25) correspond to local
minima (or maxima or points of inflection) of *E*(**x**)
since means .
For each isolated local minimum **x***,
there exists a neighborhood over which the candidate function
*V*(**x**) = *E*(**x**) -
*E*(**x***) has
continuous first partial derivatives and is strictly positive
except at **x*** where
*V*(**x***) =
0. Additionally,

(7.1.26)

is always negative [since
is always positive because of the monotonically nondecreasing
nature of the relation *x**j* = *f*(*u**j*)]
or zero at **x***.
Hence *V* is a Liapunov function, and **x***
is asymptotically stable.

The operation of the Hopfield net as an autoassociative
memory is straight forward; given a set of memories {**x***k*},
the interconnection matrix **W** is encoded such that the **x***k*'s
become local minima of the Hopfield net's energy function *E*(**x**).
Then, when the net is initialized with a noisy key ,
its output state evolves along the negative gradient of *E*(**x**)
until it reaches the closest local minima which, hopefully, is
one of the fundamental memories **x***k*'s.
In general, however, *E*(**x**) will have additional
local minima other than the desired ones encoded in **W**.
These additional undesirable stable states are referred to as
spurious memories.

When used as a DAM, the Hopfield net is operated
with very high activation function gains and with binary-valued
stored memories. The synthesis of **W** can be done according
to the correlation recording recipe of Equation (7.1.3) or the
more optimal recipe in Equation (7.1.12). These recording recipes
lead to symmetrical **W**'s (since autoassociative operation
is assumed; i.e., **y***k* = **x***k*
for all *k*) which guarantees the stability of retrievals.
Note that the external bias may be eliminated in such DAMs.
The elimination of bias, the symmetric **W**, and the use of
high gain amplifiers in such DAMs lead to the truncated energy
function:

(7.1.27)

Additional properties of these DAMs are explored
in Problems 7.1.13 through 7.1.15.

Discrete-Time Continuous-State Model

An alternative model for retrieving the stable states
(attractors) can be derived by employing the relaxation method
(also known as the fixed point method) for iteratively solving
Equation (7.1.22) ((Cichocki and Unbehauen, 1993). Here, an initial
guess **x**(0) for an attractor state is used as the initial
search point in the relaxation search. Starting from Equation
(7.1.22) with = **I** (without loss of generality)
and recalling that , we may write the
relaxation equation

; *k* = 0, 1, 2, ...
(7.1.28)

or, by solving for **x**(*k*+1),

(7.1.29)

Equation (7.1.29) describes the dynamics of a discrete-time
continuous-state synchronously updated DAM. For ,
Equation (7.1.29) is identical to Equation (7.1.17) which was
intuitively derived, except that the unit activations in the above
relaxation model are of sigmoid-type as opposed to the threshold-type
(sgn) assumed in Equation (7.1.17). Also, when the unit activations
are piece-wise linear, Equation (7.1.29) leads to a special case
of the BSB model, discussed in Section 7.4.1. The parallel update
nature of this DAM is appealing since it leads to faster convergence
(in software simulations) and easier hardware implementations
as compared to the continuous-time Hopfield model.

Most current implementations of continuous-time
neural networks are done using computer simulations which are
necessarily discrete-time implementations. However, the stability
results obtained earlier for the continuous-time DAM do not necessarily
hold for the discrete-time versions. So it is important to have
a rigorous discrete-time analysis of the stability of the dynamics
in Equation (7.1.29). (Marcus and Westervelt (1989) showed that
the function

(7.1.30)

where

(7.1.31)

is a Liapunov function for the DAM in Equation (7.1.29)
when **W** is symmetric and the activation gain [e.g., assume
*f*(*x*) = tanh(*x*)] satisfy the condition

(7.1.32)

Here, min
is the smallest eigenvalue of the interconnection matrix **W**.
Equation (7.1.30) is identical to Equation (7.1.23) since it
is assumed that j = 1,
*j* = 1, 2, ..., *n*. If **W**
has no negative eigenvalues, then Equation (7.1.32) is satisfied
by any value of , since > 0. On the other hand,
if **W** has one or more negative eigenvalues, then min,
the most negative of them, places an upper limit on the gain for
stability.

To prove that *E* in Equation (7.1.30) is a
Liapunov function when Equation (7.1.32) is satisfied, consider
the change in *E* between two discrete time steps:

(7.1.33)

Using the update equation for *x**i*
from Equation (7.1.29) and the symmetry property of **W**,
Equation (7.1.33) becomes

(7.1.34)

when *x**i*(*k*) = *x**i*(*k*+1) - *x**i*(*k*).
The last term in Equation (7.1.34) is related to *x**i*(*k*)
by the inequality ((Marcus and Westervelt, 1989)

(7.1.35)

where or, by using Equation
(7.1.31), . Combining Equations (7.1.34)
and (7.1.35) leads to

(7.1.36)

where **x**(*k*) = **x**(*k *+ 1) - **x**(*k*),
and *ij* = 1
for *i* = *j* and *ij* = 0
otherwise. Now, if the matrix is positive definite, then *E*(*k*) 0
(equality holds only when **x**(*k*) = **0**
which implies that the network has reached an attractor). The
requirement that be positive definite is satisfied by the inequality
of Equation (7.1.32). This result, combined with the fact that
*E*(*k*) is bounded, shows that the function in Equation
(7.1.30) is a Liapunov function for the DAM in Equation (7.1.29)
and thus the DAM is stable. If, on the other hand, the inequality
of Equation (7.1.32) is violated, then it can be shown that the
DAM can develop period-2 limit cycles ((Marcus and Westervelt,
1989).

Discrete-Time Discrete-State Model

Starting with the dynamical system in Equation (7.1.29)
and replacing the continuous activation function by the sign function,
one arrives at the discrete-time discrete-state parallel (synchronous)
updated DAM model where all states *x**i*(*k*),
*i* = 1, 2, ..., *n*, are updated
simultaneously according to

(7.1.37)

Another version of this DAM is one which operates
in a serial (asynchronous) mode. It assumes the same dynamics
as Equation (7.1.37) for the *i*th unit, but only one unit
updates its state at a given time. The unit which updates its
state is chosen randomly and independently of the times of firing
of the remaining (*n* - 1)
units in the DAM. This asynchronously updated discrete-state
DAM is commonly known as the discrete Hopfield net, which was
originally proposed and analyzed by (Hopfield (1982). In its original
form, this net was proposed as an associative memory which employed
the correlation recording recipe for memory storage.

It can be shown (see Problem 7.1.17) that the discrete
Hopfield net with a symmetric interconnection matrix (*w**ij* = *w**ji*)
and with nonnegative diagonal elements (*w**ii* 0)
is stable with the same Liapunov function as that of a continuous-time
Hopfield net in the limit of high amplifier gain; i.e., it has
the Liapunov function in Equation (7.1.24). Hopfield (1984) showed
that both nets (discrete and continuous nets with the above assumptions)
have identical energy maxima and minima. This implies that there
is a one-to-one correspondence between the memories of the two
models. Also, since the two models may be viewed as minimizing
the same energy function* E*, one would expect that the macroscopic
behavior of the two models are very similar; that is, both models
will perform similar memory retrievals.