Mean field annealing (
Soukoulis et al., 1983;
Bilbro
et al., 1989) is a deterministic approximation to simulated annealing
which is significantly more computationally efficient (faster)
than simulated annealing (
Bilbro et al., 1992). Instead of directly
simulating the stochastic transitions in simulated annealing,
the mean (or average) behavior of these transitions is used to
characterize a given stochastic system. Because computations
using the mean transitions attain equilibrium faster than those
using the corresponding stochastic transitions, mean field annealing
relaxes to a solution at each temperature much faster than does
stochastic simulated annealing. This leads to a significant decrease
in computational effort. The idea of using a deterministic mean-valued
approximation for a system of stochastic equations to simplify
the analysis has been adopted at various instances in this book
(e.g., see Section 4.3). Generally speaking, such approximations
are adequate in high dimensional systems of many interacting units
(states) where each state is a function of all or a large number
of other states allowing the central limit theorem to be used
(see Problem 7.5.9). In this section, we restrict our discussion
of mean field annealing to the Boltzmann machine which was introduced
in the previous section.

At thermal equilibrium, the stochastic Hopfield
net fluctuates about the constant average values in Equation (8.4.1).
The mean state <**x**> is thus one of the local minima
of the quadratic energy function *E*(**x**) at temperature
*T* = . The location of
this minimum may then be computed by solving the set of *n*
nonlinear mean-field equations. An alternative way is to solve
for <**x**> by gradient descent on *E*(**x**)
from an initial random state. This is exactly what the deterministic
continuous Hopfield net with hyperbolic tangent activations does
[recall Equation (7.1.25)]. In fact, Equation (8.4.1) has the
same form as the equation governing the equilibrium points of
the Hopfield net (Bilbro et al., 1989). To see this, we recall
from Equation (7.1.19) the dynamics of the *i*th unit in
a continuous-state electronic Hopfield net, namely

(8.4.2)

The equilibria of Equation (8.4.2) are given by setting
to zero, giving

(8.4.3)

Assuming the common choice *x* = *f*(*u*) = tanh(*u*)
in Equation (8.4.3) gives

(8.4.4)

which becomes identical in form to Equation (8.4.1)
after setting *i* = 1.

The electronic continuous-state (deterministic)
Hopfield net must employ high-gain amplifiers (large ) in order
to achieve a binary-valued solution as is normally generated by
the original stochastic net. However, starting with a deterministic
net having large may lead to a poor local minimum as is the case
with a stochastic net whose "temperature" *T* is
quenched. Since annealing a stochastic net increases the probability
that the state will converge to a global minimum, we may try to
reach this minimum by annealing the approximate mean-field system.
This approach is known as mean-field annealing. Mean-field annealing
can be realized very efficiently in electronic (analog) nets like
the one in Figure (7.1.3) where dynamic amplifier gains allow
for a natural implementation of continuous cooling schedules (Lee
and Sheu, 1993). This is referred to as "hardware annealing."

The deterministic Boltzmann machine is applicable
only to problems involving quadratic cost functions. However,
the principles of mean-field annealing may still be applied to
more general cost functions with substantial savings in computing
time by annealing a steady-state average system as opposed to
a stochastic one. In addition to being faster than simulated
annealing, mean-field annealing has proved to lead to better solutions
in several optimization problems (Van den Bout and Miller, 1988,
1989; Cortes and Hertz, 1989; Bilbro and Snyder, 1989).

The excessive number of calculations required by
Boltzmann machine learning may be circumvented by extending the
above mean-field method to adapting the *w**ij*'s
(Peterson and Anderson, 1987). Here, the correlations <*x**i*
*x**j*> are
approximated by *s**isj*
where *s**i*
is given by the average equation:

(8.4.5)

as in Equation (8.4.1), but with *I**i* = 0
for convenience. Equation (8.4.5) applies for free units (hidden
units and unclamped visible units). For a clamped visible unit
*i*, *s**i*
is set to 1 (the value that the unit's output is supposed to be
clamped at). As required by Boltzmann learning, the correlations
<*x**i* *x**j*>
should be computed at thermal equilibrium. This means that we
must use approximation terms *s**isj*
where the *s**i*'s
(*s**j*'s) are
solutions to the *n* nonlinear equations represented by Equation
(8.4.5). One may employ an iterative method to solve for the
unclamped states *s**i*
according to

(8.4.6)

combined with annealing (gradual increasing of ).
Peterson and Anderson (1987) reported that this mean-field learning
is 10 to 30 times faster than simulated annealing on some test
problems with somewhat better results.