for the equilibrium point **x*** = [0 1]T.
Is this an asymptotically stable point? Why?

**4.2.3** Study the stability
of the lossless pendulum system with the nonlinear dynamics

about the equilibrium points **x*** = [0 0]T
and **x*** = [ 0]T.
Here, measures the angle of the pendulum with respect to its
vertical rest position, *g* is gravitational acceleration,
and *L* is the length of the pendulum.

**4.2.4** Liapunov's first
method for studying the stability of nonlinear dynamical systems
(see footnote #4) is equivalent to studying the asymptotic stability
of a linearized version of these systems about an equilibrium
point. Linearize the second order nonlinear system in Problem
4.2.2 about the equilibrium point **x*** = [0 1]T
and write the system equations in the form . Show that the system
matrix **A** is identical to the Jacobian matrix **f** '(**x**)
at **x** = [0 1]T
of the original nonlinear system and thus both matrices have the
same eigenvalues. (Note that the asymptotic stability of a linear
system requires the eigenvalues of its system matrix **A**
to have strictly negative real parts).

**4.2.5** The linearization
method for studying the stability of a nonlinear system at a given
equilibrium point may fail when the linearized system is stable,
but not asymptotically; that is, if all eigenvalues of the system
matrix **A** have nonpositive real parts, and if the real part
of one or more eigenvalues of **A** has zero real part. Demonstrate
this fact for the nonlinear system

at the equilibrium point **x*** = [0, 0]T.
(Hint: Simulate the above dynamical system for the three cases:
*a* < 0, *a* = 0, and *a* > 0,
for an initial condition **x**(0) of your choice).

* **4.3.1** Show that the
Hessian of *J* in Equation (4.3.6) is given by

Also, show that

* **4.3.2** Study the stability
of the equilibrium point(s) of the dynamical system , where *J*(**w**)
is given by

Note that the above system corresponds to the average learning equation of Linsker's learning rule (see Section 3.3.4) without weight clipping.

**4.3.3** Verify the Hessian
matrices given in Section 4.3 for Oja's, Yuille et al., and Hassoun's
learning rules, given in Equations (4.3.14), (4.3.21), and (4.3.30),
respectively.

**4.3.4** Show that the average
learning equation for Hassoun's rule is given by Equation (4.3.26).

**4.3.5** Study the stability
of the equilibrium points of the stochastic differential equation/learning
rule (Riedel and Schild, 1992)

where is a positive integer. ( Note: this rule
is equivalent to Yuille et al. rule for = 2 ).

† **4.3.6** Study, via
numerical simulations, the stability of the learning rule ( Riedel
and Schild, 1992 )

where **. **Assume a training set {**x**}
of twenty vectors in R10
whose components are generated randomly and independently according
to the normal distribution N(0, 1). Is there a relation between
the stable point(s) **w***
(if such point(s) exist) and the eigenvectors of the input data
autocorrelation matrix? Is this learning rule local? Why?

**4.3.7** Show that the discrete-time
version of Oja's rule is a good approximation of the normalized
Hebbian rule in Equation (4.3.33) for small values. Hint: Start
by showing that

* **4.3.8** Consider the general
learning rule described by the following discrete-time gradient
system:

(1)

with > 0. Assume that **w***
is an equilibrium point for this dynamical system.

a. Show that in the neighborhood of **w***,
the gradient *J*(**w**) can be approximated as

(2)

where **H**(**w***)
is the Hessian of *J* evaluated at **w***.

b. Show that the gradient in Equation (2) is exact
when *J*(**w**) is quadratic; i.e., where **Q** is
a symmetric matrix, and **b** is a vector of constants.

c. Show that the linearized gradient system at **w***
is given by

(3)

where **I** is the identity matrix.

d. What are the conditions on **H**(**w***)
and for local asymptotic stability of **w***
in Equation (3)?

e. Use the above results to show that, in an average
sense, the -LMS rule in Equation (3.1.35) converges asymptotically
to the equilibrium solution in Equation (3.1.45) if 0 < < ,
where max is the largest
eigenvalue of the autocorrelation matrix **C** in Equation
(3.3.4). (Hint: Start with the gradient system in Equation (1)
and use Equation (3.1.44) for *J*). Now, show that is a
sufficient condition for convergence of the -LMS rule. (Hint:
The trace of a matrix (the sum of all diagonal elements) is equal
to the sum of its eigenvalues).

* **4.3.9** Use the results
from the previous problem and Equation (4.3.14) to find the range
of values for for which the discrete-time Oja's rule is stable
(in an average sense). Repeat for Hassoun's rule which has the
Hessian matrix given by Equation (4.3.30), and give a justification
for the choice = 1 which has led to Equation (4.3.32).

† **4.3.10** Consider
a training set of 40 15-dimensional vectors whose components are
independently generated according to a normal distribution N(0, 1).
Employ the stochastic discrete-time version of Oja's, Yuille
et al., and Hassoun's rules (replace by **w***k*+1 - **w***k*
in Equations (4.3.11), (4.3.17), and (4.3.24), respectively) to
extract the principal eigenvector of this training set. Use a
fixed presentation order of the training vectors. Compare the
convergence behavior of the three learning rules by generating
plots similar to those in Figures 4.3.1 (a) and (b). Use = 0.005,
= 100, and a random initial **w**. Repeat using
the corresponding discrete-time average learning equations with
the same learning parameters and initial weight vector as before
and compare the two sets of simulations.

**4.3.11** This problem illustrates
an alternative approach to the one of Section 4.3 for proving
the stability of equilibrium points of an average learning equation
(Kohonen, 1989). Consider a stochastic first order differential
equation of the form where **x**(*t*) is governed by
a stationary stochastic process. Furthermore, assume that the
vectors **x** are statistically independent from each other
and that strong mixing exists. Let **z** be an arbitrary constant
vector having the same dimension as **x** and **w**. Now,
the "averaged" trajectories of **w**(*t*) are
obtained by taking the expected value of ,

a. Show that

where is the angle between vectors **z**
and **w**.

b. Let **z** = **c**(*i*),
the *i*th unity-norm eigenvector of the autocorrelation matrix
**C** = <**xx**T>.
Show that, the average rate of change of the cosine of the angle
between **w** and **c**(*i*)
for Oja's rule [Equation (4.3.12)] is given by

where *i*
is the eigenvalue associated with **c**(*i*).
Note that the **c**(*i*)'s
are the equilibria of Oja's rule.

c. Use the result in part b to show that if then
**w**(*t*) will converge to the solution **w*** = **c**(1),
where **c**(1) is the
eigenvector with the largest eigenvalue 1
(Hint: Recall the bounds on the Rayleigh quotient given in Section
4.3.2).

*** 4.3.12**
Use the technique outlined in Problem 4.3.11 to study the convergence
properties (in the average) of the following stochastic learning
rules which employ a generalized forgetting law:

a. .

b. .

Assume that > 0 and *y* = **w**T**x**,
and that *g*(*y*) is an arbitrary scalar function of
*y* such that exists. Note that Equation (4.7.18) and Oja's
rule are special cases of the learning rules in a and b, respectively.

**4.4.1** Show that Equation
(4.4.7) is the Hessian for the criterion function implied by Equation
(4.4.5).

**4.6.1** Study (qualitatively)
the competitive learning behavior which minimizes the criterion
function

where is as defined in Equation (4.6.3). Can you
think of a physical system (for some integer value of *N*)
which is governed by this "energy" function *J*?

**4.6.2** Derive a stochastic
competitive learning rule whose corresponding average learning
equation maximizes the criterion function in Equation (4.6.16).

* **4.7.1** Show that for
the one-dimensional feature map, Equation (4.7.6) can be derived
from Equation (4.7.4). See Hertz et al. (1991) for hints.

**4.7.2** Show that Equation
(4.7.5) satisfies Equation (4.7.6).

**4.7.3** Solve Equation (4.7.5)
for *p*(*x*) *x, where R.
For which input distribution p*(*x*) do we have a zero-distortion
feature map?

**4.7.4** Prove the stability
of the equilibrium point in Equation (4.7.20). (Hint: Employ the
technique outlined in Problem 4.3.11).

Goto [4.0][4.1] [4.2] [4.3] [4.4] [4.5] [4.6] [4.7] [4.8] [4.9]