4.4 Principal Component Analysis (PCA)

The PCA network of Section 3.3.5, employing Sanger's rule, is analyzed in this section. Recalling Sanger's rule [from Equation (3.3.19)] and writing it in vector form for the continuous-time case, we get

(4.4.1)

with = 1, 2, ..., m, where m is the number of units in the PCA network. We will assume, without any loss of generality, that m = 2. This leads to the following set of coupled learning equations for the two units:

(4.4.2)

and

(4.4.3)

Equation (4.4.2) is Oja's rule. It is independent of unit 2 and thus converges to , the principal eigenvector of the autocorrelation matrix of the input data (assuming zero mean input vectors). Equation (4.4.3) is Oja's rule with the added inhibitory term . Next, we will assume a sequential operation of the two-unit net where unit 1 is allowed to fully converge before evolving unit 2. This mode of operation is permissible since unit 1 is independent of unit 2.

With the sequential update assumption, Equation (4.4.3) becomes

(4.4.4)

For clarity, we will drop the subscript on w. Now, the average learning equation for unit 2 is given by

(4.4.5)

which has equilibria satisfying

(4.4.6)

Hence, and with i = 2, 3, ..., n are solutions. Note that the point is not an equilibrium. The Hessian is given by

(4.4.7)

Since is not positive definite, the equilibrium w* = 0 is not stable. For the remaining equilibria we have the Hessian matrix

(4.4.8)

which is positive definite only at , assuming 2  3. Thus, Equation (4.4.5) converges asymptotically to the unique stable vector which is the eigenvector of C with the second largest eigenvalue 2. Similarly, for a network with m interacting units according to Equation (4.4.1), the ith unit (i = 1, 2, ..., m) will extract the ith eigenvector of C.

The unit-by-unit description presented here helps simplify the explanation of the PCA net behavior. In fact, the weight vectors wi approach their final values simultaneously, not one at a time. Though, the above analysis still applies, asymptotically, to the end points. Note that the simultaneous evolution of the wi is advantageous since it leads to faster learning than if the units are trained one at a time.

Goto [4.0][4.1] [4.2] [4.3] [4.5] [4.6] [4.7] [4.8] [4.9] [4.10]

Back to the Table of Contents

Back to Main Menu