Extreme Point Machine (EPM): 2-D 2-Class Version

by Ahmad Masadeh and Mohamad Hassoun (May 1998)

This version of our EPM employs a polynomial preprocessing (mapping) layer or a mapping layer of perceptrons with random Gaussian weights. Training the output perceptron weights is accomplished using the AHK learning rule (click here for details).

Instructions

1. Use the mouse to create a Training Set: point and click (Class 1 = red square, Class 2 = black square). Please note that the coordinates of the input window are [-5,+5] for both axis. The training set is restricted to a maximum of 200 points. The training set must contain at least one sample per class. An interesting (and difficult to learn) two interwind spirals (of period 3) training set of 194 points may be automatically loaded by clicking on the "Load Spirals" button. However, the user is not allowed to use this option until at least one user specified training set (a simple one is recommended) is entered and trained (or cleared).
2. If you make a mistake, click on Clear. This will clear the training set, resets margins, and initialize weights to zero. You may also selectively delete a training sample by clicking on it once (click the center of the square representing such sample).
3. Set Polynomial Order (r). You may use an integer between 1 (linear) and 20. If you choose to use a mapping layer of perceptrons, then set the standard deviation s of the random Gaussian weights (where s is a value between 1 and 10), the number of hidden units J (min 2 and max 200), and your choice of hidden unit activation function (sign quantizer or hyperbolic tangent).
4. Press the Train button to start training. The initial training phase assumes zero initial weights and it defaults to minimum of 2000 cycles and maximum of 3000 cycles. The maximum number of cycles can be set (an integer between 1 and 50,000) in the "Max Cycles" window. The cycle number is displayed in the cycles window. Train allows you to continue training strating from the state (weights) where the previous training session ended. At the end of training the applet highlights (in green) the separated regions. Extreme Points (support vectors) from each class are shown with a "ring" around them only when a solution is found. Note that these marked points are only meaningful after a sufficient number of training cycles is executed so that the separating surface converges to its terminal configuration. The user can make sure that this state is reached by further training to assure that the separating surface has converged. Misclassified samples from class 1 and class 2 are designated by a "hollow red square" and "hollow black square", respectively.
5. The user may at this point add or delete samples and/or change polynomial order (or the perceptron hidden layer parameters) and continue training if he/she desires. However, doing so invokes automatic reset (i.e., output unit weights are cleared, margins are reset, and cycles counter is reset to zero). This "Auto Reset" also leads to a new set of hidden layer random weights for the case of the EPM with perceptron preprocessing layer.
6. The Reset button reinitializes the weights (and the training cycle count) to zero, resets margins and clears the separating region, but retains the training set. Reset leads to a new set of hidden layer random weights for the case of the EPM with perceptron preprocessing layer.
7. Execution Time: A Gateway G6 333MHz PC executes around 650 cycles/min a 200 sample problem with polynomial preprocressing and r=20 (same problem executes at 850 cycles/min when a 200 unit perceptron preprocessing layer is employed). The period-3 interwind spiral problem (194 samples) can be solved using an order 20 polynomial EPM and requires about 40,000 cycles (A slightly smaller spiral obtained from the original one by deleting the six outermost samples from each class converges in 2600 cycles). On the other hand, this problem can be solved (with high probability) employing an EPM with random layer of 200 perceptrons (set weight standard deviation of preprocessing layer close to 10) and requires about 1000 to few thousdand cycles (or about 1 to few minutes on a 333MHz PC). It should be noted that the present applet code is not optimized for speed.