This chapter introduces the LTG as the basic computational
unit in a binary state neural network, and analyzes its computational
capabilities in realizing binary mappings. The LTG is then extended
to the more powerful PTG. A single LTG can only realize threshold
(linearly separable) functions. On the other hand, a PTG with
order is capable of realizing any Boolean
function of *n* inputs. The impressive computational power
of a PTG comes at the expense of increased implementation complexity.

The notion of a separating surface is introduced
and it is used to describe the operation of a threshold gate as
a two-class classifier. It is found that a single LTG is characterized
by a linear separating surface (hyperplane). On the other hand,
a single PTG is capable of realizing a highly-flexible nonlinear
(polynomial) separating surface. The flexibility of a PTG's separating
surface is due to a dimensionality expanding polynomial mapping
(-mapping) which is realized by a preprocessing layer intrinsic
to the PTG.

A fundamental theorem, known as the Function Counting
Theorem, is proved and is used to derive the following important
properties of threshold gates. First, the statistical capacity
of a single LTG, assuming data in general position in R*n*,
is equal to twice the number of its degrees of freedom (weights).
This result is also true for a PTG. Second, a threshold gate
with *d*-degrees of freedom trained to correctly classify
*m* patterns will, on the average, respond ambiguously to
a new input pattern as long as . In this
context, it is found that the probability of generalization error
approaches zero asymptotically as . Finally,
in the limit as , we find that the average
number of patterns needed for complete characterization of a given
training set is only twice the number of degrees of freedom *d*
of a threshold gate, assuming the separability of this training
set by the threshold gate.

**1.1.1** Verify that the
patterns in Figure 1.1.5 are admissible threshold patterns.

**1.1.3** Prove that a PTG(*r*)
with *n* binary inputs has degrees
of freedom, not including threshold.

**1.2.1** Show that .
Also, show that .

**1.2.2** a. Prove algebraically
that , for 1 *i* *n*.

b. Prove by induction, using the recursion relation
in part (a), the Binomial Theorem:

where and are real numbers and *n* is an integer.

c. Use the Binomial Theorem to prove that .

**1.2.3** Verify the inequality
in Equation (1.2.5) by showing that for
*n* 2.

**1.2.4** Plot the function

for *a* = 2 and 3. Now, let *x*3 = *x*1*x*2.
Plot *g*(*x*1, *x*2, *x*3) = 0.
Show that the four patterns of the XNOR function of Example 1.2.1
are properly separated by *g*.

*** 1.3.1 **Prove
that , and that .
Also, show that , if *m* = 2(*n* + 1).

**1.3.2** Assume *n*
< *m* *d* = , is it true
that *m* points in general position in R*n*
are mapped by the -mapping of a PTG(*r*) into points in general
position in R*d1*
space (i.e., -general position)? Why? Can this PTG map *m*
arbitrary points in R*n*
into *m* points in -general position? Explain.

*** 1.3.3**
Given

**1.3.4** Consider a mapping
*f* : R R, defined as the set of input/output
pairs {*x**i*,
*y**i*}, *i* = 1, 2, ..., *m*.
Assume that *x**i* *x**j*
for all *i* *j*. Show that this mapping
can be exactly realized by a polynomial of order *r* = *m* 1
(this is referred to as "strict" interpolation). Hint:
The determinant

is the Vandermonde's Determinant, which is nonzero
if *z**i* *z**j*,
for all *i* *j* = 1, 2, ..., *m*.

*** 1.3.5**
Consider the mapping from R1
to {0, 1} defined by the input/output pairs (0, 0),
(1, 1), (2, 0), (3, 0), and (4, 1). Use the
result of Problem 1.3.4 to synthesize a polynomial *g*(*x*)
of minimal order which realizes the above mapping. Plot *g*(*x*)
to verify your solution. Next, assume that a PTG(*r*) is
used to realize the same mapping. Is it possible for the order
*r* of this PTG to be smaller than that of the polynomial
*g*(*x*)? If the answer is yes, then synthesize the
appropriate weights for this PTG (assume a zero threshold) and
plot the weighted-sum (i.e., the PTG output without thresholding)
versus *x*.

**1.3.6** Derive Equation
(1.3.4) from the recursion relation .

**1.3.7** Let *y*(*x*) = *a* + *bx*,
where *a*, *b* R. Find the parameters *a* and
*b* which minimize:

a.

b.

**1.3.8** Find the polynomial
*y*(*x*) = *ax* + *bx*3,
which approximates the function *g*(*x*) = sin *x*
over [0, ]. By minimizing . Compare
graphically, the functions *g*(*x*), its approximation
*y*(*x*), and its power series approximation *g*(*x*) *x*
for and .

**1.4.1** Find the solution
vector **w** in Example 1.4.1 if **b** = [1 1 1 1 1 3]T
. Use the K-map technique to show that the Boolean function in
Table 1.4.1 is not a threshold function. Does a **b** > 0
exist which leads to a solution vector **w** with four or fewer
nonzero components?

**1.4.2** Use the method of
Section 1.4 to synthesize a PTG for realizing the Boolean function
. Use a margin vector **b** which
results in the minimal number of non-zero PTG weights.

*** 1.5.1**
Derive Equation (1.5.2). See Cover (1965) for hints.

†** 1.5.2**
Plot the probability of ambiguous response, *A*(*m*, *d*),
given in Equation (1.5.1) versus , for
*d* = 2, 10, and 20.

**1.5.3** Derive Equation
(1.5.3).

*** 1.6.1**
Derive Equation (1.6.3), starting from Equation (1.6.2).

Goto [1.0][1.1] [1.2] [1.3] [1.4] [1.5]
[1.6]