2.5 Summary

Lower bounds on the size of multilayer feedforward neural networks for the realization of arbitrary dichotomies of points are derived. It is found that networks with two or more hidden layers are potentially more size-efficient than networks with a single hidden layer. However, the derived bounds suggest that no matter what network architecture is used (assuming no feedback), the size of such networks must always be on the order or larger for the realization of arbitrary or random functions of m points in Rn, in the limit m >> n2. Fortunately, though, the functions encountered in practice are likely to have natural regularities and redundancies which make them easier to realize with substantially smaller networks than these bounds would indicate.

Single hidden layer feedforward neural networks are universal approximators for arbitrary multivariate continuous functions. These networks are also capable of implementing arbitrarily complex dichotomies; thus, they are suitable as pattern classifiers. However, it is crucial that the parameters of these networks be carefully chosen in order to exploit the full function approximation potential of these networks. In the next chapter, we explore learning algorithms which may be used to adaptively discover optimal values for these parameters.

Finally, we find that from an algorithmic complexity point of view, neural networks are best fit for solving random problems such as pattern recognition problems in noisy environments. We also find such networks appealing from a computation energy point of view when implemented in analog VLSI and/or optical technologies that utilize the properties of device physics.

Problems:

2.1.1 Use the K-map-based threshold-OR synthesis technique illustrated in Figure 2.1.3 to identify all possible decompositions of the Boolean function f(x1x2x3) = x1x3 + x2x3 +  into the ORing of two threshold functions (Hint: recall the admissible K-map threshold patterns of Figure 1.1.5).

2.1.2 Find a minimal threshold-OR network realization for the switching function given by the K-maps in Figure P2.1.2.


Figure P2.1.2. A three input, two output switching function.

2.1.3 Employ Theorem 1.4.1 and the equivalence of the two nets in Figures 1.3.4 and 1.2.1, for , to show that a two layer LTG net with m LTG's in the hidden layer, feeding into a single output LTG is sufficient for realizing any Boolean function of m points in {0,1}n. Is the requirement of full interconnectivity between the input vector and the hidden LTG layer necessary? Why?

* 2.2.1 Estimate the maximum number of functions , defined on m points in general position, that can be realized using the generally interconnected LTG network shown in Figure 2.1.5.

2.2.2 Derive the bound on k in Equation (2.2.14).

* 2.2.3 Derive the bound on k in Equation (2.2.15) [Hint: Use Equation (2.1.2)].

2.2.4 Consider an arbitrarily fully interconnected network of LTG's with no feedback. Show that this network must have more than LTG's for realizing arbitrary, completely specified Boolean functions of n-variables.

* 2.2.5 Derive the bound on k in Equation (2.2.16). Hint: Use the result of Problem 2.2.1.

2.3.1 Plot Equation (2.3.4) for n = 9.

* 2.3.2 Show that the activation in Equation (2.3.4) with n = 3 satisfies the three conditions in Equation (2.3.3).

Goto [
2.0][2.1] [2.2] [2.3] [2.4]

Back to the Table of Contents

Back to Main Menu