The computer-based recognition of facial expressions has been an active area of research for quite a long time. The ultimate goal is to realize intelligent and transparent communications between human beings and machines. The neural network (NN) based recognition methods have been found to be particularly promising, since NN is capable of implementing mapping from the feature space of face images to the facial expression space. However, finding a proper network size has always been a frustrating and time consuming experience for NN developers. In this paper, we propose to use the constructive one-hidden-layer feed forward neural networks (OHL-FNNs) to overcome this problem. The constructive OHL-FNN will obtain in a systematic way a proper network size which is required by the complexity of the problem being considered. Furthermore, the computational cost involved in network training can be considerably reduced when compared to standard back- propagation (BP) based FNNs. In our proposed technique, the 2-dimensional discrete cosine transform (2-D DCT) is applied over the entire difference face image for extracting relevant features for recognition purpose. The lower- frequency 2-D DCT coefficients obtained are then used to train a constructive OHL-FNN. An input-side pruning technique previously proposed by the authors is also incorporated into the constructive OHL-FNN. An input-side pruning technique previously proposed by the authors is also incorporated into the constructive learning process to reduce the network size without sacrificing the performance of the resulting network. The proposed technique is applied to a database consisting of images of 60 men, each having the resulting network. The proposed technique is applied to a database consisting of images of 60 men, each having 5 facial expression images (neutral, smile, anger, sadness, and surprise). Images of 40 men are used for network training, and the remaining images are used for generalization and testing. Confusion matrices calculated in both network training and testing for 4 facial expressions (smile, anger, sadness, and surprise) are used to evaluate the performance of the trained network. By extensive simulations it is shown that when compared with the BP-based method, the proposed technique constructs OHL- FNN with significantly smaller number of hidden units and weights, and simultaneously yielding improved recognition performance.
|