The pose issue which may cause loss of useful information has always been a bottleneck in face and ear recognition. To address this problem, we propose a multimodal recognition approach based on face and ear using local feature, which is robust to large facial pose variations in the unconstrained scene. Deep learning method is used for facial pose estimation, and the method of a well-trained Faster R-CNN is used to detect and segment the region of face and ear. Then we propose a weighted region-based recognition method to deal with the local feature. The proposed method has achieved state-of-the-art recognition performance especially when the images are affected by pose variations and random occlusion in unconstrained scene.
Descriptor is the key of any image-based recognition algorithm. For ear recognition, conventional descriptors are either based on 2D data or 3D data. 2D images provide rich texture information and human ear is a 3D surface that could offer shape information. It also inspires us that 2D data is more robust against occlusion while 3D data shows more robustness against illumination variation and pose variation. In this paper, we introduce a novel Texture and Depth Scale Invariant Feature Transform (TDSIFT) descriptor to encode 2D and 3D local features for ear recognition. Compared to the original Scale Invariant Feature Transform (SIFT) descriptor, the proposed TDSIFT shows its superiority by fusing 2D local information and 3D local information. Firstly, keypoints are detected and described on texture images. Then, 3D information of the keypoints located on the corresponding depth images is added to form the TDSIFT descriptor. Finally, a local feature based classification algorithm is adopted to identify ear samples by TDSIFT. Experimental results on a benchmark dataset demonstrate the feasibility and effectiveness of our proposed descriptor. The rank-1 recognition rate achieved on a gallery of 415 persons is 95.9% and the time involved in the computation is satisfactory compared to state-of-the-art methods.
KEYWORDS: Ear, Databases, Feature extraction, Data modeling, Detection and tracking algorithms, Visualization, Associative arrays, Wavelets, Optical engineering, Chemical species
The Gabor wavelets have been experimentally verified to be a good approximation to the response of cortical neurons. A new feature extraction approach is investigated for ear recognition by using scale information of Gabor wavelets. The proposed Gabor scale feature conforms to human visual perception of objects from far to near. It can not only avoid too much redundancy in Gabor features but also tends to extract more precise structural information that is robust to image variations. Then, Gabor scale feature-based non-negative sparse representation classification (G-NSRC) is proposed for ear recognition under occlusion. Compared with SRC in which the sparse coding coefficients can be negative, the non-negativity of G-NSRC conforms to the intuitive notion of combing parts to form a whole and therefore is more consistent with the biological modeling of visual data. Additionally, the use of Gabor scale features increases the discriminative power of G-NSRC. Finally, the proposed classification paradigm is applied to occluded ear recognition. Experimental results demonstrate the effectiveness of our proposed algorithm. Especially when the ear is occluded, the proposed algorithm exhibits great robustness and achieves state-of-the-art recognition performance.
As a new biometrics authentication technology, ear recognition remains many unresolved problems, one of them is the
occlusion problem. This paper deals with ear recognition with partially occluded ear images. Firstly, the whole 2D image
is separated to sub-windows. Then, Neighborhood Preserving Embedding is used for feature extraction on each subwindow,
and we select the most discriminative sub-windows according to the recognition rate. Thirdly, a multi-matcher
fusion approach is used for recognition with partially occluded images. Experiments on the USTB ear image database
have illustrated that using only few sub-window can represent the most meaningful region of the ear, and the multimatcher
model gets higher recognition rate than using the whole image for recognition.
Ear recognition based on the force field transform is new and effective. Three different applications of the force field
transform were discussed in this paper. Firstly, we discussed the problem in the process of potential wells extraction and
overcame the contradiction between the continuity of the force field and the discreteness of intensity images. Secondly,
an improved convergence-based ear recognition method was presented in this paper. To overcome the problem of
threshold segmentation, an adaptive threshold segmentation method was used to find the threshold automatically; to
reduce the computational complexity, a quick classification was realized by combining the Canny-operator and the
Modified Hausdorff Distance (MHD). Finally, the algebraic property of force field was combined with Principal
Component Analysis (PCA) and Linear Discriminant Analysis (LDA) together to obtain feature vectors for ear
recognition. We tested these applications of the force field transform on two ear databases. Experimental results show
the validity and robustness of the force field transform for ear recognition.
The research of Chinese characters cognition is an important research aspect of cognitive science and computer science,
especially artificial intelligence. In this paper, according as the traits of Chinese characters the database of Chinese
characters font representations and the model of computer simulation of Chinese characters font cognition are
constructed from the aspect of cognitive science. The font cognition of Chinese characters is actual a gradual process and
there is the accumulation of knowledge. Through using the method of computer simulation, the development model of
Chinese characters cognition was constructed. And this is the important research content of Chinese characters cognition.
This model is based on self-organizing neural network and adaptive resonance theory (ART) neural network. By
Combining the SOFM and ART2 network, two sets of input were trained. Through training and testing methods, the
development process of Chinese characters cognition based on Chinese characters cognition was simulated. Then the
results from this model and could be compared with the results that were obtained only using SOFM. By analyzing the
results, this simulation suggests that the model is able to account for some empirical results. So, the model can simulate
the development process of Chinese characters cognition in a way.
Multi-pose ear recognition is a challenging problem of ear recognition technology. In this paper, the method based on
locally linear embedding (LLE) and nearest feature line (NFL) is proposed by analyzing the disadvantages of most 2D
ear recognition methods currently when dealing with poses variation and the shortcomings of the nearest neighbor (NN)
classifier. The LLE algorithm is used to extract the ear feature, and then the NFL classifier is applied to classify ear
images under varying poses. By contrast, experimental results show that the method based on LLE and NFL can
obviously improve the recognition performance, which demonstrates the validity of this method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.