In the remote sensing area, how to automatically and accurately extract buildings from images is a hot and challenging topic in these years. With the rapid development of sensor and computer hardware technologies, it gets easier to gain remote sensing images with very high-resolution and extract buildings from them by the popular deep learning models such as Fully Convolutional Networks (FCN). However, current FCN based models always lead to blurred building boundaries and have poor abilities on extracting small buildings. Therefore, in this paper, we propose the Gaussian Dilate Convolution, which is a cascade of a trainable Gaussian Filter and an dilate convolution with proper hyperparameter initializations. Also, we carefully design a hierarchical dense feature fusion structure following the dense connection manners. Finally, we embed the Gaussian Dilate Convolution into the hierarchical dense fusion structure and name it as Dense Hierarchical Spatial Gaussian Pool (Dense-HSGP). More specifically, the Gaussian Dilate Convolution has the advantages of the original dilate convolution but preserves much more context information, while the hierarchical dense connection structure of Dense-HSGP provides more abundant receptive fields and higher feature reused abilities within the model. We execute the experiments on the widely used Inrial Labelling Dataset to verify the efficiency of the proposed model. The experimental results show that the proposed model achieves 96.45 % average accuracy and 77.17% IoU respectively, which are distinct improvements rather than several recent state-of-the-art building extraction models.
Automatic building segmentation from remote sensing images is critical in the remote sensing image semantic segmentation. The success of deep neural networks has led to advances in using fully convolutional neural networks (FCN) to extract buildings from the high-resolution image. However, the downsampling processing inevitably leads to loss of details of the segmentation results. To solve this problem, some methods try to refine the results of FCN by using probability graph models such as fully connected CRF (Conditional Random Fields). Nevertheless, many fully connected CRF based methods are too time-consuming and not suitable for building segmentation tasks in some situations. In this paper, we propose a novel time-efficient end-to-end CRF model with the domain transform algorithm called DT-CRF. In the proposed model, in order to accelerate the message passing in the mean-field approximate inference algorithm, we take the edge maps as the joint image for DT-CRF and use the domain transformation algorithm to calculate the pair-wise potential instead of the Gaussian kernel function. Meanwhile, we design a multi-task network which can generate masks and edges simultaneously, and the network can make the DT-CRF to easily optimize the segmentation results using model information. The evaluation of remote sensing image datasets verifies the time and space efficiency of the proposed DTCRF and demonstrates a distinct improvement.
We propose a nonconvex higher-order total variation (TV) method for blind motion image deblurring. First, we introduce a nonconvex higher-order TV differential operator to define a new model of the blind motion image deblurring, which can effectively eliminate the staircase effect of the deblurred image; meanwhile, we employ an image sparse prior to improve the edge recovery quality. Second, to improve the accuracy of the estimated motion blur kernel, we use L1 norm and H1 norm as the blur kernel regularization term, considering the sparsity and smoothing of the motion blur kernel. Third, because it is difficult to solve the numerically computational complexity problem of the proposed model owing to the intrinsic nonconvexity, we propose a binary iterative strategy, which incorporates a reweighted minimization approximating scheme in the outer iteration, and a split Bregman algorithm in the inner iteration. And we also discuss the convergence of the proposed binary iterative strategy. Last, we conduct extensive experiments on both synthetic and real-world degraded images. The results demonstrate that the proposed method outperforms the previous representative methods in both quality of visual perception and quantitative measurement.
In this paper, a feature extraction model for face recognition is proposed. This model is constructed by implementing three biologically inspired strategies, namely a hierarchical network, a learning mechanism of the V1 simple cells, and a data-driven attention mechanism. The hierarchical network emulates the functions of the V1 cortex to progressively extract facial features invariant to illumination, expression, slight pose change, and variations caused by local transformation of facial parts. In the network, filters that account for the local structures of the face are derived through the learning mechanism and used for the invariant feature extraction. The attention mechanism computes a saliency map for the face, and enhances the salient regions of the invariant features to further improve the performance. Experiments on the FERET and AR face databases show that the proposed model boosts the recognition accuracy effectively.
In this paper, multivariable linear regression analysis was employed to obtain the relationship among facial geometric
features, and a discriminant function was used to evaluate the significance of different features. Finally, classification
rates were compared with different combinations of geometric features. The results showed that the geometric feature
with more significance probably improved the classification performance in the cases studied.
Singular values (SVs) feature vectors of face image have been used for face recognition as the feature recently. Although SVs have some important properties of algebraic and geometric invariance and insensitiveness to noise, they are the representation of face image in its own eigen-space spanned by the two orthogonal matrices of singular value decomposition (SVD) and clearly contain little useful information for face recognition. This study concentrates on extracting more informational feature from a frontal and upright view image based on SVD and proposing an improving method for face recognition. After standardized by intensity normalization, all training and testing face images are projected onto a uniform eigen-space that is obtained from SVD of standard face image. To achieve more computational efficiency, the dimension of the uniform eigen-space is reduced by discarding the eigenvectors that the corresponding eigenvalue is close to zero. Euclidean distance classifier is adopted in recognition. Two standard databases from Yale University and Olivetti research laboratory are selected to evaluate the recognition accuracy of the proposed method. These databases include face images with different expressions, small occlusion, different illumination condition and different poses. Experimental results on the two face databases show the effectiveness of the method and its insensitivity to the face expression, illumination and posture.
With the development of biometrics technology, the recognition of human-face becomes the most acceptant way of identification. In the recent thirty years, face recognition technology gets more and more attentions. But unfortunately, most human-face recognition systems with a large-scale facial image database can’t be put into practice just because they have not enough recognition speed and precision. As a matter of fact, the recognition time will drastically increase as the number of human-face increases. In order to improve the recognition rates, we can firstly classify the large-scale facial image database into several comparatively small classes with specific criterion, and then begin recognition in the next step. If the classified class is still too big for recognition, another classification could be put into practice with other specific criterion until it adapts to recognition. This method is named as Multi-Layer Classification Method (MLCM) in our paper. In order to classify an unclassified face into a small class, a multiclass classifier must be set up. Because that the mahalanobis distance classifier follows the normal distribution, it is employed in our study. The results have shown that the integrative recognition rates have drastically increased for the large-scale facial image database.
As the development of intelligent robots, more and more sensors of higher-technique are required, and tactile sensing technology gets extensively attention. It is the three-axis force that is working when the robot is grasping or walking, but it is quite difficult to measure the three-axis force directly in the numerous tactile sensors. To get the contact-alike nonlinear solution in FEA(Finite Element Analysis), an advanced analysis method of ANSYS - APDL(Advanced Program Description Language) is employed, with which the miscellaneous and time-consuming process is automatically completed in an intelligent way. This paper introduces a series of simulation experiments about an innovative optical wave-guided three-axis tactile sensing system and brings forward the corresponding mathematical model to calculate the three-axis force. A special sensing system is designed for the experiments, and the results
considerably conform to the theoretical analysis. Thus, a new method comes into being for tactile sensing of intelligent robots.
In this paper, a novel method of Regional Facial Geometric Feature Recognition (RFGFR) is presented. With the development of biometrics technology, the recognition of human-face becomes the most acceptant way of identification. Based on the consideration that China is such a country with expansive regions, numerous peoples and different facial geometric structures and features, the six geographic regions based on facial geometric features have been classified according to China Administrative District. The 300 front face images of the Han nationality in the classified six regions have been sampled and registered into a face image gallery through a 3-D digital camera system. Subsequently, some geometric features (distance between two pupils, ratio of distance between two inner canthi to distance between two pupils and etc.) have been extracted and used as the facial feature recognition parameters. Furthermore, through lots of recognition experiments, we found that the Han people in different regions have different facial features to some extent. As a result, the feasibility and reliability of the RFGFR method are finally verified.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.