Image synthesis is a critical task in various computer vision technologies, and lots of methods tried to translate semantic images into realistic ones for controllable synthesis. With the increasing image resolution, networks are becoming larger, and applications of related methods are restricted. To alleviate the problem, we propose a lightweight mutable network for semantic image synthesis. The network is based on generative adversarial networks. We introduce the feature pyramid architecture to the generator and reduce the hidden node numbers. We also design a mutable scheme where the pyramid will involve fewer layers for smaller images. To improve the performance of the lightweight generator, we further propose a weighted discriminator and a refined loss. The experiments on several public datasets show that our method is effective and achieves competitive performance, proving that a small network can also achieve high-quality semantic image synthesis.
A practical face recognition system demands not only high recognition performance but also the capability of detecting spoofing attacks. While emerging approaches to face antispoofing have been proposed in recent years, most of them perform poorly on unseen samples. The generalizability of face antispoofing needs to be significantly improved before it can be adopted by practical application systems. The main reason for the poor generalization of current approaches is the variety of materials among the spoofing devices. As the attacks are produced by putting a spoofing display (e.g., paper, electronic screen, forged mask) in front of a camera, the variety of spoofing materials makes the spoofing attacks quite different. Another reason for the poor generalizability is that limited labeled data are available for training for face antispoofing. We focus on improving the generalizability of convolutional neural network (CNN)-based face antispoofing methods across different kinds of datasets. We propose a deep domain transfer CNN using sparsely unlabeled data from the target domain to learn features that are invariant across domains for face antispoofing. Experiments on five face spoofing datasets show that the proposed method significantly improves the cross-test performance only with a small number of unlabeled samples from the target domain.
Recently, numerous methods have been proposed to tackle the problem of fine-grained image classification (FGIC). Most of them follow a two-step strategy that contains detecting the object regions and classifying with the features extracted from these regions. For the feature extraction, the most popular method is directly cropping the feature maps according to the location of detected part regions. However, one challenge of such a method is that the direction of the semantic parts may vary in different images, therefore, it is necessary to capture such differences for better classification. We propose a CNN architecture by aligning semantic parts (ASP-CNN) for FGIC, aiming to increase the interclass variance and meanwhile reduce the intraclass variance in fine-grained datasets. Extensive experiments on CUB-200-2011 and CUB-200-2010 show the effectiveness of our ASP-CNN.
This paper presents a scheme of principal node analysis (PNA) with the aim to improve the representativeness of the learned codebook so as to enhance the classification rate of scene image. Original images are normalized into gray ones and the scale-invariant feature transform (SIFT) descriptors are extracted from each image in the preprocessing stage. Then, the PNA-based scheme is applied to the SIFT descriptors with iteration and selection algorithms. The principal nodes of each image are selected through spatial analysis of the SIFT descriptors with Manhattan distance (L1 norm) and Euclidean distance (L2 norm) in order to increase the representativeness of the codebook. With the purpose of evaluating the performance of our scheme, the feature vector of the image is calculated by two baseline methods after the codebook is constructed. The L1-PNA- and L2-PNA-based baseline methods are tested and compared with different scales of codebooks over three public scene image databases. The experimental results show the effectiveness of the proposed scheme of PNA with a higher categorization rate.
Terahertz (THz) detector indicates great potentials in detecting application because of real-time, compact bulk and unique spectral characteristics. Small dimension and integration THz detectors based on resonance cavity structure were designed and simulated to get optimizing THz detector parameters from the simulation results of membrane temperature changing. The THz detector was fabricated with complex semiconductor process and three dimension thermal variation of resonance cavity were obtained by simulation to identify the resonance cavity design. The electrical response time of THz detector could be as low as 5ms, which is suitable for the application of fast response THz detecting.
This paper proposes a multiple-step method of quality assessment on sequence iris images. Based on the spatial domain
and frequency domain of image feature-vector space, the method sets different criteria in different steps to eliminate
poor-quality images: firstly to judge the intensity and clarity of iris images roughly; secondly to preprocess iris images
by making morphological operation, extract the region of interest(ROI) which covers pupil by blocking image and then
make binary-conversion of iris image and locate pupil; thirdly to select different ROI, analyze the factors affecting image
quality, such as defocus blur, motion blur, eyelid closure, eyelashes shelter, dynamic characteristics of eyeball, etc., and
eliminate substandard images from the target sequence; lastly to set a comprehensive criteria to select the best quality
image from the sequence. The method is verified by 300 sequences of iris images and experimental results show that
only through locating pupil instead of utterly locating the outer edge of iris can this method quickly and precisely judge
whether iris is sheltered and blurred or not, and that each of its steps can almost eliminate substandard images, thus not
only reducing the number of iris images to be assessed but also saving the time of processing images.
JPEG and MPEG compression standards adopt the macro block encoding approach, but this method can lead to annoying blocking effects-the artificial rectangular discontinuities in the decoded images. Many powerful postprocessing algorithms have been developed to remove the blocking effects. However, all but the simplest algorithms can be too complex for real-time applications, such as video decoding. We propose an adaptive and easy-to-implement algorithm that can removes the artificial discontinuities. This algorithm contains two steps, firstly, to perform a fast linear smoothing of the block edge's pixel by average value replacement strategy, the next one, by comparing the variance that is derived from the difference of the processed image with a reasonable threshold, to determine whether the first step should stop or not. Experiments have proved that this algorithm can quickly remove the artificial discontinuities without destroying the key information of the decoded images, it is robust to different images and transform strategy.
Prevention and early diagnosis of tumors in mammogram are foremost. Unfortunately, these images are often corrupted by the noise due to the film noise and the background texture of the images, which did not allow isolation of the target information from the background noise, and often results in the suspicious area to be analyzed inaccurately. In order to achieve more accurate detection and segmentation tumors, the quality of the images need to improve, (including to suppressing noise and enhancing the contrast of the image). This paper presents a new adaptive histogram threshold method approach for segmentation of suspicious mass regions in digitized images. The method use multi-scale wavelet decomposition and a threshold selection criterion based on a transformed image¡¯s histogram. This separation can help eliminate background noise and discriminates against objects of different size and shape. The tumors are extracted by used an adaptively bayesian classifier. We demonstrate that the method proposed can greatly improve the accuracy of detection in tumors.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.