In airports, railway stations and other public places, security inspectors generally use the way of viewing x-ray images for security inspection, so false detection and missed detection often occur. In this paper, an automatic anomaly object detection method in x-ray images is proposed under a two-stage framework. At the first stage, a learnable Gabor convolution layer is introduced into ResNeXt to facilitate the network to capture the edge information of objects. Then, region proposal network (RPN) is used to determine the candidate regions of objects as well as perform coarse classification. At the second stage, bigger discriminative RoI pooling (BDRP) is proposed to classify the candidate boxes to improve the classification accuracy of objects. Furthermore, dense local regression (DLR) is applied to predict the offset of multiple dense boxes in region proposals to locate the objects accurately. Experimental results on the SIXray and OPIXray datasets show that, compared with the state-of-the-art methods, the proposed method can achieve a competitive detection performance.
Vehicle classification is vital to an intelligent transport system. To obtain a high accuracy, it is the most crucial process to extract reliable and distinguishable features of vehicles. A feature extraction method using a lightweight convolutional network for vehicle classification is proposed. The main contributions are threefold: (1) a lightweight network named LWNet with two convolution layers is proposed to extract the features of the vehicles; (2) Hu moment is integrated with spatial location information to improve its own describing and distinguishing ability; and (3) histogram of oriented gradient (HOG) feature is extracted from the complete image, and then the above two kinds of features are combined with HOG to form the vector. And then, a support vector machine is trained to obtain the classification model. Vehicles are classified into six categories, i.e., large bus, car, motorcycle, minibus, truck, and van. The experimental results have demonstrated that the classification accuracy can achieve 97.39%, which is 3.81% higher than that obtained from the conventional methods. In addition, for this vehicle classification task, the proposed lightweight convolutional network can achieve comparable or even higher performance compared to the deep convolutional neural networks, while the proposed method does not need the support of a graphics processing unit and has much lower complexity without the training process.
Vehicle color recognition is easily affected by subtle environmental changes. The existing recognition methods cannot achieve an accurate result. A high-accuracy vehicle color recognition method using a hierarchical fine-tuning strategy for urban surveillance videos is proposed. Different from the conventional convolutional neural networks-based methods, which usually obtain a single classification model, the proposed method combines pretraining and hierarchical fine-tunings to obtain different classification models that can adapt to the change of illumination conditions. First, the GoogLeNet is pretrained using the ILSVRC-2012 dataset to obtain the initial weight parameters of the network. During the first stage of fine-tuning, the whole vehicle color dataset is used to fine-tune the pretrained results to get the initial classification model. Then, an image quality assessment method is proposed to evaluate the illumination conditions of the image. The whole vehicle color dataset is divided into some subdatasets according to the evaluation results. The second stage of fine-tuning is performed on the initial classification model using each subdataset. Thus, the final classification models for the subdatasets are obtained. The experimental results on different databases demonstrate that the recognition accuracy of the proposed method can achieve superior performance over the state-of-the-art methods.
In this paper, we propose a region of interest-based (ROI-adaptive) fusion algorithm of infrared and visible images by
using the Laplacian Pyramid method. Firstly, we estimate the saliency map of infrared images, and then divide the infrared
image into two parts: the regions of interest (RoI) and the regions of non-interest (nRoI), by normalizing the saliency map.
Visible images are also segmented into two parts by using the Gauss High-pass filter: the regions of high frequency (RoH)
and the regions of low frequency (RoL). Secondly, we down-sampled both the nRoI of infrared image and the RoL of
visible image as the input of next level processing. Finally, we use normalized saliency map of infrared images as the
weighted coefficient to get the basic image on the top level and choose max gray value of the RoI of infrared image and
the RoH of visible image to get the detail image. In this way, our method can keep target feature of infrared image and
texture detail information of visual image at the same time. Experiment results show that such fusion scheme performs
better than the other fusion algorithms both on human visual system and quantitative metrics.
Accurate and fast detection of small infrared target has very important meaning for infrared precise guidance, early
warning, video surveillance, etc. Based on human visual attention mechanism, an automatic detection algorithm for
small infrared target is presented. In this paper, instead of searching for infrared targets, we model regular patches that do
not attract much attention by our visual system. This is inspired by the property that the regular patches in spatial domain
turn out to correspond to the spikes in the amplitude spectrum. Unlike recent approaches using global spectral filtering,
we define the concept of local maxima suppression using local spectral filtering to smooth the spikes in the amplitude
spectrum, thereby producing the pop-out of the infrared targets. In the proposed method, we firstly compute the
amplitude spectrum of an input infrared image. Second, we find the local maxima of the amplitude spectrum using cubic
facet model. Third, we suppress the local maxima using the convolution of the local spectrum with a low-pass Gaussian
kernel of an appropriate scale. At last, the detection result in spatial domain is obtained by reconstructing the 2D signal
using the original phase and the log amplitude spectrum by suppressing local maxima. The experiments are performed
for some real-life IR images, and the results prove that the proposed method has satisfying detection effectiveness and
robustness. Meanwhile, it has high detection efficiency and can be further used for real-time detection and tracking.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.