PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE Proceedings Volume 9813 including the Title Page, Copyright information, Table of Contents, Introduction, Authors, and Conference Committee listing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We have proposed and discussed optical pattern recognition algorithms for object tracking based on nonlinear equivalent models and subtraction of frames. Experimental results of suggested algorithms in Mathcad and LabVIEW are shown. Application of equivalent functions and difference of frames gives good results for recognition and tracking moving objects.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Light field imaging is capable of capturing dense multi-view 2D images in one snapshot, which record both intensity values and directions of rays simultaneously. As an emerging 3D device, the light field camera has been widely used in digital refocusing, depth estimation, stereoscopic display, etc. Traditional multi-view stereo (MVS) methods only perform well on strongly texture surfaces, but the depth map contains numerous holes and large ambiguities on textureless or low-textured regions. In this paper, we exploit the light field imaging technology on 3D face modeling in computer vision. Based on a 3D morphable model, we estimate the pose parameters from facial feature points. Then the depth map is estimated through the epipolar plane images (EPIs) method. At last, the high quality 3D face model is exactly recovered via the fusing strategy. We evaluate the effectiveness and robustness on face images captured by a light field camera with different poses.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Fringe projection three-dimensional measurement is widely applied in a wide range of industrial application. The traditional fringe projection system has the disadvantages of high expense, big size, and complicated calibration requirements. In this paper we introduce a low-cost and portable realization on three-dimensional measurement with Pico projector. It has the advantages of low cost, compact physical size, and flexible configuration. For the proposed fringe projection system, there is no restriction to camera and projector’s relative alignment on parallelism and perpendicularity for installation. Moreover, plane-based calibration method is adopted in this paper that avoids critical requirements on calibration system such as additional gauge block or precise linear z stage. What is more, error sources existing in the proposed system are introduced in this paper. The experimental results demonstrate the feasibility of the proposed low cost and portable fringe projection system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This article using some state-of-art multi-view dense matching methods for reference, proposes an UAV multiple image dense matching algorithm base on Self-Adaptive patch (UAV-AP) in view of the specialty of UAV images. The main idea of matching propagating based on Self-Adaptive patch is to build patches centered by seed points which are already matched. The extent and figure of the patches can adapt to the terrain relief automatically: when the surface is smooth, the extent of the patch would become bigger to cover the whole smooth terrain; while the terrain is very rough, the extent of the patch would become smaller to describe the details of the surface. With this approach, the UAV image sequences and the given or previously triangulated orientation elements are taken as inputs. The main processing procedures are as follows: (1) multi-view initial feature matching, (2) matching propagating based on Self-Adaptive patch, (3) filtering the erroneous matching points. Finally, the algorithm outputs a dense colored point cloud. Experiments indicate that this method surpassed the existing related algorithm in efficiency and the matching precision is also quite ideal.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Face recognition as an important biometric identification method, with its friendly, natural, convenient advantages, has obtained more and more attention. This paper intends to research a face recognition system including face detection, feature extraction and face recognition, mainly through researching on related theory and the key technology of various preprocessing methods in face detection process, using KPCA method, focuses on the different recognition results in different preprocessing methods. In this paper, we choose YCbCr color space for skin segmentation and choose integral projection for face location. We use erosion and dilation of the opening and closing operation and illumination compensation method to preprocess face images, and then use the face recognition method based on kernel principal component analysis method for analysis and research, and the experiments were carried out using the typical face database. The algorithms experiment on MATLAB platform. Experimental results show that integration of the kernel method based on PCA algorithm under certain conditions make the extracted features represent the original image information better for using nonlinear feature extraction method, which can obtain higher recognition rate. In the image preprocessing stage, we found that images under various operations may appear different results, so as to obtain different recognition rate in recognition stage. At the same time, in the process of the kernel principal component analysis, the value of the power of the polynomial function can affect the recognition result.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we introduce and study a novel unsupervised domain adaptation (DA) algorithm, called latent subspace sparse representation based domain adaptation, based on the fact that source and target data that lie in different but related low-dimension subspaces. The key idea is that each point in a union of subspaces can be constructed by a combination of other points in the dataset. In this method, we propose to project the source and target data onto a common latent generalized subspace which is a union of subspaces of source and target domains and learn the sparse representation in the latent generalized subspace. By employing the minimum reconstruction error and maximum mean discrepancy (MMD) constraints, the structure of source and target domain are preserved and the discrepancy is reduced between the source and target domains and thus reflected in the sparse representation. We then utilize the sparse representation to build a weighted graph which reflect the relationship of points from the different domains (source-source, source- target, and target-target) to predict the labels of the target domain. We also proposed an efficient optimization method for the algorithm. Our method does not need to combine with any classifiers and therefore does not need train the test procedures. Various experiments show that the proposed method perform better than the competitive state of art subspace-based domain adaptation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper we propose an independent sequential maximum likelihood approach to address the joint track-to-track association and bias removal in multi-sensor information fusion systems. First, we enumerate all kinds of association situation following by estimating a bias for each association. Then we calculate the likelihood of each association after bias compensated. Finally we choose the maximum likelihood of all association situations as the association result and the corresponding bias estimation is the registration result. Considering the high false alarm and interference, we adopt the independent sequential association to calculate the likelihood. Simulation results show that our proposed method can give out the right association results and it can estimate the bias precisely simultaneously for small number of targets in multi-sensor fusion system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we present new methods for image fusion based on intuitionistic index in spatial domain and contourlet transform domain, furthermore we adopt two ways to fuse images in contourlet domain. When constructing an intuitionistic fuzzy set, we use the Gamma function to get the membership degree, and the Sugeno complementation to get the non-membership degree. Based on the information theory, the larger the hesitancy is, the more information it has. So we set up a fusion rule, by which the larger hesitancy will be chosen, to get a fused image from multi-focus images or remote sensing ones. We compare these new algorithms to some classical image fusion algorithms. The results show, for multi-focus image, these new algorithms are better comparing to other algorithms, and they can get a good fusion result, especially the contourlet transformation algorithm using the intuitionistic index. For remote sensing image, these new algorithms are not the best, but they can also get a well fusion result.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Object detection is one of the most important researches in computer vision. Recently, category-independent objectness in RGB images has been a hot field for its generalization ability and efficiency as a pre-filtering procedure of the object detection. Many traditional applications have been transferred from the RGB images to the depth images since the economical depth sensors, such as Kinect, were popularized. The depth data represents the distance information. Because of the special characteristic, the methods of objectness evaluation in RGB images are often invalid in depth images. In this study, we propose mEdgeboxes to evaluate the objectness in depth image. Aside from detecting the edge from the raw depth information, we extract another edge map from the orientation information based on the normal vector. Two kinds of the edge map are integrated and are fed to Edgeboxes1 in order to produce the object proposals. The experimental results on two challenging datasets demonstrate that the detection rate of the proposed objectness estimation method can achieve over 90% with 1000 windows. It is worth noting that our approach generally outperforms the state-of-the-art methods on the detection rate.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Due to the curse of dimensionality, traditional clustering methods usually fail to produce meaningful results for the high dimensional data. Hypergraph partition is believed to be a promising method for dealing with this challenge. In this paper, we first construct a graph G from the data by defining an adjacency relationship between the data points using Shared Reverse k Nearest Neighbors (SRNN). Then a hypergraph is created from the graph G by defining the hyperedges to be all the maximal cliques in the graph G. After the hypergraph is produced, a powerful hypergraph partitioning method called dense subgraph partition (DSP) combined with the k-medoids method is used to produce the final clustering results. The proposed method is evaluated on several real high-dimensional datasets, and the experimental results show that the proposed method can improve the clustering results of the high dimensional data compared with applying k-medoids method directly on the original data.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In order to achieve the rendezvous and capture of the space non-cooperative target, the relative position and pose measurement of non-cooperative target must be resolved. Since the marker is not installed into the non-cooperative target and there is no inter satellite link to transfer the information, so it is very difficult to measure the relative position and pose measurement of non-cooperative target. The solar array connecting frame of non-cooperative targets have their characters and are easy to capture, so the position and pose measurement of specific operation site of non-cooperative target based on stereo vision has been studied in this paper. The method composed of image acquiring, image filtering, edge detection, feature extraction and relative pose measurement. Finally, the relative position and attitude parameters of the solar wing connection were obtained and provided to the control system. The results of simulation and ground verification show that the algorithm is accurate and effective, and can satisfy the technical requirements of the on orbit operation. The measurement approach can be used for engineering implementation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Data explosion and information redundancy are the main characteristics of the era of big data. Digging out valuable information from mass data is the premise of efficient information processing, which is a key technology in the area of object recognition with mass feature database. In the area of large scale image processing, both of the massive image data and the image features of high-dimension take great challenges to object recognition and information retrieval. Similar with big data, the large scale image feature database, which contains extensive quantity of information redundancy, can also be quantitatively represented by finite clustering models without degrading recognition performance. Inspired by the ideas of product quantization and high dimensional feature division, a data compression method based on recursive self-organizing mapping (RSOM) algorithm is proposed in this paper.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents an improved density-based clustering algorithm based on the paper of clustering by fast search and find of density peaks. A distance threshold is introduced for the purpose of economizing memory. In order to reduce the probability that two points share the same density value, similarity is utilized to define proximity measure. We have tested the modified algorithm on a large data set, several small data sets and shape data sets. It turns out that the proposed algorithm can obtain acceptable results and can be applied more wildly.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, multi-kernel learning(MKL) is used for drug-related webpages classification. First, body text and image-label text are extracted through HTML parsing, and valid images are chosen by the FOCARSS algorithm. Second, text based BOW model is used to generate text representation, and image-based BOW model is used to generate images representation. Last, text and images representation are fused with a few methods. Experimental results demonstrate that the classification accuracy of MKL is higher than those of all other fusion methods in decision level and feature level, and much higher than the accuracy of single-modal classification.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A fusion algorithm of infrared and visible images based on saliency scale-space in the frequency domain was proposed. Focus of human attention is directed towards the salient targets which interpret the most important information in the image. For the given registered infrared and visible images, firstly, visual features are extracted to obtain the input hypercomplex matrix. Secondly, the Hypercomplex Fourier Transform (HFT) is used to obtain the salient regions of the infrared and visible images respectively, the convolution of the input hypercomplex matrix amplitude spectrum with a low-pass Gaussian kernel of an appropriate scale which is equivalent to an image saliency detector are done. The saliency maps are obtained by reconstructing the 2D signal using the original phase and the amplitude spectrum, filtered at a scale selected by minimizing saliency map entropy. Thirdly, the salient regions are fused with the adoptive weighting fusion rules, and the nonsalient regions are fused with the rule based on region energy (RE) and region sharpness (RS), then the fused image is obtained. Experimental results show that the presented algorithm can hold high spectrum information of the visual image, and effectively get the thermal targets information at different scales of the infrared image.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we present a novel image matching method to find the correspondences between two sets of image interest points. The proposed method is based on a revised third-order tensor graph matching method, and introduces an energy function that takes four kinds of energy term into account. The third-order tensor method can hardly deal with the situation that the number of interest points is huge. To deal with this problem, we use a potential matching set and a vote mechanism to decompose the matching task into several sub-tasks. Moreover, the third-order tensor method sometimes could only find a local optimum solution. Thus we use a cluster method to divide the feature points into some groups and only sample feature triangles between different groups, which could make the algorithm to find the global optimum solution much easier. Experiments on different image databases could prove that our new method would obtain correct matching results with relatively high efficiency.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Many challenging computer vision problems have been proven to benefit from the incorporation of depth information, to name a few, semantic labellings, pose estimations and even contour detection. Different objects have different depths from a single monocular image. The depth information of one object is coherent and the depth information of different objects may vary discontinuously. Meanwhile, there exists a broad non-classical receptive field (NCRF) outside the classical receptive field (CRF). The response of the central neuron is affected not only by the stimulus inside the CRF, but also modulated by the stimulus surrounding it. The contextual modulation is mediated by horizontal connections across the visual cortex. Based on the findings and researches, a biological-inspired contour detection model which combined with depth information is proposed in this paper.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Use the redundancy of the super complete dictionary can capture the structural features of the image effectively, can achieving the effective representation of the image. However, the commonly used atomic sparse representation without regard the structure of the dictionary and the unrelated non-zero-term in the process of the computation, though structure sparse consider the structure feature of dictionary, the majority coefficients of the blocks maybe are non-zero, it may affect the identification efficiency. For the disadvantages of these two sparse expressions, a weighted parallel atomic sparse and sparse structure is proposed, and the recognition efficiency is improved by the adaptive computation of the optimal weights. The atomic sparse expression and structure sparse expression are respectively, and the optimal weights are calculated by the adaptive method.
Methods are as follows: training by using the less part of the identification sample, the recognition rate is calculated by the increase of the certain step size and t the constraint between weight. The recognition rate as the Z axis, two weight values respectively as X, Y axis, the resulting points can be connected in a straight line in the 3 dimensional coordinate system, by solving the highest recognition rate, the optimal weights can be obtained. Through simulation experiments can be known, the optimal weights based on adaptive method are better in the recognition rate, weights obtained by adaptive computation of a few samples, suitable for parallel recognition calculation, can effectively improve the recognition rate of infrared images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper proposes a low-cost FPGA architecture of Speed-Up Robust Features (SURF) algorithm based on OpenSURF. It optimizes the computing architecture for the steps of feature detection and feature description involved in SURF to reduce the resource utilization and improve processing speed. As a result, this architecture can detect feature and extract descriptor from video streams of 800x600 resolutions at 60 frames per second (60fps). Extensive experiments have demonstrated its efficiency and effectiveness.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Slant correction for billet characters is primary and critical step of the recognition of steel billet characters. Character positioning is not accurate when the billet characters are inclined. To solve this problem, this paper presents an algorithm of slant correction for billet characters using height feature of characters. Characters are linearly arranged, using this feature, the angle between the horizontal direction and the base line can be calculated, then the sloping billet characters can be corrected. Experimental results show that the proposed method can correct sloping characters accurately. Compared with the traditional algorithm of slant correction for billet characters, the proposed method can obtain better results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper proposes a novel posture estimation method which is composed of two stages. The first stage is reconstructing lines from stereo images and the second stage is estimate posture by reconstructed lines. Accuracy of line detection is better than the point detection. So our method have better accuracy than the methods base on points.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper proposes an approach to produce the super-resolution all-refocused images with the plenoptic camera. The plenoptic camera can be produced by putting a micro-lens array between the lens and the sensor in a conventional camera. This kind of camera captures both the angular and spatial information of the scene in one single shot. A sequence of digital refocused images, which are refocused at different depth, can be produced after processing the 4D light field captured by the plenoptic camera. The number of the pixels in the refocused image is the same as that of the micro-lens in the micro-lens array. Limited number of the micro-lens will result in poor low resolution refocused images. Therefore, not enough details will exist in these images. Such lost details, which are often high frequency information, are important for the in-focus part in the refocused image. We decide to super-resolve these in-focus parts. The result of image segmentation method based on random walks, which works on the depth map produced from the 4D light field data, is used to separate the foreground and background in the refocused image. And focusing evaluation function is employed to determine which refocused image owns the clearest foreground part and which one owns the clearest background part. Subsequently, we employ single image super-resolution method based on sparse signal representation to process the focusing parts in these selected refocused images. Eventually, we can obtain the super-resolved all-focus image through merging the focusing background part and the focusing foreground part in the way of digital signal processing. And more spatial details will be kept in these output images. Our method will enhance the resolution of the refocused image, and just the refocused images owning the clearest foreground and background need to be super-resolved.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The multi-instance multi-label (MIML) learning is a learning framework where each example is described by a bag of instances and corresponding to a set of labels. In some studies, the algorithms are applied to natural scene image classification and have achieved satisfied performance. We design a MIML algorithm based on RBF neural network for the natural scene image classification. In the framework, we compare classification accuracy based on the existing definitions of bag distance: maximum Hausdorff, minimum Hausdorff and average Hausdorff. Although the accuracy of average Hausdorff bag distance is the highest, we find average Hausdorff bag distance to weaken the role of the minimum distance between the instances in the two bags. So we redefine the average Hausdorff bag distance by introducing an adaptive adjustment coefficient, and it can change according to the minimum distance between the instances in the two bags. Finally, the experimental results show that the enhanced algorithm has a better result than the original algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Embedded steel billet character is low contrast. The uneven illumination distribution and oxidation would affect embedded character detection in images correctly. A novel method based on structured light for embedded character acquisition and extraction is proposed. First the embedded character is irradiated by structured light, embedded character would bend the structured light. The processing algorithm based on Fourier transform, picks up the carrier wave from the reflected image. The reflected image is demodulated and filtered to extract the embedded character. Experimental results show that the algorithm is conciseness and effective. The algorithm has the integrity, is not easy to be interfered by noise. The proposed method has reliability and application.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, a new method based on machine vision is proposed for the defects of the traditional manual inspection of the quality of printed matter. With the aid of on line array CCD camera for image acquisition, using stepper motor as a sampling of drive circuit. Through improvement of driving circuit, to achieve the different size or precision image acquisition. In the terms of image processing, the standard image registration algorithm then, because of the characteristics of CCD-image acquisition, rigid body transformation is usually used in the registration, so as to achieve the detection of printed image.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Dynamic texture (DT) is an extension of texture to the temporal domain. Recognizing DTs has received increasing attention. Volume local binary pattern (VLBP) is the most widely used descriptor for DTs. However, it is time consuming to recognize DTs using VLBP due to the large scale of data and the high dimensionality of the descriptor itself. In this paper, we propose a new operator called orthogonal combination of VLBP (OC-VLBP) for DT recognition. The original VLBP is decomposed both longitudinally and latitudinally, and then combined to constitute the OC-VLBP operator, so that the dimensionality of the original VLBP descriptor is lowered. The experimental results show that the proposed operator significantly reduces the computational costs of recognizing DTs without much loss in recognizing accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image matching is the main flow of a three-dimensional reconstruction. With the development of computer processing technology, seeking the image to be matched from the large date image sets which acquired from different image formats, different scales and different locations has put forward a new request for image matching. To establish the three dimensional reconstruction based on image matching from big data images, this paper put forward a new effective matching method based on visual bag of words model. The main technologies include building the bag of words model and image matching. First, extracting the SIFT feature points from images in the database, and clustering the feature points to generate the bag of words model. We established the inverted files based on the bag of words. The inverted files can represent all images corresponding to each visual word. We performed images matching depending on the images under the same word to improve the efficiency of images matching. Finally, we took the three-dimensional model with those images. Experimental results indicate that this method is able to improve the matching efficiency, and is suitable for the requirements of large data reconstruction.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Camera calibration is the first step of computer vision and one of the most active research fields nowadays. In order to improve the measurement precision, the internal parameters of the camera should be accurately calibrated. So one high-accuracy camera calibration algorithm is proposed based on the images of planar targets or tridimensional targets. By using the algorithm, the internal parameters of the camera are calibrated based on the existing planar target at the vision-based navigation experiment. The experimental results show that the accuracy of the proposed algorithm is obviously improved compared with the conventional linear algorithm, Tsai general algorithm, and Zhang Zhengyou calibration algorithm. The algorithm proposed by the article can satisfy the need of computer vision and provide reference for precise measurement of the relative position and attitude.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
For image classification tasks, the region containing object which plays a decisive role is indefinite in both position and scale. In this case, it does not seem quite appropriate to use the spatial pyramid matching (SPM) approach directly. In this paper, we describe an approach for handling this problem based on region of interest (ROI) detection. It verifies the feasibility of using a state-of-the-art object detection algorithm to separate foreground and background for image classification. It first makes use of an object detection algorithm to separate an image into object and scene regions, and then constructs spatial histogram features for them separately based on SPM. Moreover, the detection score is used to rescore. Our contributions include: i) verify the feasibility of using a state-of-the-art object detection algorithm to separate foreground and background used for image classification; ii) a simple method, called coarse object alignment matching, is proposed for constructing histogram using the foreground and background provided by object localization. Experimental results demonstrate an obvious superiority of our approach over the standard SPM method, and it also outperforms many state-of-the-art methods for several categories.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Sparse coding exhibits good performance in many computer vision applications by finding bases which capture highlevel semantics of the data and learning sparse coefficients in terms of the bases. However, due to the fact that bases are non-orthogonal, sparse coding can hardly preserve the samples’ similarity, which is important for discrimination. In this paper, a new image representing method called maximum constrained sparse coding (MCSC) is proposed. Sparse representation with more active coefficients means more similarity information, and the infinite norm is added to the solution for this purpose. We solve the optimizer by constraining the codes’ maximum and releasing the residual to other dictionary atoms. Experimental results on image clustering show that our method can preserve the similarity of adjacent samples and maintain the sparsity of code simultaneously.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In order to retrieve the positioning image efficiently and quickly from a large number of different images to realize the three-dimensional spatial positioning, in this article, based on photogrammetry and computer vision theory, a new method of three-dimensional positioning of big data image under the bag of words model guidance is proposed. The method consists of two parts: image retrieving and spatial positioning. First, complete image retrieval by feature extraction, K-means clustering, bag of words model building and other processes, thus improve the efficiency of image matching. Second, achieve interior and exterior orientation element through image matching, building projection relationship and calculating the projection matrix, and then the spatial orientation is realized. The experimental result showed that the proposed method can retrieve the target image efficiently and achieve spatial orientation accurately, which made a beneficial exploration for achieving space positioning based on big data images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Establishing reliable feature correspondence between two images is a fundamental problem in vision analysis and it is a critical prerequisite in a wide range of applications including structure-from-motion, 3D reconstruction, tracking, image retrieval, registration, and object recognition. The feature could be point, line, curve or surface, among which the point feature is primary and is the foundation of all features. Numerous techniques related to point matching have been proposed within a rich and extensive literature, which are typically studied under rigid/affine or non-rigid motion, corresponding to parametric and non-parametric models for the underlying image relations. In this paper, we provide a review of our previous work on point matching, focusing on nonparametric models. We also make an experimental comparison of the introduced methods, and discuss their advantages and disadvantages as well.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
For the problems caused by viewpoint changes in activity recognition, a multi-view interior human behavior recognition method based on 3D framework is presented. First, Microsoft's Kinect device is used to obtain body motion video in the positive perspective, the oblique angle and the side perspective. Second, it extracts bone joints and get global human features and the local features of arms and legs at the same time to form 3D skeletal features set. Third, online dictionary learning on feature set is used to reduce the dimension of feature. Finally, linear support vector machine (LSVM) is used to obtain the results of behavior recognition. The experimental results show that this method has better recognition rate.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With improving of intelligent and automation in modern industrial production area, the detection and reconstruction of the 3D surface of the product has become an important technology, but the image which acquire on the actual production line has motion blur and this problem will affect the later reconstruction work. In order to solve this problem, a deblurring method which based on double view moving target image is proposed in this paper. We can deduce the relationship of the point spread function(PSF) path between the double view image through the epipolar geometry and the camera model. The experimental results show that deblurring with the PSF path solved by the geometric relationship achieves good results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The digital matting problem is a classical problem of imaging. It aims at separating non-rectangular foreground objects from a background image, and compositing with a new background image. Accurate matting determines the quality of the compositing image. A Bayesian matting Algorithm Based on Gaussian Mixture Model is proposed to solve this matting problem. Firstly, the traditional Bayesian framework is improved by introducing Gaussian mixture model. Then, a weighting factor is added in order to suppress the noises of the compositing images. Finally, the effect is further improved by regulating the user's input. This algorithm is applied to matting jobs of classical images. The results are compared to the traditional Bayesian method. It is shown that our algorithm has better performance in detail such as hair. Our algorithm eliminates the noise well. And it is very effectively in dealing with the kind of work, such as interested objects with intricate boundaries.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image matching has always been a very important research areas in computer vision. The performance will directly affect the matching results. Among local descriptors, the Scale Invariant Feature Transform(SIFT) is a milestone in image matching, while HOG as an excellent descriptor is widely used in 2D object detection, but it seldom used as a descriptor for matching. In this article, we suppose to pool these algorithms and we use a simple modification of the Rotation- Invariant HOG(RI-HOG) to describe the feature domain detected by SIFT. The RI-HOG is Fourier analyzed in the polar/spherical coordinates. Later in our experiment, we test the performance of our method on a datasets. We are surprised to find that the method outperforms other descriptors in image matching in accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Face detection is a fundamental and important research theme in the topic of Pattern Recognition and Computer Vision. Now, remarkable fruits have been achieved. Among these methods, statistics based methods hold a dominant position. In this paper, Adaboost algorithm based on Haar-like features is used to detect faces in complex background. The method combining YCbCr skin model detection and Adaboost is researched, the skin detection method is used to validate the detection results obtained by Adaboost algorithm. It overcomes false detection problem by Adaboost. Experimental results show that nearly all non-face areas are removed, and improve the detection rate.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A three dimensional (3D) measurement method for train wheel surface is proposed based on S-transform profilometry. This method is based on S-transform in fringe analysis. A fringe pattern with a carrier frequency component is projected onto the wheel tread, the deformed fringe patterns caused by the height distribution of wheel surface is recorded as an image, and the fundamental spectrum of S-transform spectra from the image is abstracted by use of weighting filters, then the wrapped phase is obtained by IFFT of the fundamental spectrum. 2D-SRNCP (sorting by reliability following a non-continuous path) phase unwrapping algorithm is used to unwrap phase, which can be used to reconstruct the surface distribution of wheel. Simulation and testing experiment is taken and the result shows that, comparing with light-section method, this method can realize a faster inspection and a higher accuracy measurement of 3D wheel surface.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, a height correction approach is proposed based on multiple sub-image correlation in remote sensing and navigation systems. First, multiple subareas are selected in the reference image, combination of two of which consists of the two for height correction. Then the measured distance and real distance between two areas can be computed by correlation matching and perspective transformation model. With the distances, height deviation can be estimated for further processing. Considering the accuracy-loss caused by image blurring, noise, changes of scales and so on, the approach utilizes clustering method to improve accuracy. Experiments show that the proposed method can estimate the height deviation automatically and the accuracy of the proposed method is equivalent to that based on manually labeled.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Location measurement of 3D point in stereo vision is subjected to different sources of uncertainty that propagate to the final result. For current methods of error analysis, most of them are based on ideal intersection model to calculate the uncertainty region of point location via intersecting two fields of view of pixel that may produce loose bounds. Besides, only a few of sources of error such as pixel error or camera position are taken into account in the process of analysis. In this paper we present a straightforward and available method to estimate the location error that is taken most of source of error into account. We summed up and simplified all the input errors to five parameters by rotation transformation. Then we use the fast algorithm of midpoint method to deduce the mathematical relationships between target point and the parameters. Thus, the expectations and covariance matrix of 3D point location would be obtained, which can constitute the uncertainty region of point location. Afterwards, we turned back to the error propagation of the primitive input errors in the stereo system and throughout the whole analysis process from primitive input errors to localization error. Our method has the same level of computational complexity as the state-of-the-art method. Finally, extensive experiments are performed to verify the performance of our methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
More recently, Local Binary Patterns(LBP) has received much attention in face representation and recognition. The original LBP operator could describe the spatial structure information, which are the variety edge or variety angle features of local facial images essentially, they are important factors of classify different faces. But the scale and orientation of the edge features include more detail information which could be used to classify different persons efficiently, while original LBP operator could not to extract the information. In this paper, based on the introduction of original LBP-based facial representation and recognition, the histogram sequences of local Gabor binary patterns are used to representation facial image. Principal Component Analysis (PCA) method is used to classification the histogram sequences, which have been converted to vectors. Recognition experimental results show that the method we used in this paper increases nearly 6% than the classification performance of original LBP operator.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The building facade model is one of main landscapes of a city and basic data of city geographic information. It is widely useful in accurate path planning, real navigation through the urban environment, location-based application, etc. In this paper, a method of facade model refinement by fusing terrestrial laser data and image is presented. It uses the matching of model edge and image line combined with laser data verification and effectively refines facade geometry model that reconstructed from laser data. The laser data of geometric structures on building facade such as window, balcony and door are segmented, and used as a constraint for further selecting the optical model edges that are located at the cross-line of point data and no data. The results demonstrate the deviation of model edges caused by laser sampling interval can be removed in the proposed method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The most important step in image preprocessing for Optical Character Recognition (OCR) is binarization. Due to the complex background or varying light in the text image, binarization is a very difficult problem. This paper presents the improved binarization algorithm. The algorithm can be divided into several steps. First, the background approximation can be obtained by the polynomial fitting, and the text is sharpened by using bilateral filter. Second, the image contrast compensation is done to reduce the impact of light and improve contrast of the original image. Third, the first derivative of the pixels in the compensated image are calculated to get the average value of the threshold, then the edge detection is obtained. Fourth, the stroke width of the text is estimated through a measuring of distance between edge pixels. The final stroke width is determined by choosing the most frequent distance in the histogram. Fifth, according to the value of the final stroke width, the window size is calculated, then a local threshold estimation approach can begin to binaries the image. Finally, the small noise is removed based on the morphological operators. The experimental result shows that the proposed method can effectively remove the noise caused by complex background and varying light.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Stability analysis of various neural networks have been successfully applied in many fields such as parallel computing and pattern recognition. This paper is concerned with a class of stochastic Markovian jump neural networks. The general mean-square stability of Backward Euler-Maruyama method for stochastic Markovian jump neural networks is discussed. The sufficient conditions to guarantee the general mean-square stability of Backward Euler-Maruyama method are given.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Phase unwrapping is a common problem in many phase measuring techniques. Glodstein’s branch-cut algorithm is one of classic ways of phase unwrapping, but it need rectifying. First the paper introduces the characteristics of residual points and describes Glodstein’s branch-cut algorithm in detail. Then the paper discusses the improvements on the algorithm by changing branch setting and adding pretreatment. Last the paper summarizes the new algorithm and gets the better result by using computer emulation mode and validation test.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.