The Gabor wavelet is well-known tool in the various fields, such as computational neuroscience, multi-resolutional analysis and so on. The Gabor wavelet is a kind of the Gaussian modulated sinusoidal wave or a kind of windowed Fourier transformation with the Gaussian kernel window. The Gabor wavelet attains the minimum of the uncertainty relation. However the width and the height of the time-frequency window do not change in their lengths depending on the analyzing frequency. This makes the application area of the Gabor wavelet narrow. On the other hand, instead of using the conventional Gaussian distribution as a kernel of the Gabor wavelet, if the q-normal distribution is used, we can get the q-Gabor wavelet as a possible generalization of the Gabor wavelet. The q-normal distribution, which is given by the author, is one of the generalized Gaussian distribution. In this paper, we give the definitions of the q-Gabor wavelet for continuous version and discrete version. The discrete version consists of two different types of the q-Gabor wavelet. One is similar to the conventional Gabor wavelet with respect to the width and the height of the time-frequency window, the other is similar to the conventional discrete wavelet system such that the width and the height of the time-frequency window change in their lengths depending on the frequency. The mother wavelet is also given for the orthonormal q-Gabor wavelet with some approximation.
The invariance and covariance of extracted features from an object under certain transformation play quite important roles in the fields of pattern recognition and image understanding. For instance, in order to recognize a three dimentional object, we need specific features extracted from a given object. These features should be independent of the pose and the location of an object. To extract such feature, The authors have presented the three dimensional vector autoregressive model (3D VAR model). This 3D VAR model is constructed on the quaternion, which is the basis of SU(2) (the rotation group in two dimensional complex space). Then the 3D VAR model is defined by the external products of 3D sequential data and the autoregressive(AR) coefficients, unlike the usual AR models. Therefore the 3D VAR model has some prominent features. For example, The AR coefficients of the 3D VAR model behave like vectors under any three dimensional rotation. In this paper, we derive the invariance from 3D VAR coefficients by inner product of each 3D VAR coefficient. These invariants make it possible to recognize the three dimensional curves.
This paper proposes a novel face detection method to be used for practical human interactive mobile robot. Towards future aging society, there are much expectation and social demand for a mobile robot that is possible to interactively collaborate and support human. Typical examples are pet robot and service robot. Guard robot is another example. Face detection and recognition are very crucial for such applications. However, in the real situation, it is not easy to realize the robust detecting function because position, size, and brightness of face image are much changeable. The proposed method solves these problems by combining correlation-based pattern matching, histogram equalization, skin color extraction, and multiple scale images generation. The authors have implemented a prototype system based upon the proposed method and conducted some experiments using the system. Experimental results support effectiveness of the proposed idea.
This paper proposes a method for a robot to pick up human voice clearly and remotely. The method utilizes microphones array. By setting gain and delay of each microphone properly, it enables to form "acoustic focus" at the desired location. The proposed method is intended to be used as a speech signal input for human interactive mobile robot. To confirm feasibility of the proposed idea, the authors conduct simulation and simple experiments. The simulation result supports the feasibility. As for the experiment, although the array is so simple that the number of microphones is only three and signal processing of the array is "offline", we can confirm that a sensitivity peak of the array appears at the desired location. Both results show possibility to realize "acoustic focus" at the desired location. Thus these results support feasibility of the proposed idea.
The normal distribution or Gaussian distribution is one of the greatest tools in the computer vision and the other fields. For instance, it is used as the preprocessing tool for the subsequent operations in the computer vision, especially, in order to reduce the noises. In another case, it is used as the functional bases to approximate the given function which is defined as the set of the sampled points. This is called the radial bases function method (RBF). The RBF method is also used in the computational neuroscience, to explain the human mental rotation. From the information theoretic view, it gives the maximum value of the Boltzmann-Shannon entropy. In this paper, we give yet another such distribution, which is called q-normal distribution. We also give many useful formulae to use q-normal distribution in the various fields.
The invariance and covariance of extracted features from an object under certain transformation play quite important roles in the fields of pattern recognition and image understanding. For instance, in order to recognize a three dimensional object, we need specific features extracted from a given object. These features should be independent of the pose and the location of an object. To extract such features, the authors have presented the three dimensional vector autoregressive model (3D VAR model). This 3D VAR model is constructed on the quarternion, which is the basis of SU(2) (the rotation group in two dimensional complex space). Then the 3D VAR model is defined by the external products of 3D sequential data and the autoregressive (AR) coefficients, unlike the conventional AR models. Therefore the 3D VAR model has some prominent features. For example, the AR coefficients of the 3D VAR model behave like vectors under any three dimensional rotation. In this paper, we present an effective straightforward algorithm to obtain the 3D VAR coefficients from lower order to higher order recursively.
In this paper we extend the autoregressive (AR) model to the multilevel AR model with wavelet transformation, in order to get the AR coefficients at each level as a set of shape descriptors for every level. To get the multilevel AR model, we use the wavelet transformation such as Haar wavelet to a boundary data. Then real AR and complex-AR (CAR) models are adopted to the multilevel boundary data of a shape to extract the features at each level. Furthermore we present the relation of the autocorrelation coefficients between adjacent resolution levels to elucidate the relation between AR model and wavelet transformation. Some experiments are also shown for the multilevel AR and CAR models with a certain similarity measure.
Recently an interesting image analysis by scale-space method is given by Sporring and Weickert. They considered Renyi entropy at each scale to estimate the extents of the lighter pattern and the darker pattern in a given image. On the other hand, there is another generalized entropy such as Tsallis entropy, which has a physical meaning like Boltzmann entropy and is also famous for its usefulness in physics. In this paper, after giving a brief review of Tsallis entropy, we adopt Tsallis entropy as an information measure at each level for the scale-space method to elucidate what the difference between Renyi entropy and Tsallis entropy causes in result. It is also shown that Tsallis entropy is a more natural information measure than Renyi entropy.
KEYWORDS: Autoregressive models, 3D modeling, Data modeling, Americium, Radon, Pattern recognition, Promethium, Data compression, Visual process modeling, Image understanding
The invariance and covariance of extracted features from an object under certain transformation play quite important roles in the fields of pattern recognition and image understanding. For instance, in order to recognize a three dimensional (3D) object, we need specific features extracted from a given object. These features should be independent of the pose and the location of an object. To extract such feature, one of the authors has presented the 3D vector autoregressive (VAR) model. This 3D VAR model is constructed on the quaternion, which is the basis of SU(2) (the rotation group in two dimensional complex space). Then the 3D VAR model is defined by the external products of 3D sequential data and the autoregressive (AR) coefficients, unlike the conventional AR models. Therefore the 3D VAR model has some prominent features. For example, the AR coefficients of the 3D VAR model behave like vectors under any three dimensional rotation. In this paper, we present the recursive computation of 2D VAR coefficients and 3D VAR coefficients. This method reduce the cost of computation of VAR coefficients. We also define the partial correlation (PARCOR) vectors for the 2D VAR model and 3D VAR model from the point of view of data compression and pattern recognition.
The face recognition, as one of the pattern recognition, includes various essence such as the representation and the extraction of the required features, the classification based on the obtained features and the detecting specified regions etc. Previously, we presented the scale and the rotation invariant face recognition method based on both Higher-Order Local Autocorrelation features of log-polar image and linear discriminant analysis for 'face' and 'not face' classification. In this face recognition method, the searching for the 'face' region was performed randomly or sequentially on the image. Therefore its searching performance was not satisfiable. In this paper, we present a method to narrow down the search space by dynamically using the information obtained at the previous search point through constructing the multilevel dynamic attention map, which is constructed based on the Ising dynamics and the renormalization group method.
The factorization method by Tomasi and Kanade gives a stable and an accurate reconstruction. However is difficult to apply their method to real-time applications. Then we present an iterative factorization method for the GAP model with tracking the feature points. In this method, through the fixed size measurement matrix, which is independent of the number of the frames, the motion and the shape are to be reconstructed at every frame. Some experiments are also given to show the performance of our proposed iterative method.
In this paper, a multilevel Ising search method for human face detection is proposed to speed up the search. In order to utilize the information obtained from the previous searched points. Ising model is adopted to represent the candidates of `face' positions and is combined with the scale invariant human face detection method. In the face detection, the distance from the mean vector of `face' class in discriminant space represents the likelihood of face. By integrating the measured distance into the energy function of Ising model as the external magnetic field, the search space is narrowed down effectively (the candidates of `face' are reduced). By incorporating color information of face region in the external magnetic field, the `face' candidates can be reduced further. In the multilevel Ising search, face candidates (spins) with different resolutions are represented in a Pyramidal structure and the coarse-to-fine strategy is taken. We demonstrate that the proposed multilevel Ising search method can effectively reduce the search space and can detect human face correctly.
Based on the representation of the projected rotation group, we can construct the well-behaved quantity, which is called quasi moment, under the projected rotation. The representation of the projected rotation group is obtained, through the infinitesimal 3D rotation followed by the projection onto an image plane, by Lie group theory. Thus the representation of the projected rotation group has good transformation property under projected rotation,but whether the constructed quasi moment shares similar good transformation property or not is not so trivial. Therefore, in this paper, we will present the effect of the finite transformation on quasi moment explicitly and show that the quasi moment also has good transformation property under the projected rotation group.
The invariance and covariance of extracted features from an object under certain transformation play quite important roles in the fields of pattern recognition and image understanding. For instance, in order to recognize a 3D object, we need specific feature extracted from a given object. These features should be independent of the pose and the location of an object. In this paper, as one of the feature extracting methods, we present 3D autoregressive model and its higher dimensional extensions. 1D and 2D autoregressive model has been considered as one of the feature extracting methods.
For pattern recognition and image understanding, it is important to take invariance and/or covariance of features extracted from given data under some transformation into consideration. This makes various problems in pattern recognition and image understanding clear and easy. In this article, we present two autoregressive models which have proper transformation properties under rotations: one is a 2D autoregressive model (2D AR model) which has invariance under any 2D rotations and the other is a 3D autoregressive model (3D AR model) which has covariance under any 3D rotations. Our 2D AR model is based on a matrix representation of complex number. It is shown that our 2D AR model is equivalent to Otsu's complex AR model. On the other hand, our 3D autoregressive model is based on the representation theory of rotation group such as the fundamental representation of Lie algebra of SU(2)(Special Unitary group in 2D which includes rotation group), which is called Pauli's matrices.
In this paper, we consider the projected rotation group which consists of projection and rotations in 3D and give some invariant feature extractors. Based on the theory of Lie algebra, the representation of the projected rotation group is obtained. Therein it is shown that the basis of the representation can be an orthonormal basis. With this orthonormal basis, we can construct quasi moment which is a kind of a weighted moment. It is also shown that the quasi moments which are closed in their orders under the projected rotation group. By computer simulations, some experiments for 3D motion analysis with the quasi moments are given.
By considering all 2D transformations on image patterns, Otsu has shown that the moment feature is a kind of linear feature extractor from patterns and they are closed under such transformations. But the projections from 3D space onto 2D space are not considered. In order to recognize 3D motions from 2D images, the projections should be considered. So in this paper, the representation of the projected motion group is considered through Lie algebraic methods. We give the explicit formula of the bases of its representation. It is shown that the bases can be orthonormal bases. The quasi moments are defined and also shown to be closed in their order under the projected rotation group.
For the image understanding and patten recognitions, it is important to extract invariant features from given images corresponding to various transformations. Once the invariant features are obtained, we can estimate motion parameters and/or categorize objects into equivalent classes based on some criterions. So many techniques to extract invariant features are proposed, and most of them need exact matching between an image before transformation and another image after transformation. But this matching process is not easy to perform. Then we propose a group theoretical method, which does not require a matching process. In this paper, we show the bases of the representation of the perspective projected motion group and those of the spherical projected motion group explicitly.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.