PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
We describe an application of nonlinear dynamical systems to image transformation and encoding. Our approach is different from the classical one where affine discrete maps are used. Similarly to classical fractal image compression, nonlinear maps use the redundancy in the image for compression. Furthermore, compression speed is enhanced whenever nonlinear maps have more than one attractor. Nonlinear maps having strange chaotic attractors can also be used to encode the image. In this case, the image will take the shape of the strange attractor when mapped under the nonlinear system. The procedure needs some precautions for chaotic maps, because of the sensitivity to initial conditions. Another possibility is to use strange attractors to hide the initial image using various schemes. For example, it is possible to hide the image using position permutation, value permutation, or both position and value permutations. We develop an algorithm to show that chaotic maps can be used successfully for this purpose. We also show that the sensitivity to initial conditions of chaotic maps forms the basis of the encryption strategy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents an image coding method, based on wavelet
transform, where the distribution of the subband coefficients is
assumed to be generalized Gaussian. The shape factor and the standard
deviation are estimated in all subbands. A procedure of bit allocation
distributes then the available bitrate to the retained coefficients.
The multiple scale leader lattice vector quantization (MSLLVQ) has
shown its superiority compared to other structured quantization
schemes and now we propose its use for the quantization of the wavelet
coefficients. The main contribution of the paper is the procedure for
selecting the structure and the leaders for the MSLLVQ. An iterative
construction of the MSLLVQ scheme is presented along with the
derivation of the operational rate-distortion function. The
bit allocation procedure is based on the exponential fitting of the
operational rate-distortion curve. The results in terms of
peak signal to noise ratio are compared to other image codecs from the
literature, the advantage of such a coding structure being
particularly important for the fixed rate encoding.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Lossless image compression has become an important research topic,
especially in relation with the JPEG-LS standard. Recently, the techniques known for designing optimal codes for sources with infinite alphabets have been applied for the quantized Laplacian sources which have probability mass functions with two geometrically decaying tails. Due to the simple parametric model of the source distribution the Huffman iterations are possible to be carried out analytically, using the concept of reduced source, and the final codes are obtained as a sequence of very simple arithmetic operations, avoiding the need to store coding tables. We propose the use of these (optimal) codes in conjunction with context-based
prediction, for noiseless compression of images. To reduce further the average
code length, we design Escape sequences to be employed when the estimation
of the distribution parameter is unreliable.
Results on standard test files show improvements in compression ratio when comparing with JPEG-LS.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
It is necessary for getting the enlarged digital images to predict unknown high frequency components which are lost by sampling. We have proposed the image enlarging method based on the Laplacian pyramid representation which can predict the unknown high frequency components. We call this method the LP enlarging method. Since the JPEG coded image is compression image, thus, the image is blurred. Furthermore, the JPEG coded image, especially at low bit rate, exhibit visually annoying blocking effects. Thus, the LP enlarging method isn’t able to be applied to the JPEG coded image. In this paper, we modify the LP enlarging method in order to apply the JPEG coded image. The novel method introduce Epsilon-filters into the LP enlarging method, so as to reduce the blocking effects. Moreover, we tune the parameters of the LP enlarging method and the Epsilon-filters for the JPEG coded image. We clear the effectiveness of the proposed method through a lot of experimental results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Fixed matrix displays require digital interpolation algorithms to adapt the input spatial format to the output matrix. Interpolation techniques usually employed for this purpose exploit linear kernels designed to preserve the spectral content (anti aliasing), but this generates smooth edges, which result in unpleasant text images where sharpness is essential. By contrast, interpolation kernels designed to preserve sharpness introduce geometrical distortions in the scaled text (e.g. nearest neighbor interpolation). This paper describes an interpolation algorithm which, compared to linear techniques, aims to increase the sharpness of interpolated text while preserving its geometrical regularity. The basic idea is to differentiate the processing for text and non-text pixels. Firstly, a binary text map is built. By using morphological constraints it is possible to form a similar text map in the output domain that preserves the general text regularity. Finally, output text pixel positions are used to control a nonlinear interpolator (based on the Warped Distance approach) that is able to generate both step and gradual luminance profiles, thus enabling the algorithm to locally change its behavior. A general sharpness control is provided as well, which permits to range from a two-level text (maximum sharpness) to a smoother output image (traditional linear interpolation).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
New adaptive varying window size estimation methods for edge detection are presented in this work. The nonparametric Local Polynomial Approximation (LPA) method is used to define gradient estimation kernels or masks, which in conjunction with varying adaptive window size selection, carried out by the Intersection of Confidence Intervals (ICI) for each pixel, let us obtain algorithms which are adaptive to unknown smoothness and nearly optimal in the point-wise risk for estimating the intensity function and its derivatives. Several existing strategies using a constant window size of the convolutional kernel of edge detection have been upgraded to become varying window size techniques, first through the use of LPA for defining the gradient convolutional kernels of different sizes and second through ICI for the selection of the best estimate which balances the bias-variance trade-off in a point wise fashion for the whole stream of data. Comparisons with invariant window size edge detection schemes show the superiority of the presented methods, even over computationally more expensive techniques of edge detection.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A novel method of quantifying the level of detail preservation ability of digital filters is proposed. The method assumes only the input distribution of the filter and estimates how much the filter changes the signal. The change is measured by the expectation of the absolute difference between the input and output signal. The method is applicable for many filters and input distributions. As an example case, the formulas for the expectation of the absolute difference for weighted order statistic filters with the uniform and Laplacian (biexponential) input distributions are derived. Finally, the design of weighted order statistic filters using supervised learning is studied. The learning method uses the detail preservation measure as a design criterion to obtain filters with different levels of detail preservation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper claims that human various body shapes create individual eigenspaces; as a result, the classical appearance-based model cannot be effective for recognizing human postures. We introduce figure effect in the eigenspaces due to different human body shapes in this particular study. The study proposes an organized eigenspace tuning method for overcoming the preceding problem. Since the proposed method tunes the classical eigenspaces for human posture recognition, we define this phenomenon eigenspace tuning. Generation of a tuned eigenspace (TES) is an organized method where some of similar eigenspaces are selected according to MDD (minimum description deviation) method and a mean if them is taken. In fact, the TES is an optimized visual appearance of various human models that minimizes the fluctuation of MDL (mean description length) between training and testing feature spaces. We have tested the proposed approach on a number of human models considering their various body shapes, and significance of the method to the recognition rates has been demonstrated.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Spectral (color) and spatial (shape) features available in pictures are sources of information that need to be incorporated for advance content-based image database retrieval. The adaptive shape transform approach developed in this research is originated from the premise that a two-dimensional (2D) shape can be recovered completely from a set of the orthogonal Radon transform-based projections. For search consistency, it is necessary to identify the region(s) of interest (ROI) before applying the Radon transform to shape query. ROI’s are detected automatically by means of saliency map-based segmentation. The Radon transform packs the shape information of a 2D mess along the projection axis of known orientation, and generates a series of one-dimensional (1D) functions from color channels for projection angles ranging from 1° to 180°. The optimal number of projections for a particular shape is determined by imposing the Kullback-Leibler distance (KLD) histogram comparison as the similarity metric between the query and database images. The Radon transforms with the shortest and longest lengths yield the most distinctive shape attributes for the object classes being queried. For translation- and rotation-invariant retrieval, the principal component analysis is utilized as the preprocessing tool in the spatial plane. Size invariance is achieved by normalizing the Radon transforms in the (R, G, B) color channels independently. The proposed algorithm was tested on a wide range of complex shaped objects imaged in 24-bit color with different spatial resolutions. The KLDs between two images are calculated in the longest and shortest directions of the Radon transform, and then are added together to find the similarity measure corresponding to the query and database pictures. Higher measures indicate two dissimilar shapes, while smaller values represent two similar ones. Experimental results show that the method is robust and accounts for high noise immunity.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we present a novel approach for multiple-feature, multiple-sensor classification and localization of three-dimensional objects in two-dimensional images. We use a hypothesize-and-test-approach where we fit three-dimensional geometric models to image data. A hypothesis consists of an object's class and its six degrees of freedom. Our models consist of the objects' geometric data which is attributed with several local features, e.g. hotspots, edges and textures, and their respective rule of applicability (e.g. visibility). The model-fitting process is divided into three parts: using the hypothesis we first project the object onto the image plane while evaluating the rules of applicability for its local features. Hence, we get a two-dimensional representation of the objects which - in a second step - is aligned to the image data. In the last step, we perform a pose estimation to calculate the object's six degrees of freedom and to update the hypothesis out of the alignment results. The paper describes the major components of our system. This includes the management and generation of the hypotheses, the matching process, the pose estimation, and model-based prediction of the object's pose in six degrees of freedom. At the end, we show the performance, robustness and accuracy of the system in two applications (optical inspection for quality control and airport ground-traffic surveillance).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Binary matrices or (±1)-matrices have found numerous applications in coding, signal processing, and communications. In this paper, a general and efficient algorithm of decomposition of binary matrices is developed. As a special case, Hadamard matrices are considered. The proposed scheme requires no zero padding of the input data. The problem of the construction of 4n-point Hadamard transform is related to the Hadamard problem: the question of existence of Hadamard matrices. (It is not proved whether for every integer n, there exists an orthogonal 4n×4n matrix with elements ±1). The number of real operation in developed algorithms is reduced from 0(N2) to 0(Nlog2N). Comparisons revealing the efficiency of the proposed algorithms with respect to the known ones are given. In particular, it is demonstrated that, in typical applications, the proposed algorithm I s more efficient than the conventional Walsh Hadamard transform. Note that for Hadamard matrices of orders ≥96 the general algorithm is more efficient than the classical Walsh-Hadamard transform whose order is a power of two. The algorithm has a simple and symmetric structure. The results of numerical examples are presented.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We propose a fast PNN-based O(N log N) time algorithm for multilevel non-parametric thresholding, where N denotes the size of the image histogram. The proposed PNN based multilevel thresholding algorithm is considerably faster than optimal thresholding. On a set of 8-16 bits per pixel real images, experimental results also reveal that the proposed method provides better quality than the Lloyd-Max quantizer alone. Since the time complexity of the proposed thresholding algorithm is log-linear, it is applicable in real-time image processing applications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In many applications involving measuring a physical phenomenon, the output data contains a mixture of different type of distributions. The data set consists often of unimodal distributions, which overlap, i.e. the ranges of the corresponding random variables have a significant intersection. After observing a multimodal histogram that has several partially overlapping distributions the aim is to separate them by inferring the correct types of the probability density functions (PDFs) and their parameters. The method is based on the non-linear least squares estimation, where several types of PDFs are fitted to the region mostly affected by a single distribution. The possible candidate PDFs are those of the Pearson system, Weibull, Fisher, chi-squared and Rayleigh distributions. This method can be extended to multidimensional cases in certain situations. The methods developed earlier for this task are based for example on the QQ-plot technique and on order statistic filter banks. The found distribution types and their parameters can be applied to different tasks in image processing and system analysis. This algorithm can be used e.g. to the estimation of PDFs of certain phenomena and to global thresholding of images. The method is applied to real two-dimensional data sets having values coming from several distributions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The aim of this paper is to present a new method for the estimation of the instantaneous frequency of a frequency modulated signal, corrupted by additive noise. This method represents an example of fusion of two theories: the time-frequency representations and the mathematical morphology. Any time-frequency representation of a useful signal is concentrated around its instantaneous frequency law and realizes the diffusion of the noise that perturbs the useful signal in the time - frequency plane. In this paper a new time-frequency representation, useful for the estimation of the instantaneous frequency, is proposed. This time-frequency representation is the product of two others time-frequency representations: the Wigner - Ville time-frequency representation and a new one obtained by filtering with a hard thresholding filter the Gabor representation of the signal to be processed. Using the image of this new time-frequency representation the instantaneous frequency of the useful signal can be extracted with the aid of some mathematical morphology operators: the conversion in binary form, the dilation and the skeleton. The simulations of the proposed method have proved its qualities. It is better than other estimation methods, like those based on the use of adaptive notch filters.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A new concept of ga-cross sections by arbitrary homothetic curves ga that generalizes the traditional horizontal cross sections used in the mathematical morphology is considered. On the base of these kinds of cross sections, the corresponding set representations in the form of umbrae, as well as the function processing transformations such as dilation and erosion are given. Main properties of these transformations are described.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper we present a digital image enhancement technique which relies on the application of a nonlinear operator within the Retinex approach. The basic idea of this approach is to separate the illumination and reflectance components of the image, so that by reducing the contribution of the former it is possible to effectively control the dynamic range of the latter. However, its behaviour critically depends on the quality of the illumination estimation process, so that either annoying artifacts are generated, or very complex operators have to be used, which may prevent the use of this method in several cost- and time-sensitive applications. Our method is able to provide, thanks to the use of a suitable nonlinear operator, good quality, artifacts-free images at a limited computational complexity.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The problem of blind evaluation of noise variance in images is considered. Typical approaches commonly presume getting a set of variance estimations in small size blocks and further analysis of the obtained estimations set distribution with finding its maximum. However, such methods suffer from the common drawback that their accuracy becomes drastically worse if an image contains a lot of texture. To alleviate this drawback we propose an approach based on the fact that the statistical properties of DCT coefficients corresponding to high spatial frequencies in small size blocks greatly depend upon noise variance. As shown, these coefficients can be processed in nonlinear manner in order to eliminate the influence of informative component of the image itself. The dependence of the method accuracy on the used nonlinear operation and its parameters is carried out. It is shown that the proposed method produces appropriately good accuracy of blind evaluation of noise variance for a set of considered test images. The comparison analysis of the proposed method and some known analogs is performed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we propose the use of order filters in the iterative process of super-resolution reconstruction. At each iteration, order statistic filters are used to filter and fuse the error images. The signal dependent L-filter structure adjusts its coefficients to achieve edge preservation as well as maximum noise suppression in homogeneous regions. Depending on the amount of variance of the image pixels in different directional masks, the filter switches to use the orientation, which is most likely to follow the image edges. This procedure allows for the incorporation of a directional prior across the iterations. The introduction of a spatial filtering stage into the iterative process of super-resolution attempts to increase the robustness towards motion error and image outliers. Experimental results show the improvement obtained on sequences of noisy text images when motion is exactly known, and when a random motion error is introduced to simulate the real life situation of inaccurate motion estimation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
PDE-based, non-linear diffusion techniques are an effective way to denoise images.In a previous study, we investigated the effects of different parameters in the implementation of isotropic, non-linear diffusion. Using synthetic and real images, we showed that for images corrupted with additive Gaussian noise, such methods are quite effective, leading to lower mean-squared-error values in comparison with spatial filters and wavelet-based approaches. In this paper, we extend this work to include anisotropic diffusion, where the diffusivity is a tensor valued function which can be adapted to local edge orientation. This allows smoothing along the edges, but not perpendicular to it. We consider several anisotropic diffusivity functions as well as approaches for discretizing the diffusion operator that minimize the mesh orientation effects. We investigate how these tensor-valued diffusivity functions compare in image quality, ease of use, and computational costs relative to simple spatial filters, the more complex bilateral filters, wavelet-based methods, and isotropic non-linear diffusion based techniques.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
One of the major disadvantages of the standard iterative image restoration is its linear rate of convergence. In this paper it is shown that for natural scenes and uniform space invariant distortion, we can attain acceleration in the iteration process. It has been shown here that, for standard iterative restoration, if we choose the gain parameter close to but less than its upper limit and then after some iterations reduce it to exactly half of its upper limit, a close estimation of the original image can be obtained in that particular iteration itself. This is due to the fact that the iterative step vectors get closer to the original image vector as the iteration progresses, with a linear rate of convergence. The advantage of using this approach is that the iteration process can be accelerated in any desired iteration. Reducing the gain parameter in an early stage of the iteration process can save processing time at the cost of accuracy. On the other hand, if we choose to reduce the gain parameter after an increased number of iterations, we can obtain a more accurate result using more processing time. This is a result of the fact that the angle between the iterative step vector and the original image vector approaches zero as we increase the number of iterations.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, a new method of image enhancement is introduced. The method is based on the tensor (or vectorial) representation of the two-dimensional image with respect to the Fourier transform. In this representation, the image is defined as a set of one-dimensional (1-D) image-signals that split the Fourier transform into a set of 1-D transforms. As a result, the problem of the image enhancement is reduced to the 1-D processing the splitting signals.
The splitting of the image yields a simple model for the image enhancement, when by using only a few image-signals it is possible to achieve the image enhancement that is comparative to the known class of the frequency domain based parametric image enhancement algorithms, that are used widely for the object detection and visualization. A quantitative measure of image enhancement that is related to the Weber's law of the human visual system is considered. Based on the quantitative measure the best parameters for image enhancement can be found for each image-signal to be processed separately. Examples of image-signals and their contributions in process of enhancement of an image 256×256 are given.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Imaging text documents always adds a certain amount of noise and other artifacts. Fidelity of electronic reproduction depends very much on the accuracy of noise removal algorithms. The present algorithms attempt to remove artifact by filters that are based on convolution or other methods that are “invasive” with respect to the original representation of the text document. As a result, it is highly desirable to design noise removal algorithms that restore the image to the original representation of text, removing merely noise and added artifacts without blurring or tampering with font corners and edges. In this paper, we present a solution to this problem by design of a filter based on accurate statistics of text in its Matrix Frequency Representation that was developed earlier by the authors
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We present a software framework for developing a flexible image analysis system. This framework provides a uniform interface to develop intelligent image analysis tools as well as infrastructure facilities required by these tools for working cooperatively with other tools. This system first automatically generates a processing plan to accomplish a user defined task, and then executes that plan to produce results.
Each processing tool encapsulates an image processing algorithm as well as knowledge about this algorithm. Tools use this knowledge to evaluate their suitability to handle a given task. This approach allows each processing tool the ability of selectively accepting appropriate tasks that belong to its domain.
A processing tool is also able to define subtasks, which must be accomplished to refine input data as required by its underlying image processing algorithm. These subtasks can be broadcast among other tools and enlist appropriate individuals to accomplish task goals.
Our framework is able to accept image analysis tasks defined using abstract conceptual terms used in application domains, and uses production rules to expand detailed fprms of these terms. This facility allows successfully defining an image analysis task without specifying all low level details.
Preliminary results that we obtained from this framework demonstrated the success of our approach.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper we propose an automated abstract extraction for soccer video using MPEG-7 descriptors. The video abstraction is created in form of highlight scenes which represent some pre-defined contexts. For soccer video, the events often have some specific order of actions, so the Hidden Markov Models (HMMs) are employed to detect the interested highlights. In our system, the input video is first separated into shots, then the video shots are classified and clustered based on the features of MPEG-7 standard descriptors. The HMM-based detection has two layers. The first is to eliminate the trivial scenes, and the second is to distinguish the highlights of different semantic events. Specific user preference will be used to select the highlight scenes relevant to the interest of user.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The approximation in the maximally distinct or under sampled filter banks is considered. In this discussion, the decimated sampling interval T satisfies T≥M, where M is the number of paths of the filter banks. Moreover, we present the optimum transmultiplexer TR. This result is extended to the design of wireless transmultiplexer, used in CDMA systems, for example.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The automated production of maps of human settlement from recent satellite images is essential to studies of urbanization, population movement, and the like. The spectral and spatial resolution of such imagery is often high enough to successfully apply computer vision techniques. However, vast amounts of data have to be processed quickly. In this paper, we propose an approach that processes the data in several different stages. At each stage, using features appropriate to that stage, we identify the portion of the data likely to contain information relevant to the identification of human settlements. This data is used as input to the next stage of processing. Since the size of the data has reduced, we can now use more complex features in this next stage. These features can be more
representative of human settlements, and also more time consuming to
extract from the image data. Such a hierarchical approach enables us to process large amounts of data in a reasonable time, while maintaining the accuracy of human settlement identification. We illustrate our multi-stage approach using IKONOS 4-band and panchromatic images, and compare it with the straight-forward processing of the entire image.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Cameras provide only bi-dimensional views of three-dimensional objects. These views are projections that change depending on the spatial orientation or pose of the object. In this paper we propose a technique to estimate the pose of a 3D object knowing only a 2D picture of it. The proposed technique explores both the linear and the nonlinear composite correlation filters in a combination with a neural network. We present results in estimating two orientations: in-plane and out-of-plane rotations within an 8 degree square range.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we propose an automatic model based image segmentation system, where the instantiated model is refined incrementally using the domain knowledge combined by Fuzzy Logic. The Fuzzy Inference System (FIS) combines several different image features, which are used by experts to detect prostates in noisy ultrasound images. We use the Discrete Dynamic Contour (DDC) model because of its favorable performances in both open and closed contour models. The FIS governs the automatic open DDC model initialization and the following incremental growing process on a low-resolution image. At this stage, the initial open contour model grows by tracking the coarse edge details until it closes. The resulting closed contour model is then refined incrementally up to the original image resolution, incorporating finer edge details on to the model. The algorithm developed here is a general tool for object detection in an image analysis system, which employs a flexible framework designed to support multiple decision tools to collaborate in forming a solution. The FIS in our tool retrieves the domain knowledge it needs from the framework, to govern the model refinement process. The proposed algorithm can be used to detect the boundary of any object on an image, if the knowledge of the dominant image features is stored in the system. We have included results of the algorithm successfully applied to several ultrasound images to define the boundary of the prostate.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The system receives a pattern sequence, i.e., a time-series of
consecutive patterns as an input sequence. The set of input sequences
are given as a training set, where a category is attached to each input sequence, and a supervised learning is introduced. First, we introduce a state transition model, AST(Abstract State Transition), where the information of speed of moving objects is added to a state transition model. Next, we extend it to the model including a reinforcement learning, because it will be more powerful to learn
the sequence from the start to the goal. Last, we extend it to the model of state including a kind of pushdown tape that represents a knowledge behavior, which we call Pushdown Markov Model. The learning procedure is similar to the learning in MDP(Markov Decision Process) by using DP (Dynamic Programming) matching. As a result, we show a reasonable learning-based recognition of a trajectory for human behavior.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We describe two approaches of systematic performance assessment of
a specific image registration algorithm. One approach involves generating radiometrically different synthetic images by convolving one of them with a point-spread function, while the other consists of
registration of three or more images to obtain multiple estimates of registration parameters. We present experimental results that indicate that different-radiometry synthetic data is more difficult to register and so it provides better testing than same-radiometry data used in our previous work. They also show that the multiple-estimate methodology, that we call triangulation, may be used not only to measure self-consistency of a given registration algorithm, but also to obtain estimates of ground truth information for images for which, if available at all, the ground truth is known only approximately.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Concrete materials obtained from the utilization of pre-mixed and ready to use products (central mix-concrete) are more and more used. They represent a big portion of the civil construction market. Such products are used at different scale, ranging from small scale works, as those commonly realized inside and house, an apartment, etc. or at big civil or industrial scale works. In both cases the problem to control the mixtures and the final work is usually realized through the analysis of properly collected samples. Through appropriate sampling it can be derived objective parameters, as size class distribution and composition of the constituting particulate matter, or mechanical characteristics of the sample itself. An important parameter not considered by the previous mentioned approach is “segregation”, that is the possibility that some particulate materials migrate preferentially in some zones of the mixtures and/or of the final product. Such a behavior dramatically influences the quality of the product and of the final manufactured good. Actually this behavior is only studied adopting a human based visual approach. Not repeatable analytical procedures or quantitative data processing exist. In this paper a procedure fully based on image processing techniques is described and applied. Results are presented and analyzed with reference to industrial products. A comparison is also made between the new proposed digital imaging based techniques and the analyses usually carried out at industrial laboratory scale for standard quality control.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Soil texture is defined as the relative proportion of clay, silt and sand found in a given soil sample. It is an important physical property of soil that affects such phenomena as plant growth and agricultural fertility. Traditional methods used to determine soil texture are either time consuming (hydrometer), or subjective and experience-demanding (field tactile evaluation). Considering that textural patterns observed at soil surfaces are uniquely associated with soil textures, we propose an innovative approach to soil texture analysis, in which wavelet frames-based features representing texture contents of soil images are extracted and categorized by applying a maximum likelihood criterion. The soil texture analysis system has been tested successfully with an accuracy of 91% in classifying soil samples into one of three general categories of soil textures. In comparison with the common methods, this wavelet-based image analysis approach is convenient, efficient, fast, and objective.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a lossless coding that is designed for 16-bit high-resolution x-ray images. The proposed algorithm uses non-linear histogram-based mapping that eliminates the gaps in the histogram before applying wavelet transform. The mapping is designed to reduce the magnitude of the wavelet coefficients, especially in the high frequency subbands. Reducing the magnitude of the coefficients in the high frequency subbands provides more compression as the high frequency subbands occupy most of the image area. This paper shows that the energy of all the subbands is being reduced after eliminating the gaps in the histogram, and hence the magnitudes of the coefficients are being reduced. To further exploit this property, the image is segmented into 64×64 blocks, and the gaps in the histograms of each block are independently eliminated, and then each block is independently coded using SPIHT. Since the mapping is non-linear, look-up tables are transmitted to the decoder as part of the overhead information. Experimental results show that the compression in bit per pixel (bpp) of the proposed algorithm does not only exceed wavelet-based SPHIT and JPEG 2000 but also it exceeds the state-of-the-art context-based JPEG-LS.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With the increasing use of multimedia technologies, image compression requires higher performance as well as new features. To address this need in the specific area of image coding, the latest ISO/IEC image compression standard, JPEG2000, has been developed. In part II of the standard, the Wavelet Trellis Coded Quantization (WTCQ) algorithm was adopted. It has been proved that this quantization design provides subjective image quality superior to other existing quantization techniques. In this paper we are aiming to improve the rate-distortion performance of WTCQ, by incorporating a thresholding process in JPEG2000 coding chain. The threshold decisions are derived in a Bayesian framework, and the prior used on the wavelet coefficients is the generalized Gaussian distribution (GGD). The threshold value depends on the parametric model estimation of the subband wavelet coefficient distribution. Our algorithm approaches the lowest possible memory usage by using line-based wavelet transform and a scan-based bit allocation technique. In our work, we investigate an efficient way to apply the TCQ to wavelet image coding with regard to both the computational complexity and the compression performance. Experimental results show that the proposed algorithm performs competitively with the best available coding algorithms reported in the literature in terms quality performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Powerful, flexible shape models of anatomical structures are required for robust, automatic analysis of medical images. In this paper we investigate a physics-based shape representation and deformation method in an effort to meet these requirements. Using a medial-based spring-mass mesh model, shape deformations are produced via the application of external forces or internal spring actuation. The range of deformations includes bulging, stretching, bending, and tapering at different locations, scales, and with varying amplitudes. Springs are actuated either by applying deformation operators or by activating statistical modes of variation obtained via a hierarchical regional principal component analysis. We demonstrate results on both synthetic data and on a spring-mass model of the corpus callosum, obtained from 2D mid-sagittal brain Magnetic Resonance (MR) Images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Some types of laser range scanner can measure both range data and color texture data simultaneously from the same viewpoint, and are often used to acquire 3D structure of outdoor scenery. However, for outdoor scenery, unfortunately a laser range scanner cannot give us perfect range information about the target objects such as buildings, and various factors incur critical defects of range data. We present a defect detection method based on region segmentation using observed range and color data, and employ a nonlinear PDE (Partial Differential Equation)-based method to repair detected defect regions of range data. As to the defect detection, performing range-and-color segmentation, we divide observed data into several regions that correspond to buildings, trees, the sky, the ground, persons, street furniture, etc. Using the segmentation results, we extract occlusion regions of buildings as defects regions. Once the defect regions are extracted, 3D position data or range data will be repaired from the observed data in their neighborhoods. For that purpose, we adapt the digital inpainting algorithm, originally developed for the color image repair problem, for this 3D range data repair problem. This algorithm is formulated as the nonlinear time-evolution procedure based on the geometrical nonlinear PDE.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In the present work we modelize multi-values 2D images as surfaces embbeded in space-features space. Using the differential geometric framework we then introduce an original definition of multi-values image curvatures. First we use these curvatures to detect valleys and ridges in color images. Then we generate a new non-linear color scale space based on a mean curvature flow. It leads to a powerful tool for denoising color images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we propose a new design method of general weighted median filters admitting negative weights for enhancement of images degraded by additive impulsive noise. The general weighted median (GWM) filters are already proposed as frequency selective nonlinear filters. Nevertheless, no one considers how to apply the GWM filters for enhancing degraded images. To enhance images degraded by additive noise, preferable frequency response is varied greatly with positions of window in images. Therefore, GWM filters with fixed weights are not preferable in image processing. Proposed method consists with three steps. At first, we divide block in sliding windows of filters into some number of classes according to difference of spectral characteristics. Second, we optimize some number of GWM filters to have proper frequency response in each class of block. At last, the GWM filters switched in each class are used for enhancement of images. To prepare the GWM filters in each class, the proposed filtering method is better than the GWM filters with fixed weights. Through some simulations, we show the above efficiency of the proposed filters comparing to the original WM filters and linear filters. The proposed method has the robustness for impulsive noise contamination and the frequency selective filtering property.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes the aim towards a nonlinear processing system for satellite image enhancement and smoothing that prepares image for a successful feature extraction through edge detection. Emphasis was given to coastlines, man made objects such as airports, dams, buildings and linear features such as roads and parcel boundaries. Rank order morphological operators and adaptive filtering were employed leading to a promising result. Adaptive filtering was adopted to smooth the image, homogenize regions and at the same time prohibit edge blurring, since high frequency areas in the image are protected. Morphological operators based on rank filters were also implemented because they are often more robust to noise and shape variations than morphological operators with plain structuring element. A survey concerning the shape and the size for the structuring element used for the morphological operators is presented. Structuring element’s shape can lead to certain transformation of desired features geometry and size controls the scale space, trying to retain only features at certain desirable scales. The nonlinear processing system was applied to SPOT HRV (10-meters ground resolution) and IKONOS PAN (1-meter ground resolution) satellite imagery and is demonstrated with examples.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Some modifications to the filterbank algorithm for fingerprint recognition are proposed. Particular attention is paid to core localization and full database coverage. The present implementation improves recognition results while keeping almost the same computational load.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The purpose of this paper is to develop a class of generalized parametric Slant-Hadamard transform of order (formula available in paper)where k is an arbitrary integer and to present its fast algorithm. As special cases of this class are the classical Slant-Hadamard (k=2 and βN=1), the generalized Slant-Hadamard (βN=1), and the parametric Slant-Hadamard (k=2) transforms. We will show that the parametric Slant-Hadamard transform is slightly superior to the DCT for compression of the geometric test images at a particular quantization matrix scaling factors.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
As it is known, the impulsive noise appears on the image in the form of randomly distributed pixels of random brightness. Impulses themselves usually differ much from the surrounding pixels in brightness. The main topic of the paper is the introduction of the new impulse detection criteria, and their application to such filters as median, rank-order and cellular neural Boolean. Three impulse detectors are considered. The Rank Impulse Detector uses such property of impulse that its rank in variation series is usually quite different from rank of the median. Exponential Median Detector uses the exponent of the difference between the local median and the value of pixel to detect the impulse. Combination of these two detectors forms the Enhanced Rank Impulse Detector and integrates advantages of both of them. In combination with filter it allows iterative filtering without further image destruction.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Typhoon can be classified into two classes: eyed typhoon and non-eyed typhoon. The center of a non-eyed typhoon with good circularity is the geometric center of the cloud system and the center of a non-eyed typhoon with bad circularity can be located in the high grayness value area near the side of the greater grayness gradient sector. A new mathematical morphology-based algorithm is proposed to automatically achieve the center location of a non-eyed typhoon. Multispectral image fusion of infrared spectrum and water vapor spectrum is used to verify the result of locating the typhoon center. For a given infrared satellite cloud image, the locating steps are as followed: a) noises removing, b) main cloud systems segmenting, c) center locating and d) multispectral image fusion verification. The experimental results show that the algorithm locates the centers of most non-eyed typhoons successfully.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a novel algorithm for handling occlusion in visual traffic surveillance (VTS) by geometrically splitting the model that has been fitted onto the composite binary vehicle mask of two occluded vehicles. The proposed algorithm consists of a critical points detection step, a critical points clustering step and a model partition step using the vanishing point of the road. The critical points detection step detects the major critical points on the contour of the binary vehicle mask. The critical points clustering step selects the best critical points among the detected critical points as the reference points for the model partition. The model partition step partitions the model by exploiting the information of the vanishing point of the road and the selected critical points. The proposed algorithm was tested on a number of real traffic image sequences, and has demonstrated that it can successfully partition the model that has been fitted onto two occluded vehicles. To evaluate the accuracy, the dimensions of each individual vehicle are estimated based on the partitioned model. The estimation accuracies in vehicle width, length and height are 95.5%, 93.4% and 97.7% respectively.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An approach for estimating the distribution of a synchronized budding
yeast (Saccharomyces cerevisiae) cell population is discussed. This involves estimation of the phase of the cell cycle for each cell. The approach is based on counting the number of buds of different sizes in budding yeast images. An image processing procedure is presented for the bud-counting task. The procedure employs clustering of the local mean-variance space for segmentation of the images. The subsequent bud-detection step is based on an object separation method which utilizes the chain code representation of objects as well as labeling of connected components. The procedure is tested with microscopic images that were obtained in a time-series experiment of a synchronized budding yeast cell population. The use of the distribution estimate of the cell population for inverse filtering of signals that are obtained in time-series microarray measurements is discussed as well.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Inspired by the analogy between computer-based visual systems and their biological counterparts, we propose to introduce two new concepts: the concept of a pixel’s receptive field (analogous to neural cells’ receptive fields in the eye’s retina), and its derivation - the concept of information content of a single pixel (analogous to the output activity of the retina’s neural cells). Exploiting these new concepts, we suggest a quantitative measure for the dissimilarity between a pixel and its surrounding neighbors, which we define as a measure of pixel’s information content. With such a measure at the hand, many image processing tasks that usually require (for their successful accomplishment) some information related assumptions could be reformulated and redesigned to gain new and unknown computation efficiency of image features detection, description and discrimination.
On this basis, new image processing techniques could be devised.
Lena images processed by applying some of these techniques are presented for illustration purposes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes and illustrates an optimal nonlinear interpolation method that is appropriate for image line scans. It is particularly suitable for the restoration of digital images corrupted by “salt and pepper” noise in which isolated pixels are driven to their minimum and maximum gray values. It assigns gray values to these pixels so that the original smoothness of each line scan is maintained. The method is significant in that smoothness invariance is not achieved using, for example, ordinary low-pass or standard wavelet filtering methods to remove “salt and pepper” noise.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.