PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
The concept of the N-point DFT is generalized, by considering it in the real space (not complex). The multiplication
by twiddle coefficients is considered in matrix form; as the Givens transformation. Such block-wise
representation of the matrix of the DFT is effective. The transformation which is called the T-generated N-block
discrete transform, or N-block T-GDT is introduced. For each N-block T-GDT, the inner product is defined,
with respect to which the rows (and columns) of the matrices X are orthogonal. By using different parameterized
matrices T, we define metrics in the real space of vectors. The selection of the parameters can be done among
only the integer numbers, which leads to integer-valued metric. We also propose a new representation of the
discrete Fourier transform in the real space R2N. This representation is not integer, and is based on the matrix C
(2x2) which is not a rotation, but a root of the unit matrix. The point (1, 0) is not moving around the unite circle
by the group of motion generated by C, but along the perimeter of an ellipse. The N-block C-GDT is therefore
called the N-block elliptic FT (EFT). These orthogonal transformations are parameterized; their properties are
described and examples are given.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes the 2-D reversible integer discrete Fourier transform (RiDFT), which is based on the concept
of the paired representation of the 2-D signal or image. The Fourier transform is split into a minimum set of
short transforms. By means of the paired transform, the 2-D signal is represented as a set of 1-D signals which
carry the spectral information of the signal at disjoint sets of frequency-points. The paired transform-based
2-D DFT involves a few operations of multiplication that can be approximated by integer transforms. Such
one-point transforms with one control bit are applied for calculating the 2-D DFT. 24 real multiplications and
24 control bits are required to perform the 8x8-point RiDFT, and 264 real multiplications and 168 control bits
for the 16 x 16-point 2-D RiDFT of real inputs. The computational complexity of the proposed 2-D RiDFTs is
comparative with the complexity of the fast 2-D DFT.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we investigate the use of the Stockwell Transform for image compression. The proposed technique
uses the Discrete Orthogonal Stockwell Transform (DOST), an orthogonal version of the Discrete Stockwell
Transform (DST). These mathematical transforms provide a multiresolution spatial-frequency representation of
a signal or image.
First, we give a brief introduction for the Stockwell transform and the DOST. Then we outline a simplistic
compression method based on setting the smallest coefficients to zero. In an experiment, we use this compression
strategy on three different transforms: the Fast Fourier transform, the Daubechies wavelet transform and the
DOST. The results show that the DOST outperforms the two other methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We present a heuristic solution for B-term approximation using
Tree-Structured Haar (TSH) transforms. Our solution consists of two
main stages: best basis selection and greedy approximation. In
addition, when approximating the same signal with different B
constraint or error metric, our solution also provides the
flexibility of having less overall running time at expense of more
storage space. We adopted lattice structure to index basis vectors,
so that one index value can fully specify a basis vector. Based on
the concept of fast computation of TSH transform by butterfly
network, we also developed an algorithm for directly deriving
butterfly parameters and incorporated it into our solution. Results
show that, when the error metric is normalized ℓ1-norm and
normalized ℓ2-norm, our solution has comparable (sometimes
better) approximation quality with prior data synopsis algorithms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In various practical situations of remote sensing image processing it is assumed that noise is nonstationary and no a
priory information on noise dependence on local mean or about local properties of noise statistics is available. It is
shown that in such situations it is difficult to find a proper filter for effective image processing, i.e., for noise removal
with simultaneous edge/detail preservation. To deal with such images, a local adaptive filter based on discrete cosine
transform in overlapping blocks is proposed. A threshold is set locally based on a noise standard deviation estimate
obtained for each block. Several other operations to improve performance of the locally adaptive filter are proposed and
studied. The designed filter effectiveness is demonstrated for simulated data as well as for real life radar remote sensing
and marine polarimetric radar images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, a new noise reduction algorithm is proposed. In general, an edge-high frequency information in
an image-would be filtered or suppressed after image smoothing. The noise would be attenuated, but the image
would lose its sharp information. This defect makes the post-processing harder. One new algorithm performs
connectivity analysis on edge-data to make sure that only isolated edge information that represents noise gets
filtered out, hence preserving the overall edge structure of the original image. The steps of new algorithm are
as follows. First, find the edge from the noisy image by multi-resolution analysis. Second, use connectivity
analysis to direct a mean filter to suppress the noise while preserving the edge information. In the first step,
we propose a new algorithm to find edges in a very noisy image. The algorithm is based on the analysis of a
group of multi-resolution images obtained by processing the original noisy image by different Gaussian filters.
After applied to a sequence of images of the same scene but with different signal-noise-ratio (snr), this method
is robust to remove noise and keep the edge. Also, through statistic analysis, there exists the regularity that
the parameters of the algorithm would be constant with varying images under the same snr.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A fourth order PDE is proposed as the regularization operator for image restoration in order to alleviate the
"blocky" effects that frequently mar restored images regularized with anisotropic difusion (a second order PDE).
This is motivated by its desirable property of evolving toward an image consisting of piecewise planar areas
which is a less blocky and better approximation to natural images. In order to mitigate speckle artifacts that
it frequently brings about, image gradient magnitude is added as an additional variable of the nonlinearity
function that controls its behavior. A numerical implementation method is presented and simulation results
indicate that the proposed method tend to produce restored images which are smoother in smooth areas and
sharper in feature-rich areas. However, speckle artifacts need to be carefully addressed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image de-noising is a widely-used technology in modern real-world surveillance systems. Methods can
seldom do both de-noising and texture preservation very well without a direct knowledge of the noise model.
Most of the neighborhood fusion-based de-noising methods tend to over-smooth the images, which causes
a significant loss of detail. Recently, a new non-local means method has been developed, which is based
on the similarities among the different pixels. This technique results in good preservation of the textures;
however, it also causes some artifacts. In this paper, we utilize the scale-invariant feature transform (SIFT)
[1] method to find the corresponding region between different images, and then reconstruct the de-noised
images by a weighted sum of these corresponding regions. Both hard and soft criteria are chosen in order to
minimize the artifacts. Experiments applied to real unmanned aerial vehicle thermal infrared surveillance
video show that our method is superior to popular methods in the literature.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Polygon meshes are collections of vertices, edges and faces defining surfaces in a 3D environment. Computing
geometric features on a polygon mesh is of major interest for various applications. Among these features, the
geodesic distance is the distance between two vertices following the surface defined by the mesh. In this paper,
we propose an algorithm for fast geodesic distance approximation using mesh decimation and front propagation.
This algorithm is appropriated when a fast geodesic distances computation is needed and when no fine precision
is required.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we present a novel approach on image object removal by extending subpatch texture synthesis
technique into redundant wavelet transform (RDWT) domain. As an overcompleted wavelet transform, RDWT
is shift invariant and obtained without downsampling. Also, each RDWT highpass subband exhibits one specific
orientation features of the image, in horizontal, vertical, or diagonal. All these make RDWT ideal for performing
texture synthesis object removal techniques. In our experiments, subpatch texture synthesis in RDWT is
introduced to remove unwanted objects from digital photographs. Specifically, for each RDWT subband, depending
on the subband orientation, a particular direction subpatch texture synthesis is applied independently.
Experimental results reveal that our simple algorithm performs better than previous methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
It has long been known that the human visual system (HVS) has a nonlinear response to luminance. This
nonlinearity can be quantified using the concept of just noticeable difference (JND), which represents the minimum
amplitude of a specified test pattern an average observer can discern from a uniform background. The JND
depends on the background luminance following a threshold versus intensity (TVI) function.
It is possible to define a curve which maps physical luminances into a perceptually linearized domain. This
mapping can be used to optimize a digital encoding, by minimizing the visibility of quantization noise. It is also
commonly used in medical applications to display images adapting to the characteristics of the display device.
High dynamic range (HDR) displays, which are beginning to appear on the market, can display luminance
levels outside the range in which most standard mapping curves are defined. In particular, dual-layer LCD
displays are able to extend the gamut of luminance offered by conventional liquid crystals towards the black
region; in such areas suitable and HVS-compliant luminance transformations need to be determined. In this
paper we propose a method, which is primarily targeted to the extension of the DICOM curve used in medical
imaging, but also has a more general application. The method can be modified in order to compensate for the
ambient light, which can be significantly greater than the black level of an HDR display and consequently reduce
the visibility of the details in dark areas.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We present techniques for the processing of color, high-dynamic luminance images of a type aiming for objectivity
and also of a type aiming for aesthetic improvement. In the first case we start with camera raw data, propose
a variant white balance, darken very light spots and lighten very dark spots. In the second case we use color
spaces of the type hue-saturation-luminance; we propose a hue processing method inspired in the Bezold-Brucke
effect as well as a luminance-dependant displacement of color saturation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Region-based active contours are a variational framework for image segmentation. It involves estimating the
probability distributions of observed features within each image region. Subsequently, these so-called region
descriptors are used to generate forces to move the contour toward real image boundaries. In this paper region
descriptors are computed from samples within windows centered on contour pixels and they are named local
region descriptors (LRDs). With these descriptors we introduce an equation for contour motion with two terms:
growing and competing. This equation yields a novel type of AC that can adjust the behavior of contour pieces to
image patches and to the presence of other contours. The quality of the proposed motion model is demonstrated
on complex images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper we present a novel fast method for the non-rigid registration of a few X-ray projections with
CT data. The method involves non-parametric non-rigid registration techniques for the difficult 2D-3D case,
combined with knowledge of probable deformations modeled as active shape models (ASMs). ASMs allow us
to cope with as few as two projections by regularizing the registration process. The model is learned from
deformations observed during respiration in a 4D-CT. This method can be applied in motion compensated
radiation therapy to eliminate the need for fiducial implantation. We designed a fast C++ implementation for
our method in order to make it practicable. Our tests on real 4D-CT data achieved registration times of 2-4
minutes using a desktop PC.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Bayer patterns, in which a single value of red, green or blue is available for each pixel, are widely used in digital color
cameras. The reconstruction of the full color image is often referred to as demosaicking. This paper introduced a new
approach - morphological demosaicking. The approach is based on strong edge directionality selection and interpolation,
followed by morphological operations to refine edge directionality selection and reduce color aliasing. Finally
performance evaluation and examples of color artifacts reduction are shown.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Interpolation is a key ingredient in many imaging routines. In this note, we present a thorough evaluation of an
interpolation method based on exponential splines in tension. They are based on so-called tension parameters,
which allow for a tuning of their properties. As it turns out, these interpolants have very many nice features,
which are, however, not born out in the literature. We intend to close this gap. We present for the first time
an analytic representation of their kernel which enables one to come up with a space and frequency domain
analysis. It is shown that the exponential splines in tension, as a function of the tension parameter, bridging
the gap between linear and cubic B-Spline interpolation. For example, with a certain tension parameter, one is
able to suppress ringing artefacts in the interpolant. On the other hand, the analysis in the frequency domain
shows that one derives a superior signal reconstruction quality as known from the cubic B-Spline interpolation,
which, however, suffers from ringing artifacts. With the ability to offer a trade-off between opposing features of
interpolation methods we advocate the use of the exponential spline in tension from a practical point of view
and use the new kernel representation to qualify the trade-off.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Modern automated microscopic imaging techniques such as high-content screening (HCS), high-throughput
screening, 4D imaging, and multispectral imaging are capable of producing hundreds to thousands of images
per experiment. For quick retrieval, fast transmission, and storage economy, these images should be saved in
a compressed format. A considerable number of techniques based on interband and intraband redundancies of
multispectral images have been proposed in the literature for the compression of multispectral and 3D temporal
data. However, these works have been carried out mostly in the elds of remote sensing and video processing.
Compression for multispectral optical microscopy imaging, with its own set of specialized requirements, has
remained under-investigated. Digital photography{oriented 2D compression techniques like JPEG (ISO/IEC
IS 10918-1) and JPEG2000 (ISO/IEC 15444-1) are generally adopted for multispectral images which optimize
visual quality but do not necessarily preserve the integrity of scientic data, not to mention the suboptimal
performance of 2D compression techniques in compressing 3D images.
Herein we report our work on a new low bit-rate wavelet-based compression scheme for multispectral fluorescence
biological imaging. The sparsity of signicant coefficients in high-frequency subbands of multispectral
microscopic images is found to be much greater than in natural images; therefore a quad-tree concept such as
Said et al.'s SPIHT1 along with correlation of insignicant wavelet coefficients has been proposed to further
exploit redundancy at high-frequency subbands. Our work propose a 3D extension to SPIHT, incorporating a
new hierarchal inter- and intra-spectral relationship amongst the coefficients of 3D wavelet-decomposed image.
The new relationship, apart from adopting the parent-child relationship of classical SPIHT, also brought forth
the conditional "sibling" relationship by relating only the insignicant wavelet coefficients of subbands at the
same level of decomposition. The insignicant quadtrees in dierent subbands in the high-frequency subband
class are coded by a combined function to reduce redundancy. A number of experiments conducted on microscopic
multispectral images have shown promising results for the proposed method over current state-of-the-art
image-compression techniques.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A method is presented to measure the intensity of the blocking artefact in compressed pictures or video frames.
First, a way is devised to artificially introduce pure blocking, which closely resembles the real one subsequent
to JPEG compression. Then a modified no-reference measurement is proposed that requires less computations
than other formerly presented methods, permits to take into account the whole image or frame area, and is
not affected by interlaced video. Some first experiments indicate that the measured values relate closely to the
introduced blockiness effect. The robustness of the metric to the influence of other typical JPEG artefacts is
also checked. Further, the effect on blockiness of some enhancement strategies is measured. Pictures enhanced
with methods introducing the most severe blockiness are found to have the highest value of the proposed metric.
Finally the problem of blockiness measurement in video sequences is addressed. In this case the blocking grid is
no longer regular. In fact, blocks of different size could be used in encoding, and single blocks could be shifted
in referenced (P and B) frames due to motion compensation. A method is devised for grid detection.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Multivariate analysis seeks to describe the relationship between an arbitrary number of variables. To explore highdimensional
data sets, projections are often used for data visualisation to aid discovering structure or patterns that lead to
the formation of statistical hypothesis. The basic concept necessitates a systematic search for lower-dimensional
representations of the data that might show interesting structure(s). Motivated by the recent research on the Image Grand
Tour (IGT), which can be adapted to view guided projections by using objective indexes that are capable of revealing
latent structures of the data, this paper presents a signal processing perspective on constructing such indexes under the
unifying exploratory frameworks of Independent Component Analysis (ICA) and Projection Pursuit (PP). Our
investigation begins with an overview of dimension reduction techniques by means of orthogonal transforms, including
the classical procedure of Principal Component Analysis (PCA), and extends to an application of the more powerful
techniques of ICA in the context of our recent work on non-destructive testing technology by element specific x-ray
imaging.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
To register three or more images together, current approaches involve registering them two at a time. This pairwise approach can lead to registration inconsistencies. It can also result in diminished accuracy because only a fraction of the total data is being used at any given time. We propose a registration method that simultaneously registers the entire ensemble of images. This ensemble registration of multi-sensor datasets is done using clustering in the joint intensity space. Experiments demonstrate that the ensemble registration method overcomes serious issues that hinder pairwise multi-sensor registration methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this contribution the robustness of a novel steganographic scheme based on the generalized Fibonacci sequence
against Chi-square attacks is investigated. In essence, an image is first represented in a basis defined by a
generalized Fibonacci sequence. Then the secret data are inserted by substitution technique into selected bit
planes preserving the first order distributions, and finally, the inverse Fibonacci decomposition is applied to
obtain the stego-image. Secret data are scrambled before the embedding to improve the security of the whole
system. In order to perform Chi-square attacks, the knowledge of both the parameters determining the binary
Fibonacci representation of an image is assumed. Experimental results show that no visual impairments are
introduced and the probability of detecting the presence of hidden data is small even if a modest capacity loss
is present.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we focus on the effective representation of the image, which is called the paired representation
and reduces the image to the set of independent 1-D signals and splits the 2-D DFT into a minimal number of
1-D DFTs. The paired transform is a frequency and time representation of the image. Splitting-signals carry
the spectral information in disjoint subsets of frequencies, which allows for enhancing the image by processing
splitting-signals separately and changing the resolution of periodic structures composing the image. We present
a new effective formula for the inverse 2-D paired transform, which can be used for solving the algebraic system
of equations with measurement data for image reconstruction without using the Fourier transform technique.
The image is reconstructed directly from the splitting-signals which can be calculated from projection data.
The same inverse formula can be used for image enhancement, such as the known method of α-rooting. A new
concept of direction images is introduced, that define the decomposition of the image by directions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Eye blink detection is one of the important problems in computer vision. It has many applications such as face live
detection and driver fatigue analysis. The existing methods towards eye blink detection can be roughly divided into two
categories: contour template based and appearance based methods. The former one usually can extract eye contours
accurately. However, different templates should be involved for the closed and open eyes separately. These methods are
also sensitive to illumination changes. In the appearance based methods, image patches of open-eyes and closed-eyes are
collected as positive and negative samples to learn a classifier, but eye contours can not be accurately extracted. To
overcome drawbacks of the existing methods, this paper proposes an effective eye blink detection method based on an
improved eye contour extraction technique. In our method, eye contour model is represented by 16 landmarks therefore
it can describe both open and closed eyes. Each landmark is accurately recognized by fast classifier which is trained from
the appearance around this landmark. Experiments have been conducted on YALE and another large data set consisting
of frontal face images to extract the eye contour. The experimental results show that the proposed method is capable of
affording accurate eye location and robust in closed eye condition. It also performs well in the case of illumination
variants. The average time cost of our method is about 140ms on Pentium IV 2.8GHz PC 1G RAM, which satisfies the
real-time requirement for face video sequences. This method is also applied in a face live detection system and the
results are promising.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We present a drift-correcting template update strategy for precisely tracking a feature point in 2D image sequences in
this paper. The proposed strategy greatly extends Matthews et al's template tracking strategy [I. Matthews, T. Ishikawa
and S. Baker, The template update problem, IEEE Trans. PAMI 26 (2004) 810-815.] by incorporating a robust non-rigid
image registration step used in medical imaging. Matthews et al's strategy uses the first template to correct drifts in the
current template; however, the drift would still build up if the first template becomes quite different from the current one
as the tracking continues. In our strategy the first template is updated timely when it is quite different from the current
one, and henceforth the updated first template can be used to correct template drifts in subsequent frames. The method
based on the proposed strategy yields sub-pixel accuracy tracking results measured by the commercial software
REALVIZ(R) MatchMover(R) Pro 4.0. Our method runs fast on a desktop PC (3.0 GHz Pentium(R) IV CPU, 1GB RAM,
Windows(R) XP professional operating system, Microsoft Visual C++ 6.0 (R) programming), using about 0.03 seconds on
average to track the feature point in a frame (under the assumption of a general affine transformation model, 61×61
pixels in template size) and when required, less than 0.1 seconds to update the first template. We also propose the
architecture for implementing our strategy in parallel.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Edge detection is an important image processing task which has been used extensively in object detection and
recognition. Over the years, many edge detection algorithms have been established, with most algorithms largely based
around linear convolution operations. In such methods, smaller kernel sizes have generally been used to extract fine
edge detail, but suffer from low noise tolerance. The use of higher dimension kernels is known to have good
implications for edge detection, as higher dimension kernels generate coarser scale edges. This suppresses noise and
proves to be particularly important for detection and recognition systems. This paper presents a generalized set of
kernels for edge and line detection which are orthogonal to each other to yield nxn kernels for any odd dimension n.
Some of the kernels can also be generalized to form mxn rectangular kernels. In doing so, it unifies small and large
kernel approaches in order to reap the benefits of both. It is also seen that the Frei and Chen orthogonal kernel set is a
single instance of this new generalization. Experimental results show that the new generalized set of kernels can
improve edge detection results by combining the usefulness of both lower and higher dimension kernels.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper proposes a Tensor Decomposition Based method that can recognize an unknown person's action
from a video sequence, where the unknown person is not included in the database (tensor) used for the recognition. The
tensor consists of persons, actions and time-series image features. For the observed unknown person's action, one of the
actions stored in the tensor is assumed. Using the motion signature obtained from the assumption, the unknown person's
actions are synthesized. The actions of one of the persons in the tensor are replaced by the synthesized actions. Then, the
core tensor for the replaced tensor is computed. This process is repeated for the actions and persons. For each iteration, the difference between the replaced and original core tensors is computed. The assumption that gives the minimal
difference is the action recognition result. For the time-series image features to be stored in the tensor and to be extracted
from the observed video sequence, the human body silhouette's contour shape based feature is used. To show the validity
of our proposed method, our proposed method is experimentally compared with Nearest Neighbor rule and Principal
Component analysis based method. Experiments using 33 persons' seven kinds of action show that our proposed method
achieves better recognition accuracies for the seven actions than the other methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we propose a novel ellipse detection method which is based on a modified
RANSAC, with automatic sampling guidance from the edge orientation difference curve. Hough
Transform family is one of the most popular and methods for shape detection, but the Standard
Hough Transform loses its computation efficiency if the dimension of the parameter space gets
high. Randomized Hough Transform, an improved version of Standard Hough Transform has
difficulty in detecting shapes from complicated, cluttered scenes because of its random sampling
process. As a pre-process for random selection of five pixels to be used to build the ellipse's
equation, we propose a two-step algorithm: (1) region segmentation and contour detection by
mean shift algorithm (2) contour splitting based on the edge orientation difference curve obtained
from the contour of each region. In each contour segment obtained by step (2), 5 pixels are
randomly selected and the modified RANSAC is applied to the 5 pixels so that an accurate ellipse
model is obtained. Experimental result show that the proposed method can achieve high
accuracies and low computation cost in detecting multiple ellipses from an image.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this study, an iterative maximum a posteriori (MAP) approach using a Bayesian model of Markov
random field (MRF) was proposed for image restoration to reduce or remove the noise resulted from
imperfect sensing. Image process is assumed to combine the random fields associated with the observed
intensity process and the image texture process respectively. The objective measure for determining the
optimal restoration of this "double compound stochastic" image process is based on Bayes' theorem, and
the MAP estimation employs the Point-Jacobian iteration to obtain the optimal solution. In the proposed
algorithm, MRF is used to quantify the spatial interaction probabilistically, that is, to provide a type of
prior information on the image texture and the neighbor window of any size is defined for contextual
information on a local region. However, the window of a certain size would result in using wrong
information for the estimation from adjacent regions with different characteristics at the pixels close to or
on the boundary. To overcome this problem, the new method is designed to use less information from
more distant neighbors as the pixel is closer to the boundary. It can reduce the possibility to involve the
pixel values of adjacent region with different characteristics. The proximity to the boundary is estimated
using a non-uniformity measurement based on edge value, standard deviation, entropy, and the 4th
moment of intensity distribution. This study evaluated the new scheme using simulation data, and the
experimental results show a considerable improvement in image restoration.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we introduce the Vector Median M-type L (VMML) -filter to remove impulsive and Gaussian noise from
color images and video color sequences. This filter utilizes vector approach and the Median M-type (MM) estimator
with different influence functions in the filtering scheme of L-filter. We also introduce the use of impulsive noise
detectors to improve the properties of noise suppression and detail preservation in the proposed filtering scheme in the
case of low and high densities of impulsive noise. To demonstrate the performance of the proposed filtering scheme in
real applications, we applied it for filtering of SAR images, which naturally have speckle noise. Simulation results
indicate that the proposed filter consistently outperforms other color image filters by balancing the tradeoff between
noise suppression, detail preservation, and color retention.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper we propose to evaluate both robustness and security of digital image watermarking techniques by
considering the perceptual quality of un-marked images in terms of Weightened PSNR. The proposed tool is based on
genetic algorithms and is suitable for researchers to evaluate robustness performances of developed watermarking
methods. Given a combination of selected attacks, the proposed framework looks for a fine parameterization of them
ensuring a perceptual quality of the un-marked image lower than a given threshold. Correspondingly, a novel metric for
robustness assessment is introduced. On the other hand, this tool results to be useful also in those scenarios where an
attacker tries to remove the watermark to overcome copyright issues. Security assessment is provided by a stochastic
search of the minimum degradation that needs to be introduced in order to obtain an un-marked version of the image as
close as possible to the given one. Experimental results show the effectiveness of the proposed approach.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper a novel technique for rotation independent template matching via Quadtree Zernike decomposition
is presented. Both the template and the target image are decomposed by using a complex polynomial basis.
The template is analyzed in block-based manner by using a quad tree decomposition. This allows the system to
better identify the object features.
Searching for a complex pattern into a large multimedia database is based on a sequential procedure that
verifies whether the candidate image contains each square of the ranked quadtree list and refining, step-by-step,
the location and orientation estimate.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this contribution, a novel method for distributed video coding for stereo sequences is proposed. The system
encodes independently the left and right frames of the stereoscopic sequence. The decoder exploits the side
information to achieve the best reconstruction of the correlated video streams. In particular, a syndrome coder
approach based on a lifted Tree Structured Haar wavelet scheme has been adopted. The experimental results
show the effectiveness of the proposed scheme.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image segmentation is an important and difficult computer vision problem. Hyper-spectral images pose even more
difficulty due to their high-dimensionality. Spectral clustering (SC) is a recently popular clustering/segmentation
algorithm. In general, SC lifts the data to a high dimensional space, also known as the kernel trick, then derive
eigenvectors in this new space, and finally using these new dimensions partition the data into clusters. We demonstrate
that SC works efficiently when combined with covariance descriptors that can be used to assess pixelwise similarities
rather than in the high-dimensional Euclidean space. We present the formulations and some preliminary results of the
proposed hybrid image segmentation method for hyper-spectral images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Previously we have shown that error diffusion neural networks (EDNs) find local minima of frequency-weighted error
between a binary halftone output and corresponding smoothly varying input, an ideal framework for solving halftone
problems. An extension of our work to color halftoning employs a three dimensional (3D) interconnect scheme. We cast
color halftoning as four related sub-problems: the first three are to compute good binary halftones for each primary color
and the fourth is to simultaneously minimize frequency-weighted error in the luminosity of the composite result. We
have showed that an EDN with a 3D interconnect scheme can solve all four problems in parallel. This paper shows that
our 3D EDN algorithm not only shapes the error to frequencies to which the Human Visual System (HVS) is least
sensitive but also shapes the error in colors to which the HVS is least sensitive. The correlation among the color planes
by luminosity reduces the formation of high contrast pixels, such as black and white pixels that often constitute color
noise, resulting in a smoother and more homogeneous appearance in a halftone image and a closer resemblance to the
continuous tone image. The texture visibility of color halftone patterns is evaluated in two ways: (1) by computing the
radially averaged power spectrum (2) by computing the visual cost function.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this study, the radial basis functions based SG algorithm (SGRBF) is applied for evolution of level sets in image
segmentation. The implementation of level set method in image processing often involves solving partial differential
equations (PDEs). Finite differences implicit scheme is a prevalent method to solve PDE for extending the evolution of
level sets. Instead of using finite differences method, SGRBF is used in our study for evolving level sets. The SGRBF is
a mathematical framework developed for function approximation using Gaussian RBFs. In SGRBF, the number and
centers of the basis functions are determined in a systematic and mathematically sound way using a purely algebraic
approach. The numerical results show that, except for a continuous representation of both the implicit function and its
level sets, the algorithm we introduce here can reduce the computation cost by selecting the most contributive centers for
radial basis functions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Parallel processing promises scalable and effective computing power which can handle the complex data structures of
knowledge representation languages efficiently. Past and present sequential architectures, despite the rapid advances in
computing technology, have yet to provide such processing power and to offer a holistic solution to the problem. This
paper presents a fresh attempt in formulating alternative techniques for grammar learning, based upon the parallel and
distributed model of connectionism, to facilitate the more cognitively demanding task of pattern understanding. The
proposed method has been compared with the contemporary approach of shape modelling based on level sets, and
demonstrated its potential as a prototype for constructing robust networks on high performance parallel platforms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Lattice associative memories also known as morphological associative memories are fully connected feedforward
neural networks with no hidden layers, whose computation at each node is carried out with lattice algebra
operations. These networks are a relatively recent development in the field of associative memories that has
proven to be an alternative way to work with sets of pattern pairs for which the storage and retrieval stages use
minimax algebra. Different associative memory models have been proposed to cope with the problem of pattern
recall under input degradations, such as occlusions or random noise, where input patterns can be composed
of binary or real valued entries. In comparison to these and other artificial neural network memories, lattice
algebra based memories display better performance for storage and recall capability; however, the computational
techniques devised to achieve that purpose require additional processing or provide partial success when inputs
are presented with undetermined noise levels. Robust retrieval capability of an associative memory model is
usually expressed by a high percentage of perfect recalls from non-perfect input. The procedure described here
uses noise masking defined by simple lattice operations together with appropriate metrics, such as the normalized
mean squared error or signal to noise ratio, to boost the recall performance of either the min or max lattice auto-associative
memories. Using a single lattice associative memory, illustrative examples are given that demonstrate
the enhanced retrieval of correct gray-scale image associations from inputs corrupted with random noise.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Neural Networks Application in Image Processing II
Based on our research in the last 17 years (with 68 papers published) on the subject of artificial neural network
studied from the point of view of N-dimension geometry, a novel neural network system, the dynamic neural
network, is proposed here for detecting an unknown moving (or time-varying) object such that the object will not
only be detected by its static images, but also by the way it moves if this object follows a constant moving pattern.
The system is designed to identify the unknown object by comparing a few time-separated snapshots of the object to
a few standard moving objects learned or memorized in the system. The identification is determined by a user
entered accuracy control. It could be very accurate, yet still be quite robust and quite fast in identification (e.g.,
identification in real-time) because of the simplicity of the algorithm. It is different from most other neural network
systems because it employs the ND geometrical concept.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Interlaced scanning has been widely used in most broadcasting systems. However, there are some undesirable artifacts
such as jagged patterns, flickering, and line twitters. Moreover, most recent TV monitors utilize flat panel display
technologies such as LCD or PDP monitors and these monitors require progressive formats. Consequently, the
conversion of interlaced video into progressive video is required in many applications and a number of deinterlacing
methods have been proposed. Recently deinterlacing methods based on neural network have been proposed with good
results. On the other hand, with high resolution video contents such as HDTV, the amount of video data to be processed
is very large. As a result, the processing time and hardware complexity become an important issue. In this paper, we
propose an efficient implementation of neural network deinterlacing using polynomial approximation of the sigmoid
function. Experimental results show that these approximations provide equivalent performance with a considerable
reduction of complexity. This implementation of neural network deinterlacing can be efficiently incorporated in HW
implementation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Neural network de-interlacing has shown promising results among various de-interlacing methods. In this paper, we
investigate the effects of input size for neural networks for various video formats when the neural networks are used for
de-interlacing. In particular, we investigate optimal input sizes for CIF, VGA and HD video formats.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Bi-i (Bio-inspired) Cellular Vision system is built mainly on Cellular Neural /Nonlinear Networks (CNNs) type
(ACE16k) and Digital Signal Processing (DSP) type microprocessors. CNN theory proposed by Chua has advanced
properties for image processing applications. In this study, the edge detection algorithms are implemented on the Bi-i
Cellular Vision System. Extracting the edge of an image to be processed correctly and fast is of crucial importance for
image processing applications. Threshold Gradient based edge detection algorithm is implemented using ACE16k
microprocessor. In addition, pre-processing operation is realized by using an image enhancement technique based on
Laplacian operator. Finally, morphologic operations are performed as post processing operations. Sobel edge detection
algorithm is performed by convolving sobel operators with the image in the DSP. The performances of the edge
detection algorithms are compared using visual inspection and timing analysis. Experimental results show that the
ACE16k has great computational power and Bi-i Cellular Vision System is very qualified to apply image processing
algorithms in real time.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Manifolds are mathematical spaces whose points have Euclidean neighborhoods, but whose global structure could be
more complex. A one dimensional manifold has a neighborhood that resembles a line. A two dimensional one resembles
a plane. If we consider a one dimensional example, most system neighborhoods cannot be represented optimally by a
straight line. A multi-ordered nonlinear line would be better suited to represent most data. A learning algorithm to model
the pipeline, based on Fischer Linear Discriminant (FLD), using least squares estimation is presented in this paper. Face
patterns are known to show continuous variability. Yet face images of one individual tend to cluster together and can be
considered as a neighborhood. Such similar patterns form a pipeline in state space that can be used for pattern
classification. Multiple patterns can be trained by having separate lines for each pattern. Face points are now projected
onto a low-dimensional mean nonlinear pipe-line, thus providing an easy intuitive way to place new points. Given a test
point/face, the classification problem is now simplified to checking the nearest neighbors. This can be done by finding
the minimum distance pipe-line from the test-point. The proposed representation of a face image results in improved
accuracy when compared to the classical point representation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Nowadays, a strong need exists for the efficient organization of an increasing amount of home video content. To create
an efficient system for the management of home video content, it is required to categorize home video content in a
semantic way. So far, a significant amount of research has already been dedicated to semantic video categorization.
However, conventional categorization approaches often rely on unnecessary concepts and complicated algorithms that
are not suited in the context of home video categorization. To overcome the aforementioned problem, this paper
proposes a novel home video categorization method that adopts semantic home photo categorization. To use home photo
categorization in the context of home video, we segment video content into shots and extract key frames that represent
each shot. To extract the semantics from key frames, we divide each key frame into ten local regions and extract lowlevel
features. Based on the low level features extracted for each local region, we can predict the semantics of a
particular key frame. To verify the usefulness of the proposed home video categorization method, experiments were
performed with home video sequences, labeled by concepts part of the MPEG-7 VCE2 dataset. To verify the usefulness
of the proposed home video categorization method, experiments were performed with 70 home video sequences. For the
home video sequences used, the proposed system produced a recall of 77% and an accuracy of 78%.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.