PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE Proceedings Volume 8056, including the Title Page, Copyright information, Table of Contents, and the Conference Committee listing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Median filtering has been an effective way for reducing noise of the impulsive kind in images. Yet, the inherent problem
with median filters is that their performance may be limited if images are corrupted by a significant amount of noise. In
such cases, large median filters may have to be considered, resulting in the removal of fine image details. In order to
alleviate this problem, several techniques have been developed and presented in the literature with the purpose of
detecting the locations of noisy pixels and applying median filters only at those locations. As a result, image pixels not
associated to noise remain unaffected. In the recent past, a method in which noisy pixels were identified based on the
information extracted from four directional pixel neighborhoods was proposed. The technique used four directional
weighted median filters for processing the detected noisy pixels. It was shown that by considering different directional
neighborhoods around each pixel, the fine details of the image, such as thin lines, were preserved, even after filtering
was applied. This paper investigates an extension to the previous technique that uses local pixel neighborhoods, in
addition to directional ones, which cover a wider spectrum of shapes. The objective of this modification is to increase the
possibility of identifying at least one neighborhood which does not cross over fine image details. Comparisons between
the original and the proposed method suggest that considering a larger variety of pixel neighborhood shapes is beneficial
for impulsive noise detection and removal.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The problem of horizontal imaging through the atmospheric boundary layer is common in defense, surveillance
and remote sensing applications. Like all earth-bound imaging systems the resolving capability of an imaging
system is limited by atmospheric turbulence. Using speckle imaging techniques it is often possible to overcome
these effects and recover images with resolution approaching the diffraction-limit. We examine the performance
of a bispectrum-based speckle imaging technique when applied to imaging scenarios near the ground.
Computer simulations were used to generate three sets of 70, turbulence degraded images with varied turbulence
strength. Early results indicate the bispectrum to be a robust estimator for images corrupted by the
anisoplanatic turbulence encountered when imaging horizontally. Bispectrum reconstructed image frames show
an improvement of nearly 60% in Mean Squared Error (MSE) on average over the examined turbulence strengths.
The improvement in MSE was found to increase as additional input frames used for image reconstruction though
using a few as 10 input frames provided a 50% improvement in MSE on average over turbulence strengths.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Variation in illumination conditions through a scene is a common issue for classification, segmentation and
recognition applications. Traffic monitoring and driver assistance systems have difficulty with the changing
illumination conditions at night, throughout the day, with multiple sources (especially at night) and in the
presence of shadows. The majority of existing algorithms for color constancy or shadow detection rely on
multiple frames for comparison or to build a background model. The proposed approach uses a novel color space
inspired by the Log-Chromaticity space and modifies the bilateral filter to equalize illumination across objects
using a single frame. Neighboring pixels of the same color, but of different brightness, are assumed to be of the
same object/material. The utility of our algorithm is studied over day and night simulated scenes of varying
complexity. The objective is not to provide a product for visual inspection but rather an alternate image with
fewer illumination related issues for other algorithms to process. The usefulness of the filter is demonstrated
by applying two simple classifiers and comparing the class statistics. The hyper-log-chromaticity image and the
filtered image both improve the quality of the classification relative to the un-processed image.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A novel orientation code is proposed for face recognition applications in this paper. Gabor wavelet transform is a
common tool for orientation analysis in a 2D image; whereas Hamming distance is an efficient distance measurement for
multiple classifications such as face identification. Specifically, at each frequency band, an index number representing
the strongest orientational response is selected, and then encoded in binary format to favor the Hamming distance
calculation. Multiple-band orientation codes are then organized into a face pattern byte (FPB) by using order statistics.
With the FPB, Hamming distances are calculated and compared to achieve face identification. The FPB has the
dimensionality of 8 bits per pixel and its performance will be compared to that of FPW (face pattern word, 32 bits per
pixel). The dimensionality of FPB can be further reduced down to 4 bits per pixel, called face pattern nibble (FPN).
Experimental results with visible and thermal face databases show that the proposed orientation code for face
recognition is very promising in contrast with classical methods such as PCA.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Current research on gaze tracking, specifically relating to mouse control, is often limited to infrared cameras. Since these
can be costly and unsafe to operate, inexpensive optical cameras are a viable alternative. This paper presents image
processing techniques and algorithms to control a computer mouse using an optical camera. Usually, eye tracking
techniques utilize cameras mounted on devices located relatively far away from the user, such as a computer monitor.
However, in such cases, the techniques used to determine the direction of gaze are inaccurate due to the constraints
imposed by the camera resolution in conjunction to limited size of the pupil. In order to achieve higher accuracy in pupil
detection, and therefore mouse control, the camera used is head-mounted and placed near one of the user's eyes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Accurately generating an alarm for a moving door is a precondition for tracking, recognizing and segmenting objects or
people entering or exiting the door. The challenge of generating an alarm when a door event occurs is difficult when
dealing with complex doors, moving cameras, objects moving or an obscured entrance of the door, together with the
presence of varying illumination conditions such as a door-way light being switched on. In this paper, we propose an
effective method of tracking the door motion using edge-map information contained within a localised region at the top
of the door. The region is located where the top edge of the door displaces every time the door is opened or closed. The
proposed algorithm uses the edge-map information to detect the moving corner in the small windowed area with the help
of a Harris corner detector. The moving corner detected in the selected region gives an exact coordinate of the door
corner in motion, thus helping in generating an alarm to signify that the door is being opened or closed. Additionally, due
to the prior selection of the small region, the proposed method nullifies the adverse effects mentioned above and helps
prevent different objects that move in front of the door affecting its efficient tracking. The proposed overall method also
generates an alarm to signify whether the door was displaced to provide entry or exit. To do this, an active contour
orientation is computed to estimate the direction of motion of objects in the door area when an event occurs. This
information is used to distinguish between objects and entities entering or exiting the door. A Hough transform is applied
on a specific region in the frame to detect a line, which is used to perform error correction to the selected windows. The
detected line coordinates are used to nullify the effects of a moving camera platform, thus improving the robustness of
the results. The developed algorithm has been tested on all the Door Zone video sequences contained with the United
Kingdom Home Office i-LIDs dataset, with promising results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Personnel positioning is important for safety in e.g. emergency response operations. In GPS-denied environments,
possible positioning solutions include systems based on radio frequency communication, inertial sensors, and cameras.
Many camera-based systems create a map and localize themselves relative to that. The computational complexity of
most such solutions grows rapidly with the size of the map. One way to reduce the complexity is to divide the visited
region into submaps. This paper presents a novel method for merging conditionally independent submaps (generated
using e.g. EKF-SLAM) by the use of smoothing. Using this approach it is possible to build large maps in close to linear
time. The method is demonstrated in two indoor scenarios, where data was collected with a trolley-mounted stereo vision
camera.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Motion blur acts on an image like a two dimensional low pass filter, whose spatial frequency characteristic depends both
on the trajectory of the relative motion between the scene and the camera and on the velocity vector variation along it.
When motion during exposure is permitted, the conventional, static notions of both the image exposure and the scene-toimage
mapping become unsuitable and must be revised to accommodate the image formation dynamics. This paper
develops an exact image formation model for arbitrary object-camera relative motion with arbitrary velocity profiles.
Moreover, for any motion the camera may operate in either continuous or flutter shutter exposure mode. Its result is a
convolution kernel, which is optimally designed for both the given motion and sensor array geometry, and hence permits
the most accurate computational undoing of the blurring effects for the given camera required in forensic and high
security applications. The theory has been implemented and a few examples are shown in the paper.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
One of the goals of superresultion has been to achieve interpolation in excess of some externally imposed physical constraint. Initially it was the optical diffraction limit while the Nyquist Limit of sampled data systems has also become a more recent issue. Regardless of the setting, the limitations are the same; there generally is not enough available degrees of freedom to perform an
interpolation without severe loss of information. While some success has been achieved in superresolution, magnification is generally limited to less than 2. In this paper we present a method
where context based basis functions are developed for digital zoom where the magnifications were assumed to be greater that 2. The number of degrees of freedom are still less than the number formally required, because the basis functions are developed for scenes similar to scenes presented for interpolation, they are more efficient than those developed without regard to context.
The technique is presented together with several still images and video examples of digital zoom for a magnification of 5 and 10. Results are compared with conventional B-Cubic Spline
interpolation. Parallelization of the technique with graphic processors is discussed toward its real time implementation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We present results of an on-going project to assess the applicability in reflection seismology of emerging super
resolution techniques pioneered in digital photography. Our approach involves: (1) construction of a forward model
connecting low resolution seismic images to high resolution ones, and (2) solution of a Tikhonov-regularized ill
conditioned optimization problem to construct a high resolution image from several lower resolution counterparts; the
high and low resolution images derived, respectively, from dense and sparse seismic surveys.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Unmanned Airborne Vehicles (UAVs) during flight capture a set of images that have slightly different looks of
the scene. These images often contain a sufficient overlapped area between them and subpixel shifts of random
fractions that allows for constructing a high resolution image within the overlapped area. The high resolution
image may have a poor visual quality due to the degradations during acquisition and display processes such as
blurring caused by the system optics or aliasing due to sampling. A technique referred to as the microscanning
is an effective method for reducing aliasing and increasing spatial resolution. By moving the field of view (FOV)
on the detector array with predetermined sub-pixel shifts, both aliasing reduction and resolution improvement
are realized with increasing effective spatial sampling periods. In this paper we introduce the idea of the
microscanning in UAV captured images. Based on the continuous-discrete-continuous (CDC) model, a Wiener
restoration filter is used to restore the visually poor quality image to a super resolution (SR) image.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper we introduce the concept of continuous quantification of uniqueness. Our approach is to construct an
algorithm that computes a fuzzy set membership function, which given any inter-object dissimilarity metric and it's
variability, measures the probability that an entity of interest will not be confused with other similar entities in a search
space. We demonstrate use of this algorithm by applying it to stereoscopic computer vision, in order to identify which of
several sub-problems pertaining to solution of the classic stereoscopic correspondence problem are least likely to be
solved incorrectly, and hence are most well suited to greatest confidence first approaches.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
To address the emergent needs of military and security users, a new design approach has been developed to enable the
rapid development of high performance and low cost imaging and processing systems. In this paper, information about
the "Bespoke COTS" design approach is presented and is illustrated using examples of systems that have been built and
delivered. This approach facilitates the integration of standardised COTS components into a customised yet flexible
systems architecture to realise user requirements within stringent project timescales and budgets. The paper also
discusses the important area of the design trade-off space (performance, flexibility, quality, and cost) and compares the
results of the Bespoke COTS approach to design solutions derived from more conventional design processes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An automatic landing site detection algorithm is proposed for aircraft emergency landing. Emergency landing
is an unplanned event in response to emergency situations. If, as is unfortunately usually the case, there is
no airstrip or airfield that can be reached by the un-powered aircraft, a crash landing or ditching has to be
carried out. Identifying a safe landing site is critical to the survival of passengers and crew. Conventionally,
the pilot chooses the landing site visually by looking at the terrain through the cockpit. The success of this
vital decision greatly depends on the external environmental factors that can impair human vision, and on
the pilot's flight experience that can vary significantly among pilots. Therefore, we propose a robust, reliable
and efficient algorithm that is expected to alleviate the negative impact of these factors. We present only the
detection mechanism of the proposed algorithm and assume that the image enhancement for increased visibility,
and image stitching for a larger field-of-view have already been performed on the images acquired by aircraftmounted
cameras. Specifically, we describe an elastic bound detection method which is designed to position
the horizon. The terrain image is divided into non-overlapping blocks which are then clustered according to a
"roughness" measure. Adjacent smooth blocks are merged to form potential landing sites whose dimensions are
measured with principal component analysis and geometric transformations. If the dimensions of the candidate
region exceed the minimum requirement for safe landing, the potential landing site is considered a safe candidate
and highlighted on the human machine interface. At the end, the pilot makes the final decision by confirming
one of the candidates, also considering other factors such as wind speed and wind direction, etc. Preliminary
results show the feasibility of the proposed algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The improved Situational awareness in Persistent Surveillance Systems (PSS) is an ongoing research effort of the
Department of Defense. Most PSS generate huge volume of raw data and they heavily rely on human operators to
interpret and inference data in order to detect potential threats. Many outdoor apprehensive activities involve vehicles as
their primary source of transportation to and from the scene where a plot is executed. Vehicles are employed to bring in
and take out ammunitions, supplies, and personnel. Vehicles are also used as a disguise, hide-out, a meeting place to
execute threat plots. Analysis of the Human-Vehicle Interactions (HVI) helps us to identify cohesive patterns of
activities representing potential threats. Identification of such patterns can significantly improve situational awareness in
PSS. In our approach, image processing technique is used as the primary source of sensing modality. We use HVI
taxonomy as a means for recognizing different types of HVI activities. HVI taxonomy may comprise multiple threads
of ontological patterns. By spatiotemporal linking of ontological patterns, a HVI pattern is hypothesized to pursue a
potential threat situation. The proposed technique generates semantic messages describing ontology of HVI. This
paper also discusses a vehicle zoning technique for HVI semantic labeling and demonstrates efficiency and
effectiveness of the proposed technique for identifying HVI.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A two-fold image understanding algorithm based on Bayesian networks is introduced. The methodology has modules for
image segmentation evaluation and region of interest (ROI) identification. The former uses a set of segmentation maps
(SMs) of a target image to identify the optimal one. These SMs could be generated from the same segmentation
algorithm at different thresholds or from different segmentation techniques. Global and regional low-level image features
are extracted from the optimal SM and used along with the original image to identify the ROI. The proposed algorithm
was tested on a set of 4000 color images that are publicly available and compared favorably to the state-of-the-art
techniques. Applications of the proposed framework include image compression, image summarization, mobile phone
imagery, digital photo cropping, and image thumb-nailing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Multiview Video Coding (MVC) is an extension to the H.264/MPEG-4 AVC video compression standard developed
with joint efforts by MPEG/VCEG to enable efficient encoding of sequences captured simultaneously from multiple
cameras using a single video stream. Therefore the design is aimed at exploiting inter-view dependencies in addition to
reducing temporal redundancies. However, this further increases the overall encoding complexity
In this paper, the high correlation between a macroblock and its enclosed partitions is utilised to estimate motion
homogeneity, and based on the result inter-view prediction is selectively enabled or disabled. Moreover, if the MVC is
divided into three layers in terms of motion prediction; the first being the full and sub-pixel motion search, the second
being the mode selection process and the third being repetition of the first and second for inter-view prediction, the
proposed algorithm significantly reduces the complexity in the three layers.
To assess the proposed algorithm, a comprehensive set of experiments were conducted. The results show that the
proposed algorithm significantly reduces the motion estimation time whilst maintaining similar Rate Distortion
performance, when compared to both the H.264/MVC reference software and recently reported work.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Inverse lens distortion modelling allows one to find the pixel in a distorted image which corresponds to a known
point in object space, such as may be produced by RADAR. This paper extends recent work using neural networks
as a compromise between processing complexity, memory usage and accuracy. The already encouraging results
are further enhanced by considering different neuron activation functions, architectures, scaling methodologies
and training techniques.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Traditional sharpening filters often enhance the noise content in imagery in addition to the edge definition. In order to
ensure that only pertinent features are enhanced and that the noise content of the imagery is not exaggerated, an adaptive
filter is typically required. This paper discusses a novel image sharpening strategy proposed by Waterfall Solutions Ltd.
(WS) that is based upon the use of adaptive image filter kernels. The scale of the filter is steered by a WS' local saliency
measure. This allows the filter to sharpen pertinent features and suppress local noise. The scale of the edge sharpening
filter adapts locally in accordance with a proposed saliency measure. This helps to ensure that only pertinent edges are
enhanced. The technique has been applied to a series of test images. Results have shown the potential of this technique
for distinguishing salient information from noise content and for sharpening pertinent edges. By increasing the size of the
filter in noisy regions the filter is able to enhance larger-scale edge gradients whilst suppressing local noise. It is
demonstrated that the proposed approach provides superior edge enhancement capabilities over conventional filtering
approaches according to performance measures, such as edge strength and Signal-to-Noise-Ratio (SNR).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In general edge detection evaluation, the edge detectors are examined, analyzed, and compared either visually
or with a metric for specific an application. This analysis is usually independent of the characteristics of the
image-gathering, transmission and display processes that do impact the quality of the acquired image and thus,
the resulting edge image. We propose a new information theoretic analysis of edge detection that unites the
different components of the visual communication channel and assesses edge detection algorithms in an integrated
manner based on Shannon's information theory. The edge detection algorithm here is considered to achieve high
performance only if the information rate from the scene to the edge approaches the maximum possible. Thus,
by setting initial conditions of the visual communication system as constant, different edge detection algorithms
could be evaluated. This analysis is normally limited to linear shift-invariant filters so in order to examine the
Canny edge operator in our proposed system, we need to estimate its "power spectral density" (PSD). Since the
Canny operator is non-linear and shift variant, we perform the estimation for a set of different system environment
conditions using simulations. In our paper we will first introduce the PSD of the Canny operator for a range
of system parameters. Then, using the estimated PSD, we will assess the Canny operator using information
theoretic analysis. The information-theoretic metric is also used to compare the performance of the Canny
operator with other edge-detection operators. This also provides a simple tool for selecting appropriate edgedetection
algorithms based on system parameters, and for adjusting their parameters to maximize information
throughput.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A local color transfer method based on dark channel dehazing for visible/infrared image fusion is presented. Image
fusion combines complemental information from visible and infrared images. Visible image supplies plenty of scene
details and infrared image is good at popping out hot or cold targets. However, under a bad weather condition, such as
haze or fog, the visible image is degraded greatly and leads to a low contrast and poor color fidelity fused image. Color
transfer can improve the color appearance using a bright haze-free reference image, but it usually modifies the pixel
values according to the global mean value and standard deviation in each color channel. This paper pays more attention
to the dark channel of the reference image and so applies different color transfer schemes to haze area and haze-free area.
Results show that it is effective for decreasing the bad effect of the haze and achieving a more visually pleasing color
visible/infrared fused image.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Feature-specific imaging (FSI) or compressive imaging involves measuring relatively few linear projections of a
scene compared to the dimensionality of the scene. Researchers have exploited the spatial correlation inherent in
natural scenes to design compressive imaging systems using various measurement bases such as Karhunen-Lo`eve
(KL) transform, random projections, Discrete Cosine transform (DCT) and Discrete Wavelet transform (DWT)
to yield significant improvements in system performance and size, weight, and power (SWaP) compared to
conventional non-compressive imaging systems. Here we extend the FSI approach to time-varying natural scenes
by exploiting the inherent spatio-temporal correlations to make compressive measurements. The performance of
space-time feature-specific/compressive imaging systems is analyzed using the KL measurement basis. We find
that the addition of temporal redundancy in natural time-varying scenes yields further compression relative to
space-only feature specific imaging. For a relative noise strength of 10% and reconstruction error of 10% using
8×8×16 spatio-temporal blocks we find about a 114x compression compared to a conventional imager while
space-only FSI realizes about a 32x compression. We also describe a candidate space-time compressive optical
imaging system architecture.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We have previously shown in reference [3] that images of particular objects of interest can be
recovered from compressive measurements by minimizing a L2-norm criterion that incorporates
prior knowledge of the signal such as its expected spectra. The basis in which the signal is
reconstructed was also noted to be an important consideration in the formulation of the solution. In
this paper, we further improve this technique by representing the image in a multi-scale domain so
that select bands of the transform can be adapted to reference signals from other sources, while
improving the overall quality of reconstruction of the full image. It is shown by means of an
example that the adaptation not only reduces the overall mean square error of the reconstruction, but
also helps to correctly resolve features in the high-resolution image that are not accurately
reconstructed by the open-loop algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The DARPA MOSAIC program applies multiscale optical design (shared objective lens and parallel array of microcameras)
to the acquisition of high pixel count images. Interestingly, these images present as many challenges
as opportunities. The imagery is acquired over many slightly overlapping fields with diverse focal, exposure and
temporal parameters. Estimation of a consensus image, display of imagery at human-comprehensible resolutions,
automated anomaly detection to guide viewer attention, and power management in a distributed electronic environment
are just a few of the novel challenges that arise. This talk describes some of these challenges and
presents progress to date.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A CAPTCHA is an automatically generated test designed to distinguish between humans and computer programs;
specifically, they are designed to be easy for humans but difficult for computer programs to pass in order
to prevent the abuse of resources by automated bots. They are commonly seen guarding webmail registration
forms, online auction sites, and preventing brute force attacks on passwords.
In the following, we address the question: How does adding a grey level to random CAPTCHA generation
affect the utility of the CAPTCHA? We treat the problem of generating the random CAPTCHA as one of
random field simulation: An initial state of background noise is evolved over time using Gibbs sampling and an
efficient algorithm for generating correlated random variables. This approach has already been found to yield
highly-readable yet difficult-to-crack CAPTCHAs. We detail how the requisite parameters for introducing grey
levels are estimated and how we generate the random CAPTCHA. The resulting CAPTCHA will be evaluated in
terms of human readability as well as its resistance to automated attacks in the forms of character segmentation
and optical character recognition.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The objective of scalable video coding is to enable the generation of a unique bitstream that can adapt to various bitrates,
transmission channels and display capabilities. The scalability is categorised in terms of temporal, spatial, and
quality. Effective Rate Control (RC) has important ramifications for coding efficiency, and also channel bandwidth and
buffer constraints in real-time communication.
The main target of RC is to reduce the disparity between the actual and target bit-rates. In order to meet the target bitrate,
a predicted Mean of Absolute Difference (MAD) between frames is used in a rate-quantisation model to obtain the
Quantisation Parameter (QP) for encoding the current frame.
The encoding process exploits the interdependencies between video frames; therefore the MAD does not change abruptly
unless the scene changes significantly. After the scene change, the MAD will maintain a stable slow increase or
decrease. Based on this observation, we developed a simplified RC algorithm. The scheme is divided in two steps;
firstly, we predict scene changes, secondly, in order to suppress the visual quality, we limit the change in QP value
between two frames to an adaptive range. This limits the need to use the rate-quantisation model to those situations
where the scene changes significantly.
To assess the proposed algorithm, comprehensive experiments were conducted. The experimental results show that the
proposed algorithm significantly reduces encoding time whilst maintaining similar rate distortion performance,
compared to both the H.264/SVC reference software and recently reported work.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image segmentation remains one of the major challenges in image analysis and computer vision. Fuzzy clustering, as a
soft segmentation method, has been widely studied and successfully applied in mage clustering and segmentation. The
fuzzy c-means (FCM) algorithm is the most popular method used in mage segmentation. However, most clustering
algorithms such as the k-means and the FCM clustering algorithms search for the final clusters values based on the
predetermined initial centers. The FCM clustering algorithms does not consider the space information of pixels and is
sensitive to noise. In the paper, presents a new fuzzy c-means (FCM) algorithm with adaptive evolutionary programming
that provides image clustering. The features of this algorithm are: 1) firstly, it need not predetermined initial centers.
Evolutionary programming will help FCM search for better center and escape bad centers at local minima. Secondly, the
spatial distance and the Euclidean distance is also considered in the FCM clustering. So this algorithm is more robust to
the noises. Thirdly, the adaptive evolutionary programming is proposed. The mutation rule is adaptively changed with
learning the useful knowledge in the evolving process. Experiment results shows that the new image segmentation
algorithm is effective. It is providing robustness to noisy images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The Metropolis Monte Carlo procedure for reconstruction of blurred output is discussed. In this approach two
Monte Carlo Procedures are run at the same time. The first Monte Carlo procedure selects a pixel according to a
distribution function that takes it to important input fields more frequently. The second procedure decides whether a
grain should be place in or removed from the selected pixel. The criterion used to make this decision is the Mean
Squared Error. If this error is reduced the move is accepted otherwise it is rejected. The results of applying the
method for a slowly converging impulse response function and an input that contains three closely spaced peaks is
reported. The results indicate that the input signal for this very difficult to resolve system is recovered with very
good resolution.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The proposed system is focusing on the detection of three events in airport videos: a person running, a person putting
down an object and a person pointing with his/her hand. The system was part of the NIST-TRECVid 2010 campaign, the
training dataset consists in 100 hours of video from the Gatwick airport from five different cameras. For the detection of
a person running, a non-parametric approach was adopted where statistics about tracked object velocities were
accumulated over a long period of time using a Gaussian kernel. Outliers were then detected with the help of a kind of tstudent
test taking into account the local statistics and the number of observations. For the detection of "object put"
events, we follow a dual background segmentation approach where the difference in response between a short term and a
long term background model (Mixture of Gaussians) triggers alerts. False alerts are excluded based on a simple
modeling of the camera geometry in order to reject objects that are too large or too small given their positions in the
image. The detection of pointing gesture events is based on the grouping of significant spatio-temporal corners (Harris)
in a 3x3x3 cell called compound features as proposed recently by Andrew Gilbert et al. [10]. A hierarchical codebook is
then derived from the training set based on a data mining algorithm looking for frequent items (called transactions). The
algorithm was modified in order to deal with the large number of potential transactions (several millions) during the
training step.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.