We present a computationally efficient track-before-detect algorithm that achieves more than 50% true detection
at 10-6 false alarm rate for pixel sized unknown number of multiple targets when the signal-to-noise ratio
is less than 7dB. Without making any assumptions on the distribution functions, we select a small number
of cells, so called as needles, and generate motion hypotheses using the target state transition model. We
accumulate cell likelihoods along each hypothesis in the temporal window and append the accumulated values
to the corresponding queues of the cell positions in the most recent image. We assign a target in case the queue
maximum is greater than a threshold that produces the specified false alarm rate.
A novel method to accelerate the application of linear filters that have multiple identical coefficients on arbitrary kernels
is presented. Such filters, including Gabor filters, gray level morphological operators, volume smoothing functions, etc.,
are widely used in many computer vision tasks. By taking advantage of the overlapping area between the kernels of
the neighboring points, the reshuffling technique prevents from the redundant multiplications when the filter response
is computed. It finds a set of unique coefficients, constructs a set of relative links for each coefficient, and then sweeps
through the input data by accumulating the responses at each point while applying the coefficients using their relative links.
Dual solutions, single input access and single output access, that achieve 40% performance improvement are provided. In
addition to computational advantage, this method keeps a minimal memory imprint, which makes it an ideal method for
embedded platforms. The effects of quantization, kernel size, and symmetry on the computational savings are discussed.
Results prove that the reshuffling is superior to the conventional approach.
We present two fast algorithms that approximate the distance transformation of 2D binary images. Distance
transformation finds the minimum distances of all data points from a set of given object points,
however, such an exhaustive search for the minimum distances is infeasible in larger data spaces.
Unlike the conventional approaches, we extract the minimum distances with no explicit distance computation
by using either multi-directional dual scan line propagation or wave propagation methods. We
iteratively move on a scan line in opposite directions and assign an incremental counter to underlying
data points while checking for object points. To our advantage, the precision of dual scan propagation
method can be set according to the available computational power. Alternatively, we start a wavefront
from object points and propagate it outward at each step while assigning the number of steps taken as
the minimum distance. Unlike the most existing approaches, the computational load of our algorithm
does not depend on the number of object points either.
This paper addresses the issue of multi-source collaborative object tracking in high-definition (HD) video sequences.
Specifically, we propose a new joint tracking paradigm for the multiple stream electronic pan-tilt-zoom (EPTZ) cameras.
These cameras are capable of transmitting a low resolution thumbnail (LRT) image of the whole field of view as well as
a high-resolution cropped (HRC) image for the target region. We exploit this functionality to perform joint tracking in
both low resolution image of the whole field of view as well as high resolution image of the moving target. Our system
detects objects of interest in the LRT image by background subtraction and tracks them using iterative coupled
refinement in both LRT and HRC images. We compared the performance of our joint tracking system with that of
tracking only in the HD mode. The results of our experiments show improved performance in terms of higher frame rates
and better localization.
We present a computationally inexpensive method for multi-modal image registration. Our approach employs a joint gradient similarity function that is applied only to a set high spatial gradient pixels. We obtain motion parameters by maximizing the similarity function by gradient ascent method, which secures a fast convergence. We apply our technique to the task of affine model based registration of 2D images which undergo large rigid motion, and show promising results.
This paper describes an experimental study of the use of thermal infrared (8 - 12μm) imaging applied to the problem of pedestrian tracking. Generally it was found that infrared images enable better image segmentation, but their tracking performance with current algorithms is poorer. Simple fusion of both types of images has produced some improvement in the segmentation step of the tracking algorithms. In addition to the specific experimental results, this paper also provides a useful set of practical factors that need to be taken into account when using thermal infrared imaging for surveillance applications under real-world conditions.
KEYWORDS: Detection and tracking algorithms, Video, Picosecond phenomena, Video surveillance, Cameras, Video processing, 3D acquisition, Temporal resolution, Filtering (signal processing), Imaging systems
In this paper, we present an object detection and tracking
algorithm for low-frame-rate applications. We extend the standard
mean-shift technique such that it is not limited within a single
kernel but uses multiple kernels centered around high motion areas
obtained by change detection. We also improve the convergence properties of the mean-shift by integrating two additional likelihood terms. Our simulations prove the effectiveness of the
proposed method.
In this contribution, we propose a computationally fast algorithm to compute local feature histograms of an image. existing histogram extraction is done by evaluating the distribution of image features such as color, edge, etc. within a local image windows centered each pixels. This approach is computationally very demanding since it requires evaluation of the feature distributions for every possible local window in the image. We develop an accumulated histogram propagation method that takes advantage of the fact that the local windows are overlaps and their feature histograms are highly correlated. Instead of evaluating the distributions independently, we propagate the distribution information in a 2D sweeping fashion. Our simulations prove that the proposed algorithm significantly accelarates histogram extraction and enables computation of e.g. posterier propabilities and likelihood values, which are frequently used for object detection, and tracking, as well as in other vision applications such as calibration and recognition.
We develop a level set based region growing method for automatic partitioning of color images into segments. Previous attempts at image segmentation either suffer from requiring a priori information to initialize regions, being computationally complex, or fail to establish the color consistency and spatial connectivity at the same time. Here, we represent the segmentation problem as monotonic wave propagation in an absorbing medium with varying front speeds. We iteratively emit waves from the selected base points. At a base point, the local variance of the data reaches a minimum, which indicates the base point is a suitable representative of its local neighborhood. We determine local variance by applying a hierarchical gradient operator. The speed of the wave is determined by the color similarity of the point on the front to the current coverage of the wave, and by edge information. Thus, the wave advances in an anisotropic spatial-color space. The absorbing function acts as a stopping criterion of the wave front. We take advantage of fast marching methods to solve the Eikonal equation for finding the travel times of the waves. Besides, each region boundary is represented as a mixture of Gaussian models. This formulation enables segmentation of multi-modal color objects. Our method is superior to the linkage-based and snake-based region growing techniques since it prevents leakage and imposes compactness on the region without over-smoothing its boundary. Furthermore, we can deal with sharp corners and changes in topology. The automatic segmentation method is Eulerian, thus it is computationally efficient. Our experiments illustrate the robustness, accuracy, and effectiveness of the proposed method.
KEYWORDS: Video, Image segmentation, Video compression, Motion estimation, Video surveillance, Motion measurement, Quantization, Video processing, Data processing, Error analysis
We propose a real-time object segmentation method for MPEG encoded video. Computational superiority is the main advantage of compressed domain processing. We exploit the macro-block structure of the encoded video to decrease the spatial resolution of the processed data, which exponentially reduces the computational load. Further reduction is achieved by temporal grouping of the intra-coded and estimated frames into a single feature layer. In addition to computational advantage, compressed-domain video possesses important features attractive for object analysis. Texture characteristics are provided by the DCT coefficients. Motion information is readily available without incurring cost of estimating a motion field. To achieve segmentation, the DCT coefficients for I-frames and block motion vectors for P-frames are combined and a frequency-temporal data structure is constructed. Starting from the blocks where the AC-coefficient energy and local inter-block DC-coefficient variance is small, the homogeneous volumes are enlarged by evaluating the distance of candidate vectors to the volume characteristics. Affine motion models are fit to volumes. Finally, a hierarchical clustering stage iteratively merges the most similar parts to generate an object partition tree as an output.
We present a Gaussain model based approach for robust and automatic extraction of roads from very low-resolution satellite imagery. First, the input image is filtered to suppress the regions that the likelihood of existing a road pixel is low. Then, the road magnitude and orientation are computed by evaluating the responses from a quadruple orthogonal line filter set. A mapping from the line domain to the vector domain is used to determine the line strength and orientation for each point. A Gaussian model is fitted to point and matching models are updated recursively. The iterative process consists of finding the connected road points, fusing them with the previous image, passing them through the directional line filter set and computing new magnitudes and orientations. The road segments are updated at each iteration, and the process continues until there are no further changes in the roads extracted. Experimental results demonstrate the success of the proposed algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.