In this paper, we discuss some of the challenges of computing mosaics from practical aerial surveillance video, and how these challenges can be overcome. One particular challenge is "burned-in" metadata, which occurs when metadata from the sensor and aircraft are burned into the actual pixel data. Another obstacle is the presence of "black borders" that commonly appear on the edges of video frames, which may vary in size and location from system to system. The paper demonstrates methods of robustly aligning frames and compositing them so that the limitations just mentioned do not affect the final mosaic quality too severely.
We investigate the characteristics of compression noise in images compressed by scalar quantization of the data's wavelet transform coefficients. Such quantization noise is both experimentally and theoretically shown to be spatially varying in the pixel domain, with statistical correlations between the errors at the pixel locations. A quantization covariance matrix is presented that can find use in general restoration scenarios where the observed image or images have been compressed by scalar quantization of image wavelet coefficients. Deblurring is presented as an example use of the quantization model, which demonstrates the model's advantage over the common assumption of independent and identically distributed noise.
Significant progress toward the development of a video annotation capability is presented in this paper. Research and development of an object tracking algorithm applicable for UAV video is described. Object tracking is necessary for attaching the annotations to the objects of interest. A methodology and format is defined for encoding video annotations using the SMPTE Key-Length-Value encoding standard. This provides the following benefits: a non-destructive annotation, compliance with existing standards, video playback in systems that are not annotation enabled and support for a real-time implementation. A model real-time video annotation system is also presented, at a high level, using the MPEG-2 Transport Stream as the transmission medium. This work was accomplished to meet the Department of Defense’s (DoD’s) need for a video annotation capability. Current practices for creating annotated products are to capture a still image frame, annotate it using an Electric Light Table application, and then pass the annotated image on as a product. That is not adequate for reporting or downstream cueing. It is too slow and there is a severe loss of information. This paper describes a capability for annotating directly on the video.
Compression of imagery by quantization of the data's transform
coefficients introduces an error in the imagery upon decompression.
When processing compressed imagery, often a likelihood term is used to
provide a statistical description of how the observed data are related
to the original noise-free data. This work derives the statistical
relationship between compressed imagery and the original imagery,
which is found to be embodied in a (in general) non-diagonal
covariance matrix. Although the derivations are valid for transform
coding in general, the work is motivated by considering examples for
the specific cases of compression using the discrete cosine transform
and the discrete wavelet transform. An example application of
motion-compensated temporal filtering is provided to show how the
presented likelihood term might be used in a restoration scenario.
The aim of this research is to recompress the JPEG standard images in order to minimize the storage and/or communications bandwidth requirements. In our approach, we convert existing JPEG images into JPEG 2000 images. The proposed image restoration method is applied to improve the visual quality when the bit rate becomes low and visually annoying artifacts appear in existing JPEG image. The JPEG restoration algorithm here makes use of the DCT quantization noise model along with a Markov random field (MRF) prior model for the original image in order to formulate the restoration algorithm in a Bayesian framework. The maximum of a posteriori (MAP) principle based convex model is applied to restore images. The restored image is then compressed with the JPEG2000. The cumulative distribution function (CDF) based visual quality metric method has been developed to measure coding artifacts in large JPEG images. Perceptual distortion analysis is also included in this paper.
Construction of panoramic mosaics from video is well established in
both the research and commercial communities, but current methods
generally perform the time-consuming registration procedure entirely
from the sequence's pixel data. Video sequences usually exist in
compressed format, often MPEG-2; while specialized hardware and
highly-optimized software can often quickly create accurate mosaics
from a video sequence's pixels, these products do not make efficient
use of all information available in a compressed video stream. In
particular, MPEG video files generally contain significant information
about global camera motion in their motion vectors. This paper
describes how to exploit the motion vector information so that global
motion can be estimated extremely quickly and accurately, which leads
to accurate panoramic mosaics. The major obstacle in generating
mosaics with this method is variable quality of MPEG motion vectors,
both within a stream from a particular MPEG encoder and between
streams compressed with different encoders. The paper discusses
methods of robustly estimating global camera motion from the observed
motion vectors, including the use of least absolute value estimators,
variable model order for global camera motion, and motion vector
weighting depending on their estimated accuracy. Experimental results
are presented to demonstrate the performance of the algorithm.
KEYWORDS: Video, Image segmentation, Video surveillance, Computer programming, Video compression, Surveillance, Image compression, Detection and tracking algorithms, Video processing, Sensors
Temporal segmentation of video in the compressed domain is becoming increasingly popular due to its computational advantages over video decompression followed by pixel-domain segmentation. This paper discusses the advantages of compressed-domain processing, and proposes a computationally-efficient method of detecting scene changes without reconstructing the video. The target application provides requirements that allow the algorithm to avoid complicated processing that searches for unnatural scenes changes such as dissolves, fades, and wipes that are common studio effects. The paper provides experimental results to demonstrate operation of the algorithm on real data.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.