As the demand for higher quality and higher resolution video increases, many applications fail to meet this demand due to low bandwidth restrictions. One factor contributing to this problem is the high bitrate requirement of the intra-coded Instantaneous Decoding Refresh (IDR) frames featuring in all video coding standards. Frequent coding of IDR frames is essential for error resilience in order to prevent the occurrence of error propagation. However, as each one consumes a huge portion of the available bitrate, the quality of future coded frames is hindered by high levels of compression. This work presents a new technique, known as Spatial Resampling of IDR Frames (SRIF), and shows how it can increase the rate distortion performance by providing a higher and more consistent level of video quality at low bitrates.
KEYWORDS: High dynamic range imaging, Image compression, Video compression, Video, Visibility, Computer programming, Wavelets, Spatial frequencies, Visualization, Transform theory
High Dynamic Range (HDR) technology can offer high levels of immersion with a dynamic range meeting and exceeding that of the Human Visual System (HVS). A primary drawback with HDR images and video is that memory and bandwidth requirements are significantly higher than for conventional images and video. Many bits can be wasted coding redundant imperceptible information. The challenge is therefore to develop means for efficiently compressing HDR imagery to a manageable bit rate without compromising perceptual quality. In this paper, we build on previous work of ours and propose a compression method for both HDR images and video, based on an HVS optimised wavelet subband weighting method. The method has been fully integrated into a JPEG 2000 codec for HDR image compression and implemented as a pre-processing step for HDR video coding (an H.264 codec is used as the host codec for video compression). Experimental results indicate that the proposed method outperforms previous approaches and operates in accordance with characteristics of the HVS, tested objectively using a HDR Visible Difference Predictor (VDP). Aiming to further improve the compression performance of our method, we additionally present the results of a psychophysical experiment, carried out with the aid of a high dynamic range display, to determine the difference in the noise visibility threshold between HDR and Standard Dynamic Range (SDR) luminance edge masking. Our findings show that noise has increased visibility on the bright side of a luminance edge. Masking is more consistent on the darker side of the edge.
In this paper, a redundant picture formation algorithm that takes into account a given redundancy rate constraint is
presented for error resilient wireless video transmission without reliance on retransmissions. The algorithm assigns
priorities to MBs according to two suggested metrics and ranks macroblocks accordingly. The first metric is based on an
end-to-end distortion model and aims at maximising the reduction in distortion per redundancy bit. The end-to-end
distortion accounts for the effects of error propagation, mismatch between the primary and redundancy description and
error concealment. Macroblocks providing large distortion reduction for fewer bits spent are assigned a higher priority.
The second metric employs the variance of the motion vectors of a macroblock and those of its neighbouring blocks.
Results show that the rate distortion metric outperforms other examined metrics by up to 2dB. Moreover, gains over
existing error resilience schemes, such as LA-RDO, are presented.
This paper proposes a concealment based approach to generating the side information and estimating the correlation noise for low-delay, pixel-based, distributed video coding. The proposed method employs a macroblock pattern similar to the one used in the dispersed type FMO of H.264 for grouping the macroblocks of each frame into intra coded (key) and Wyner-Ziv groups. Temporal concealment is then used at the decoder for "concealing" the missing macroblocks (estimating the side information-predicting the Wyner-Ziv macroblocks). The actual intra coded/decoded macroblocks are used for estimating the correlation noise. The results indicate significant performance improvements relative to existing motion extrapolation based approaches (up to 25% bit rate reduction).
MIMO (multiple-input-multiple-output) systems offer potential for throughput increase and enhanced quality of service
for multimedia transmission. The underlying multipath environment requires new error-resilience techniques if the
obtained benefits are to be fully exploited. Different MIMO architectures produce error-patterns of somewhat diverse
characteristics. This paper proposes the use of multiple-description coding (MDC) as an approach that outperforms the
standard-based error-resilience techniques in the majority of these cases. Results obtained from the random packet-error
generator are furthered through the use of realistic MIMO channel scenarios and argue in favour of the deployment of an
MDC-based video transmission system. Singular value decomposition (SVD) is used to create orthogonal sub-channels
within a MIMO system which provide, depending on their respective gains and fading characteristics, an efficient means
of mapping video content. Results indicate improvements in average PSNR of decoded test-sequences of up to 3 dB
(5dB in the region of high PERs) compared to standard, single-description video transmission. This is also supported by
significant subjective quality enhancements.
We propose a method of providing error resilient H.264 video over 802.11 wireless channels by using a feedback mechanism which does not incur an additional delay typically found in ARQ-type feedback. Our system uses the TCP/IP and UDP/IP protocols, located between the medium access control (MAC) layer of 802.11, and the H.264 video application layer. The UDP protocol is used to transfer time sensitive video data without delay; however, packet losses introduce excessive artifacts which propagate to subsequent frames. Error resilience is achieved by a feedback mechanism-the decoder conveys the packet-loss information as small TCP packets to the video source as negative acknowledgements. By using multiple reference frames, slice-based coding and timely intra-refresh, the encoder makes use of this feedback information to perform subsequent temporal prediction without propagating the error to future frames. We take static measurements of the actual channel and use the packet loss and delay patterns to test our algorithms. Simulations show an improvement of 0.5~5 dB in PSNR over plain UDP-based video transmission. Our method improves the overall quality of service of interactive video transmission over wireless LAN; it can be used as a
model for future media-aware wireless network protocol designs.
The EU FP6 WCAM (Wireless Cameras and Audio-Visual Seamless Networking) project aims to study, develop and validate a wireless, seamless and secured end-to-end networked audio-visual system for video surveillance and multimedia distribution applications. This paper describes the video transmission aspects of the project, with contributions in the areas of H.264 video delivery over wireless LANs.
The planned demonstrations under WCAM include transmission of H.264 coded material over 802.11b/g networks with TCP/IP and UDP/IP being employed as the transport and network layers over unicast and multicast links. UDP based unicast and multicast transmissions pose the problem of packet erasures while TCP based transmission is associated with long delays and the need for a large jitter buffer. This paper presents measurement data that have been collected at the trial site along with analysis of the data, including characterisation of the channel conditions as well as recommendations on the optimal operating parameters for each of the above transmission scenarios (e.g. jitter buffer sizes, packet error rates, etc.). Recommendations for error resilient coding algorithms and packetisation strategies are made in order to moderate the effect of the observed packet erasures on the quality of the transmitted video. Advanced error concealment methods for masking the effects of packet erasures at the receiver/decoder are also described.
The imminent arrival of mobile video telephony will enable deaf people to communicate - as hearing people have been able to do for a some time now - anytime/anywhere in their own language sign language. At low bit rates coding of sign language sequences is very challenging due to the high level of motion and the need to maintain good image quality to aid with understanding. This paper presents optimised coding of sign language video at low bit rates in a way that will favour comprehension of the compressed material by deaf users. Our coding suggestions are based on an eye-tracking study that we have conducted which allows us to analyse the visual attention of sign language viewers. The results of this study are included in this paper. Analysis and results for two coding methods, one using MPEG-4 video objects and the second using foveation filtering are presented. Results with foveation filtering are very promising, offering a considerable decrease in bit rate in a way which is compatible with the visual attention patterns of deaf people, as these were recorded in the eye tracking study.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.