Given the rapid changes in telecommunication systems and their higher dependence on artificial intelligence, it is increasingly important to have models that can perform well under different, possibly adverse, conditions. Deep Neural Networks (DNNs) using convolutional layers are state-of-the-art in many tasks in communications. However, in other domains, like image classification, DNNs have been shown to be vulnerable to adversarial perturbations, which consist of imperceptible crafted noise that when added to the data fools the model into misclassification. This puts into question the security of DNNs in communication tasks, and in particular in modulation recognition. We propose a novel framework to test the robustness of current state-of-the-art models where the adversarial perturbation strength is dependent on the signal strength and measured with the “signal to perturbation ratio” (SPR). We show that current state-of-the-art models are susceptible to these perturbations. In contrast to current research on the topic of image classification, modulation recognition allows us to have easily accessible insights on the usefulness of the features learned by DNNs by looking at the constellation space. When analyzing these vulnerable models we found that adversarial perturbations do not shift the symbols towards the nearest classes in constellation space. This shows that DNNs do not base their decisions on signal statistics that are important for the Bayes-optimal modulation recognition model, but spurious correlations in the training data. Our feature analysis and proposed framework can help in the task of finding better models for communication systems.
In recent years, light field imaging has attracted the attention of the academic and industrial communities thanks to its enhanced rendering capabilities that allow to visualise contents in a more immersive and interactive way. However, those enhanced capabilities come at the cost of a considerable increase in content size when compared to traditional image and video applications. Thus, advanced compression schemes are needed to efficiently reduce the volume of data for storage and delivery of light field content. In this paper, we introduce a novel method for compression of light field images. The proposed solution is based on a graph learning approach to estimate the disparity among the views composing the light field. The graph is then used to reconstruct the entire light field from an arbitrary subset of encoded views. Experimental results show that our method is a promising alternative to current compression algorithms for light field images, with notable gains across all bitrates with respect to the state of the art.
KEYWORDS: Video, Computer programming, Compressed sensing, Receivers, Video compression, Matrices, Distortion, Signal to noise ratio, Signal attenuation, Motion estimation
We propose a new scheme for wireless video multicast based on compressed sensing. It has the property of
graceful degradation and, unlike systems adhering to traditional separate coding, it does not suffer from a cliff
effect. Compressed sensing is applied to generate measurements of equal importance from a video such that a
receiver with a better channel will naturally have more information at hands to reconstruct the content without
penalizing others. We experimentally compare different random matrices at the encoder side in terms of their
performance for video transmission. We further investigate how properties of natural images can be exploited to
improve the reconstruction performance by transmitting a small amount of side information. And we propose
a way of exploiting inter-frame correlation by extending only the decoder. Finally we compare our results with
a different scheme targeting the same problem with simulations and find competitive results for some channel
configurations.
KEYWORDS: Video, Computer programming, Video coding, Optimization (mathematics), Statistical modeling, Video compression, Televisions, Performance modeling, Systems modeling, Video processing
We propose a framework for popularity-driven rate allocation in H.264/MVC-based multi-view video communications
when the overall rate and the rate necessary for decoding each view are constrained in the delivery
architecture. We formulate a rate allocation optimization problem that takes into account the popularity of
each view among the client population and the rate-distortion characteristics of the multi-view sequence so that
the performance of the system is maximized in terms of popularity-weighted average quality. We consider the
cases where the global bit budget or the decoding rate of each view is constrained. We devise a simple ratevideo-
quality model that accounts for the characteristics of interview prediction schemes typical of multi-view
video. The video quality model is used for solving the rate allocation problem with the help of an interior
point optimization method. We then show through experiments that the proposed rate allocation scheme clearly
outperforms baseline solutions in terms of popularity-weighted video quality. In particular, we demonstrate that
the joint knowledge of the rate-distortion characteristics of the video content, its coding dependencies, and the
popularity factor of each view is key in achieving good coding performance in multi-view video systems.
We consider streaming video content over an overlay network of peer nodes. We propose a novel streaming
strategy that is built on utility-based packet scheduling and proportional resource sharing in order to flight against
free-riders. Each of the peers employs a mesh-pull mechanism to organize the download of media packets from
its neighbours. For eficient resource utilization, data units are requested from neighbours based on their utility.
The packet utility is driven by both its importance for the video reconstruction quality at the receiving peer
and its popularity within the peer neighbourhood. In order to discourage free-riding in the system, requesting
peers then share the upload bandwidth of a sending peer in proportion to their transmission rate to that peer .
Our simulation results show that the proposed protocols increase the performance of a mesh-pull P2P streaming
system. Significant improvements are registered relative to existing solutions in terms of average quality and
average decoding rate.
KEYWORDS: Video, Computer programming, Control systems, Computer simulations, Video coding, Receivers, Video processing, Smoothing, Device simulation, Scalable video coding
We address the problem of the proper choice of the thickness of pre-encoded video layers in congestion-controlled
streaming applications. While congestion control permits to distribute the network resources in a fair manner
among the different video sessions, it generally imposes an adaptation of the streaming rate when the playback
delay is constrained. This can be achieved by adding or dropping layers in scalable video along with efficient
smoothing of the video streams. The size of the video layers directly drives the convergence of the congestion
control to the stable state. In this paper, we derive bounds on both the encoding rates of the video layers
that depend on the prefetch delay that can be used for stream smoothing. We then discuss the practical
scheduling aspects related to the transmission of layered video when delays are constrained. We finally describe
an implementation of the proposed scheduler and we analyze its performance in NS-2 simulations. We show
that it is possible to derive a media-friendly rate allocation for layered video in different transmission scenarios,
and that the proper choice of the layer thickness improves the average video quality when the prefetch delay is
constrained.
This paper presents a distributed coding scheme for the representation of 3D scenes captured by stereo omni-directional
cameras. We consider a scenario where images captured from two different viewpoints are encoded
independently, with a balanced rate distribution among the different cameras. The distributed coding is built on
multiresolution representation and partitioning of the visual information in each camera. The encoder transmits
one partition after entropy coding, as well as the syndrome bits resulting from the channel encoding of the
other partition. The decoder exploits the intra-view correlation and attempts to reconstruct the source image
by combination of the entropy-coded partition and the syndrome information. At the same time, it exploits the
inter-view correlation using motion estimation between images from different cameras. Experiments demonstrate
that the distributed coding solution performs better than a scheme where images are handled independently,
and that the coding rate stays balanced between encoders.
We consider the scenario of video streaming in peer-to-peer networks. A single media server delivers the video
content to a large number of peer hosts by taking advantage of their forwarding capabilities. We propose a
scheme that enables the peers to efficiently distribute the media stream among them. Each of the peers connects
to the streaming server via multiple multicast trees that provide for robustness in the event of peer disconnection.
Moreover, adaptive forwarding of the media content at each peer is enabled by labeling the packets with their
importance for the reconstruction of the media stream. We study the performance of the proposed scheme as
a function of system parameters such as the play-out delay of the media application, the peer population size
and the number of multicast trees employed by the scheme. We show that by placing priorities on forwarding
the individual packets at each peer an improved performance is achieved over conventional peer-to-peer systems
where no such prioritization is deployed. The gains in performance are particularly significant for low-delay
applications and large peer populations.
A system for sender-driven video streaming from multiple servers to a single receiver is considered in this paper. The receiver monitors incoming packets on each network path and returns, to the senders, estimates of the available bandwidth on all the network paths. The senders in turn employ this information to compute transmission schedules for packets belonging to the video stream sent to the receiver. An optimization framework is proposed that enables the senders to compute their transmission schedules in a distributed way,
and yet to dynamically coordinate them over time such that the resulting video quality at the receiver is maximized. Experimental results demonstrate that the proposed streaming framework provides superior performance over distortion-agnostic transmission schemes that perform proportional packet scheduling based only on the available network bandwidths.
This paper presents a new, highly flexible, scalable image coder
based on a Matching Pursuit expansion. The dictionary of atoms is
built by translation, rotation and anisotropic refinement of
gaussian functions, in order to efficiently capture edges in
natural images. In the same time, the dictionary is invariant
under isotropic scaling, which interestingly leads to very simple
spatial resizing operations. It is shown that the proposed scheme
compares to state-of-the-art coders when the compressed image is
transcoded to a lower (octave-based) spatial resolution. In
contrary to common compression formats, our bit-stream can
moreover easily and efficiently be decoded at any spatial
resolution, even with irrational re-scaling factors. In the same
time, the Matching Pursuit algorithm provides an intrinsically
progressive stream. This worthy feature allows for easy rate
filtering operations, where the least important atoms are simply
discarded to fit restrictive bandwidth constraints. Our scheme is
finally shown to favorably compare to state-of-the-art
progressive coders for moderate to quite important rate
reductions.
We address a new error-resilient scheme for broadcast-quality interactive MPEG-2 video streams to be transmitted over lossy packet networks. A new scene-complexity adaptive mechanism, namely Adaptive MPEG-2 Information Structuring and Protection (AMISP) is introduced. AMISP lies on an information structuring scheme which modulates the number of resynchronization points (i.e., slice headers and intra-coded macroblocks) in order to maximize the perceived video quality. The video quality the end-user experiences depends both on the quality of the compressed video before transmission and on the degradation due to packet loss. Therefore, the structuring scheme constantly determines the best compromise between the rate allocated to encoding pure video information and the rate aiming at reducing the sensitivity to packet loss. It is then extended with a Forward Error Correction (FEC) based protection algorithm to become AMISP. AMISP triggers the insertion of FEC packets in the MPEG-2 video packet stream. Finally, it is shown that AMISP outperforms usual MPEG-2 transmission schemes, and offers an acceptable video quality even at loss ratios as high as 10-2. Video quality is estimated using the Moving Picture Quality Metric, which proved to behave consistently with human judgment.
This work addresses the optimization of TV-resolution MPEG-2 video streams to be transmitted over lossy packet networks. This paper introduces a new scene-complexity adaptive mechanism, namely the adaptive MPEG-2 Information Structuring (AMIS) mechanism. AMIS adaptively modulates the number of resynchronization points in order to maximize the perceived video quality assuming it is aware of the packet loss probability and the error concealment technique implemented in the decoder. The perceived video quality depends both on the encoding quality and the degradation due to data loss. Therefore, AMIS constantly determines the best compromise between the rate allocated to pure video information and the rate aiming at reducing the sensitivity to packet loss. Results show that the proposed algorithm behaves much better than the traditional MPEG-2 encoding scheme in terms of perceived video quality under the same traffic constraints.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.