This paper presents a novel generative adversarial network for the task of human pose transfer, which aims at transferring the pose of a given person to a target pose. In order to deal with pixel-to-pixel misalignment due to the pose differences, we introduce an attention mechanism and propose Pose-Guided Attention Blocks. With these blocks, the generator can learn how to transfer the details from the conditional image to the target image based on the target pose. Our network can make the target pose truly guide the transfer of features. The effectiveness of the proposed network is validated on DeepFasion and Market-1501 datasets. Compared with state-of-the-art methods, our generated images are more realistic with better facial details.
KEYWORDS: Light sources, Cameras, Calibration, Sensors, Time of flight cameras, Manufacturing, 3D modeling, Modulation, Phase shift keying, Signal to noise ratio
The depth quality of a time-of-flight (ToF) camera is influenced by many systematic and non-systematic errors1. In this paper we present a simple method to correct and reduce these errors and propose a multi-phase approach to improve the depth acquisition accuracy. Compared with traditional calibration methods, we take the position of light source into account, and calibrate the light source together with the camera to reduce depth distortion. To ameliorate the sensor errors caused in the manufacturing process, a Look-up Table (LUT) is used to correct pixel-related errors. Besides, we capture images with multiple phases and apply FFT to get the true depth. By the proposed approach, we are able to reconstruct an accurate 3D model with RMSE of the measured depth belowing 1.2mm.
KEYWORDS: Video, Video compression, Switching, Navigation systems, 3D video streaming, Video processing, Local area networks, Computer programming, Cameras, 3D image processing
Light Field Rendering (LFR) now plays a very important role in Free View-point Video (FVV) service, which is a new
type of multi-media. Supporting Light Field Video (LFV) streaming over IP network is a very challenging research area.
This paper shows a sender-driven streaming service that can support dynamic light field video service to multiple users
over broad-band IP networks using a time-stamp controlling algorithm. Results show that system built based on our
algorithm can support more than 50 users in a 100Mb band-width on server side.
KEYWORDS: Video, Genetic algorithms, Video coding, Video compression, Optimization (mathematics), Performance modeling, Computer programming, Cameras, 3D video compression, Chemical elements
Efficient exploitation of the temporal and inter-view correlation is critical to multi-view video coding (MVC),
and the key to it relies on the design of prediction chain structure according to the various pattern of correlations.
In this paper, we propose a novel prediction structure model to design optimal MVC coding schemes along with
tradeoff analysis in depth between compression efficiency and prediction structure complexity for certain standard
functionalities. Focusing on the representation of the entire set of possible chain structures rather than certain
typical ones, the proposed model can given efficient MVC schemes that adaptively vary with the requirements of
structure complexity and video source characteristics (the number of views, the degrees of temporal and interview
correlations). To handle large scale problem in model optimization, we deploy a hybrid genetic algorithm
which yields satisfactory results shown in the simulations.
KEYWORDS: Distortion, Computer programming, Video, Video coding, Error analysis, Video processing, Process control, Receivers, Performance modeling, Data modeling
In this paper, we jointly consider the work of rate distortion optimization (RDO) coding and rate distortion
optimization (RaDiO) streaming. A statistical mean-squared error (MSE) prediction system considering the
changing macroblock refreshing rate and the error propagation effect is introduced to make the expected distortion
expression used in RaDiO more accurate. According to this real time distortion prediction method, we propose
an integrated rate distortion optimized encoding and streaming (RDOES) framework which jointly optimize the
coding and streaming process by combining control on the selection of macroblock refreshing rate, the scheduler
on transmission and retransmission of the video packet. Using this framework, we demonstrate that each frame
should be encoded as I frame or none-intra-macroblock-refreshing P frame to achieve the best transmission
performance whenever RaDiO or RDOES is used. Simulation results show that our RDOES framework provides
flexibility and demonstrates substantial performance gains of at least 1.1dB on average over conventional RaDiO.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.