Bits-back coding is a kind of entropy coding algorithm and can be used with an image generative model using latent variables for lossless image compression. It achieves excellent coding performance when ignoring the amount of initial bits which must be prepared independent of image contents. However, the initial bits are considered as additional information and indispensable for the encoding process. In this paper, a way of initial bits generation as well as discretization precision of the latent variables are investigated for better coding performance in a practical sense.
Since the JPEG standard employs discrete cosine transform (DCT) based lossy coding algorithm, quality degradation due to coding artifacts is unavoidable especially at low coding rates. In order to alleviate this problem, we previously proposed a post filtering method that can maximize the well-known objective quality index called SSIM. In this paper, we combine the method with the projection onto convex sets (POCS) algorithm to exploit quantization constraint (QC) prior on DCT coefficients extracted from the JPEG image. Experimental results indicate that SSIM scores of the JPEG images are improved by applying the POCS algorithm.
We have researched a hierarchical lossless encoding method using cellular neural networks (CNN) as predictors. In our method, which belongs to the hierarchical lossless coding method, the prediction accuracy is improved by adaptively using different CNN predictors depending on the direction of the image edges. The prediction error obtained by CNN prediction is encoded by adaptive arithmetic coding using multiple probabilistic models based on the context modeling. In previous research,1 a new approach is introduced in which the prediction errors of each predictor are encoded separately by arithmetic coding. Although this method improves the performance of encoding prediction errors, increasing side information became an issue. Therefore, to reduce the side information of the arithmetic coders, we propose a grouping algorithm that groups the prediction errors corresponding to each predictor based on the utilization of the predictors.
3D point cloud data consist of a large number of points with attribute information such as color in addition to geometry information of 3D positions. Since their data size tends to be large, efficient compression methods not only for the geometry but also for the attribute information are desired. In general, attribute information has a spatial correlation depending on local texture of the object surface. However, the conventional adaptive prediction technique, which is popular in 2D image coding, cannot be applied as-is, since distribution of the already encoded samples used for the prediction is usually sparse and irregular. In this paper, we propose a method for designing adaptive predictors on a point-by-point basis using a 3D directional autocorrelation model of the attribute information. The obtained predictors are utilized in the probability model optimization technique for efficient lossless coding of RGB color attribution.
This paper proposes a method of detecting entry and exit of vehicles in each parking slot using a surveillance camera placed outdoors. To specify the respective parking slot areas in a surveillance camera image, polygonal windows are set by projecting rectangular boxes, each of which surrounds the typical body of a vehicle parked in each slot, onto the image based on the perspective projection matrix associated with the surveillance camera. The projection matrix is semi-automatically calculated by providing 3D coordinates to a small number of points picked up from the surveillance camera image. Then, a joint intensity histogram between a pair of images taken by the same camera with a certain time interval is calculated in each window. By analyzing the distribution of the histogram, entry and exit of vehicles in the slot can be robustly detected without being affected by lighting change during the time interval.
We previously proposed a method of designing a 2D FIR filter that can maximize the well-known objective quality index called SSIM. The designed filter can be used as a post processing tool for lossy image coding methods to reduce coding artifacts. In this scenario, there is a trade-off between the amount of side information on filter coefficients and the obtained gain in image quality. In this paper, effectiveness of the designed filters on the rate-SSIM based coding performance is evaluated under different settings of the size and quantization precision of the filter coefficients. Moreover, we introduce symmetric constraints on the filter coefficients to reduce the side information.
We previously proposed a novel lossless image coding method that utilizes example search and adaptive prediction within a framework of probability model optimization. In this paper, the definition of the probability model as well as its optimization procedure are modified to reduce the encoding complexity. In addition, affine predictors used in the adaptive prediction are refined for accurate probability modeling. Simulation results indicate that our modification contributes not only to encoding time reduction, but also to coding efficiency improvement for all of the tested images.
Recently, convolutional neural network-based generative models of image signals have been proposed mainly for the purpose of image generation, restoration and compression. For example, PixelCNN++ approximates probability distribution of the image intensity value as a parametric function pel-by-pel, and can be used for lossless image coding tasks. However, such an approach cannot work well for specific images which have statistical properties different from the image dataset used for the network training. In this paper, we improve the coding efficiency by introducing a few parameters for adjusting the probability model generated by PixelCNN++. These parameters are numerically optimized to minimize coding rates of the given image and then encoded as side-information to enable same adjustment at the decoder side.
This paper describes a method of designing a 2D post filter for reducing coding artifacts caused by lossy image compression. Though Mean Squared Error (MSE) has been typically used in such filter design, it is not necessarily a good quality measure in terms of consistency with subjective perception. In this paper, we employ a more reliable quality measure called Structural SIMilarity (SSIM), and derive filter coefficients that can maximize the SSIM score for each image.
KEYWORDS: Image compression, High dynamic range imaging, RGB color model, Image sensors, Computer programming, Electrical engineering, High dynamic range image sensors, Error analysis
This paper describes an efficient lossless coding method for HDR color images stored in a floating point format called radiance RGBE. In this method, three mantissa parts of RGB components as well as a common exponent part, each of which is represented in 8-bit depth, are encoded by the block-adaptive prediction technique. In order to improve the prediction accuracy, mantissa parts of RGB components used in the prediction are adjusted so that their exponent parts can be regarded as same. Moreover, not only the same color but also already encoded other color components are used in the prediction to exploit inter-color correlations. Simulation results indicate that introduction of the above exponent equalization as well as inter-color prediction can considerably improve the coding efficiency.
We previously proposed a lossless video coding method based on intra/inter-frame example search and probability model optimization. In this method, several examples, i.e. a set of pels whose neighborhoods are similar to a local texture of the target pel to be encoded, are searched from already encoded areas of the current and previous frames with integer pel accuracy. Probability distribution of an image value at the target pel is then modeled as weighted sum of the Gaussian functions whose peaked positions are given by the individual examples. Furthermore, model parameters that control shapes of the Gaussian functions are numerically optimized so that the resulting coding rate can be a minimum. In this paper, the above example search process is enhanced to allow fractional-pel positions for more accurate probability modeling.
Seam carving and its variants are popular as content-aware image resizing methods. However, they often suffer from the problem that excessive downscaling causes perceptually annoying distortions. This is mainly because penetration of the seams into some important objects becomes unavoidable at the latter stage of the processing. As a solution for this problem, we previously proposed a nonlinear downscaling technique which iteratively performed a DCT-based locally linear scaling operator within ‘belt-like seams’, i.e. seams with a certain width. To enhance this idea, in this paper, we replace the latter processing stage with a global linear scaling operator. A transition point between the nonlinear and linear processing stages is automatically determined based on a preservation measurement for the important objects. Simulation results show that our approach can produce subjectively better results than the conventional nonlinear downscaling methods.
We previously proposed a machine learning based post filtering method for reducing image artifacts caused by lossy compression. The method classifies reconstructed image samples into three categories using a support vector machine (SVM) to roughly discriminate magnitude of the reconstruction errors. Then, an optimum offset value is added to the samples belonging to each category in a similar way to the post filtering technique called sample adaptive offset (SAO) used in the H.265/HEVC standard. In this paper, two kinds of SVM classifiers are adaptively switched according to information on block boundaries of transform units (TUs) in H.265/HEVC intra-frame coding. Furthermore, samples used for a feature vector, which will be fed to the SVM classifier, are rotated at the block boundary to properly capture local characteristics of the reconstruction errors.
This paper describes a method for creating cel-style CG animations of waving hair. In this method, gatherings of air are considered as virtual circles moving at a constant velocity, and hair bundles are modeled as elastic bodies. Deformation of the hair bundles is then calculated by simulating collision events between the virtual circles and the hair bundles. Since the method is based on the animator's technique used in creation of the traditional cel animations, it is expected to suppress a feeling of strangeness that is often introduced by the conventional procedural animation techniques.
In general, "drawing collapse" is a word used when very low quality animated contents are broadcast. For example, perspective of the scene is unnaturally distorted and/or sizes of people and buildings are abnormally unbalanced. In our research, possibility of automatic discrimination of drawing collapse is explored for the purpose of reducing a workload for content check typically done by the animation director. In this paper, we focus only on faces of animated characters as a preliminary task, and distances as well as angles between several feature points on facial parts are used as input data. By training a support vector machine (SVM) using the input data extracted from both positive and negative example images, about 90% of discrimination accuracy is obtained when the same character is tested.
This paper describes a novel lossless video coding method that directly estimates a probability distribution of image values pel-by-pel. In the estimation process, several examples, i.e. a set of pels whose neighborhoods are similar to a local texture of the target pel to be encoded, are gathered from search windows located on an already encoded area of the current frame as well as those of the previous frames. Then the probability distribution is modeled as a weighted sum of the Gaussian functions whose center positions are given by the individual examples. Furthermore, model parameters that control shapes of the Gaussian functions are numerically optimized so that the resulting coding rate can be a minimum. Simulation results indicate that the coding performance can be improved by increasing the number of reference frames.
This paper proposes an efficient lossless coding scheme for still images. The scheme utilizes an adaptive prediction technique where a set of linear predictors are designed for a given image and an appropriate predictor is selected from the set block-by-block. The resulting prediction errors are encoded using context-adaptive variable-length codes (VLCs). Context modeling, or adaptive selection of VLCs, is carried out pel-by-pel and the VLC assigned to each context is designed on a probability distribution model of the prediction errors. In order to improve coding efficiency, a generalized Gaussian function is used as the model for each context. Moreover, not only the predictors but also parameters of the probability distribution models are iteratively optimized for each image so that a coding rate of the prediction errors can have a minimum. Experimental results show that the proposed coding scheme attains comparable coding performance to the state-of-the-art TMW scheme with much lower complexity in the decoding process.
Cascading of quadratic nonlinearity has been attracting great interests for its potential application to parametric devices,
such as phase conjugators or effective Kerr media. In most configurations, the device consists of a single nonlinear element
with uniform phase-mismatch. Cascading of several elements with different phase-mismatch has been theoretically investigated and predicted to improve performance of the device for classical applications. In this paper, amplitude squeezing in second-harmonic generation using cascaded quadratic nonlinear elements is numerically analyzed. The analyses are based on linearization of nonlinear coupling equations, where interacting fields are approximated as plane waves. Phase-mismatch of each element is varied independently and tolerance of squeezing performance to the fluctuation of the phase-mismatch is also investigated. It is predicted that the performance as a squeezing device can be also superior to that of a single element device, if a proper combination of the phase-mismatch of each element is chosen. For the fundamental wave, the tolerance to the fluctuation of the phase-mismatch will improve by nearly tenfold compared with the single element case. For the harmonic wave, squeezing beyond the limit of perfect phase-matched case (3dB) will be available, though the tolerance to the fluctuation of the phase-mismatch is quite small. These improvements can be attributed to the nonlinear phase rotation that keeps squeezed axis coincided with the amplitude phase in a stable manner.
This paper describes a new transform image coding scheme which transfigures square-blocks into variable-shape-blocks so that their boundaries run parallel with the principal contours in an image in order to diminish the coding noise peculiar to transform coding such as the mosquito noise and blocking-effects. This scheme decreases additional information to encode block shapes by limiting all of them to quadrilaterals. Moreover smoothing filter is introduced for the purpose of reducing approximation errors particularly at block boundaries. After determining and encoding block shapes, this scheme transmits a mean value of each block by DPCM and reproduces the interpolation image at both coder and decoder. Then, the interpolation residual signals are selectively encoded by the mean separated KLT whose orthonormal basis-functions are derived individually for each block from the isotropic model for autocorrelation functions of images. Simulation results indicate that this scheme has superiority over square-block-based coding scheme in both coding efficiency and image quality.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.