We examine properties of perceptual image distortion models, computed as the mean squared error in the response of a 2-stage cascaded image transformation. Each stage in the cascade is composed of a linear transformation, followed by a local nonlinear normalization operation. We consider two such models. For the first, the structure of the linear transformations is chosen according to perceptual criteria: a center-surround filter that extracts local contrast, and a filter designed to select visually relevant contrast according to the Standard Spatial Observer. For the second, the linear transformations are chosen based on statistical criterion, so as to eliminate correlations estimated from responses to a set of natural images. For both models, the parameters that govern the scale of the linear filters and the properties of the nonlinear normalization operation, are chosen to achieve minimal/maximal subjective discriminability of pairs of images that have been optimized to minimize/maximize the model, respectively (we refer to this as MAximum Differentiation, or “MAD”, Optimization). We find that both representations substantially reduce redundancy (mutual information), with a larger reduction occurring in the second (statistically optimized) model. We also find that both models are highly correlated with subjective scores from the TID2008 database, with slightly better performance seen in the first (perceptually chosen) model. Finally, we use a foveated version of the perceptual model to synthesize visual metamers. Specifically, we generate an example of a distorted image that is optimized so as to minimize the perceptual error over receptive fields that scale with eccentricity, demonstrating that the errors are barely visible despite a substantial MSE relative to the original image.
KEYWORDS: Distortion, Image quality, RGB color model, Image processing, Visual process modeling, Imaging systems, Human vision and color perception, Performance modeling, Visualization, Visual system
Perceptual image distortion measures can play a fundamental role in evaluating and optimizing imaging systems
and image processing algorithms. Many existing measures are formulated to represent "just noticeable differences"
(JNDs), as measured in psychophysical experiments on human subjects. But some image distortions,
such as those arising from small changes in the intensity of the ambient illumination, are far more tolerable to
human observers than those that disrupt the spatial structure of intensities and colors. Here, we introduce a
framework in which we quantify these perceptual distortions in terms of "just intolerable differences" (JIDs).
As in (Wang & Simoncelli, Proc. ICIP 2005), we first construct a set of spatio-chromatic basis functions to
approximate (as a first-order Taylor series) a set of "non-structural" distortions that result from changes in
lighting/imaging/viewing conditions. These basis functions are defined on local image patches, and are adaptive,
in that they are computed as functions of the undistorted reference image. This set is then augmented with a
complete basis arising from a linear approximation of the CIELAB color space. Each basis function is weighted
by a scale factor to convert it into units corresponding to JIDs. Each patch of the error image is represented
using this weighted overcomplete basis, and the overall distortion metric is computed by summing the squared
coefficients over all such (overlapping) patches. We implement an example of this metric, incorporating invariance
to small changes in the viewing and lighting conditions, and demonstrate that the resulting distortion values
are more consistent with human perception than those produced by CIELAB or S-CIELAB.
We describe an invertible nonlinear image transformation that is well-matched to the statistical properties of
photographic images, as well as the perceptual sensitivity of the human visual system. Images are first decomposed
using a multi-scale oriented linear transformation. In this domain, we develop a Markov random field
model based on the dependencies within local clusters of transform coefficients associated with basis functions
at nearby positions, orientations and scales. In this model, division of each coefficient by a particular linear
combination of the amplitudes of others in the cluster produces a new nonlinear representation with marginally
Gaussian statistics. We develop a reliable and efficient iterative procedure for inverting the divisive transformation.
Finally, we probe the statistical and perceptual advantages of this image representation, examining
robustness to added noise, rate-distortion behavior, and artifact-free local contrast enhancement.
Reduced-reference (RR) image quality measures aim to predict the visual quality of distorted images with only partial information about the reference images. In this paper, we propose an RR quality assessment method based on a natural image statistic model in the wavelet transform domain. In particular, we observe that the marginal distribution of wavelet coefficients changes in different ways for different types of image distortions. To quantify such changes, we estimate the Kullback-Leibler distance between the marginal distributions of wavelet coefficients of the reference and distorted images. A generalized Gaussian model is employed to summarize the marginal distribution of wavelet coefficients of the reference image, so that only a relatively small number of RR features are needed for the evaluation of image quality. The proposed method is easy to implement and computationally efficient. In addition, we find that many well-known types of image distortion lead to significant changes in wavelet coefficient histograms, and thus are readily detectable by our measure. The algorithm is tested with subjective ratings of a large image database that contains images corrupted with a wide variety of distortion types.
We propose a methodology for comparing and refining perceptual image quality metrics based on synthetic images that are optimized to best differentiate two candidate quality metrics. We start from an initial distorted image and iteratively search for the best/worst images in terms of one metric while constraining the value of the other to remain fixed. We then repeat this, reversing the roles of the two metrics. Subjective test on the quality of pairs of these images generated at different initial distortion levels provides a strong indication of the relative strength and weaknesses of the metrics being compared. This methodology also provides an efficient way to further refine the definition of an image quality metric.
KEYWORDS: Wavelets, Global system for mobile communications, Stochastic processes, Denoising, Image processing, Algorithm development, Statistical modeling, Signal to noise ratio, Matrices, Control systems
We develop a new class of non-Gaussian multiscale stochastic processes defined by random cascades on trees of wavelet or other multiresolution coefficients. These cascades reproduce a rich semi-parametric class of random variables known as Gaussian scale mixtures. We demonstrate that this model class can accurately capture the remarkably regular and non- Gaussian features of natural images in a parsimonious fashion, involving only a small set of parameters. In addition, this model structure leads to efficient algorithms for image processing. In particular, we develop a Newton- like algorithm for MAP estimation that exploits very fast algorithm for linear-Gaussian estimation on trees, and hence is efficient. On the basis of this MAP estimator, we develop and illustrate a denoising technique that is based on a global prior model, and preserves the structure of natural images.
KEYWORDS: Wavelets, Global system for mobile communications, Denoising, Image denoising, Statistical modeling, Photography, Statistical analysis, Data modeling, Algorithm development, Signal to noise ratio
The statistics of photographic images, when decomposed in a multiscale wavelet basis, exhibit striking non-Gaussian behaviors. The joint densities of clusters of wavelet coefficients are well-described as a Gaussian scale mixture: a jointly Gaussian vector multiplied by a hidden scaling variable. We develop a maximum likelihood solution for estimating the hidden variable from an observation of the cluster of coefficients contaminated by additive Gaussian noise. The estimated hidden variable is then used to estimate the original noise-free coefficients. We demonstrate the power of this model through numerical simulations of image denoising.
KEYWORDS: Wavelets, Statistical modeling, Computer programming, Visual process modeling, Image processing, Principal component analysis, Statistical analysis, Photography, Visualization, Signal to noise ratio
I describe a statistical model for natural photographic images, when decomposed in a multi-scale wavelet basis. In particular, I examine both the marginal and pairwise joint histograms of wavelet coefficients at adjacent spatial locations, orientations, and spatial scales. Although the histograms are highly non-Gaussian, they are nevertheless well described using fairly simple parameterized density models.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.