Video related technology has grown rapidly due to the progress of digital devices such as virtual reality, 3D cameras, 3D films, and 3D display and Internet. During video acquisition and processing (compression, transmission, and reproduction) they may suffer some types of distortions which lead degradation and has a direct effect on the subjective sensation about human eyes. Moreover, the subjective evaluation is boring, time-consuming, and we do not have a specialist to do this kind of work. So, it is necessary to evaluate the quality of the videos by computers. Which means that the video quality evaluation/assessment has become vital. The goal of video quality assessment is to predict the perceptual quality for improving the performance of practical video application systems. In other words, the users’ experience is worse, so we need a metric to measure the distortions. The commonly used videos’ quality assessment methods: a) consider the video as a sequence of two-dimensional images and the videos’ quality assessment (scores) computing by weighted averaging of per frames (2D images) score, which conflicts with the fact that a video signal is a 3D volume and which ignores the movement features, b) are designed for specific distortions (for example blockiness and blurriness). In this paper, we present a novel deep learning architecture for no-reference video quality assessment. It based on a 3D convolutional neural network and generative adversarial network (GAN). We evaluate the proposed approach on the LIVE, ECVQ, TID2013 and EVVQ databases. Computer simulations show that the proposed video quality assessment: a) get convergence on a small amount of data, b) more “universal”- it can be used for different video quality degradation, including denoising, deblocking, deconvolution, and c) outperforms existing no-reference video quality assessment/methods. In addition, we demonstrate how our predicted no-reference quality metric correlates with qualitative opinion in a human observer study.
|