In recent years, the deep learning features, which was extracted and coded by neural network, have been widely used in image classification and achieved significant results. Textile fabric industry usually classify the items according to image patterns of fabrics, in which the shape and color of image have more distinguish effect. So how to use the image characteristics of textile fabrics to enhance the discrimination of deep learning features is the key process to improve the accuracy of classification of textile fabric. This paper proposes a method of textile fabric classification based on the deep learning features enhanced by Histograms of Oriented Gradients (HOG) features and HS histogram statistical features extracted in HSV domain. By fusing more image shape and color information with deep learning features, the enhanced deep learning features have greater discrimination. The experiment shows that the proposed method has good classification result, achieve 92.4% accuracy on the corresponding dataset.
The depth completion task aims to fill in the missing depth values of a sparse depth map to obtain a dense depth map, which is crucial for application of computer vision especially autonomous driving. Since acquiring a dense or even semidense ground-truth depth map for supervised training is laborious and difficult, sparse depth map is used to achieve semi-supervised learning. But the sparse depth map has not enough valid values, which means it cannot provide effective constraint. Therefore, we merge multi-frame point clouds from LiDAR sequences into a same frame to improve the density of sparse depth map and enhance the constraints of depth labels in semi-supervised learning. So a semisupervised multimodal multitask framework is proposed which includes two sub-networks: LiDAR odometry and depth completion network. The LiDAR odometry sub-network takes LiDAR sequences as input and achieves self-supervised learning based on geometric consistency between sequences. Based on the pose estimated by odometry network, we use differential projection module (DPM) to obtain a denser merged depth map. The depth completion sub-network takes binocular image and sparse depth map as input, and realizes semi-supervised learning with supervision of the stereo view synthesis and the merged depth map from LiDAR odometry branch. These two sub-networks can be trained in a multitask fashion with the help of DPM. The experiments conducted on the KITTI dataset show that the proposed method outperforms other state-of-the-art methods.
Semantic image synthesis is to synthesize photorealistic images according to the given semantic layout. Existing methods try to build a single-scale style encoder based on semantic regions, which inject style simply based on a single level, are unable to extract rich style information. Especially for different instance objects in the same semantic region, single-scale networks tend to generate the same style and control style ineffectively. To cope with this issue, we propose Multi-Scale Instance-level image synthesis method (MSIN). In order to learn more discriminative representation from different feature levels in instance, a multi-scale style encoder is designed to extract more details instead of traditional single-scale style encoder, which adopts a "pyramid" structure to contact contextual information. In addition, to synthesize visually pleasing and photorealistic images, MSIN leverages the region-style fusion mechanism in adaptive normalization layer, which realizes instance-wise object-to-object multi-style generation simultaneously. Compared with the previous methods, our method can generate images with fine details and control style in instance object, whose semantics are more reasonable and diverse to different instance objects. The experimental results demonstrate the superiority of MSIN on dealing with semantic image synthesis tasks and outperforms existing methods in terms of instance objects and diverse generation.
The autonomous vehicles are required to perceive the environment to take a correct driving decision. The sensors which have been commonly used by autonomous vehicles are the camera and the Light Detection and Ranging (LiDAR). In this work, we have integrated the LiDAR data with the image captured by the camera to assign the color information to the point cloud which resulted in a 3D model and to assign depth information to the image pixels which resulted in a depth map. The LiDAR data is sparse and the resolution of the image is much greater than that of the LiDAR data. In order to match the resolution of the LiDAR data and image data, we had utilized Gaussian Process Regression (GPR) to interpolate the depth map but it was not able to completely fill the empty locations in the depth map. In this paper, we have proposed a method to interpolate the 2D depth map data to completely fill the empty locations in the depth map. In this study, we have used Velodyne VLP-16 LiDAR and a monocular camera. Our method is based on the covariance matrix in which the depth value assigned to the empty locations in depth map is decided according to the value of covariance function in the covariance matrix. Our method surpassed the GPR in run time and interpolation result. This shows that our approach is fast enough in real-time for autonomous vehicles.
To improve the efficiency of VP9 decoder, a novel parallel pipeline structure of VP9 decoder is presented in this paper. According to the decoding workflow, VP9 decoder can be divided into sub-modules which include entropy decoding, inverse quantization, inverse transform, intra prediction, inter prediction, deblocking and pixel adaptive compensation. By analyzing the computing time of each module, hotspot modules are located and the causes of low efficiency of VP9 decoder can be found. Then, a novel pipeline decoder structure is designed by using mixed parallel decoding methods of data division and function division. The experimental results show that this structure can greatly improve the decoding efficiency of VP9.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.