The utilization of long-range spatial features has significantly enhanced hyperspectral image (HSI) classification performance. However, capturing global spatial relationships among pixels comes with a high computational expense. Given that super-pixels are created adaptively from the content within hyperspectral images, leveraging them for global spatial information extraction has proven to be an efficient strategy. Nevertheless, some methodologies at the super-pixel level, like Graph Convolutional Networks (GCN), are burdened by numerous parameters and substantial computational complexity. In order to address these challenges, this paper introduces an Efficient MLP-Aided CNN Network with Localglobal Feature Fusion for Hyperspectral Image Classification (EMACN). EMACN adopts a dual-branch structure, utilizing a super-pixel-based Multilayer Perceptron (MLP) for global feature extraction with a pixel-based Convolutional Neural Network (CNN) for local feature extraction, thereby achieving an effective balance between computational efficiency and classification performance. Compared to the quadratic complexity seen in other super-pixel level deep learning structures such as GCNs, the computational complexity of our super-pixel-based MLP approach is linear with respect to both the number and dimensionality of super-pixels. Furthermore, we enhance our model by integrating a spectral transformation subnet that utilizes group convolution instead of the traditional 1x1 convolution for reducing the dimensionality of hyperspectral images. Additionally, our model incorporates a pixel-based CNN module that employs 2D depthwise separable convolutions, optimizing computational efficiency. This design allows our network to achieve an optimal balance between its representation capabilities and computational expenses. Extensive testing across three datasets confirms the superior performance of our proposed network.
|