Simultaneous achievement of the detection of small targets in synthetic aperture radar (SAR) images and the visual interpretation of the entire scene remains challenging due to the SAR imaging properties. To address these challenges, a two-branch network, which includes a proposed target detection network of TYSAR-YOLOv5 and a SAR image colorization network of CycleGAN, is developed. TYSAR-YOLOv5 is established on the basis of the YOLOv5 network, which incorporates an additional detection head specifically designed for capturing small targets from shallow features. The BoTNet structure is also deployed in the backbone network to capture global information and locate targets in highly dense scenes accurately with an acceptable number of parameters. To enhance the interpretation of the surrounding environment of detected targets, the CycleGAN network known for style transfer capabilities is employed to colorize SAR images and fuse with detected targets to achieve the enhanced target detection result of the SAR image. We expanded and updated the large-scale multi-class SAR image target detection dataset-1.0 (MSAR-1.0) by adding small aircraft targets using data augmentation techniques. The reference color images corresponding to the SAR images of each target type are also added using Google Earth Engine and public vision data for SAR image colorization. Experimental results demonstrate that the developed network dominantly outperformed state-of-the-art target detection networks in small aircraft target detection and multi-class target detection tasks. Of more importance, the final visualizations of target detection exhibit excellent interpretability, which provides rich semantic information benefiting decision-making.
Spectral variability and shadow effects can limit the hyperspectral image (HSI) classification performance. Compared with HSI, the LiDAR data is an excellent complement with its abundant elevation information. In this study, a procedure including pre-processing, deep residual network classification and post-processing is investigated for classification of HSI aided by the LiDAR data to release the problem of identifying shaded objects and spectral variability. Specifically, three aspects with respect to spectral band selection using Archetypal Analysis (AA), feature level fusion based classification by deploying a residual network associated approach and label correction utilizing the elevation information, are explored to realize more accurate classification. Experiments on three public multi-source (hyperspectral and LiDAR) remote sensing datasets show more promising classification can be achieved via fusion two-source of remote sensing data than that using only independent hyperspectral image. In particular, on the Houston 2017 dataset, OA and Kappa achieved significant gains of 2.47% and 2.79% respectively after incorporating LiDAR information. Moreover, the results demonstrate the elevation information used independently in the post-processing stage can help with effective refinement of classification results.
The state-of-art simultaneous localization and mapping (SLAM) methods have been demonstrated of satisfactory results in static environments. However, most of the existing methods cannot be directly applied in dynamic environments to account for moving objects. To solve the practical problem of SLAM in dynamic environments, an RGB-D SLAM method is developed based on moving objects removal and dense map reconstruction. Specifically, deep learning-based instance segmentation is performed to obtain the semantic information which is furthermore combined with multiview geometry theory to detect possible moving objects. Kalman filter and feature fusion algorithms are further used for moving objects tracking to improve the detection accuracy. Then the camera pose is estimated using the static feature points after motion removal to realize reliable localization in dynamic environments. Out of concern for the infeasibility of navigation and perception with sparse feature maps, each-single-view of dense map is constructed using the RGB image and the depth map collected by the RGB-D camera after motion removal. The dense environmental map is finally reconstructed through the registration of point cloud of selected key image frames. Experiments are conducted on public datasets and real scene data to evaluate the performance of the proposed method. Compared with the state-of-art method ORB-SLAM2, the absolute trajectory error and the relative pose error obtained by the proposed method are significantly reduced by at least 91%, and the localization accuracy reaches 0.001 m.
A spectral–spatial classification method using a trilateral filter (TF) and stacked sparse autoencoder (SSA) for improving the classification accuracy of hyperspectral image (HSI) is proposed. The operation is carried out in two main stages: edge-preserved smoothing and high-level feature learning. First, a reference image obtained from dual tree complex wavelet transform is adopted in a TF for smoothing the HSI. As expected, the filter not only can effectively attenuate the mixed noise (e.g., Gaussian noise and impulse noise) where the bilateral filter shows poor performance but also can produce useful spectral–spatial features from HSI by considering geometric closeness and photometric similarity between pixels simultaneously. Second, an artificial fish swarm algorithm (AFSA) is first introduced into a SSA, and the proposed deep learning architecture is used to adaptively exploit more abstract and differentiable high-level feature representations from the smoothed HSI, based on the factor that AFSA provides better trade-off among concurrency, search efficiency, and convergence rate compared with gradient descent and back-propagation algorithms in a traditional SSA. Finally, a random forest classifier is utilized to perform supervised fine-tuning and classification. Experimental results on two real HSI data sets demonstrate that the proposed method generates competitive performance compared with those of conventional methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.