SSD (Single Shot Multi-box Detector) is one of the best object detection algorithms with both high accuracy and fast speed. However, SSD’s feature pyramid detection method only extracts the features from different scales without further procession, which leads to semantic information lost. In this paper, we proposed Multi-scales Feature Integration SSD, an enhanced SSD with feature integrated modules which can improve the performance significantly over SSD. In the feature integrated modules, features from different layers with different scales are concatenated together after some upsampling tricks, then we use the features as input of several convolutional modules, those modules will be fed to multibox detectors to predict the final results. We test our algorithm On the Pascal VOC 2007test with the input size 300×300 using a single Nvidia 1080Ti GPU. In addition, our network outperforms a lot of state-of-the-art object detection algorithms in both aspects of accuracy and speed.
The detection of ship targets in remote sensing satellite images is an important means to obtain all ships on the sea surface by satellite image. It can realize the monitoring of sea surface resources, so it has important civil and military significance. Because of the complex background, ship detection in harbour is one of the difficulties. In recent years, many target detection methods based on deep learning have been proposed, and they have achieved good results in natural scene images. YOLOv3 is an advanced end-to-end method because of its high detection accuracy and fast detection speed. But even advanced methods have their shortcomings in this task. Ships in port usually dock side by side, which leads to missed detection of many targets when NMS (Non-Maximum Suppression) operation is performed on the predicted bounding boxes. In this paper, we replace the original NMS with Soft-NMS on the basis of YOLOv3. This operation makes the detector miss fewer targets. At the same time, we added IoU loss when calculating the loss of the prediction box and ground truth box. IoU loss takes the prediction box and the IoU value of its corresponding ground truth box as the evaluation criterion, which makes the target box generated by the detector more fitted to the target. In order to validate the effectiveness of the proposed algorithm, we use harbour remote sensing data collected from Google image and GaoFen-2 (GF-2) satellite, the experimental results show good performance of the proposed method in the detection of ship targets in harbour.
Semantic segmentation is one of the basic themes in computer vision. Its purpose is to assign semantic tags to each pixel of an image, which has been applied in many fields such as medical field, intelligent transportation and remote sensing image. In this paper, we use deep learning to solve the task of remote sensing semantic image segmentation. We propose an algorithm for semantic segmentation of the Attention Seg-Net network combined with SegNet and attention gate. Our proposed network can better segment vegetation, buildings, water bodies and roads in the test set of remote sensing images.
Extracting buildings from remote sensing images is a significant task with many applications such as map drawing, city planning, population estimation, etc. However, traditional methods that rely on artificially designed features struggle to perform well due to the diverse appearance and complicated background. In this paper, we design an end-to-end convolutional neural network that combines semantic segmentation and edge detection for building extraction. In addition, we propose a residual unit combined with spatial pyramid pooling (SPP-RU) to yield representations of different size receptive fields by multi-branch network. We conduct experiments on WHU building dataset, and the experimental results demonstrate the effectiveness of our method in terms of quantitative and qualitative performance compared with state-of-the-art methods.
Satellite remote sensing that utilizes highly accurate cloud detection is important for monitoring natural disasters. The GaoFen-4, China’s first high-resolution stationary satellite, was recently launched and acquires imagery at a spatial resolution of 50 m and a high temporal resolution (up to 10 min). An object-based cloud detection method was conducted for a time series of GaoFen-4 images. The cloudy objects were obtained from the individual images, and the outlier detection of multiple temporal objects was further processed for refinement. In the initial cloud detection, the objects were segmented by the mean-shift algorithm, and their morphological features were extracted by extended attribute profiles. The threshold-detected cloudy objects were trained according to spectral and morphological features, and the initial objects were classified as cloudy or clear by a regularized least-squares classifier. Furthermore, the medians and standard deviations of the classified cloudy and clear objects were calculated and subsequently refined by the outlier detection of multiple temporal images. The clear object features deviated more than a multiple of standard deviations from the medians of the clear objects that were classified as cloudy objects. Additionally, the refined clear objects were obtained by a similar outlier detection method. Flood event monitoring using GaoFen-4 images showed that the average overall accuracy of the initial cloud detection was 83.4% and increased to 93.3% after refinement. This object-based cloud detection method was insensitive to variations in land objects and can effectively improve cloud detection within small or thin areas, which can be helpful for the monitoring of natural disasters.
Multi-sensor image registration is an important part of the remote sensing image processing. The gray property of the same object would have large differences in infrared and visible imaging mode, so it could get less matching points by using traditional SIFT algorithm directly in registration. However, NSCT decomposition can represent the structural information of the image very well and extract more SIFT feature points in its high frequency decomposed image. In addition, traditional SIFT descriptors’ gradient is affected by gray contrast, which could get less feature matching points during the similarity search in the matching procedure. Gradient mirroring (GM) is a method that can modify the direction of the feature points, which can reduce the contrast impact on the similarity matching. Therefore, a novel method combining NSCT and GM is proposed in this article. The experiments prove that, comparing with the traditional SIFT algorithm, the new method can get more matching points, better distributing and higher matching rate in infrared and visible image registration.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.