We proposed a vision-based methodology as an aid for an unmanned aerial vehicle (UAV) landing on a previously unsurveyed area. When the UAV was commanded to perform a landing mission in an unknown airfield, the learning procedure was activated to extract the surface features for learning the obstacle appearance. After the learning process, while hovering the UAV above the potential landing spot, the vision system would be able to predict the roughness value for confidence in a safe landing. Finally, using hybrid optical flow technology for motion estimation, we successfully carried out the UAV landing without a predefined target. Our work combines a well-equipped flight control system with the proposed vision system to yield more practical versatility for UAV applications. |
1.IntroductionUnmanned aerial vehicles (UAVs) are widely used in many fields, from military to civilian to commercial. For the sake of efficiency and convenience, a higher degree of autonomy is required to minimize human intervention. Many efforts have been made to develop vision-based technologies for UAV maneuvers. One challenge of this technique lies in how to land a UAV on a previously unvisited area. Generally, the issue can be divided into two parts: identifying a flat area for safe landing (landing risk assessment) and landing accurately on an unknown spot (motion estimation). For the case of landing site selection, the goal is to find a planar surface with a small slope that is free of obstacles, which can be conducted through assessing the landing risk by either constructing an elevation map or evaluating the planarity of terrain appearance. The former approach shall build a full three-dimensional (3-D) geometry1 with the corresponding coordinates of the environment by means of a sequence of images, e.g., structure from motion (SFM).2–5 As a consequence of estimated topographical information, a two-dimensional elevation map was thus obtained to comprehend the region flatness by, for instance, least median squares6 or plane fit.7 Incorporated with the coordinates about the surrounding, the information is useful for the likely landing site. SFM and related methods enable an image-based 3-D scene as well as determine the safety landing site, but with the cost of heavy computation. In order to directly obtain the absence of obstacles for the landing merit, alternatively, another incomplete but effective scheme was used. The planar area can be extracted through the homography estimation.8,9 Without the need for a region extraction process, the roughness estimation from the optical flow field was proposed to measure the planarity of the surface.10 In addition, researchers addressed the learning process to yield more practical versatility for UAV applications, such as supervised learning for texture classification,11 neural network policy for navigation,12 and deep reinforcement learning for marker recognition.13 For the aspect of landing risk, a self-supervised learning (SSL) method was employed to overcome the constraint that significant movement is required for optical-flow-based roughness estimation.14 After determining an adequate landing site in an unvisited area, the follow-up is to complete the landing process guided by either the positioning system [e.g., global positioning system (GPS)] or the vision-based motion estimation system. In terms of the vision-based landing scheme, several patterns are designed as markers to tackle the close range and nighttime detection problem during UAV descent.15–18 Moreover, while landing on a moving target, schemes either optimizing the marker detection rate19 or exploiting the moving target’s dynamic model were developed accordingly.20 However, the performance of the aforementioned schemes mainly relies on the specific target pattern and is unlikely to be applied in an unvisited environment where there is no chance to set a well-defined landing guide in advance. Our team aims to develop a fully vision-based system for UAV landings in a previously unvisited environment. Resuming our previous work on vision-based landing motion estimation,21 we further integrated the vision system with the learning algorithm. The major function of the proposed system is to classify the obstacle appearance on the ground and provide an accurate measure of motion during the landing. To achieve these aims, we introduced the SSL to model the relationship between visual appearance and surface roughness and developed a classifier to determine if the land is safe for landing by recognizing the predicted roughness (yes/no question). Moreover, the hybrid optical flow scheme was also employed to ensure the motion estimation throughout the entire landing process without prior knowledge of guiding markers. The remainder is organized as follows. In Sec. 2, we explain the concept of roughness estimation as well as the methodology for landing site identification. In Sec. 3, we introduce the hybrid framework for visual motion detection, including the multiscale strategy for positioning to tackle the field-of-view problem during descent. Afterward, experimental verification is given in Sec. 4. Finally, conclusion and future work are drawn in Sec. 5. 2.Learning of Obstacle AppearanceSSL is a classic approach that uses input signals as the sources for supervision. Instead of human intervention, the training labels were determined by the collected data. Therefore, to learn the obstacle appearance in view, we must gather the visual cues as the input objects and the surface planarity as the corresponding supervised output values. Figure 1 shows the process of the learning algorithm for the obstacle appearance. Phase I: since the area is previously unvisited, we first navigated the UAV to capture the images as the clustering dataset. The texton dictionary22,23 was then built as an attribute of the surface texture features. Phase II: we collected the training data, including the surface roughness measured from the optical flow field, and the texton distribution formed by matching the randomly selected patches and the labeled textons. We used the regression to model the relationship amid the surface roughness and the texton distribution. Phase III: after completing the learning step, the UAV would have the capability to identify obstacles in a still image through the predicted roughness, thereby ensuring a safe landing on the unvisited area. In order to save computational effort, the imaging algorithms were merely effective in the region-of-interest (ROI) of the input image stream. Details will be explained in the following content. 2.1.Patch Operation for Visual AppearanceIn this study, we used the texton method24 to attribute the visual appearance of an unvisited airfield. By clustering the characteristic values from multiple image patches, we can generate a texton dictionary that represents the surface texture features. In our implementation, each image patch was rearranged into a vector with its grayscale values. Then these vectors were partitioned into sets by a -means clustering algorithm,25 where each vector belongs to the cluster with the nearest mean. The objective function can be defined as where is the mean of observations in cluster . With this method, the cluster centroids formed a visual dictionary for the unvisited area.After creating the texton dictionary, the rendering features from images can be characterized by a texton probability distribution, as shown in Fig. 2. For each randomly extracted patch, we searched for the closest match in the dictionary based on the Euclidean distance and added it to the corresponding bin in a histogram. Finally, we obtain the texton distribution by normalizing with the number of extracted patches as follows: where2.2.Surface Roughness using Optical Flow AlgorithmIn order to determine if the selected spot is suitable for landing, we estimated the surface roughness as the merit of the safe landing. The concept of roughness estimation is to regard the optical flow components as a set of points for a plane fitting problem, where the fitting error was adopted as the measure of roughness. Based on previous research in Ref. 10, the camera model based on the optical flow algorithm shall satisfy the following conditions: (1) downward-looking camera, (2) planar surface in sight, and (3) known angular rates of the camera. Under these assumptions, the optical flow vectors can be generalized as follows: where and are the optical flow vectors in the and directions of the image coordinates system, respectively. , , and are the corresponding velocities in the , , and directions scaled with respect to the altitude. and are the tangents of the slope angles of the surface.According to Eq. (5), the magnitude of the optical flow would be inversely proportional to the flight height above the surface. Therefore, we can estimate the surface roughness by fitting the optical flow field. Since the UAV moved nearly laterally, Eq. (5) can be simplified to Eq. (6). The parameter vectors and can be calculated separately by solving a linear fitting problem within a random sample consensus (RANSAC) procedure. RANSAC iterations gave us the estimated surface plane and the corresponding fitting error, serving as the measure of surface roughness (Fig. 3). If there exists any obstacle on the surface, the procedure would lead to a higher fitting error in , , or both directions. Consequently, we combined the results in both directions as the overall surface roughness.2.3.Self-Supervised LearningIn the previous section, we introduced the concept of roughness estimation using the optical flow technique. However, since the fitting dataset consists of velocity vectors, the roughness estimation requires significant movement to guarantee a moderate result, which is not viable in the hovering mode. In order to ensure the accuracy of obstacle prediction, we proposed an SSL scheme to map visual appearance features to roughness values . In this study, -nearest neighbor regression26 was used as the learning method due to its simplicity and flexibility. The algorithm is a nonparametric method that keeps all available data and predicts the numerical response based on the proximity measure. A dataset of training samples is given as follows: The algorithm performed predictions by calculating the similarity between the input sample and each instance of the training data. Finally, the predicted roughness came out with the mean of neighbors’ responses. where denotes the objects in the database that are closest to the input . After completing the learning process, we can estimate the surface roughness corresponding to any input distribution through the regression model.3.Vision Motion Estimation for LandingOnce the landing site was selected, the UAV was commanded to descend steadily until touchdown. For vision-based landings, in the case of an unvisited environment, there exist some possibilities that the marker-based vision system might fail to recognize the marker feature. Therefore, we introduced a hybrid framework to select the processing image frames and algorithms based on the required estimations, thus meeting the need to land in an unvisited environment with no target recognition required. In addition, an augmented phase correlation (PC) method with a multiscale strategy was introduced to tackle the problem of scale variation resulting from the field-of-view change during descent. 3.1.Hybrid Optical Flow TechnologyThe hybrid optical flow technique, correlating multiple image frames, was proposed to measure two dynamic motions for landing controls: velocity and position. As shown in Fig. 4, the velocity estimation is determined by comparing two consecutive frames (). On the other hand, the position information is computed by the deviation between the reference frame and the ’th frame () to avoid integral error accumulations. The image processing in the part of motion estimation was also shown in Fig. 4. The vision system generated the guidance information with the flight data from the control system. Following our previous work,21 we dynamically adopted the Gunnar–Farnebäck algorithm27 and PC method28–30 to obtain the velocity and position, respectively. These two algorithms worked together to combine the advantages of dense and sparse optical flow in terms of accuracy and robustness. Both the velocity and position measurements were employed as the feedback signals to the flight control system. 3.2.Multiscale StrategyThe typical PC approach could only tolerate a small range of scale difference in between two image frames. During the landing process, however, the vision-based motion estimation would experience a large-scale difference due to the decrease in height. Accordingly, in the previous work, we introduced a multiscale strategy to autonomously adjust the sensed ROI and update the reference ROI for relative position estimation. Figure 5 shows the concept of the multiscale PC method. First, the vision system set the reference ROI () at the instant that the UAV was commanded to descend. As the UAV was descending, the size of the ROI extraction was enlarged by a scale factor . Then the sensed ROI () was carried out by resizing with the same parameter to maintain the same scale as the reference ROI. Finally, the sensed ROI and the reference ROI were applied to the PC function. The factor can be computed by the height of the UAV. where and are the flight heights of the reference image and the sensed image, respectively. In our experiments, the flight height was obtained by a laser altimeter. In addition, while the UAV was descending, an instantly updating reference ROI image was necessary to ensure that it contained a sufficient overlap region for ROI extraction in subsequently captured images. The reference ROI was autonomously updated when reached a preset threshold value. It is noted that the position estimate was carried out with the prior measurement as an initial condition when passing through the next epoch of the reference ROI. Augmenting the PC method with the proposed multiscale strategy, we effectively minimize the sensing error, which was proved by our experimental results.4.Experimental ResultsIn this section, we reveal the experiments of vision-aided landing in a previously unsurveyed area. First, we compared the SSL method with the optical flow method in the roughness estimation. Then we conducted the UAV landing with a hybrid optical flow scheme. It is noted that the learning process was carried out before these experiments. The testing UAV was a commercial quadrotor Stellar from InnoFlight™ as shown in Fig. 6. It was pre-equipped with a flight control system (InnoFlight™ Jupiter JM-1 Autopilot), an inertial measurement unit, a laser altimeter, and a GPS module, respectively. In addition, for visual equipment, the image processing system included a NVIDIA™ Jetson TK1 module and a GoPro™ HERO camera. The embedded program in the vision computer executed the image processing algorithms and carried out communications with the flight control computer, which ran the proportional–integral–derivative control scheme. 4.1.Obstacle Detection using Self-Supervised LearningIn this experiment, we hovered the UAV above several spots (marked by the red star in Fig. 1), 10 s per spot, and examined the capability of the vision system for obstacle detection. Area 1 and area 2 contained obstacles for buildings and cars, respectively, whereas the other two spots (area 3 and area 4) had no obstacles. We computed the mean and the standard deviation of estimated roughness from the optical flow algorithm () and the SSL method () and inferred the classification rule for a safe landing site based on these results. Figure 7 shows the estimated results of surface roughness while the UAV was hovering. As we predicted in Sec. 2.3, the accuracy of surface roughness estimation via optical flow highly depends on the UAV movement extent. Due to the absence of lateral movement, the optical flow vectors would fail to reveal the magnitude difference that is subject to surface fluctuation. As can be seen from the results of the optical flow method (top-left figure), the roughness estimation failed to cluster, and thus had poor determination in terms of classification. In contrast, the roughness predicted by the SSL method provided a better evaluation (bottom-left figure). The roughness distribution can be entirely classified into two groups, corresponding to regions with and without obstacles. In addition to the clear threshold () that the proposed vision system can identify a safe landing area (with low roughness value), the distribution exhibits that the larger the area occupied by the obstacles, the higher the roughness value. 4.2.Landing Controls with Hybrid Motion EstimationIn preparation for landing (hovering phase), the UAV was commanded to hold its position above the likely landing spot. After confirming no obstacles in view, the UAV then started to steadily descend with visual feedback using the hybrid optical flow technique. Although the visual motion estimation was in effect, the system also collected the GPS data simultaneously as the benchmark. In this work, the position accuracy was verified using the template matching method31,32 under multiple flight trials. A video demo can be found in Ref. 33, whereas Figs. 8(a) and 8(b) show the flight data and the in-plane route during the landing process. In terms of velocity estimation, the precision of the vision-based method was comparable with the state-of-the-art GPS, with only a difference in both and directions. The vision-based landing was activated at (black cross), i.e., the target spot set at the coordinate origin. At the end of the landing, the UAV was located at (blue dot) and (green dot) according to the sensing value of the vision system and GPS, respectively. The corresponding images of camera view were also presented in Fig. 8(c). For the position part, we used the template matching method to authenticate the landing accuracy. The detected location (red cross) by template matching was considered as the ground truth of positioning accuracy, and its coordinate was also indicated in Fig. 8(b). In terms of overview, the vision-based landing resulted in of in-plane positioning error. These results suggested that the hybrid vision-based scheme is able to guide the UAV landing precisely without prior information of a particular marker. 5.ConclusionsIn this paper, we proposed a vision-aided system to aid the UAV landing in an unsurveyed environment. The overall procedure involved identifying the safety of the landing spot and completing the landing at that location. To assess the landing risk, the system used an SSL algorithm to construct the regression model for roughness estimation. The texton distribution formed by patch matching was used to represent the visual features in view, and the concept of roughness was used to determine whether the ground underneath was a safe landing site or not. For a newly acquired distribution, the proposed system obtained the roughness information through the regression model and further classified the presence of obstacles. Compared with the pure optical flow method, the SSL method allowed the UAV to estimate the roughness in the hovering phase, which is more practical in UAV landing operations. Then we applied the visual motion estimation framework for landing as proposed in our previous work, including a multiscale strategy that could tackle the problem of scale variation during descending. With this method, we can land a UAV with no well-defined target required on the ground. In addition, the experimental results indicated that we successfully carried out the vision-based autonomous landing with a positioning error of . The detailed discussion of landing performance can be found in our previous work.21 To enable the vision system more completely in UAV applications, more effort shall be paid to tackle the problem of visually retrieving depth information. Moreover, online learning would make the UAV more versatile by autonomously selecting a safe landing spot. In this way, a fully vision-based system can be employed to implement UAV autolanding in an unsurveyed environment. AcknowledgmentsThis work was financially supported by the Ministry of Science and Technology (MOST) of Taiwan government under Grant Nos. 107-2221-E-009-084, 107-2221-E-009-115-MY3, and 108-2221-E-009-108. ReferencesY. Diskin and V. Asari,
“Dense point-cloud representation of a scene using monocular vision,”
J. Electron. Imaging, 24
(2), 023003
(2015). https://doi.org/10.1117/1.JEI.24.2.023003 JEIME5 1017-9909 Google Scholar
S. Ullman,
“The interpretation of structure from motion,”
Proc. R. Soc. London Ser. B, 203
(1153), 405
–426
(1979). https://doi.org/10.1098/rspb.1979.0006 Google Scholar
T. Templeton et al.,
“Autonomous vision-based landing and terrain mapping using an MPC-controlled unmanned rotorcraft,”
in Proc. IEEE Int. Conf. Rob. and Autom.,
1349
–1356
(2007). https://doi.org/10.1109/ROBOT.2007.363172 Google Scholar
C. Forster et al.,
“Continuous on-board monocular-vision-based elevation mapping applied to autonomous landing of micro aerial vehicles,”
in Proc. IEEE Int. Conf. Rob. and Autom.,
111
–118
(2015). https://doi.org/10.1109/ICRA.2015.7138988 Google Scholar
T. Hinzmann et al.,
“Free LSD: prior-free visual landing site detection for autonomous planes,”
IEEE Rob. Autom. Lett., 3 2545
–2552
(2018). https://doi.org/10.1109/LRA.2018.2809962 Google Scholar
A. Johnson, J. Montgomery and L. Matthies,
“Vision guided landing of an autonomous helicopter in hazardous terrain,”
in Proc. IEEE Int. Conf. Rob. and Autom.,
3966
–3971
(2005). https://doi.org/10.1109/ROBOT.2005.1570727 Google Scholar
J. Mackay, G. Ellingson and T. W. McLain,
“Landing zone determination for autonomous rotorcraft in surveillance applications,”
in AIAA Guidance, Navigation, and Control Conf.,
(2016). Google Scholar
S. Bosch, S. Lacroix and F. Caballero,
“Autonomous detection of safe landing areas for an UAV from monocular images,”
in IEEE/RSJ Int. Conf. Intell. Rob. and Syst.,
5522
–5527
(2006). https://doi.org/10.1109/IROS.2006.282188 Google Scholar
M. Rebert et al.,
“Parallax beam: a vision-based motion estimation method robust to nearly planar scenes,”
J. Electron. Imaging, 28
(2), 023030
(2019). https://doi.org/10.1117/1.JEI.28.2.023030 JEIME5 1017-9909 Google Scholar
G. de Croon et al.,
“Optic-flow based slope estimation for autonomous landing,”
Int. J. Micro Air Veh., 5
(4), 287
–297
(2013). https://doi.org/10.1260/1756-8293.5.4.287 Google Scholar
A. Cesetti et al.,
“Autonomous safe landing of a vision guided helicopter,”
in Proc. IEEE/ASME Int. Conf. Mechatron. and Embedded Syst. and Appl.,
125
–130
(2010). https://doi.org/10.1109/MESA.2010.5552081 Google Scholar
A. Loquercio et al.,
“DroNet: learning to fly by driving,”
IEEE Rob. Autom. Lett., 3
(2), 1088
–1095
(2018). https://doi.org/10.1109/LRA.2018.2795643 Google Scholar
R. Polvara et al.,
“Toward end-to-end control for UAV autonomous landing via deep reinforcement learning,”
in Int. Conf. Unmanned Aircraft Syst.,
115
–123
(2018). Google Scholar
H. W. Ho et al.,
“Optical-flow based self-supervised learning of obstacle appearance applied to MAV landing,”
Rob. Autom. Syst., 100 78
–94
(2018). https://doi.org/10.1016/j.robot.2017.10.004 Google Scholar
S. Lange, N. Sunderhauf and P. Protzel,
“A vision based onboard approach for landing and position control of an autonomous multirotor UAV in GPS-denied environments,”
in Int. Conf. Adv. Rob.,
1
–6
(2009). Google Scholar
H. Lee, S. Jung and D. H. Shim,
“Vision-based UAV landing on the moving vehicle,”
in Int. Conf. Unmanned Aircraft Syst.,
1
–7
(2016). https://doi.org/10.1109/ICUAS.2016.7502574 Google Scholar
X. Chen, S. K. Phang and B. Chen,
“System integration of a vision-guided UAV for autonomous tracking on moving platform in low illumination condition,”
in ION Pacific PNT Conf.,
(2017). Google Scholar
S. Lin, M. A. Garratt and A. J. Lambert,
“Monocular vision-based real-time target recognition and tracking for autonomously landing an UAV in a cluttered shipboard environment,”
Auton. Rob., 41
(4), 881
–901
(2017). https://doi.org/10.1007/s10514-016-9564-2 Google Scholar
S. Kyristsis et al.,
“Towards autonomous modular UAV missions: the detection, geo-location and landing paradigm,”
Sensors, 16
(11), 1844
(2016). https://doi.org/10.3390/s16111844 SNSRES 0746-9462 Google Scholar
D. Falanga et al.,
“Vision-based autonomous quadrotor landing on a moving platform,”
in IEEE Int. Symp. Safety, Secur. and Rescue Rob.,
200
–207
(2017). https://doi.org/10.1109/SSRR.2017.8088164 Google Scholar
H.-W. Cheng, T.-L. Chen and C.-H. Tien,
“Motion estimation by hybrid optical flow technology for UAV landing in an unvisited area,”
Sensors, 19
(6), 1380
(2019). https://doi.org/10.3390/s19061380 SNSRES 0746-9462 Google Scholar
B. Julesz,
“Textons, the elements of texture perception, and their interactions,”
Nature, 290
(5802), 91
–97
(1981). https://doi.org/10.1038/290091a0 Google Scholar
T. Leung and J. Malik,
“Recognizing surfaces using three-dimensional textons,”
in Proc. Seventh IEEE Int. Conf. Comput. Vision,
1010
–1017
(1999). https://doi.org/10.1109/ICCV.1999.790379 Google Scholar
M. Varma and A. Zisserman,
“Texture classification: are filter banks necessary?,”
in IEEE Comput. Soc. Conf. Comput. Vision and Pattern Recognit., Proc.,
II-691
(2003). Google Scholar
D. Arthur and S. Vassilvitskii,
“K-means++: the advantages of careful seeding,”
in Proc. Eighteenth Annu. ACM-SIAM Symp. Discrete Algorithms,
(2007). Google Scholar
T. Cover and P. Hart,
“Nearest neighbor pattern classification,”
IEEE Trans. Inf. Theory, 13
(1), 21
–27
(1967). https://doi.org/10.1109/TIT.1967.1053964 IETTAW 0018-9448 Google Scholar
G. Farnebäck,
“Two-frame motion estimation based on polynomial expansion,”
Lect. Notes Comput. Sci., 2749 363
–370
(2003). https://doi.org/10.1007/3-540-45103-X LNCSD9 0302-9743 Google Scholar
C. D. Kuglin and D. C. Hines,
“The phase correlation image alignment method,”
in IEEE Int. Conf. Syst., Man, and Cybern.,
163
–165
(1975). Google Scholar
V. Argyriou and T. Vlachos,
“Extrapolation-free arbitrary-shape motion estimation using phase correlation,”
J. Electron. Imaging, 15
(1), 010501
(2006). https://doi.org/10.1117/1.2170582 JEIME5 1017-9909 Google Scholar
N. Ma et al.,
“A subpixel matching method for stereovision of narrow baseline remotely sensed imagery,”
Math. Probl. Eng., 2017 1
–14
(2017). https://doi.org/10.1155/2017/7901692 Google Scholar
J. P. Lewis,
“Fast template matching,”
(1994). Google Scholar
A. Sibiryakov,
“Fast and high-performance template matching method,”
in IEEE Conf. Comput. Vision and Pattern Recognit.,
1417
–1424
(2011). https://doi.org/10.1109/CVPR.2011.5995391 Google Scholar
H.-W. Cheng, T.-L. Chen and C.-H. Tien,
“exp.mp4,”
(2019) http://tinyurl.com/y656uerp March ). 2019). Google Scholar
BiographyHsiu-Wen Cheng received her BS and MS degrees in aeronautics and astronautics engineering from National Cheng Kung University, Tainan, Taiwan, in 2007 and 2009, respectively. She is a PhD candidate in mechanical engineering at National Chiao Tung University (NCTU), Hsinchu, Taiwan. Her research interests are in image processing with a focus on unmanned aerial vehicle applications. Tsung-Lin Chen received his BS and MS degrees in power mechanical engineering from National Tsing Hua University, Hsinchu, Taiwan, in 1990 and 1992, respectively, and his PhD in mechanical engineering from the University of California, Berkeley, in 2001. From 2001 to 2002, he joined Analog Devices Inc. as a MEMS design engineer. Since 2003, he has been in the Department of Mechanical Engineering, NCTU, Hsinchu, Taiwan. Currently, he is a full professor. His research interests include microelectromechanical systems and controls. Chung-Hao Tien received his BS degree in electrical and computer engineering and his PhD in electro-optical engineering from NCTU, Hsinchu, Taiwan, in 1997 and 2003, respectively. He was a research assistant at the University of Arizona, Tucson, USA, in 2001. After being a postdoctoral fellow at Carnegie Mellon University, Pittsburgh, USA, from 2003 to 2004, he joined NCTU as a faculty and now is a full professor in the Department of Photonics. His current research interests include computational imaging and information optics. He is currently a member of the Phi Tau Phi Honor Society, the Optical Society of America, and SPIE. |