18 November 2022 Siamese network based on global and local feature matching for object tracking
Ziming Zhao, Mengle Zuo, Junyang Yu, Xin He, Yalin Song, Rui Zhai
Author Affiliations +
Abstract

Some tracking algorithms based on Siamese network have made great progress in similarity learning via features cross-correlation between an object branch and a search branch. However, it is significantly challenging for object tracking in video sequences in terms of target deformation with greatly varying. We propose a Siamese network based on global and local feature matching for object tracking including three phases with the aim of addressing the above issues. In the first phase, obtaining the global similarity matching and local relational mapping similarity of the template branch and the search branch by a selection mechanism of object template-aware features are to reduce the impact of background features on the local matching. In the second phase, introducing correlation matching of the local feature for establishing correspondence among partial-level pixels. Finally, combining the classification and regression results with global matching features and local matching features in a weighted fusion. Extensive experiments are conducted on datasets (OTB-100, LaSOT and GOT-10K) demonstrate that the proposed network enables to achieve superiority compared against the state-of-the-art method and provides an efficient scenario for tackling the issue.

© 2022 SPIE and IS&T
Ziming Zhao, Mengle Zuo, Junyang Yu, Xin He, Yalin Song, and Rui Zhai "Siamese network based on global and local feature matching for object tracking," Journal of Electronic Imaging 31(6), 063022 (18 November 2022). https://doi.org/10.1117/1.JEI.31.6.063022
Received: 11 May 2022; Accepted: 31 October 2022; Published: 18 November 2022
Lens.org Logo
CITATIONS
Cited by 3 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Detection and tracking algorithms

Video

Deformation

Convolution

Target detection

Feature extraction

Education and training

Back to Top