Compared with pairwise/triplet similarity based hashing, central similarity based hashing can capture data relationships more efficiently. However, previous works still have the limitation of feature expression capability due to information redundancy and they simply use similar/dissimilar label pairs so that they cannot fully capture the complex semantic information of the visual content, which limits the retrieval performance to some extent. To address the above issues, an attention-based hashing with central similarity learning (ACSH) for image retrieval is proposed. Firstly, it uses an off-the-shelf algorithm to generate semantic hash centers with sufficient hamming distance between each other for data points. Secondly, it embeds a spatial attention mechanism into the feature extraction module, enabling the deep hashing network to focus on important features and suppress unimportant features. Finally, in the training phase, a classification task is introduced to supervise the feature learning of the spatial attention mechanism to capture more complex semantic information of the visual content. Comprehensive empirical evidence shows that ACSH has better retrieval performance than the existing deep hashing methods based on central similarity on three standard benchmarks.
|