In this paper, we propose a framework for image classification. An image is represented by multiple feature channels which are computed by the bag-of-words model and organized in a spatial pyramid. The main difference among feature channels resides in what type of base descriptor in the bag-of-words model is extracted. The overall features achieve different levels of the trade-off between discriminative power and invariance. Support vector machines with kernels based on histogram intersection distance and χ2 distance are used to obtain a posteriori probabilities of the image in each feature channel. Then, four data fusion strategies are proposed to combine intermediate results from multiple feature channels. Experimental results show that almost all the proposed strategies can significantly improve the classification accuracy as compared with the single cue methods and, especially, prod-max performs best in all experiments. The framework appears to be general and capable of handling diverse classification problems due to the multiple-feature-channel-based representation. Also, it is demonstrated that the proposed method achieves higher, or comparable, classification accuracies with less computational cost as compared with other multiple cue methods on challenging benchmark datasets.
In this paper we present a complete framework for scene categorization that builds upon and extends several
recent ideas including spatial pyramid representation and a variety of base local descriptors which have different
discriminative power and invariance from task to task. Furthermore, we propose two strategies: sum-max and
max-max, used to effectively combine diverse source of data in a unified setting way. Our approach shows
significantly improved performance on a large, challenging data set of fifteen natural scene categories. Owing to
combination of complementary information cues, our approach is expected to equally applicable to a range of
tasks.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.