KEYWORDS: Image processing, Human vision and color perception, Image retrieval, RGB color model, Feature extraction, Cones, Databases, Eye, Visualization, Content based image retrieval
Bridging the semantic gap between the low level visual features extracted by computers such as color, texture or shape and high level semantic concepts perceived by humans is the main challenge in the aim of increasing the precision of semantic results into Content-Based Image Retrieval (CBIR). This challenge has been approached with the technique known as Relevance Feedback (RF). The technique of RF can be applied through two methods, biased subspace learning or query movement. The method of query movement is based on Rocchio algorithm. In this paper, we present a new optimization to technique of Relevance Feedback through query movement to develop a CBIR system with better semantic precision. We make a modification to the input images color channels composition in the additive color space (Red, Green, Blue) and perceptual additive color space (Hue, Saturation, Value), through the images representation with human photopic vision behavior, which provides the semantic perception of the colors. With the proposed representation we obtained a more accurate behavior of the Color Histogram (CH), Color Coherence Vector (CCV) and Local Binary Patterns (LBP) descriptors in Rocchio algorithm, thus, a query movement oriented more to the semantics of the user. The optimization performance was measured with a subset of 137 classes with 100 images each one from Caltech256 object database. The results show a significant improvement in the semantic precision in comparison to the P. Mane RF method with prominent features, as well as the performance of CBIR systems without RF using the mentioned descriptors.
Nowadays there is a trend towards the use of unimodal databases for multimedia content description, organization and retrieval applications of a single type of content like text, voice and images, instead bimodal databases allow to associate semantically two different types of content like audio-video, image-text, among others. The generation of a bimodal database of audio-video implies the creation of a connection between the multimedia content through the semantic relation that associates the actions of both types of information. This paper describes in detail the used characteristics and methodology for the creation of the bimodal database of violent content; the semantic relationship is stablished by the proposed concepts that describe the audiovisual information. The use of bimodal databases in applications related to the audiovisual content processing allows an increase in the semantic performance only and only if these applications process both type of content. This bimodal database counts with 580 audiovisual annotated segments, with a duration of 28 minutes, divided in 41 classes. Bimodal databases are a tool in the generation of applications for the semantic web.
In the computer world, the consumption and generation of multimedia content are in constant growth due to the popularization of mobile devices and new communication technologies. Retrieve information from multimedia content to describe Mexican buildings is a challenging problem. Our objective is to determine patterns related to three building eras (Pre-Hispanic, colonial and modern). For this purpose, existing recognition systems need to process a plenty of videos and images. The automatic learning systems trains the recognition capability with a semantic-annotated database. We built the database taking into account high-level feature concepts, user knowledge and experience. The annotations helps correlating context and content to understand the data on multimedia files. Without a method, the user needs a super mind to remember all and registry this data manually. This article presents a methodology for a quick images annotation using a graphical interface and intuitive controls. Emphasizing in the most two important features: time-consuming during annotations task and the quality of selected images. Though, we only classify images by its era and its quality. Finally, we obtain a dataset of Mexican buildings preserving the contextual information with semantic-annotations for training and test of buildings recognition systems. Therefore, research on content low-level descriptors is other possible use for this dataset.
Current search engines are based upon search methods that involve the combination of words (text-based search); which
has been efficient until now. However, the Internet’s growing demand indicates that there’s more diversity on it with each
passing day. Text-based searches are becoming limited, as most of the information on the Internet can be found in different
types of content denominated multimedia content (images, audio files, video files).
Indeed, what needs to be improved in current search engines is: search content, and precision; as well as an accurate display
of expected search results by the user. Any search can be more precise if it uses more text parameters, but it doesn’t help
improve the content or speed of the search itself. One solution is to improve them through the characterization of the
content for the search in multimedia files. In this article, an analysis of the new generation multimedia search engines is
presented, focusing the needs according to new technologies.
Multimedia content has become a central part of the flow of information in our daily life. This reflects the necessity of
having multimedia search engines, as well as knowing the real tasks that it must comply. Through this analysis, it is shown
that there are not many search engines that can perform content searches. The area of research of multimedia search engines
of new generation is a multidisciplinary area that’s in constant growth, generating tools that satisfy the different needs of
new generation systems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.