In this paper we propose a novel method to recognize different types of two-person interactions through multi-view surveillance
cameras. From the bird-eye view, proxemics cues are exploited to segment the duration of the interaction, while from
the lateral view the corresponding interaction intervals are extracted. The classification is achieved by applying a visual
bag-of-words approach, which is used to train a liner multi-class SVM classifier. We test our method on the UNITN social
interaction dataset. Experimental results show that using the temporal segmentation can improve the classification
performance.
KEYWORDS: Video surveillance, Video, Motion models, Visualization, Data modeling, Information visualization, Detection and tracking algorithms, Analytical research, Surveillance, Cameras
In this paper we propose a new method to infer human social interactions using typical techniques adopted in literature for visual search and information retrieval. The main piece of information we use to discriminate among different types of interactions is provided by proxemics cues acquired by a tracker, and used to distinguish between intentional and casual interactions. The proxemics information has been acquired through the analysis of two different metrics: on the one hand we observe the current distance between subjects, and on the other hand we measure the O-space synergy between subjects. The obtained values are taken at every time step over a temporal sliding window, and processed in the Discrete Fourier Transform (DFT) domain. The features are eventually merged into an unique array, and clustered using the K-means algorithm. The clusters are reorganized using a second larger temporal window into a Bag Of Words framework, so as to build the feature vector that will feed the SVM classifier.
KEYWORDS: Digital watermarking, Image quality, Data hiding, Quantization, Signal to noise ratio, Image processing, Modulation, Visual system, Bismuth, Image compression
This paper presents an innovative watermarking scheme which allows the insertion of information in the Discrete
Cosine Transform (DCT) domain increasing the perceptual quality of the watermarked images by exploiting
the masking effect of the DCT coefficients. Indeed, we propose to make the strength of the embedded data
adaptive by following the characteristics of the Human Visual System (HVS) with respect to image fruition.
Improvements in the perceived quality of modified data are evaluated by means of various perceptual quality
metrics as demonstrated by experimental results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.