Visual tracking is a challenging computer vision problem with numerous practical applications. We propose a convolutional features selection-based tracking framework to improve accuracy and robustness. First, we investigate the impact of features extracted from different convolutional neural network layers for the visual tracking problem. Second, we learn correlation filters on each layer outputs to encode the target appearance and design a fluctuation detection technique to select the appropriate convolutional layers, which can improve the target localization precision and avoid drifting caused by the challenging factors, such as occlusions and appearance variations. Third, we present an improved model update strategy to keep positive samples while removing corrupted ones. Extensive experimental results on the OTB-2013 and OTB-2015 benchmarks demonstrate that the proposed algorithm performs favorably against several state-of-the-art trackers.
KEYWORDS: Video, Video surveillance, Data modeling, Feature extraction, Distance measurement, Atomic force microscopy, Principal component analysis, Cameras, Image resolution, Roads
Various approaches have been proposed for video anomaly detection. Yet these approaches typically suffer from one or more limitations: they often characterize the pattern using its internal information, but ignore its external relationship which is important for local anomaly detection. Moreover, the high-dimensionality and the lack of robustness of pattern representation may lead to problems, including overfitting, increased computational cost and memory requirements, and high false alarm rate. We propose a video anomaly detection framework which relies on a heterogeneous representation to account for both the pattern’s internal information and external relationship. The internal information is characterized by slow features learned by slow feature analysis from low-level representations, and the external relationship is characterized by the spatial contextual distances. The heterogeneous representation is compact, robust, efficient, and discriminative for anomaly detection. Moreover, both the pattern’s internal information and external relationship can be taken into account in the proposed framework. Extensive experiments demonstrate the robustness and efficiency of our approach by comparison with the state-of-the-art approaches on the widely used benchmark datasets.
Sparse representation has been applied to an online subspace learning-based tracking problem. To handle partial occlusion effectively, some researchers introduce l1 regularization to principal component analysis (PCA) reconstruction. However, in these traditional tracking methods, the representation of each object observation is often viewed as an individual task so the inter-relationship between PCA basis vectors is ignored. We propose a new online visual tracking algorithm with multitask sparse prototypes, which combines multitask sparse learning with PCA-based subspace representation. We first extend a visual tracking algorithm with sparse prototypes in multitask learning framework to mine inter-relations between subtasks. Then, to avoid the problem that enforcing all subtasks to share the same structure may result in degraded tracking results, we impose group sparse constraints on the coefficients of PCA basis vectors and element-wise sparse constraints on the error coefficients, respectively. Finally, we show that the proposed optimization problem can be effectively solved using the accelerated proximal gradient method with the fast convergence. Experimental results compared with the state-of-the-art tracking methods demonstrate that the proposed algorithm achieves favorable performance when the object undergoes partial occlusion, motion blur, and illumination changes.
KEYWORDS: Video, Detection and tracking algorithms, Fractal analysis, Data modeling, Visual process modeling, Machine vision, Computer vision technology, Chaos theory, Video acceleration, Video surveillance
We present a framework for dynamic textures (DTs) recognition and localization by using a model developed in the text analysis literature: probabilistic latent semantic analysis (pLSA). The novelty is revealed in three aspects. First, chaotic feature vector is introduced and characterizes each pixel intensity series. Next, the pLSA model is employed to discover the topics by using the bag of words representation. Finally, the spatial layout of DTs can be found. Experimental results are conducted on the well-known DTs datasets. The results show that the proposed method can successfully build DTs models and achieve higher accuracies in DTs recognition and effectively localize DTs.
High dynamic range (HDR) imaging is an important and challenging research topic in computational photography. A simple but effective image fusion method is proposed to accomplish the multi-exposure image composition in both static and dynamic scenes. The foundation of the proposed method is an experiential criterion that optimizes the exposure that occurs at a dramatic alteration point in the low dynamic range image sequence (LDRI). To extract these well-exposed pixel vectors, each pixel curve formed by the pixel vectors at same position along all frames in the LDRIs is first preprocessed by the chord length parameterization. Then a single high-quality pseudo-HDR image can be extracted directly and efficiently from the LDRIs using a pixel-level fusion index matrix derived from the first- and second-order difference quotients of the preprocessed pixel curves. The main advantage of the proposed method is its use of a single independent pixel in computing. It is highly parallel, allowing a graphic processing unit-based, real-time implementation. The experiments on various scenes discussed here indicate that the proposed exposure fusion method can combine a large image sequence with 10 megapixels into a visually compelling pseudo-HDR image at a rate of 30 frames/s on a consumer hardware.
Particle probability hypothesis density (PHD) filter-based visual trackers have achieved considerable success in the visual tracking field. But position measurements based on detection may not have enough ability to discriminate an object from clutter, and accurate state extraction cannot be obtained in the original PHD filtering framework, especially when targets can appear, disappear, merge, or split at any time. To meet the limitations, the proposed algorithm combines a color histogram of a target and the temporal dynamics in a unifying framework and a Gaussian mixture model clustering method for efficient state extraction is designed. The proposed tracker can improve the accuracy of state estimation in tracking a variable number of objects.
The probability-hypothesis-density (PHD) filter as a multitarget recursive Bayes filter has generated substantial interest in the visual tracking field due to its ability to handle a time-varying number of targets. But the target's trajectory cannot be identified within its own framework. To complement the ability of PHD, the auction algorithm is combined to calculate the object trajectories automatically. We present a motion detection, dynamic, and measurement equation, as well as visual multitarget tracking algorithm based on Gaussian mixture probability hypothesis density with trajectory computation in detail. Experimental results on a large video surveillance dataset show that the proposed multitarget tracking framework improves the tracker and recognizes tracks when a variable number of targets appear, merge, split, and disappear, even in cluttered scenes.
A novel algorithm, Gaussian mean shift registration (GMSR), is proposed for multisensor dynamic bias estimation. The sufficient condition for convergence of a Gaussian mean shift procedure is given, which extends the current theorem from a strictly convex kernel to a piece-wise convex and concave kernel. The Gaussian mean shift algorithm combined with the extended Kalman filter (EKF) is implemented to estimate the dynamic bias based on the measurements from a single target, which is an iterative optimization procedure. Monte Carlo simulations show that the new algorithm has significant improvement in performance with reducing root mean square (RMS) errors compared with the minimum mean square error (MMSE) estimator, based on multiple targets and multiple frames. The proposed estimator is close to the theoretical lower bound, i.e., it is more efficient in estimating the dynamic bias than other methods.
We propose a novel B-spline active contour model based
on image fusion. Compared with conventional active contours, this
active contour has two advantages. First, it is represented by a cubic
B-spline curve, which can adaptively determine the curve parameter’s
step length; and it can also effectively detect and express the
object contour’s corner points. Second, it is implemented in connection
with image fusion. Its external image force is modified as the
weighted sum of two modal image forces, with the two weights in
terms of a local region’s image entropy or image contrast’s standard
deviation. The experiments indicate that this active contour can accurately
detect both the object’s contour edge and the corner points.
Our experiments also indicate that the active contour’s convergence
with a weighted image force by the image contrast’s standard deviation
is more accurate than that of image entropy, restraining the
influence of the texture or pattern.
In visual tracking, the object's appearance may change over time due to illumination changes, pose variations, and partial or full occlusions. This variability makes tracking difficult. This paper proposes an adaptive appearance model for visual tracking. The model can adapt to changes in object appearance over time. The value of each pixel is modeled by a Gaussian mixture distribution. A novel update scheme based on the expectation maximization algorithm is developed to update the appearance model parameters. In designing the tracking algorithm, the observation model is based on the adaptive appearance model, and a particle filter is employed. Outlier pixels and occlusions are handled using a robust-statistics technique. Numerous experimental results demonstrate that the proposed algorithm can track objects well under illumination changes, large pose variations, and partial or full occlusions.
In this paper, a novel radar management strategy technique suitable for RADAR/IRST track fusion, which is based on Fisher Information Matrix (FIM) and fuzzy stochastic decision approach, is put forward. Firstly, optimal radar measurements' scheduling is obtained by the method of maximizing determinant of the Fisher information matrix of radar and IRST measurements, which is managed by the expert system. Then, suggested a "pseudo sensor" to predict the possible target position using the polynomial method based on the radar and IRST measurements, using "pseudo sensor" model to estimate the target position even if the radar is turned off. At last, based on the tracking performance and the state of target maneuver, fuzzy stochastic decision is used to adjust the optimal radar scheduling and retrieve the module parameter of "pseudo sensor". The experiment result indicates that the algorithm can not only limit Radar activity effectively but also keep the tracking accuracy of active/passive system well. And this algorithm eliminates the drawback of traditional Radar management methods that the Radar activity is fixed and not easy to control and protect.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.