PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE Proceedings Volume 11734, including the Title Page, Copyright Information, and Table of Contents.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The number of active landmines is uncertain, however, estimated 5,554 people were killed or injured by mines in 2019. Understanding the land-cover before mine clearance process provides valuable information on the scale of the problem, the resources to clear the field and ensures all hazardous areas prioritized. In this paper, we present a new framework for land clearness prioritization using land-cover analysis. We use remote sensing images from sentinel-2 to estimate the changes in the land cover. Specifically, we estimate the changes in vegetation and non-vegetation areas. Further, we use the amount and number of land changes during a period to provide recommendations on the clearance priority for different areas. A case study for different areas in the Kingdom of Cambodia is presented with several observations of satellite images for the years 2019 and 2020. Several suspected hazardous areas (or polygons) are defined by landmine surveying expert for analysis. A change matrix for each polygon is obtained from consecutive observations. Then, a series of qualitative and quantitative 2- dimensional characteristics are extracted such as class change mask from-to, percentage loss and gain per class. The 2D characteristics, together with expert-defined scores of class-change importance are used to compute the amount and number of changes in each polygon and a recommendation on the clearance priorities. Our study demonstrates that analysing the changes in land-cover is a promising direction to help in the non-technical survey process and increasing the productivity of the land release.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Hyperspectral images deliver several hundred spectral bands covering the visible and the infrared wavelength. Visualize hyperspectral information on a trichromatic display is impracticable to show information, which contains a hundred bands. Therefore, the selection of representative bands and the improvement of image quality are challenging tasks. In this paper, a simple, effective hyperspectral image visualization method based on hyperspectral image enhancement with the use of different cost functions is proposed. The proposed method consists of two major steps. First, the wavelength-based band selection chooses three subsets of adjacent hyperspectral bands. Second, the selected bands are improved by the proposed fractional contrast stretching algorithm with optimizations. In the nature-inspired optimization algorithm with an adaptive inertia weight, the quality of an enhanced image will be evaluated by different cost functions – image quality measures. This paper's main contribution is that: i) the selection of representative bands corresponding to natural-color appearance, ii) the image enhancement for visualization, and iii) the investigation of the suitable cost function for hyperspectral image visualization. Experiments performed on several hyperspectral datasets illustrate that the proposed method can produce remarkable visualization performance in subjective and objective assessments.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The rapid expansion of data generated from various sensors, devices, and applications offers the opportunity to exploit complex data relationships in new ways. Multi-dimensional data often appears as multiple, separate data channels, for which multi-channel data representations and analysis techniques have been developed. Alternately, for image matching, multiple-template image matching techniques have been developed. Multi-template approaches use multiple, often singlechannel, templates, exhibiting intra-class variations. Same-class test image exemplars must match all reference templates. In this paper, we combine multiple-template matching techniques with multi-channel data representations to provide multitemplate, multi-channel image matching. We represent image data with pixels taking values in tensor products of lowdimensional Clifford Algebras, for which Fourier transforms exist. Fourier domain matching provides a computational processing improvement over spatial correlation-based matchers. The tensor product approach provides a decomposition of higher-dimensional algebras into combinations of lower dimensional algebras for which Fourier transforms apply. The tensor product approach produces a performance advantage, on data with the appropriate inter-channel correlation characteristics, through exploitation of additional data channel correlations. When these add constructively, a matching performance benefit occurs. We define an anti-involution mapping on a tensor product space, which leads to a definition of image correlation over these spaces. We prove that the correlation satisfies an extended inner product definition. We prove a Cauchy-Schwartz inequality, which validates use of the image correlation as a matcher. We present an example, using synthetic image data, where the approach provides superior matching performance over classical score sum fusion.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper aims to introduce a new transform domain no-reference (NR) image quality assessment that predicts the perceptual quality for improving imaging systems performance. The main idea is that enhancing the contrast of diverse real-world digital images by creating more high-frequency content in the improved image than the original image. The proposed measure uses different fast orthogonal transforms, such as Fourier and Fibonacci. A generalized transform-based model of local transform-based coefficients is derived and transformed the model parameters into features used for perceptual image quality score prediction. To test the performance of the proposed algorithm, we use the well-known and publicly available database TID2008. The Pearson correlation coefficient is utilized to measure and compare the proposed quality measures performance with state-of-the-art approaches.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Navigation of modern aerial platforms typically uses GPS position measurements for accurate position estimation. GPS can provide position accuracies of better than ten meters. However, operation in GPS-denied regions, precludes the use of GPS for navigation, thereby requiring other aiding sources for position estimation. To obtain altitude measurements, barometric-based altimeters are often used. Barometric-based altimeters provide a passive sensor alternative for low-cost aerial vehicles. These altimeters measure barometric pressure to estimate altitude, but provide errors of tens of meters or more due to atmospheric variation, weather conditions, and platform distance from altimeter calibration location. In this paper, we present a multi-camera vision based approach to altitude estimation, which can be used in the absence of GPS. This vision-based approach converts the altitude estimation problem to a ground-based image registration problem, for which highly accurate (pixel and sub-pixel-level accuracy) registration techniques are available. The altitude estimate results from simultaneously aligning multiple onboard cameras to ground reference imagery. The registration spatial error statistics corresponds directly to the altitude estimation error. We use image fusion over multiple registration solutions to provide accurate image registration and thus accurate altitude estimation. We first provide a technical description of the approach. We then generate numerical simulation results demonstrating performance of the approach, and validating aspects of the theory. We compare performance to a standard image rescaling approach for altitude estimation. We demonstrate improved performance over the classical approach.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image registration is a major field within computer vision and is often a required step in fulfilling other computer vision and pattern recognition tasks such as change detection, scene classification and image segmentation. Recent advances in 3D computer vision and lowered costs in Light Detection and Ranging devices, better known as LiDAR, have given way to an increase in readily available 3D image datasets. These 3D captures give an extra dimension to computer vision data and allow for improvements in a multitude of tasks when compared to their 2D counterparts. However, due to the large scale and complex nature of 3D point cloud data, classical methods for registration often require increased hardware usage and time and can fail to proper register data with a low degree of error. The strategy presented in this paper aims to minimize the number of points representing a point cloud to reduce the time and hardware overhead needed to perform registration while allowing the algorithm to improve registration accuracy and reduce error between registered clouds. This is done by extracting key edge features from the point clouds using eigenvector analysis to remove ground planes and large normal planes within the point cloud. The algorithm is further improved by performing set differencing on two separate edge extractions to remove large clusters of points representing natural objects that can often cause confusion for registration of outdoor LiDAR scenes. The method for key point registration is evaluated on large scale, complex LiDAR point clouds obtained from aerial sensors. Tests are performed on both fully overlapping and partially overlapping clouds to ensure that the method increases performance on full and partial registration tasks. The tests are also performed on clouds of varying resolution to test the algorithms ability to maintain integrity regardless of cloud resolution. Point reduction results, registration statics and visual results are presented for comparison. A brief look into possible applications of the method and future improvements to the algorithm are included.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The article presents a description of a system that allows you to analyze the surface of a tool used for processing metal products. The information parameters are data on the current position of the object and the angles of its rotation, as well as a set of images obtained in the visible and/or infrared range. The captured images are subject to interference caused by: small distances between the camera and the investigated instrument; sensor noise; blurring the boundaries (defocusing) of objects at the edges of images; uneven lighting; the complexity of the shape of the analyzed instrument. The introduction of the possibility of analyzing a series of images obtained in different ranges will improve the accuracy of the formation of the structure of the means (instrument). The article describes methods and algorithms for improving the quality of a series of images. A description of the choice of their parameters is given, and an analysis of their influence on the result is presented. Methods have been developed for analyzing objects and identifying their boundaries by analyzing a series of images obtained in different ranges (IR and visible). The use of combined ranges makes it possible to increase the accuracy of determining the boundary in conditions of interference and low illumination and also allows the analysis of objects of complex shapes (the presence of holes, internal structures, chips). A series of test images, fixing the edge of the cutter, the shape of the drill, and cutting inserts, show the effectiveness of the developed approach and the possibility of automating the analysis process.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Convolutional Neural Network (CNN) is a powerful and successful deep learning technique for a variety of computer vision and image analysis applications. Interpreting and explaining the decisions of CNNs is one of the most challengeable tasks despite its significant success in various image analysis tasks. Topological Data Analysis (TDA) is an approach that exploits algebraic invariants from topology to analyse high dimensional and noisy datasets as well as growing challenges of big data applications. Persistent homology (PH) is an algebraic topology method for measuring topological features of shapes and/or functions at different distance or similarity resolutions. This work is an attempt to investigate the algebraic properties of pretrained CNN convolutional layer filters based on random Gaussian/Uniform distribution. We shall investigate the stability and sensitivity of the condition number of CNN filters during and post the model training with focus on class discriminability of the PH features of the convolved images. We shall demonstrate a strong link between the condition number of the CNN filters and their discriminating power of the PH representation. In particular, we shall establish that if small perturbation added to the original images then feature maps with well-conditioned filters will produce similar topological features to the original image. Our investigation and findings are based on training CNN’s with Digits, MNIST and CIFAR-10 datasets. Our ultimate interest in applying the results of these findings in designing appropriate CNN models for classifications of ultrasound tumor scan images. Preliminary results for these applications are encouraging.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Three-factor authentication is the best available option for cybersecurity enhancement, which includes ‘know’ (password), ‘have’ (username, token, or card) and ‘are’ (biometrics). Each one makes this process stronger and more secure. Imagine an international team working on a long-term project by remotely logging into a secured server. In such a context, adding keystroke biometrics to authentication will definitely improves the cybersecurity. We propose to develop a set of recurrent neural network (RNN) models of utilizing keystroke dynamics as ones’ biometrics to enhance cybersecurity. Keystroke dynamics refers to the process of measuring and assessing human’s typing rhythm on digital devices. Keystroke timing information such as di-graph, dwell time and flight time are used in our experimental datasets. We propose to apply support vector machine and recurrent neural network to keystroke dynamics. Experimental results show the proposed methods are promising in contrast with traditional methods like nearest neighbor and Manhattan distance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The novel coronavirus 2019 (COVID-19) first appeared in Wuhan province of China and spread quickly around the globe and became a pandemic. The gold standard for confirming COVID-19 infection is through Reverse Transcription-Polymerase Chain Reaction (RT-PCR) assay. The lack of sufficient RT-PCR testing capacity, false negative results of RT-PCR, time to get back the results and other logistical constraints enabled the epidemic to continue to spread albeit interventions like regional or complete country lockdowns. Therefore, chest radiographs such as CT and X-ray can be used to supplement PCR in combating the virus from spreading. In this work, we focus on proposing a deep learning tool that can be used by radiologists or healthcare professionals to diagnose COVID-19 cases in a quick and accurate manner. However, the lack of a publicly available dataset of X-ray and CT images makes the design of such AI tools a challenging task. To this end, this study aims to build a comprehensive dataset of X-rays and CT scan images from multiple sources as well as provides a simple but an effective COVID-19 detection technique using deep learning and transfer learning algorithms. In this vein, a simple convolution neural network (CNN) and modified pre-trained AlexNet model are applied on the prepared X-rays and CT scan images. The result of the experiments shows that the utilized models can provide accuracy up to 98% via pre-trained network and 94.1% accuracy by using the modified CNN.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In tumor diagnostics from Ultrasound scan images, the region of interest is often determined by marking the boundary of the suspect mass by experts, simply by clicking on sufficient number of tumor boundary points. To determine whether the tumor is malignant or benign, clinical experts who are trained for long time on how to interpret image information from the marked tumor region and from the surrounding area. In contrast, in designing automatic computer aided diagnosis system using both traditional and conventional machine learning, the relevant image features are generally obtained by cropping the tumor as region of interest (RoI) without considering the periphery of the tumor that might contain important discriminative information for better classification accuracy. In this work, we investigate the impact on classification accuracy of different types of tumors by the cropping strategy where the tumor area will be augmented by a proportion of the surrounding region of the ROI. The required optimal proportion need to be determined so that the cropped ROIs encapsulate information about posterior echo and shadow of the tumor in addition to internal texture and echo that has mainly been used as classification indicators. Recently proposed cropping techniques use the best fitting ellipse of the tumor and examine the proportion by which the ellipse is expanded to improve accuracy. Unfortunately, the fitting ellipse may not reflect the shape of the tumor. Here, we investigate a number of alternative approaches of cropping the ROIs using the concept of convex hull shape(s) determined from the tumor boundary points selected by radiologists. Initially, we check several expansion ratio scales of the convex hull ranging from 0.6 to 4.0 against the cropped tumor without margin. Several classification methods including handcrafted features and deep learning methods are adopted for breast and liver tumors using ultrasound images. We shall demonstrate the importance of optimal cropping for breast and liver ultrasound tumor classification. Furthermore, optimal margin depends on the cancer type and classification method as well.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In recent years, virtual reality has experienced steady growth in the medical field, such as surgery, rehabilitation, disease diagnostic, and learning. The 3D representation of radiological images plays a significant role in disease diagnostic and treatment planning compared to standard 2D medical images. Since March 2019, almost all laboratories and medical centers have improved their patients' management methods with confirmed coronavirus (COVID-19) disease. Providing appropriate treatment in the well moment may contribute to save lives. Our study aims to develop an advanced COVID-19 CT scan image segmentation and 3D visualization using an unsupervised thresholding procedure and virtual reality technology to better plan and monitor affected patients. Our proposed system provides three-dimensional COVID-19 lesion visualization, which clearly shows segmented infected region (in 3D) rather than traditional two-dimensional images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Ultrasound imaging is widely used in medical diagnostics. The existence of speckle noise tends to impair ultrasound image quality, which has a negative effect on the computer-aided diagnostic pipeline. As a result, a content-preserving noise reduction is an essential part of ultrasound image pre-processing. This paper argues that conventional one-fit-all preprocessing methods on all images irrespective of their quality and/or their content have many limitations. The paper demonstrates that the negative effects of the speckle noise are more significant in regions where solid tissues are present. Consequently, we propose an adaptive approach of using trained classification models to detect such regions within the image and targeting the speckle noise of the detected regions instead of the whole image. The detection is achieved by placing a sliding window over the image and feeding individual windows to a trained classifier. In this study, we first analyse the content of the images to identify the complexity of the speckle noise by training a linear support vector machine classifier on histogram-based measurements such as skewness and kurtosis to determine whether the image partially or fully needs pre-processing. To evaluate the effectiveness of the new adaptive pre-processing methods, a hybrid two-model solution in which the first trainable model decides if an image requires pre-processing or not and applies it respectively on the whole image. The second model takes a step further to check which parts of the images requires pre-processing and adaptively applies it using the block-based trainable system. The results, based on 138 benign and 104 malignant ovarian ultrasound images, show that the two models performed better than other state-of-the-art pre-processing techniques, which confirms the need for the adaptive system that applies pre-processing only when needed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper aims to effectively detect and remove clouds on remote sensing (RS) images. The proposed algorithm has a two-stage procedure: (a) an efficient cloud detection algorithm and (b) an efficient inpainting algorithm for image reconstruction. We introduce an inpainting method to recover the images with cloud occlusion from the remote sensing image's surfaces using block-matching. The proposed approach has been evaluated on the remote sensing RGB image dataset with various aerial sceneries. Compared with traditional inpainting methods, both qualitative and quantitative results show that our approach has advantages over state-of-the-art methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The Coronavirus (Covid-19) pandemic has been affecting the health of people around the globe. With the number of confirmed cases and deaths still rising daily, it is now crucial to quickly detect the positive cases and provide them with the necessary treatment. Presently, several research investigations are being conducted to help control the spread of this epidemic. One research topic is to create faster and more accurate detection. Recent studies have demonstrated that chest CT images encompass the distinctive COVID-19 features, which can be utilized for achieving an efficient COVID-19 diagnosis. However, manually reading these images on a large scale can be laborious and is intractable. Thus, using an artificial intelligence-based system that can help capture the precise information and give an accurate diagnosis would be beneficial. In this paper, a customized weighted filter-based CNN (CCNN) is proposed. Computer simulations show that the proposed CCNN system (1) increases the effectiveness of detect COVID-19 CT scans from the non- COVID-19 CT scans and (2) has faster training time compared to the traditional deep learning models.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Face recognition is one of the most challenging biometric modalities when deployed in unconstrained environments due to the high variability that faces images present in the real world in a crowd, which are affected by complex factors including head poses, aging, illumination conditions, occlusions, and facial expressions. The face recognition system aims to identify and track visual data subjects, such as images and videos. More people are currently carrying masks in public, bringing new challenges to the face detection and identification system. This article focuses on the detection and recognition of masked faces. The presented framework is based on new artificial intelligence tools that use hand-crafted and deep learning (YOLOv3 and CNNs) features and SVM classifiers. Computer simulation on five different face mask datasets (Real-World Masked Face Dataset (RMFD), the Simulated Masked Face Dataset (SMFD), Medical Mask Dataset(MMD), Labeled Faces in the Wild (LFW)) and our proposed Artificially simulated masked face dataset (ASMFD), of which the testing results illustrate that the proposed method is comparable or better in most cases than traditional face mask recognition techniques. The presented system may produce anonymous statistical data that can help the agencies predict potential epidemics of COVID-19.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The devastating aftermath of a natural disaster is often challenging to assess, and inaccuracies are bound to occur when an assessment is done manually due to the inevitable human-in-the-loop errors. Timely and accurate evaluation of the extent of damages is often needed to effectively deploy resources to hard-hit areas, save lives, and facilitate adequate planning towards disaster recovery. The commonly used supervised learning approaches have made a considerable improvement in assessing natural disasters. However, quickly implementing supervised classification is still challenging due to the complexity of acquiring many labeled samples in the aftermath of disasters. In this paper, we propose a: i) two-stream high-resolution network (HRNet) that takes a pair of pre- and post-disaster images and ii) semi-supervised framework for improving the generalizability of current methods to other housing styles. The proposed method comprises of two parts: a multi-class deep learning model, and a pseudo-label generator and refinement module. By harnessing information from a large amount of unlabeled data and aerial imagery, our approach can outperform its base model. Experimental results on the xView2 dataset demonstrate that the proposed framework improves the performance of our two-stream model for unseen satellite images depicting a scene before and after a disaster.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
One of the numerous drawbacks of existing systems for wrong-way driver detection (WWD) is that they require installation and maintenance of expensive sensor networks. More importantly, they fail to leverage on the growing number of traffic surveillance camera networks. Approaching wrong way driver detection from a computer vision standpoint is a rather intricate one if not well thought out. As such, recent methods which explored alternative deep learning approach for solving this problem have been shown to exhibit a high rate of false detection and consider very limited settings e.g. exit ramps. In this paper, we propose a more sophisticated computer vision framework to address the shortcomings of existing systems while also leveraging on existing preinstalled large-scale camera infrastructure to achieve real-time WWD detection with high precision. The proposed framework combines four modules working collaboratively to deliver desired results. This includes: (i) a Flow Detection Module which is initialized to determine the correct direction of flow by momentarily observing the traffic; (ii) a state-of-the-art object detection algorithm, in this case YOLOv5, for detecting all objects of interest from each frame; (iii) a sophisticated centroid-based object tracker coupled with Hungarian matching algorithm for efficiently tracking objects of interest; and (iv) a wrong way flagging module to flag vehicles moving opposite to a lane’s computed flow direction as they enter and exit the camera’s field of view. The Hungarian algorithm ensures that each object of interest is assigned a unique ID which not only reinforces tracking efficiency of the object tracker, but also provides traffic count capability. Tracking paths are compared against computed direction of flow to instantly detect wrong way driving. The proposed architecture achieves state-of-the-art performance with high True Positive Rate and low false detections. One of the several benefits of the proposed method is that it could potentially be integrated into the department of transport (DOT) surveillance system to significantly reduce the cognitive load pressure on traffic control agents who are overwhelmed by the large number of video feeds they are tasked to monitor in real-time. Alerts generated from this system could help mitigate such issues.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Recognizing the model of a vehicle in natural scene images is an important and challenging task for real-life applications. Current methods perform well under controlled conditions, such as frontal and horizontal view-angles or under optimal lighting conditions. Nevertheless, their performance decreases significantly in an unconstrained environment, that may include extreme darkness or over illuminated conditions. Other challenges to recognition systems include input images displaying very low visual quality or considerably low exposure levels. This paper strives to improve vehicle model recognition accuracy in dark scenes by using a deep neural network model. To boost the recognition performance of vehicle models, the approach performs joint enhancement and localization of vehicles for non-uniform-lighting conditions. Experimental results on several public datasets demonstrate the generality and robustness of our framework. It improves vehicle detection rate under poor lighting conditions, localizes objects of interest, and yields better vehicle model recognition accuracy on low-quality input image data.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We present a new image enhancement algorithm based on multi-scale block-rooting processing. The basic idea is to apply the frequency domain image enhancement approach for different image blocks. The parameter of transform coefficient enhancement for every block is driven through optimization of measure of enhancement. Some experimental results are presented to illustrate the performance of the proposed algorithm on the medical image dataset. It is shown that the proposed method can realize the image color constancy, local dynamic range compression, color enhancement, and the overall dynamic range compression under certain circumstances.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.