PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE Proceedings Volume 11543, including the Title Page, Copyright information, and Table of Contents
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This record contains the opening remarks for the 11543 Digital Forum.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In long range imagery, the atmosphere along the line of sight can result in unwanted visual effects. Random variations in the refractive index of the air causes light to shift and distort. When captured by a camera, this randomly induced variation results in blurred and spatially distorted images. The removal of such effects is greatly desired. Many traditional methods are able to reduce the effects of turbulence within images, however they require complex optimisation procedures or have large computational complexity. The use of deep learning for image processing has now become commonplace, with neural networks being able to outperform traditional methods in many fields. This paper presents an evaluation of various deep learning architectures on the task of turbulence mitigation. The core disadvantage of deep learning is the dependence on a large quantity of relevant data. For the task of turbulence mitigation, real life data is difficult to obtain, as a clean undistorted image is not always obtainable. Turbulent images were therefore generated with the use of a turbulence simulator. This was able to accurately represent atmospheric conditions and apply the resulting spatial distortions onto clean images. This paper provides a comparison between current state of the art image reconstruction convolutional neural networks. Each network is trained on simulated turbulence data. They are then assessed on a series of test images. It is shown that the networks are unable to provide high quality output images. However, they are shown to be able to reduce the effects of spatial warping within the test images. This paper provides critical analysis into the effectiveness of the application of deep learning. It is shown that deep learning has potential in this field, and can be used to make further improvements in the future.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Digital holography (DH) systems have the potential to perform single-shot imaging through deep turbulence by incorporating emerging algorithms, such as model-based iterative reconstruction (MBIR), that jointly estimate both the phase-errors and speckle-free image. However, the high computational cost of MBIR poses a challenge for use in practical applications. In this paper, we propose a method that makes MBIR feasible for real-time DH systems. Our method uses surrogate optimization techniques to simplify and speed up the reflectance and phase-error updates in MBIR. Further, our method accelerates computation of the surrogate-updates by leveraging cache-prefetching and SIMD vector processing units on each CPU core. We analyze the convergence and real CPU time of our method using simulated data sets, and demonstrate its dramatic speedup over the original MBIR approach.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Object detection is the basis for several computer vision applications and autonomous functionalities. The task has been studied extensively and since the onset of deep learning detection accuracy have increased significantly. Every year several new models based on convolutional neural networks (CNNs) are developed and released. However, the development is driven by large research datasets, such as ImageNet and MS COCO, which aim to cover a large range of classes and contain very strong biases with respect to object size and position. Thus existing models and design choices are biased towards such situations. More specialized domains, such as that of maritime vessel detection, can have very different requirements and not all mainstream models are equally suited towards this task. Specific challenges of maritime vessel detection in surface-to-surface view are a large variety of object sizes due to distances from the camera but also the large range in different vessel types, atmospheric effects, and strong overlap between objects. Furthermore, the lack of large training datasets in such specialized domains is a limitation that needs to be considered. Finally, the existing smaller datasets often contain strong biases themselves, as they were usually recorded in a single location with unique visual characteristics and vessel types that may be very distinct from those in other datasets. In this work we analyze the performance of several of the latest state-of-the-art object detectors in the context of maritime vessel detection. We evaluate the detectors on the limited existing public datasets, including the specialized Singapore Maritime Dataset and the SeaShips dataset but also ship images included in general object detection datasets, such as MS COCO. We specifically analyze how well existing dataset biases impact the ability of the resulting detectors to generalize. In addition to this, we create our own maritime vessel training data from online sources and investigate the impact of adding such data to the training process. Our evaluation results in a set of models which achieve strong vessel detection accuracy on all datasets. In summary, this work does not aim at methodological novelty but rather seeks to provide an empirical basis for choice of object detector and composition of training data for future work on the subject of maritime vessel detection.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Deep learning algorithms have been proven to be a powerful tool in image and video processing for security and surveillance operations. In a maritime environment, the fusion of electro-optical sensor data with human intelligence plays an important role to counter the security issues. For instance, the situational awareness can be enhanced through an automated system that generates reports on ship identity and signature together with detecting the changes on naval vessels activity. As a result, this improves data gathering and analysis in the absence of sensor specialists on board and significantly increases the response time to anomalous events.
To date, various studies have been set out to explore the performance of deep neural networks using a ship signature database. Research on image analysis in the maritime domain mainly focuses on an object detection task. It follows that ship detection is very challenging due to the illumination and weather conditions, the water dynamics, the complex backgrounds, the presence of small-sized objects and the limited availability of training data. Aside from detection, image segmentation gains interest for maritime surveillance. The task is proposed to address not only the naval vessel detection using bounding boxes, but also obtaining the ship mask. By performing the segmentation to the pixel level the ship characteristics can be more accurately obtained in order to acquire object classification and identification.
In the current study, we investigate the Mask R-CNN method, i.e. a state-of-the-art framework in image segmentation tasks, for ship detection. The surveillance data captured by an on-board camera provides visual-optical videos in an open sea scenario with a minimum influence from background clutter. The results indicate that the detector performs well on large object targets, however, training on a dataset representative of what is expected to detect and recognize is needed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The main goal of object detection is to localize objects in a given image and assign to each object fits corresponding class label. Performing effective approaches in infrared images is a challenging problem due to the variation of the target signature caused by changes in the environment, viewpoint variation or the state of the target. Convolutional Neural Networks (CNN) models already lead to accurate performances on traditional computer vision problems, and they have also show their capabilities to more specific applications like radar, sonar or infrared imaging. For target detection, two main approaches can be used: two-stage detector or one-stage detector. In this contribution we investigate the two-stage Faster-RCNN approach and propose to use a compact CNN model as backbone in order to speed-up the computational time without damaging the detection performance. The proposed model is evaluated on the dataset SENSIAC, made of 16 bits gray-value image sequences, and compared to Faster-RCNN with VGG19 as backbone and the one-stage model SSD.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Maritime surveillance contributes in the security of ports, oil platforms, and coastal littoral by detecting unusual activities such as unlicensed fishing boats, pirate attacks, and human trafficking, by monitoring and controlling a maritime region. The maritime surveillance systems face many design challenges. For instance, these systems must track moving vessels at long distances in the presence of a dynamic background, using infrared imaging and under various weather conditions. This work presents a benchmark study of the performance of different state of the art tracking algorithms for marine vehicles using mid-wave infrared (MWIR) images. A comprehensive study was conducted to indicate the advantages and disadvantages of the tracking algorithms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Supervised deep learning algorithms are re-defining the state-of-the-art for object detection and classification. However, training these algorithms requires extensive datasets that are typically expensive and time-consuming to collect. In the field of defence and security, this can become impractical when data is of a sensitive nature, such as infrared imagery of military vessels. Consequently, algorithm development and training are often conducted in synthetic environments, but this brings into question the generalisability of the solution to real world data. In this paper we investigate training deep learning algorithms for infrared automatic target recognition without using real-world infrared data. A large synthetic dataset of infrared images of maritime vessels in the long wave infrared waveband was generated using target-missile engagement simulation software and ten high-fidelity computer-aided design models. Multiple approaches to training a YOLOv3 architecture were explored and subsequently evaluated using a video sequence of real-world infrared data. Experiments demonstrated that supplementing the training data with a small sample of semi-labelled pseudo-IR imagery caused a marked improvement in performance. Despite the absence of real infrared training data, high average precision and recall scores of 99% and 93% respectively were achieved on our real-world test data. To further the development and benchmarking of automatic target recognition algorithms this paper also contributes our dataset of photo-realistic synthetic infrared images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The method of automatic synthesis of a fuzzy controller and optimization of its parameters based on a genetic algorithm is developed. A distinctive feature of the method is an algorithm for processing statistical data about the operation of a real industrial facility, which makes it possible to form the initial knowledge base of a fuzzy controller (the number and type of membership functions used, the base of control rules). The use of a genetic algorithm allows optimizing the parameters of a fuzzy controller in such a way as to ensure the best quality indicators of its operation: the duration and oscillation of the transient process, the value of the steady-state error. The proposed method is automated due to the development of a special software application in the Matlab modeling environment, and requires minimal human participation in its work. Simulation modeling is carried out and results are presented that confirm the correctness of the proposed method and the possibility of its practical use. The method operation can be represented as a sequence of the following stages: forming the initial parameters of a fuzzy controller; searching for optimal lengths of term-sets of input-output linguistic variables; searching for optimal parameters of term-sets of input–output linguistic variables.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Automatic detection and tracking of persons and vehicles can greatly increase situational awareness in many military applications. Various methods for detection and tracking have been proposed so far, both for rule-based and learning approaches. With the advent of deep learning, learning approaches generally outperform rule-based approaches. Pre-trained neural networks on datasets like MS COCO can give reasonable detection performance on military datasets. However, for optimal performance it is advised to optimize the training of these pre-trained networks with a representative dataset. In typical military settings, it is a challenge to acquire enough data, and to split the training and test set properly. In this paper we evaluate fine-tuning on military data and compare different pre- and post-processing methods. First we compare a standard pre-trained RetinaNet detector with a fine-tuned version, trained on similar objects, which are recorded at distances different than the distance in the test set. On the aspect of distance this train set is therefore out-of-distribution. Next, we augment the training examples by both increasing and decreasing their size. Once detected, we use a template tracker to follow the objects, compensating for any missing detections. We show the results on detection and tracking of persons and vehicles in visible imagery in a military long range detection setting. The results show the added value of fine-tuning a neural net with augmented examples, where final network performance is similar to human visual performance for detection of targets, with a target area of tens of pixels in a moderately cluttered land environment.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Bee is the perfect visual navigator and visual communicator. this paper proposes a visual navigation and visual communication method, which combines with GPS navigation and radio communication to form a UAV fusion navigation and fusion communication system. Drawing lessons from the characteristics of bees using polarized light navigation, using the air sun, ground markers and other navigation positioning as the backup of the GPS navigation system, by editing the visual behavior rules communication dictionary, in the case of radio communication interference, simulated bee body dance movements for aerial visual communication between UAVs; ground personnel visual recognition UAV air movements, UAV recognition ground personnel body movements, open space visual communication. YOLO V3 models are used in visual communication and navigation to detect UAV visual movements and ground personnel limb movements. Experiments show that the visual communication and navigation system of bionic bee principle can keep UAV formation safe and mission flight under the condition of radio interference.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Detection of military assets on the ground can be performed by applying deep learning-based object detectors on drone surveillance footage. The traditional way of hiding military assets from sight is camouflage, for example by using camouflage nets. However, large assets like planes or vessels are difficult to conceal by means of traditional camouflage nets. An alternative type of camouflage is the direct misleading of automatic object detectors. Recently, it has been observed that small adversarial changes applied to images of the object can produce erroneous output by deep learning-based detectors. In particular, adversarial attacks have been successfully demonstrated to prohibit person detections in images, requiring a patch with a specific pattern held up in front of the person, thereby essentially camouflaging the person for the detector. Research into this type of patch attacks is still limited and several questions related to the optimal patch configuration remain open. This work makes two contributions. First, we apply patch-based adversarial attacks for the use case of unmanned aerial surveillance, where the patch is laid on top of large military assets, camouflaging them from automatic detectors running over the imagery. The patch can prevent automatic detection of the whole object while only covering a small part of it. Second, we perform several experiments with different patch configurations, varying their size, position, number and saliency. Our results show that adversarial patch attacks form a realistic alter- native to traditional camouflage activities, and should therefore be considered in the automated analysis of aerial surveillance imagery.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The use of camouflage is widespread in the biological domain, and has also been used extensively by armed forces around the world in order to make visual detection and classification of objects of military interest more difficult. The recent advent of ever more autonomous military agents raises the questions of whether camouflage can have a similar effect on autonomous agents as it has on human agents, and if so, what kind of camouflage will be effective against such adversaries. In previous works, we have shown that image classifiers based on deep neural networks can be confused by patterns generated by generative adversarial networks (GANs). Specifically, we trained a classifier to distinguish between two ship types, military and civilian. We then used a GAN to generate patterns that, when overlaid on parts of military vessels (frigates), made the classifier confuse the modified frigates with civilian vessels. We termed such patterns "adversarial camouflage" (AC) since these patterns effectively camouflage the frigates with respect to the classifier. The type of adversarial attack described in our previous work is a so-called white box attack. This term describes adversarial attacks that are devised given full knowledge of the classifier under attack. This is as opposed to black box attacks, which describe attacks on unknown classifiers. In our context, the ultimate goal is to design a GAN that is capable of black box attacks, in other words: a GAN that will generate AC that has effect across a wide range of neural network classifiers. In the current work, we study techniques to improve the robustness of our GAN-based approach by investigating whether a GAN can be trained to fool a selection of several neural network-based classifiers, or reduce the confidence of the classifications to a degree which makes them unreliable. Our results indicate that it is indeed possible to weaken a wider range of neural network classifiers by training the generator on several classifiers.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In recent years, more and more video surveillance cameras are being used both in military and civilian applications. This trend results in large amounts of available image and video footage. An effective manual search and evaluation of this data is difficult due to the large data volume and limited human attention span. This is why automatic algorithms are required to aid in data analysis. A key task in this context is search for persons of interest, i.e., person re-identification. Based on a query image, re-identification methods retrieve further occurrences of the depicted person in large data volumes. The prevailing success of convolutional neural networks (CNNs) in computer vision did not spare person re-identification and has recently led to significant improvements. Current state-of-the-art approaches mostly rely on features extracted from CNNs trained with person images and corresponding identity labels. However, person re-identification still remains a challenging problem due to many task-specific influences such as, e.g., occlusions, incomplete body parts, background clutter, varying camera perspectives, and pose variation. Unlike conventional CNN features, descriptive person attributes represent higher-level semantic information that is more robust to many of these influences. Therefore, person re-identification can be improved by integrating attributes into the algorithms. In this work we investigate approaches for attribute-based person re-identification using deep learning methods with the goal of developing efficient models with the best possible re-identification accuracy. We show that best practices in person re-identification approaches can be transferred to the task of pedestrian attribute recognition to achieve strong baseline results for both tasks. Moreover, we show that leveraging information about semantic clothing and body regions during training of the networks improves the results further. Finally, we combine pedestrian attribute recognition and person re-identification models in a multi-task architecture to build our attribute-based person re-identification approach. We develop our attribute model on the large RAP dataset, which currently offers the largest available number of persons and attributes and thus allows for a differentiated analysis. The final combined attribute and re-identification model is trained on the Market-1501 dataset, which provides person identities and attribute annotations simultaneously. Our results show that baseline re-identification results are surpassed, thus indicating that complementary information from the two different tasks is leveraged.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper considers the linear quadratic regulator (LQR) optimal control problem of multi-agent unmanned vehicle systems under communication constraints with packet drops. The problem is formulated into a distributed optimization problem of minimizing a global cost function through the sum of local cost functions by using local information exchange. By utilizing a newly developed optimization technique, we propose a novel algorithm to solve the distributed LQR problem in a first order (gradient descent based) manner. Moreover, we adopt the key idea of virtualizing an extra node for each agent to store information from the previous step and create a fully distributed optimization algorithm. Extensive simulations demonstrate the efficacy and robustness of the proposed solution.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A crucial technology in modern smart manufacturing is the human-robot collaboration (HRC) concept. In the HRC, operators, and robots unite and collaborate to perform complex tasks in a variety of scenarios, heterogeneous and dynamic conditions. A unique role in the implementation of the HRC model, as a means of sensation, is assigned to machine vision systems. It provides the receipt and processing of visual information about the environment, the analysis of images of the working area, the transfer of this information to the control system, and decision-making within the framework of the task. Thus, the task of recognizing the actions of a human-operator for the development of a robot control system in order to implement an effective HRC system becomes relevant. The operator commands fed to the robot can have a variety of forms: from simple and concrete to quite abstract. This introduces several difficulties when the implementation of automated recognition systems in real conditions; this is a heterogeneous background, an uncontrolled work environment, irregular lighting, etc. In the article, we present an algorithm for constructing a video descriptor and solve the problem of classifying a set of actions into predefined classes. The proposed algorithm is based on capturing three-dimensional subvolumes located inside a video sequence patch and calculating the difference in intensities between these sub-volumes. Video patches and central coordinates of sub-volumes are built on the principle of VLBP. Such a representation of three-dimensional blocks (patches) of a video sequence by capturing sub-volumes, inside each patch, in several scales and orientations, leads to an informative description of the scene and the actions taking place in it. Experimental results showed the effectiveness of the proposed algorithm on known data sets.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The article discusses a heterogeneous processor based on an open source 64-bit core of the RISC-V architecture, combined with a reconfigurable neural network accelerator. The features of the implementation of a binary matrix neural network on FPGA and its combination with the RISC-V RV64GC core in tasks of cognitive robotics and industrial production are investigated in order to increase safety in the interaction of a robot and a person.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Full motion video electro-optical/infrared (EO/IR) sensors are now ubiquitous in civil and defence applications. Modern intelligence, surveillance, and reconnaissance (ISR) platforms provide live video feeds containing critical surveillance/observations for applications such as search and rescue, peace support, disaster relief, and port security. A crucial form of intelligence extracted from video is patterns of life, which vary drastically over space and time. Detecting anomalous behaviour within normal patterns of life in near-real time is an important task because behaviour anomalies correlate with significant, actionable information. Many anomalous behaviours cannot be specified in advance, a problem that can be addressed using unsupervised, deep-learning algorithms to identify behaviour anomalies in the video stream. We propose an operator-in-the-loop algorithmic approach that uses the latest advances in deep learning to learn patterns of life and inform operators of behaviour anomalies. Our approach uses a pre-trained object detector (RetinaNet)to identify objects within each frame paired with an object tracker based on a discriminative correlation filter. After building up a dataset of tracks, we use a combination of clustering techniques and a convolutional autoencoder to build a baseline of patterns of life for different object types. We demonstrate the ability of the autoencoder to reliably reconstruct raw object tracks from a latent space and show that at inference time tracks with larger than average loss correlate with anomalous behaviour. We demonstrate the capabilities of our approach by staging an anomalous event in front of security system cameras and compare the extracted track behaviours to normal patterns of life for that area.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Leveraging the power of deep neural networks, single-person pose estimation has made substantial progress throughout the last years. More recently, multi-person pose estimation has also become of growing importance, mainly driven by the high demand for reliable video surveillance systems in public security. To keep up with these demands, certain efforts have been made to improve the performance of such systems, which is yet limited by the insufficient amount of available training data. This work addresses this lack of labeled data: by diminishing the often faced problem of domain shift between synthetic images from computer game graphics engines and real world data, annotated training data shall be provided at zero labeling-cost. To this end, generative adversarial networks are applied as domain adaption framework, adapting the data of a novel synthetic pose estimation dataset to several real world target domains. State-of-the-art domain adaption methods are extended to meet the important requirement of exact content preservation between synthetic and adapted images. Experiments, that are subsequently conducted, indicate the improved suitability of the adapted data as human pose estimators trained on this data outperform those which are trained on purely synthetic images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Currently, modern achievements in the field of deep learning are increasingly being applied in practice. One of the practical uses of deep learning is to detect cracks on the surface of the roadway. The destruction of the roadway is the result of various factors: for example, the use of low-quality material, non-compliance with the standards of laying asphalt, external physical impact, etc. Detection of these damages in automatic mode with high speed and accuracy is an important and complex task. An effective solution to this problem can reduce the time of services that carry out the detection of damage and also increase the safety of road users. The main challenge for automatically detecting such damage, in most cases, is the complex structure of the roadway. To accurately detect this damage, we use U-Net. After that we improve the binary map with localized cracks from the U-Net neural network, using the morphological filtering. This solution allows localizing cracks with higher accuracy in comparison with traditional methods crack detection, as well as modern methods of deep learning. All experiments were performed using the publicly available CRACK500 dataset with examples of cracks and their binary maps.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image enhancement refers to processing images to make them more suitable for display or further image analysis. So, image enhancement is a problem-oriented procedure for improving the visual appearance of the remote sensing image. In this paper, we adopt the quaternion framework for representing color satellite images since it offers scope to process color images holistically, rather than as separate color space components, and thereby handles the coupling between the color channels. We present a new image enhancement algorithm based on multi-scale block-rooting processing. The basic idea is to apply the frequency domain image enhancement approach for different size image blocks. Also, we used a nonreference quality measure to choose the presented algorithm parameters optimally. Computer simulations on the remote sensing image dataset show that the new enhancement algorithm exhibits better results compared to state-of-the-art techniques.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We propose the so-called Bundled Transfer Learning approach with multiple large architectures for automated detection of Coronavirus using X-Ray and CT images. Extracted features from each pretrained model in the bundle are gathered for training new deep layers. The dataset includes CT scan images of COVID-19 and normal classes and X-Ray images of COVID-19, Pneumonia, and normal classes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
There are two goals of modeling, including interpretation that is to extract information about how the response variables are associated to the input variables, and prediction that is to predict what the responses are going to be. The dilemma is that interpretable algorithms such as linear regression or logistic regression are often not accurate for prediction, while complex algorithms for better prediction are much more accurate but not easy to interpret1. Risk could be in the forms of cyber security risk, credit risk, investment risk, operational risk, etc. In this paper, we propose an interpretable method in evaluating risk using Deep Learning.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.