Maritime surveillance is crucial for ensuring compliance with regulations and protecting critical maritime infrastructure. Conventional tracking systems, such as AIS or LRIT, are susceptible to manipulation as they can be switched off or altered. To address this vulnerability, there is a growing need for a visual monitoring system facilitated by unmanned systems such as unmanned aerial vehicles (UAVs) and unmanned surface vehicles (USVs). Equipped with sensors and cameras, these unmanned vehicles collect vast amounts of data that often demand time-consuming manual processing. This study presents a robust method for automatic target vessel re-identification from RGB imagery captured by unmanned vehicles. Our approach uniquely combines visual appearance and textual data recognized from the acquired images to enhance the accuracy of target vessel identification and authentication against a known vessel database. We achieve this through utilizing Convolutional Neural Network (CNN) embeddings and Optical Character Recognition (OCR) data, extracted from the vessel’s images. This multi-modal approach surpasses the limitations of methods relying solely on visual or textual information. The proposed prototype was evaluated on two distinct datasets. The first dataset contains small vessels without textual data and serves to test the performance of the fine-tuned CNN model in identifying target vessels, trained with a triplet loss function. The second dataset encompasses medium and large-sized vessels amidst challenging conditions, highlighting the advantage of fusing OCR data with CNN embeddings. The results demonstrate the feasibility of a computer vision model that combines OCR data with CNN embeddings for target vessel identification, resulting in significantly enhanced robustness and classification accuracy. The proposed methodology holds promise for advancing the capabilities of autonomous visual monitoring systems deployed by unmanned vehicles, offering a resilient and effective solution for maritime surveillance.
Recent improvements in magnetic resonance image (MRI) reconstruction from partial data have been reported using spatial context modelling with Markov random field (MRF) priors. However, these algorithms have been developed only for magnitude images from single-coil measurements. In practice, most of the MRI images today are acquired using multi-coil data. In this paper, we extend our recent approach for MRI reconstruction with MRF priors to deal with multi-coil data i.e., to be applicable in parallel MRI (pMRI) settings. Instead of reconstructing images from different coils independently and subsequently combining them into the final image, we recover MRI image by processing jointly the undersampled measurements from all coils together with their estimated sensitivity maps. The proposed method incorporates a Bayesian formulation of the spatial context into the reconstruction problem. To solve the resulting problem, we derive an efficient algorithm based on the alternating direction method of multipliers (ADMM). Experimental results demonstrate the effectiveness of the proposed approach in comparison to some well-adopted methods for accelerated pMRI reconstruction from undersampled data.
While many existing CT noise filtering post-processing techniques optimize minimum mean squared error (MSE)-based quality metrics, it is well-known that the MSE is generally not related to the diagnostic quality of CT images. In medical image quality assessment, model observers (MOs) have been proposed for predicting diagnostic quality in medical images. MOs optimize a task-based quality criterion such as lesion or tumor detection performance. In this paper, we first discuss some of the non-stationary noise properties of CT noise. These properties will be utilized to construct a multi-directional non-stationary noise model that can be used by MOs. Next, we investigate a new shearlet-based denoising scheme that opti- mizes a task-based image quality metric for CT background noise. This work makes a connection between multi-resolution sparsity-based denoising techniques on the one hand and model observers on the other hand. The main advantage is that this approach avoids the two-step procedure of MSE-optimized denoising followed by a MO-based quality evaluation (of- ten with contradictory quality goals), while instead optimizing the desired task-based image quality directly. Experimental results are given to illustrate the benefits of the proposed approach.
In this paper, we first briefly review the directional properties of the Dual-Tree complex wavelet transform and we investigate how the directional selectivity of the transform can be increased (i.e., to obtain more than 6 orientations per scale). To this end, we describe a new augmented Lagrangian optimization algorithm to jointly perform the 2D spectral factorization of a set of 2D directional filters, with a high numerical accuracy. We demonstrate how this approach can be used to design compactly supported shearlet frames that are tight. Finally, a number of experimental results are given to show the merits of the resulting shearlet frames.
In digital cameras and mobile phones, there is an ongoing trend to increase the image resolution, decrease the sensor size
and to use lower exposure times. Because smaller sensors inherently lead to more noise and a worse spatial resolution,
digital post-processing techniques are required to resolve many of the artifacts. Color filter arrays (CFAs), which use
alternating patterns of color filters, are very popular because of price and power consumption reasons. However, color
filter arrays require the use of a post-processing technique such as demosaicing to recover full resolution RGB images.
Recently, there has been some interest in techniques that jointly perform the demosaicing and denoising. This has the
advantage that the demosaicing and denoising can be performed optimally (e.g. in the MSE sense) for the considered noise
model, while avoiding artifacts introduced when using demosaicing and denoising sequentially.
In this paper, we will continue the research line of the wavelet-based demosaicing techniques. These approaches are
computationally simple and very suited for combination with denoising. Therefore, we will derive Bayesian Minimum
Squared Error (MMSE) joint demosaicing and denoising rules in the complex wavelet packet domain, taking local adaptivity
into account. As an image model, we will use Gaussian Scale Mixtures, thereby taking advantage of the directionality
of the complex wavelets. Our results show that this technique is well capable of reconstructing fine details in the image,
while removing all of the noise, at a relatively low computational cost. In particular, the complete reconstruction (including
color correction, white balancing etc) of a 12 megapixel RAW image takes 3.5 sec on a recent mid-range GPU.
This work explores the potentials of structure encoding in sparse tomographic reconstructions. We are encoding
spatial structure with Markov Random Field (MRF) models and employ it within Magnetic Resonance Imaging
(MRI) and Quantitative Microwave Tomography. We illustrate thereby also different ways of MRF modelling:
as a discrete, binary field imposed on hidden labels and as a continuous model imposed on the observable field.
In case of MRI, the analyzed approach is a straightforward extension of sparse MRI methods and is related
to the so-called LaMP (Lattice Matching Pursuit) algorithm, but with a number of differences. In case of
Microwave Tomography, we give another interpretation of structured sparsity using much different, but also
effective approach. Thorough experiments demonstrate clear advantages of MRF based structure encoding in
both cases and motivate strongly further development.
In recent years, there has been a lot of interest in multiresolution representations that also perform a multidirectional analysis.
These representations often yield very sparse representation for multidimensional data. The shearlet representation,
which has been derived within the framework of composite wavelets, can be extended quite trivially from 2D to 3D.
However, the extension to 3D is not unique and consequently there are different implementations possible for the discrete
transform. In this paper, we investigate the properties of two relevant designs having different 3D frequency tilings. We
show that the first design has a redundancy factor of around 7, while in the second design the transform can attain a redundancy
factor around 3.5, independent of the number of analysis directions. Due to the low redundancy, the 3D shearlet
transform becomes a viable alternative to the 3D curvelet transform. Experimental results are provided to support these
findings.
The shearlet transform is a recent sibling in the family of geometric image representations that provides a traditional
multiresolution analysis combined with a multidirectional analysis. In this paper, we present a fast DFT-based analysis
and synthesis scheme for the 2D discrete shearlet transform. Our scheme conforms to the continuous shearlet theory to
high extent, provides perfect numerical reconstruction (up to floating point rounding errors) in a non-iterative scheme
and is highly suitable for parallel implementation (e.g. FPGA, GPU). We show that our discrete shearlet representation
is also a tight frame and the redundancy factor of the transform is around 2.6, independent of the number of analysis
directions. Experimental denoising results indicate that the transform performs the same or even better than several related
multiresolution transforms, while having a significantly lower redundancy factor.
A new approach for wavelet-based demosaicing of color filter array (CFA) images is presented. It is observed that
conventional wavelet-based demosaicing results in demosaicing artifacts in high spatial frequency regions of the
image. By proposing a framework of locally adaptive demosaicing in the wavelet domain, the presented method
proposes computationally simple techniques to avoid these artifacts. In order to reduce computation time and
memory requirements even more, we propose the use of the dual tree complex wavelet transform. The results
show that wavelet-based demosaicing, using the proposed locally adaptive framework, is visually comparable
with state-of-the-art pixel based demosaicing. This result is very promising when considering a low complexity
wavelet-based demosaicing and denoising approach.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.