Recent advances in hyperspectral imaging sensors allow the acquisition of images of a scene at hundreds of contiguous narrow spectral bands. Target detection algorithms try to exploit this high-resolution spectral information to detect target materials present in a scene, but this process may be computationally intensive due to the large data volumes generated by the hyperspectral sensors, typically hundreds of megabytes. Previous works have shown that hyperspectral data processing can significantly benefit from the parallel computing resources of graphics processing units (GPUs), due to their highly parallel structure and the high computational capabilities that can be achieved at relative low costs. We studied the parallel implementation of three target detection algorithms (RX algorithm, matched filter, and adaptive matched subspace detector) for hyperspectral images in order to identify the aspects in the structure of these algorithms that can exploit the CUDA™ architecture of NVIDIA® GPUs. A data set was generated using a SOC-700 hyperspectral imager to evaluate the performance and detection accuracy of the parallel implementations on a NVIDIA® Tesla™ C1060 graphics card, achieving real-time performance in the GPU implementations based on global statistics.
The increasing volume of data produced by hyperspectral image sensors have forced researches and developers
to seek out new and more ecient ways of analyzing the data as quick as possible. Medical, scientic, and
military applications present performance requirements for tools that perform operations on hyperspectral sensor
data. By providing a hyperspectral image analysis library, we aim to accelerate hyperspectral image application
development. Development of a cross-platform library, Libdect, with GPU support for hyperspectral image
analysis is presented.
Coupling library development with ecient hyperspectral algorithms escalates into a signicant time invest-
ment in many projects or prototypes. Provided a solution to these issues, developers can implement hyperspectral
image analysis applications in less time. Developers will not be focused on implementing target detection code
and potential issues related to platform or GPU architecture dierences.
Libdect's development team counts with previously implemented detection algorithms. By utilizing proven
tools, such as CMake and CTest, to develop Libdect's infrastructure, we were able to develop and test a prototype
library that provides target detection code with GPU support on Linux platforms. As a whole, Libdect is an
early prototype of an open and documented example of Software Engineering practices and tools. They are
put together in an eort to increase developer productivity and encourage new developers into the eld of
hyperspectral image application development.
Hyperspectral sensors can collect hundreds of images taken at different narrow and contiguously spaced spectral
bands. This high-resolution spectral information can be used to identify materials and objects within the field
of view of the sensor by their spectral signature, but this process may be computationally intensive due to
the large data sizes generated by the hyperspectral sensors, typically hundreds of megabytes. This can be
an important limitation for some applications where the detection process must be performed in real time
(surveillance, explosive detection, etc.). In this work, we developed a parallel implementation of three state-ofthe-
art target detection algorithms (RX algorithm, matched filter and adaptive matched subspace detector) using
a graphics processing unit (GPU) based on the NVIDIA® CUDA™ architecture. In addition, a multi-core CPUbased
implementation of each algorithm was developed to be used as a baseline for the speedups estimation. We
evaluated the performance of the GPU-based implementations using an NVIDIA ® Tesla® C1060 GPU card, and
the detection accuracy of the implemented algorithms was evaluated using a set of phantom images simulating
traces of different materials on clothing. We achieved a maximum speedup in the GPU implementations of
around 20x over a multicore CPU-based implementation, which suggests that applications for real-time detection
of targets in HSI can greatly benefit from the performance of GPUs as processing hardware.
Spectral unmixing of hyperspectral images is a process by which the constituent's members of a pixel scene
are determined and the fraction of the abundance of the elements is estimated. Several algorithms have been
developed in the past in order to obtain abundance estimation from hyperspectral data, however, most of
them are characterized by being highly computational and time consuming due to the magnitude of the data
involved. In this research we present the use of Graphic Processing Units (GPUs) as a computing platform in
order to reduce computation time related to abundance estimation for hyperspectral images. Our
implementation was developed in C using NVIDIA(R) Compute Unified Device Architecture (CUDATM). The
recently introduced CUDA platform allows developers to directly use a GPU's processing power to perform
arbitrary mathematical computations. We describe our implementation of the Image Space Reconstruction
Algorithm (ISRA) and Expectation Maximization Maximum Likelihood (EMML) algorithm for abundance
estimation and present a performance comparison against implementations using C and Matlab. Results show
that the CUDA technology produced results around 10 times better than the fastest implementation done on
previous platforms.
The Image Space Reconstruction Algorithm (ISRA) has been used in hyperspectral imaging applications to
monitor changes in the environment and specifically, changes in coral reef, mangrove, and sand in coastal areas.
This algorithm is one of a set of iterative methods used in the hyperspectral imaging area to estimate abundance.
However, ISRA is highly computational, making it difficult to obtain results in a timely manner. We present the
use of specialized hardware in the implementation of this algorithm, specifically the use of VHDL and FPGAs
in order to reduce the execution time. The implementation of ISRA algorithm has been divided into hardware
and software units. The hardware units were implemented on a Xilinx Virtex II Pro XC2VP30 FPGA and the
software was implemented on the Xilinx Microblaze soft processor. This case study illustrates the feasibility
of this alternative design for iterative hyperspectral imaging algorithms. The main bottleneck found in this
implementations was data transfer. In order to reduce or eliminate this bottleneck we introduced the use of
block-rams (BRAMS) to buffer data and have data readily available to the ISRA algorithm. The memory
combination of DDR and BRAMS improved the speed of the implementation.
Results demonstrate that the C language implementation is better than both FPGA's implementations.
Nevertheless, taking a detailed look at the improvements in the results, FPGA results are similar to results
obtained in the C language implementation and could further be improved by adding memory capabilities to the
FPGA board. Results obtained with these two implementations do not have significant differences in terms of
execution time.
This work presents a new methodology for the formulation of discrete chirp Fourier transform (DCFT) algorithms and it discusses performance measures pertaining to the mapping of these algorithms to hardware computational structures (HCS) as well as the extraction of chirp rate estimation parameters of multicomponent nonstationary signals arriving from point targets. The methodology centers on the use of Kronecker products algebra, a branch of finite dimensional multilinear algebra, as a language to present a canonical formulation of the DCFT algorithm and its associated properties. The methodology also explains how to search for variants of this canonical formulation that contribute to enhance the mapping process to a target HCS. The parameter extraction technique uses time-frequency properties of the DCFT in a modeled delay-Doppler synthetic aperture radar (SAR) remote sensing and surveillance environment to treat multicomponent return signals of prime length, with additive Gaussian noise as background clutter, and extract associated chirp rate parameters. The fusion of time-frequency information, acquired from transformed chirp or linear frequency modulated (FM) signals using the DCFT, with information obtained when the signals are treated using the discrete ambiguity function acting as point target response, point spread function, or impulse response, is used to further enhance the estimation process. For the case of very long signals, parallel algorithm implementations have been obtained on cluster computers. A theoretical computer performance analysis was conducted on the cluster implementation based on a methodology that applies well-defined design of experiments methods to the identification of relations among different levels in the process of mapping computational operations to high-performance computing systems. The use of statistics for identification of relationships among factors has formalized the search for solutions to the mapping problem and this approach allows unbiased conclusions about results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.