KEYWORDS: LIDAR, Video, Super resolution, 3D applications, Detection and tracking algorithms, Image sensors, Image fusion, Detector arrays, Deep learning, 3D modeling
We present a statistical model for the multiscale super-resolution of complex 3D single-photon LiDAR scenes while providing uncertainty measures about the depth and reflectivity parameters. We then propose a generalization of this model by unrolling its iterations into a new deep learning architecture which requires a reduced number of trainable parameters, and provides rich information about the estimates including uncertainty measures. The proposed algorithms will be demonstrated on two specific applications: micro-scanning with a 32 × 32 time-of-flight detector array, and sensor fusion for high-resolution kilometer-range 3D imaging. Results show that the proposed algorithms significantly enhance the image quality.
Human monitoring using mmWave radar has recently become an area of significant research. The properties of radars makes them uniquely suited to imaging in adverse regimes such as through atmospheric obscurance or optically opaque media, such as housing materials. The privacy preserving nature of their data also allows radars to perform area monitoring and surveillance rolls with a reduced public impact. However, direct inference of human pose from radar data is challenging due to the relatively low transverse resolution of radar data. In this work we present a Convolutional Neural Network (CNN) capable of converting data from a commercially available Frequency Modulated Continuous Wave (FMCW) radar into human interpretable pose information. We employ a novel experimental configuration in which we combine a marker free motion capture suit with a single line sensing radar in an elevated position. We experimentally verify the ability of our system to reconstruct human pose and report average errors below 3 cm.
KEYWORDS: Single photon avalanche diodes, Depth maps, Video, Super resolution, Neural networks, Sensors, Education and training, Image sensors, Autonomous driving, RGB color model
Three-dimensional (3D) imaging captures depth information from a given scene and is used in a wide range of fields like industrial environments, smartphones and autonomous driving, among others. This paper summarises the results of a depth video super-resolution scheme that is tailored for single-photon avalanche diode (SPAD) image sensors, which produces 3D maps at frame rates > 100 FPS (32×64 pixels). Consecutive frames are used to super-resolve and denoise depth maps via 3D convolutional neural networks with an upscaling factor of 4. Due to the lack of noise-free, high-resolution depth maps captured with high-speed cameras, the neural network is trained with synthetic data using Unreal Engine, which is later processed to resemble the data outputted by a SPAD sensor. The model is then tested with different video sequences captured with a high-speed SPAD dToF, which processes frames at >30 frames per second. The super-resolved data shows a significant reduction in noise and presents enhanced edge details in objects. We believe these results are relevant to improve the accuracy of object detection in autonomous driving cars for collision avoidance or AR/VR systems.
Accurate object tracking or target identification are key requirements in the automotive, consumer, and defence industries. These tasks require hardware to provide good quality images and accurate analysis routines to interpret the data. Here we will report on the use of next-generation single-photon avalanche detector (SPAD) array sensors combined with neural networks for high-speed three-dimensional imaging and object tracking. Such detectors enable three-dimensional imaging at high speeds and low light levels, and they can operate in a wide range of conditions and at large standoff distances. We will discuss the use of such detectors for tracking and monitoring airborne objects, such as drones. We will also discuss our recent work on human pose estimation, achieved from a low-cost SPAD time-of-flight sensor with only 4x4 pixels. Here we use neural networks to first increase the resolution of the data and then reconstruct the skeletal form of multiple humans in three dimensions. It is clear that the next generation of technology for object tracking and identification will use a combination of advanced imaging hardware and data fusion approaches. We will discuss our group's recent research in this area.
KEYWORDS: Ranging, Systems modeling, Imaging systems, Single photon, Data modeling, Time of flight imaging, Stereoscopy, Statistical modeling, Sensors, Picosecond phenomena
The recent development of single-photon avalanche diode (SPADs) arrays as imaging sensors with both picosecond binning capabilities and single photon sensitivity has led to the rapid development of time-of-flight imaging systems however, simulations of SPAD systems outside of the Poisson regime remain rare. Here we present a model for SPAD systems which combines single photon counting statistics with computational parallelization which together enable the efficient generation of photo-realistic SPAD data. We confirm the accuracy of out model by experimental verification. Further, we apply this simulator to the problem of drone identification, orientation, and, segmentation. The proliferation of semi-autonomous aerial multi-copters i.e. drones, has raised concerns over the ability of existing aerial detection systems to accurately characterize such vehicles. Here, we fuse the 3D imaging of SPAD sensors with the classification capabilities of a bespoke convolutional neural network (CNN) into a system capable of determining drone pose in flight. To overcome the lack of publicly available training data we generate a photo-realistic dataset to enable the training of our network. After training, we are able to predict the roll, pitch, and yaw of the several different drone types with an accuracy greater than 90%.
The recent development of single-photon avalanche diode (SPADs) arrays as imaging sensors with both picosecond binning capabilities and single photon sensitivity has led to the rapid development of time-of-flight imaging systems. When used in conjunction with a synchronised light source these sensors produce a 3D image. Here, we apply this 3D imaging ability to the problem of drone identification, orientation, and, segmentation. The proliferation of semi-autonomous aerial multi-copters i.e. drones, has raised concerns over the ability of existing aerial detection systems to accurately characterise such vehicles. Here, we fuse the 3D imaging of SPAD sensors with the classification capabilities of a bespoke convolutional neural network (CNN) into a system capable of determining drone pose in flight. To overcome the lack of publicly available training data we generate a photorealistic dataset to enable the training of our network. After training, we are able to predict the roll, pitch, and yaw of the several different drone types with an accuracy greater than 90%.
3D sensing devices are becoming increasingly prevalent in robotics, self-driving cars, human-computer interfaces, as well as consumer electronics. Recent years have seen single-photon avalanche diodes (SPADs) emerging as one of the key technologies underlying 3D time-of-flight sensors, with the capability to capture accurate 3D depth maps in a range of environmental conditions, and with low computational overhead. In particular, direct ToF SPADs (dToF), which measure the return time of back-scattered laser pulses, form the backbone of many automotive LIDAR systems. We here consider an advanced direct ToF SPAD imager with a 3D-stacked structure, integrating significant photon processing. The device generates photon timing histograms in-pixel, resulting in a maximum throughput of 100's of giga photons per second. This advance enables 3D frames to be captured at rates in excess of 1000 frames per second, even under high ambient light levels. By exploiting the re-configurable nature of the sensor, higher resolution intensity (photon counting) data may be obtained in alternate frames, and depth upscaled accordingly. We present a compact SPAD camera based on the sensor, enabling high-speed object detection and classification in both indoor and outdoor environments. The results suggest a significant potential in applications requiring fast situational awareness.
3D-imaging is used in a wide range of applications such as robotics, computer interfaces, autonomous driving or even capturing the flight of birds. Current systems are often based on stereoscopy or structured light approaches, which impose limitations on standoff distance (range), and require textures in the scene or accurate projection patterns. Furthermore, there may be significant computational requirements for the generation of 3D maps. This work considers a system based on the alternative approach of time-of-flight. A state-of-the art single-photon avalanche diode (SPAD) image sensor is used in combination with pulsed, flood-type illumination. The sensor generates photon timing histograms in pixel, achieving a photon throughput of 100’s of Gigaphotons per second. This in turn enables the capture of 3D maps at frame rates >1kFPS, even in high ambient conditions and with minimal latency. We present initial results on processing data frames from the sensor (in the form of 64×32, 16-bin timing histograms, and 256×128 photon counts) using convolutional neural networks, with the view to localize and classify objects in the field of view, with low latency. In tests involving three different hand signs, with data frames acquired with millisecond exposures, a classification accuracy of >90% is obtained, with histogram-based classification consistently outperforming intensity based processing, despite the former’s relatively low lateral resolution. The total, GPU-assisted, processing time for detecting and classifying a sign is under 25 ms. We believe these results are relevant to robotics or self-driving cars, where fast perception, exceeding human reaction times is often desired.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.