Decision-making through artificial neural networks with minimal latency is critical for numerous applications such as navigation, tracking, and real-time machine action systems. This requires machine learning hardware to process multidimensional data at high throughput. Unfortunately, handling convolution operations, the primary computational tool for data classification tasks, obeys challenging runtime complexity scaling laws. However, homomorphically implementing the convolution theorem in a Fourier optics display light processor can achieve a non-iterative O(1) runtime complexity for data inputs beyond 1,000 × 1,000 large matrices. Following this approach, here we demonstrate data streaming multi-kernel image batching using a Fourier Convolutional Neural Network (FCNN) accelerator. We show image batch processing of large-scale matrices as 2 million dot product multiplications performed by a digital light processing module in the Fourier domain. Furthermore, we further parallelize this optical FCNN system by exploiting multiple spatially parallel diffraction orders, achieving a 98x throughput improvement over state-of-the-art FCNN accelerators. A comprehensive discussion of the practical challenges associated with working at the edge of system capabilities highlights the problem of crosstalk and resolution scaling laws in the Fourier domain. Accelerating convolution by exploiting massive parallelism in display technology brings non-Van Neumann-based machine learning acceleration.
Non-van Neumann compute engines such as neuromorphic electronics have shown to outperform CPUs by 3-4 orders of magnitude in terms of ‘weighted addition’, namely multiply-accumulate (MAC)-per-Joule. Here, we discuss experimental devices for a photonic neural network (NN) with an energy efficiency targeting10^18 MAC/J. We consider an electro-optic perceptron consisting of a photodetector (summation) coupled to an EO modulator (nonlinear activation function, NLAF) [George et al, Opt.Exp. 2019]. The perceptron’s efficiency is proportional to the electronic charge at the NLAF; in case of Silicon MZI modulators, this is ~10^6 charges hence the MAC/J is similar to TrueNorth. However, co-integration of emerging EO materials such as ITO into Si MZIs enables efficient modulation (e.g. VpL=0.5 V-mm [Armin et al, APL Phot. 2018]. Here we discuss latest results of a ITO-Silicon MZM with a record-low VpL=0.06 V-mm, and show noise-based NN training results of our in-house software PhotonFlow.
If electro-optic conversion of current photonic NNs could be postponed until the very end of the network, then the execution time is simply the photon time-of-flight delay. Here we discuss a first design and performance of an all-optical perceptron and feed-forward NN. Key is the dual-purpose foundry-approved heterogeneous integration of phase-change-materials resulting in a) volatile nonlinear activation function (threshold) realized with ps-short optical pulses resulting in a non-equilibrium variation of the materials permittivity, and b) thermo-optically writing a non-volatile optical multi-cell (5-bit) memory for the NN weights after being (offline) trained. Once trained, the weights only required a rare update, thus saving power. Performance wise, such an integrated all-optical NN is capable of < fJ/MAC using experimental demonstrated pump-probe [Waldecker et al, Nat. Mat. 2015] with a delay per perceptron being ~ps [Miscuglio et al. Opt.Mat.Exp. 2018] has a high cascadability.
Performing feature extractions in convolution neural networks for deep-learning tasks is computational expensive in electronics. Fourier optics allows convolutional filtering via dot-product multiplication in the Fourier domain similar to the distributive law in mathematics. Here we experimentally demonstrate convolutional filtering exploiting massive parallelism (10^6 channels, 8-bit at 1kHz) of digital mirror display technology, thus enabling 250 TMAC/s. An FPGA-PCIe board controls the ‘weights’ and handles the data I/O, whereas a high-speed camera detects the inverse-Fourier transformed (2nd lens) data. Gen-1 processes with a total delay (including I/O) of ~1ms, while Gen-2 at 1-10ns leveraging integrated photonics at 10GHz and changing the front-end I/O to a joint-transform-correlator (JTC). These processors are suited for image/pattern recognition, super resolution for geolocalization, or real-time processing in autonomous vehicles or military decision making.
Graphene has extraordinary electro-optic properties and is therefore a promising candidate for monolithic photonic devices such as photodetectors. However, the integration of this atom-thin layer material with bulky photonic components usually results in a weak light-graphene interaction leading to large device lengths limiting electro-optic performance. In contrast, here we demonstrate a plasmonic slot graphene photodetector on silicon-on-insulator platform with high-responsivity given the 5 µm-short device length. We observe that the maximum photocurrent, and hence the highest responsivity, scales inversely with the slot width. Using a dual-lithography step, we realize 15 nm narrow slots that show a 15-times higher responsivity per unit device-length compared to photonic graphene photodetectors. Furthermore, we reveal that the back-gated electrostatics is overshadowed by channel-doping contributions induced by the contacts of this ultra-short channel graphene photodetector. This leads to quasi charge neutrality, which explains both the previously-unseen offset between the maximum photovoltaic-based photocurrent relative to graphene’s Dirac point and the observed non-ambipolar transport. Such micrometer compact and absorption-efficient photodetectors allow for short-carrier pathways in next-generation photonic components, while being an ideal testbed to study short-channel carrier physics in graphene optoelectronics.
Discovery of TCC-VCSEL done by Dr. Dalir (PI) in 2013 led to new functionalities of VCSEL structure. In principal, a TCC-VCSEL has same vertical structure as conventional VCSEL. A VCSEL is consists of two distributed Bragg reflector (DBR) mirrors parallel to the wafer surface with an active region consisting of one or more quantum wells for the laser light generation in between. The planar DBR-mirrors consist of layers with alternating high and low refractive indices. Each layer has a thickness of a quarter of the laser wavelength in the material, yielding intensity reflectivity’s above 99%. High reflectivity mirrors are required in VCSELs to balance the short axial length of the gain region. Here we assume a TCC-VCSEL with a coupling of K between the cavities. We pump one cavity with a gain of g, while the other cavity has loss of ɣ. It is noted that the lasing frequency is a function of loss and coupling between the cavities. Assuming a constant coupling (K), tunabilty of the TCC-VCSEL will be adjusted by the loss. The three-dimensional simulation of the single mode operation in TCC structure is performed by employing film mode matching method of FIMMWAVE Photon Design Corp. With a coupling of K= 1.5THz, a 19.7 nm wavelength will be swept in the PT regime crucial for lab-on-a-chip integrated bio-sensor applications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.