The story behind Lightmatter’s tech

Published in

Lightmatter

9 min readJul 24, 2019

Photonic (or optical) computers have long been considered a holy grail for information processing due to the potential for high bandwidth and low power computation. Developing these machines required three decades of technological advancement that Lightmatter is now harnessing to deliver on the promise of highly power-efficient, parallel computation with light. In this blog post, we will walk through the history of optical computing, its death and its resurgence in quantum computing, and how Lightmatter is now building on these techniques to develop a faster and more energy-efficient optical processor for artificial intelligence.

A brief history of optical computing

In the 1980s, scientists at Bell Labs made early attempts at creating optical computers. This new kind of computer would offer bandwidth that was significantly higher than that of electronics — hundreds of terahertz (10^14) compared to a few gigahertz (10^9). By the mid 1980s, hopes for this technology had reached a fever-pitch:

By the mid-1990’s we’ll have flexible programmable computers. You may never know there are optics in there. You’ll see no flashing lights. It will be very dull looking. But it will run circles around everything else. Electronics just can’t keep up with us.
- Dr. Henry J. Caulfield (New York Times, 1985)

The Bell Labs approach to optical computing relied on implementing an optical version of the electronic transistor — a device that is used to switch (or amplify) electrical signals. Unlike electrons, which are used in the transistors inside your phone and computer, light beams do not interact directly with each other. However, light can interact with materials; by temporarily changing the properties of the material that it is passing through, the passage of one light beam can be ‘felt’ by another.

*Fig. When considered as a digital (off or on device), transistors (middle) are an electronic version of a switch (left). Optical transistors (right) were meant to replace their electronic counterpart.*

As you may have noticed, the prediction made by the Bell Labs scientists did not come to fruition. This was largely due to difficulties in implementing “optical transistors.” Each optical transistor absorbs some light, making the signal weaker and weaker as it propagates and limiting the number of operations that can be performed on this kind of system. On top of this, there was the issue of storing data optically, a problem which remains extremely challenging to this day. There was a significant backlash in the scientific community towards research on optical computing due to the unfulfilled promises of the 1980s and the hype that it generated.

While the optical transistor was dying, a new kind of optical computing approach was being invented. In the mid-1990s, the field of quantum computing was growing rapidly owing to new proofs that showed that quantum systems could solve problems that were intractable on classical computers. There were many known approaches to implementing quantum systems, including using photons (single particles of light). In 1994, in hopes of building an optical quantum processor, Michael Reck and co-authors described a system that used arrays of a fundamental optical component — the Mach-Zehnder Interferometer (MZI) — to perform an important mathematical operation called matrix multiplication.

*Fig. Bulk optical (left) and integrated optical (right) implementation of a Mach-Zehnder interferometer. Bulk optics systems are typically assembled on room-scale metal tables.*

At the time, experiments in optics were typically performed using bulky optical components that were screwed into an optical tabletop (i.e., a heavy, metal, vibration-controlled surface; often called a “breadboard”) for mechanical stability. With this “breadboarding” platform, stabilizing optical systems with tens of optical beams that operate in synchronous beats was intractable — even small vibrations or temperature changes would introduce errors to the system. So, while the idea of concatenating small optical circuits to form a larger optical computer was revolutionary, the technology to make it possible had not yet caught up with the theory.

*Fig. Bulk optical (left) and integrated optical (right) version of the same* *quantum teleportation setup*.

Research on integrated photonics for matrix processing

The solution on the horizon was to shrink the meter-sized optical components down to 10s of microns using computer chips with “integrated photonics” elements that could be easily fabricated and controlled. Thanks to interest in the technology surrounding the fiber optical networks that serve as the backbone of today’s internet, the telecommunications industry was actively working on developing photonic chips. However, it wasn’t until around 2004 that fabricating photonic integrated circuits with a large number of components became feasible. By 2012, photonic chip fabrication facilities began offering multi-project wafer (MPW) services for silicon-based optical chips. This enabled multiple academic research groups to share resources and produce designs in low quantities at reduced costs. The first optical computers that would lead to Lightmatter were created in this environment.

Fig. Micrograph of a programmable nanophotonic processor. This chip was fabricated at IME A*STAR in Singapore in 2013.

In 2012, Nicholas Harris (Lightmatter’s CEO) and collaborators used the OpSIS MPW service to realize a “programmable nanophotonic processor” (PNP), an optical processor implemented in silicon photonics that performs matrix transformations on light. There were two major technical challenges towards implementing large-scale PNPs: (1) compact, low-loss, and efficient phase shifters and (2) multi-channel control and readout circuitry. The first tapeout was completed at the end of 2012 and the resulting chip is shown above.

As a member of Prof. Dirk Englund’s Quantum Photonics Laboratory at the Massachusetts Institute of Technology, Harris spent the next several years designing the control and readout hardware from scratch, coding the software platform, and developing calibration techniques for PNPs as well as designing and taping-out increasingly complex PNPs. The first research articles on the PNP and its application to quantum information processing were posted to the online preprint server arXiv in 2015: one article from Harris and Darius Bunandar (Lightmatter’s Chief Scientist) together with collaborators at MIT, and another from Jacques Carolan and collaborators at the University of Bristol in the UK. The articles were ultimately published in the academic journals Nature Photonics and Science.

Because the PNP could implement general matrix operations, it enabled a broad array of applications, including classical computing, quantum computing, data routing, security, and more. Installations of the PNP setup were deployed at universities and research facilities including MIT, the University of Vienna, and the Air Force Research Laboratory. We’ve included a list of academic publications that used the PNP at the end of this article. We recently wrote a review article on the field of PNPs, which can be found here.

In 2017, Harris and other collaborators at MIT published a paper on deep learning that used the PNP to demonstrate parts of an all-optical neural network (a concept first reported in 1987 at CalTech). In that work, an entire deep neural network would be unrolled into many PNPs stitched together using nonlinear optical elements (rudimentary versions of optical transistors). As a side note, we’re often asked if this work is a basis of Lightmatter’s tech. It is not — the proposal was interesting, but it had limitations.

First, neural networks can have hundreds of millions of parameters; current technology cannot support a corresponding number of MZIs. In fact, the implementation would likely require arrays of chips optically connected with nonlinear optical elements. Second, there are challenges towards implementing these nonlinear optical elements — the same kind of challenges that Bell Labs had in the 1980s. While implementations of all-optical neural networks from the 2017 proposal are likely to remain elusive, the underlying hardware — the PNP — developed by Harris and collaborators in the Quantum Photonics Laboratory at MIT has established itself as a valuable alternative platform for matrix processing.

Matrix processors and deep learning

While optical matrix processing technology was maturing, big things were happening in the artificial intelligence processor space. In 2017, Google published its work on a matrix processor for deep learning. They called it the “Tensor Processing Unit.” This move was surprising, since Google is broadly known as a software company, yet its team had designed and fabricated a state-of-the-art chip in one year to help enhance Google’s software products. State-of-the-art artificial intelligence algorithms (typically deep neural networks) are, at their core, sequences of matrix products. Therefore, building matrix processors for this task makes a lot of sense.

Google’s TPU release was part of a sort of big bang for machine learning chips as investors, corporations, and entrepreneurs perceived the commercial opportunity in building application-specific chips to speed up artificial intelligence tasks. The realization would spawn internal research efforts at many tech companies and a new wave of hardware startups all aimed at building chips to accelerate artificial intelligence applications.

*Fig. 2D arrays of compute units used for matrix processing in the Google TPU and Lightmatter’s processor.*

The TPU is composed of a 2D “systolic” array of multiply-accumulate units (MACs) that fire in a synchronized sequence. Each MAC performs the operation ab+c, where a, b, and c are values stored in registers. A schematic representation of a systolic MAC array is shown above. In Google’s TPU, the MAC array is typically configured so that the weights that represent a layer (or chunk of a layer) of a neural network are loaded into the array and the vector data (images, etc.) flow through the array with each clock cycle. When computing a matrix-matrix product, there are many different data vectors that need to be multiplied by the same weight values. So, there is a big benefit to storing the weights in the MAC array, rather than having to load them repeatedly from memory. This is known as a “weight stationary” scheme and is generally viewed as an efficient method in the context of deep learning inference tasks, since memory access is both slow and energy intensive — especially if the memory is not located on-chip.

The future

At Lightmatter, we take a different approach to processing deep neural networks, but with many similarities to the 2D MAC array employed by the TPU. Our approach, based on the PNP architecture, relies on a 2-D array of Mach-Zehnder interferometers (MZIs) fabricated in a silicon photonics process. To implement an N by N matrix product, our approach requires N² MZIs — the same number of compute elements used by systolic MAC arrays. Mathematically, each MZI performs a 2-by-2 matrix-vector product. Together, the whole mesh of MZIs multiplies an N-by-N matrix by an N-element vector. Computation occurs as light travels from the input to the output of the MZI array within the time of flight for the optical signals of about 100 picoseconds — less than a single clock cycle of your computer! Because our system operates at optical wavelengths, the theoretical bandwidth of the MZIs is nearly 200 terahertz (it would be a challenge to drive the MZIs at anywhere near this speed) compared to a couple of gigahertz for electronic MACs. Our MZIs require orders of magnitude less energy per calculation than electronic MACs implemented in the latest electronic chips.

Because of these special properties of our photonic system, we’re able to implement a matrix processor that is both faster and more energy-efficient than electronic artificial intelligence accelerators.

Fig. The research platform developed and used in academic publications by the founders of Lightmatter while at the Quantum Photonics Laboratory at MIT. The photonic processor is controlled by an array of custom circuits with programmable electrical outputs and the light is read-out using an array of detectors mounted on another custom circuit board; both units communicate over USB via a microcontroller to a host computer. At Lightmatter, we are integrating all of these functions into a single chip.

Since the founding of Lightmatter, we have been improving the way that we control and interface with our optical computer. To make our system user-friendly we’ve undertaken a number of engineering challenges: (1) implementing all of the control and readout circuits on-chip, (2) integrating high density memory in the package, (3) adding standard electronic communications interfaces to the system so that the chip can talk with the outside world just like an electronic accelerator, and (4) building the software infrastructure that will make the system plug-and-play for anyone who will use our product.

Building the hardware and software stack is important, but it can’t be done open-loop. Over the past year, we’ve been benchmarking our system on large-scale, real world neural network architectures and datasets (like ResNet with ImageNet) that are more relevant than handwritten number recognition tasks from the 1990s (like MNIST).

We look forward to sharing our ultra-fast, efficient optical compute platform with companies that, like us, are looking to push the boundaries of what is possible with responsible artificial intelligence.

This article contains contributions by Nicholas Harris, Darius Bunandar, Thomas Graham, Carl Ramey, Martin Forsythe, Tomo Lazovich, Michael Gould, and Victoria Pisini

The story behind Lightmatter’s tech

A brief history of optical computing

Research on integrated photonics for matrix processing

Matrix processors and deep learning

The future

Written by Lightmatter