Experiments show how an all-optical version of an artificial neural network — a type of artificial-intelligence system — could potentially deliver better energy efficiency can conventional computing approaches.
A DNN comprises many layers of artificial neurons and artificial synapses, which are connections between the neurons. The strengths of these connections are called weights and can be either positive, indicating neuronal excitation, or negative, implying inhibition. A DNN learns to perform tasks such as image recognition by varying its synaptic weights in a way that minimizes the difference between its actual output and the desired output.
Central processing units and other digital-based hardware accelerators5are typically used for DNN computations. A DNN can be trained using a known set of data, whereas an already trained DNN can be applied to unknown data in a task called inference. In either case, although the amount of computation is vast, the variety of operations is modest, because ‘multiply–accumulate’ operations dominate across the many synaptic weights and neuronal excitations.
DNNs are known to still work well when computational precision is low5. As a result, these networks represent an intriguing opportunity for unconventional computing techniques. For example, researchers are exploring DNN accelerators that are based on emerging non-volatile memory devices6,7. Such devices retain information even when their power source is switched off, and can offer improved speed and energy efficiency for DNNs through analog electronic computation.
Why not, therefore, also consider optics? Structures that direct light — whether they be an optical fibre for use in telecommunications or a waveguide patterned onto a photonic chip — can be packed with vast amounts of data. Inside such a waveguide, many wavelengths of light can propagate together, using a technique known as wavelength division multiplexing. Each wavelength can then be modulated (altered in such a way that it can carry information) at a rate that is limited by the available bandwidths associated with electronic-to-optical modulation and optical-to-electronic detection.
Structures called resonators enable individual wavelengths to be added to or removed from the waveguide, like wagons on a freight train. For example, micrometre-scale, ring-shaped (micro-ring) resonators can implement arrays of synaptic weights8. Such resonators can be modulated thermally9, electro-optically10,11 or, as in Feldmann and colleagues’ work, through phase-change materials12. These materials can switch between an amorphous phase and a crystalline phase, which differ greatly in their ability to absorb light. Under ideal conditions, the resulting multiply–accumulate operations would require only a small amount of power.
Feldmann et al. present an all-optical neural network on a millimetre-scale photonic chip, in which there are no optical-to-electronic conversions within the network. Inputted data are electronically modulated onto different wavelengths for injection into the network, but after that has been performed, all the data stay on the chip. Both weight modulation and neuron integration are achieved using integrated phase-change materials; these are located on two types of micro-ring resonator, which have a synaptic or neuronal function.
Unmodulated light that is injected at the various operating wavelengths picks up the neuronal excitations that have accumulated in the phase-change material, and then passes them to the next layer of the network. Even without on-chip optical gain (a process in which a medium transfers energy to the light that is transmitted through it), this set-up could potentially be scaled up to larger networks. The authors demonstrate, on a small scale, both supervised and unsupervised learning — that is, training is achieved using labelled data, which is how DNNs learn, and using data without such labels, which is how humans tend to learn.
Because the weights are implemented by light absorption, negative weights require a large bias signal, which must not activate the phase-change material. An alternative approach13 that can readily offer negative weights uses devices called Mach–Zehnder interferometers. In these devices, a single waveguide is split into two arms and then recombined; this causes the amount of transmitted light to depend on the difference in optical phase between the two paths. However, it might be challenging to combine this approach with wavelength division multiplexing, because the arms of each interferometer would need to introduce the appropriate phase difference for each wavelength.
Photonic DNNs still present substantial challenges. Their total power usage can be low in ideal situations, but thermo-optic power is frequently required to adjust and maintain the differences in optical phase in the arms of each Mach–Zehnder interferometer. Moreover, the total optical power that is injected into a system containing phase-change materials must be calibrated carefully, so that the materials respond to incoming signals exactly as intended. Although phase-change materials can also be used to adjust Mach–Zehnder phases, unavoidable cross-coupling between how strongly the materials absorb light and how much they slow it down poses a considerable complication.
Phase-change materials seem to be well suited for the non-volatile long-term storage of synaptic weights that are based on micro-ring resonators needing only infrequent adjustment. However, when used in the role of neuron, the speed of crystallization of such materials will limit the maximum rate at which neurons can be excited. Furthermore, the need to melt the materials to induce a full neuronal reset after every potential excitation event will rapidly consume the large, but finite, switching endurance of the materials.
Conventional DNNs have grown large and now typically involve many thousands of neurons and millions of synapses. But photonic networks require waveguides that are spaced far from each other to prevent them from coupling, and that avoid sharp bends to prevent light from leaving the waveguide. Because crossing two waveguides introduces the risk of injecting undesired power into the wrong path, the 2D nature of a photonic chip presents a substantial design constraint.
Despite the long distances and large areas that are required for the implementation of photonic networks, fabrication of the key parts of each optical structure requires precision. This is because the waveguides and coupling regions — for instance, at the entrance and exit of each micro-ring resonator — must have the exact dimensions needed to obtain their desired performance. There are also limits to how small micro-ring resonators can be made. Finally, the relatively weak optical effects offered by modulation techniques require long interaction regions to enable their limited impact on passing light to build to a noticeable level.
Advances such as those made in Feldmann and colleagues’ study and by others8,13 are encouraging for the future of the field. The development of readily available broadband on-chip gain would help considerably, as would techniques that can support independent and arbitrary operations on each piece of optically encoded data, without requiring vast areas of the photonic chip. Should scalable photonic neural accelerators offering high energy efficiencies eventually emerge, we might well look back on the work of Feldmann et al. and others in the field as important early glimpses of the technology’s promise.
Nature 569, 199-200 (2019)
(원문: 여기를 클릭하세요~)
