Engineers present chip that boosts AI computing efficiency

Wednesday, 24 August, 2022

AI-powered edge computing has become pervasive; devices like drones, smart wearables and industrial IoT sensors are equipped with AI-enabled chips so that computing can occur at the ‘edge’ of the internet, where the data originates. This allows real-time processing and guarantees data privacy.

However, AI functionalities on these tiny edge devices are limited by the energy provided by a battery. Therefore, improving energy efficiency is crucial. In today’s AI chips, data processing and data storage happen at separate places — a compute unit and a memory unit. The frequent data movement between these units consumes most of the energy during AI processing, so reducing the data movement is key to addressing the energy issue.

Engineers from Stanford University have created a possible solution: a novel resistive random-access memory (RRAM) chip that does the AI processing within the memory itself, thereby eliminating the separation between the compute and memory units. Their ‘compute-in-memory’ (CIM) chip, called NeuRRAM, is about the size of a fingertip and does more work with limited battery power than what current chips do. H. S. Philip Wong, the Willard R. and Inez Kerr Bell Professor in the School of Engineering, said that having those calculations done on the chip instead of sending information to and from the cloud could enable faster, more secure, cheaper and more scalable AI in the future, and give more people access to AI power.

Weier Wan, a graduate at Stanford who is leading this project, said the data movement issue is similar to spending eight hours in commute for a two-hour workday. “With our chip, we are showing a technology to tackle this challenge,” Wan said.

NeuRRAM was presented in a recent article in the journal Nature. This chip demonstrates a broad range of AI applications on hardware, rather than through simulation alone. To overcome the data movement bottleneck, researchers implemented compute-in-memory, a novel chip architecture that performs AI computing directly within memory rather than in separate computing units. The memory technology that NeuRRAM used is resistive random-access memory (RRAM), a type of non-volatile memory that retains data even once power is off, which has emerged in commercial products. RRAM can store large AI models in a small area footprint, and consume very little power, making them perfect for small-size and low-power edge devices.

Even though the concept of CIM chips is well established and the idea of implementing AI computing into RRAM isn’t new, this is reportedly one of the first instances to integrate a lot of memory right onto the neural network chip and present all benchmark results through hardware measurements, according to the co-senior author of the Nature paper. The architecture of NeuRRAM allows the chip to perform analog in-memory computation at low power and in a compact-area footprint. It was designed in collaboration with the lab of Gert Cauwenberghs at the University of California, San Diego, who pioneered low-power neuromorphic hardware design. The architecture also enables reconfigurability in dataflow directions, supports various AI workload mapping strategies and can work with different kinds of AI algorithms — all without sacrificing AI computation accuracy.

To show the accuracy of NeuRRAM’s AI abilities, researchers tested its function on different tasks and found that it is 99% accurate in letter recognition from the MNIST dataset, 85.7% accurate on image classification from the CIFAR-10 dataset, 84.7% accurate on Google speech command recognition and showed a 70% reduction in image-reconstructed error on a Bayesian image recovery task.

“Efficiency, versatility and accuracy are all important aspects for broader adoption of the technology. But to realise them all at once is not simple. Co-optimising the full stack from hardware to software is the key,” Wan said.

“Such full-stack co-design is made possible with an international team of researchers with diverse expertise,” Wong said.

The NeuRRAM is currently a physical proof of concept but needs more development before it is ready to be translated into actual edge devices. But this combined efficiency, accuracy and ability to do different tasks showcases the chip’s potential.

“Maybe today it is used to do simple AI tasks such as keyword spotting or human detection, but tomorrow it could enable a whole different user experience. Imagine real-time video analytics combined with speech recognition all within a tiny device. To realise this, we need to continue improving the design and scaling RRAM to more advanced technology nodes,” Wan said.

Priyanka Raina, assistant professor of electrical engineering and a co-author of the paper, said the work opens up several avenues of future research on RRAM device engineering and programming models and neural network design for compute-in-memory, to make this technology scalable and usable by software developers.

If successful, RRAM compute-in-memory chips like NeuRRAM could be embedded in crop fields to do real-time AI calculations for adjusting irrigation systems to current soil conditions. If mass produced, these chips would be cheap enough, adaptable enough and low power enough that they could be used to advance technologies, like in medical devices that allow home health monitoring. They could also be used to solve global societal challenges as well.

“By having these kinds of smart electronics that can be placed almost anywhere, you can monitor the changing world and be part of the solution. These chips could be used to solve all kinds of problems from climate change to food security,” Wong said.

Image credit: iStock.com/matejmo

Engineers present chip that boosts AI computing efficiency

Novel method to grow ultrathin semiconductors on electronics

Blue LEDs developed to enhance display performance

Research breakthrough to advance OLED technology

Content from other channels on our network