Advanced microchips developed for modern AI workloads

Thursday, 14 March, 2024

The US Defense Department’s Defense Advanced Research Projects Agency (DARPA) has partnered with Princeton University to develop advanced microchips for artificial intelligence.

The new hardware reimagines AI chips for modern workloads and can run powerful AI systems using less energy than current advanced semiconductors, according to Naveen Verma, Professor of Electrical and Computer Engineering. Verma, who will lead the project, said the advance breaks through the barriers that have stymied chips for AI, such as size, efficiency and scalability.

Chips that require less energy can be deployed to run AI in more dynamic environments, from laptops and phones to hospitals and highways to low-Earth orbit and beyond. DARPA will support Verma’s work, based on a suite of key inventions from his lab, with an $18.6 million grant. The DARPA funding will drive an exploration into how fast, compact and power-efficient the new chip can get.

“There’s a pretty important limitation with the best AI available just being in the data centre. You unlock it from that and the ways in which we can get value from AI, I think, explode,” Verma said.

In the Princeton-led project, researchers will collaborate with Verma’s startup, EnCharge AI. The startup aims to commercialise technologies based on discoveries made from Verma’s lab, including several key papers he co-wrote with electrical engineering graduate students going back to 2016. Verma co-founded EnCharge AI in 2022 with Kailash Gopalakrishnan, a former IBM Fellow, and Echere Iroaga, a leader in semiconductor systems design.

Gopalakrishnan said that innovation within existing computing technologies, as well as improvements in silicon technology, began slowing at the time when AI began creating new demands for computation power and efficiency. Not even the best graphics processing unit (GPU), used to run today’s AI systems, can mitigate the bottlenecks in memory and computing energy facing the industry. “While GPUs are the best available tool today, we concluded that a new type of chip will be needed to unlock the potential of AI,” Gopalakrishnan said.

To meet the rising demand for computing power required by AI models, the latest chips pack in tens of billions of transistors, each separated by the width of a small virus. And yet the chips still are not dense enough in their computing power for modern needs. Today’s leading models, which combine large language models with computer vision and other approaches to machine learning, were developed using more than a trillion variables each. The Nvidia-designed GPUs that have fuelled the GPU boom have become so valuable, there is reportedly a considerable backlog to buy or lease these chips.

To create chips that can handle modern AI workloads in compact or energy-constrained environments, the researchers had to reimagine the physics of computing while designing and packaging hardware that can be manufactured with existing fabrication techniques and that can work well with existing computing technologies, such as a central processing unit. “AI models have exploded in their size and that means two things. AI chips need to become much more efficient at doing math and much more efficient at managing and moving data,” Verma said.

The researchers’ approach has three key parts; the core architecture of virtually every digital computer has followed a deceptively simple pattern first developed in the 1940s: store data in one place, do computation in another. That means shuttling information between memory cells and the processor. Over the past decade, Verma has researched an updated approach where the computation is done directly in memory cells, called in-memory computing. That is part one — the promise is that in-memory computing will reduce the time and energy it costs to move and process large amounts of data. So far, digital approaches to in-memory computing have been limited. As a result, Verma and his team turned to an alternate approach as part two: analog computation.

“In the special case of in-memory computing, you not only need to do compute efficiently, you also need to do it with very high density because now it needs to fit inside these very tiny memory cells,” Verma said.

Rather than encoding information in a series of 0s and 1s and processing that information using traditional logic circuits, analog computers leverage the richer physics of the devices. The curvature of a gear, or the ability of a wire to hold electrical charge. Digital signals began replacing analog signals in the 1940s, primarily because binary code scaled better with the exponential growth of computing. But digital signals don’t tap deeply into the physics of devices and as a result, they can require more data storage and management. They are also less efficient in that way. Analog gets its efficiency from processing finer signals using the intrinsic physics of the devices, but that can come with a trade-off in precision.

“The key is in finding the right physics for the job in a device that can be controlled exceedingly well and manufactured at scale,” Verma said.

The researchers found a way to carry out highly accurate computation using the analog signal generated by capacitors specially designed to switch on and off with exacting precision. That’s part three. Unlike semiconductor devices such as transistors, the electrical energy moving through capacitors doesn’t depend on variable conditions like temperature and electron mobility in a material.

“They only depend on geometry. They depend on the space between one metal wire and the other metal wire. And geometry is one thing that today’s most advanced semiconductor manufacturing techniques can control extremely well,” Verma said.

Image credit: iStock.com/mediagfx

Advanced microchips developed for modern AI workloads

What does cybersecurity look like in the quantum age?

Transforming acoustic waves with a chip

Mechanical strain boosts lead-free ferroelectrics

Content from other channels on our network