05
Jun, 24
05 Jun, 24

A Complete No-Brainer:
ReRAM for Neuromorphic Computing

In the last 60 years technology has evolved at such an exponentially fast rate that we are now regularly conversing with AI based chatbots, and that same OpenAI technology has been put into a humanoid robot. It’s truly amazing to see this rapid development.

 

Above: OpenAI technology in a humanoid robot

 

Continued advancement of AI development faces numerous challenges. One of these is computing architecture. Since it was first described in 1945, the von Neumann architecture has been the foundation for most computing. In this architecture, instructions and data are stored together in memory and communicate via a shared bus to the CPU. This has enabled many decades of continuous technological advancement.

However, there are bottlenecks created by such an architecture, in terms of bandwidth, latency, power consumption, and security, to name a few. For continued AI development, we can’t just make brute force adjustments to this architecture. What’s needed is an evolution to a new computing paradigm that bypasses the bottlenecks inherent in the traditional von Neumann architecture and more precisely mimics the system is trying to imitate: the human brain.

To achieve this, memory must be closer to the compute engine for better efficiency and power consumption. Even better, computation should be done directly within the memory itself. This paradigm change requires new technology, and ReRAM (or RRAM) is among the most promising candidates for future in-memory computing architectures.

 

Roadmap for ReRAM in AI

Given its long list of advantages, ReRAM can be used in a broad range of applications ranging from mixed signal and power management to IoT, automotive, industrial, and many other areas. We generally see ReRAM rolling out in AI applications over time in different ways. For AI related applications, relevant advantages of ReRAM include its cost efficiency, ultra-low power consumption, scaling capabilities, small footprint and fit into a long-term roadmap to advanced neuromorphic computing.

The shortest-term opportunity for ReRAM is as an embedded memory (10-100 Mb) for edge AI applications. The idea is to bring the NVM closer to the compute engine, therefore massively reducing power consumption. This opportunity can be realized today using ReRAM for synaptic weight storage, replacing the use of external flash and eliminating some of the local SRAM or DRAM. My colleague Gideon Intrater will present on this topic on Monday, June 24th at the Design Automation Conference 2024. If you are planning to attend, please attend his presentation as part of the session, ‘Cherished Memories – Exploring the Power of Innovative Memory Architectures for AI applications.’

In the mid-term, ReRAM is a great candidate for in-memory computing where analog behavior is required. In this methodology, ReRAM is used for both computation and weight storage – at first in binary (storing two values per bit) and then moving to multi-level operations (multiple values per bit). An example of in-memory computing was proposed in 2022 using arrays based on Weebit ReRAM as Content Addressable Memories. This work, done in collaboration with the Department of Electrical Engineering, Indian Institute of Technology Delhi, is highlighted in the article, ‘In-Memory Computing for AI Similarity Search using Weebit ReRAM,’ by Amir Regev.

My colleague Amir Regev also recently wrote an article, ‘Towards Processing In-Memory,’ which explains more about the idea of in-memory computing with Weebit ReRAM, based on work done with the Department of Electrical Engineering at the Technion Israel Institute of Technology and CEA-Leti.

Above: A roadmap for ReRAM in AI – short-term, mid-term and long-term

 

In the longer term, neuromorphic computing comes into play. In the brain, synapses provide the connections between neurons, and they can change their strength and connectivity over time in response to patterns of neural activity. Likewise, ReRAM arrays can be used to create artificial synapses in a neural network which change their strength and connectivity over time in response to patterns of input. This allows them to learn and adapt to new information, just like biological synapses.

Areas of particular interest include Bayesian architectures and meta learning. Bayesian neural networks hold great potential for the development of AI, particularly where decision-making under uncertainty is critical. These networks actually quantify uncertainty, so such methods can help AI models avoid overconfidence in their predictions, potentially leading to more reliable, safer AI systems. The characteristics of ReRAM make it an ideal solution for these networks.

The aim of meta learning is to create models that can generalize well to new tasks by leveraging prior experience. As they ‘learn to learn,’ they continuously update their beliefs based on new data without needing to re-train from scratch, making them more adaptable and flexible than today’s methods. The idea is to develop a standalone system capable of learning, adapting and acting locally at the edge. A model would be trained on a server and then optimized parameters would be saved on the chip at the edge. The edge system would then be able to learn new tasks by itself – like humans and other animals.

Compared to current machine learning where models are trained for specific tasks with fixed algorithm on a huge dataset, there are numerous advantages of this concept, including:

  • Data is stored locally on the chip and not in the cloud so there is greater security, much faster reaction and lower power consumption
  • Computation is done in-situ so there is no need to transfer data from memory to the computation unit
  • The system could adapt to very different real world situations since it would imitate human learning ability

A recent joint paper from Politecnico di Milano, Weebit and CEA-Leti proposed a bio-inspired neural network capable of learning using Weebit ReRAM. The focus is on building a bio-inspired system that requires hardware with plasticity, in other words the ability to adjust its state based on specific inputs and rules, as in the case of biological synapses. You can read about this work in an article by Alessandro Bricalli, ‘AI Reinforcement Learning with Weebit ReRAM.’

This is the future of ReRAM in AI, and I can’t wait!

 

Overcoming hurdles

Like all memory technologies, ReRAM has both pros and cons for neuromorphic applications. On the ‘pros’ side, this includes its non-volatility, ability to scale to smaller nodes, low power consumption and the ability to have multi-level operation.

The ‘cons’ are largely due to phenomena such as limited precision of the programming conductance. ReRAM technologies are also subject to some resistance drift while cycling. Other phenomena, such as relaxation (linked to both time and temperature), can impact resistance values over time.

As we look towards using ReRAM for neuromorphic computing, we won’t let such resistance variability hold us back. There are not only ways to mitigate such factors, but also ways in which these ‘cons’ can be taken advantage of in certain neuromorphic bio-inspired circuits.

 

Mitigating resistance variability

One of the main ways we can mitigate resistance variability is by using Program and Verify (P&V) algorithms. The idea is quite simple: whenever a cell doesn’t satisfy a given criterion in some way, we can reprogram it and then re-verify its resistance state. Such methods allow us to fine-tune resistance levels in a given range to attain more than just the levels of low-resistance state (LRS) and high-resistance state (HRS).

We can do this in multiple ways. One way is to use a gradual method, in which we repeat the same operation over and over until a cell satisfies the condition imposed (or the maximum number of allowed repetitions has been completed). This method can be incremental, in which case the programming control parameter increases at each repetition, or cumulative, in which case the parameter is kept constant each time.

There are numerous knobs we can control, including the programming direction and level of the control parameter. The total number of P&V cycles, as well as what happens before the verify itself, can vary depending on the goal we want to achieve – whether it’s improving retention, resilience or endurance, or achieving other goals.

The Ielmini Group at the Politecnico di Milano has proposed numerous state-of-the-art algorithms which can help with further tuning. One of these is called ISPVA, in which the gate voltage of the transistor is kept constant, therefore fixing the compliance current, while the top electrode voltage is increased until the desired conductance is attained. Conversely, in the IGVVA approach, the top electrode voltage is kept constant (high enough to grant a successful set operation), while the gate voltage is increased to gradually increase the compliance current.

Variability of the programmed levels is a key parameter in in-memory computing and hardware implementation of deep neural networks. Therefore, it’s important to use algorithms that not only achieve the right level of electrical conductance but also make sure this conductance is consistent across multiple attempts. There are many other P&V algorithms we can employ, for example to reach a more stable conductive filament, reduce post programming fluctuations, or achieve another goal.

It’s important to note that P&V algorithms are not the only tools available to mitigate ReRAM variability. For instance, pulse shape can play an important role in reducing variability and therefore improving neural network accuracy. Some industry work has shown that compared to regular square pulses, triangular pulses reduce the number of oxygen vacancies after set operation, therefore improving conductive filament stability. Triangular pulses have also been shown to be effective in improving the resistance state after the reset operation.

Above: Triangular pulse shape reduces the Vo after set operation, therefore improving conductive filament stability (Y. Feng et al., EDL 2021)

 

Taking advantage of ReRAM’s ‘cons’ for neuromorphic computing

In a neural network, we would like synapses to have a linear and symmetric response, a large number of analog states, a high on/off ratio, high endurance and no variability. ReRAM has intrinsic variabilities, and we can at least partly mitigate such non-idealities. For neural networks, we can also use them to our advantage!

One example is in a Bayesian neural network where device variability is actually key to its implementation: the natural differences from one device to another are crucial for how it works. For instance, differences in how memory conducts electricity with each use can actually help by providing randomness, which is useful for generating numbers or for algorithms in AI that need randomness, like Bayesian reasoning.

In Bayesian methods, you don’t just get one answer from a given input; instead, you get a distribution of possible answers. The natural variation in ReRAM can be used to create this distribution. This variation is like having physical random numbers that can help perform calculations directly within the memory. This makes it possible to do complex multiplications right where the data is stored. In addition, Bayesian neural networks are resilient to device-to-device variability and system aging.

 

Summary

ReRAM is a good match for neuromorphic applications due to its cost-efficiency, ultra-low power consumption, scaling advantage at 28nm and below, small footprint to store very large arrays, analog behavior and ease of fabrication in the back end of the line. The conductance of ReRAM can also be easily modulated by controlling a few electrical parameters.

We can mitigate the ‘cons’ of ReRAM to make it shine in edge AI and in-memory computing applications in the short- and mid-term, respectively. In the long term, the similarity of ReRAM cells to synapses in the brain make it a great fit for neuromorphic computing. As we look to these applications for new applications, such as Bayesian neural networks, the ‘cons’ of ReRAM can not only be mitigated, but can even provide advantages.

I recently presented a tutorial at the International Memory Workshop in Seoul, during which I discussed the requirements of new neuromorphic circuits, why ReRAM is an ideal fit for such applications, existing challenges and possible solutions to improve ReRAM-based neural networks.
Please click here to view the presentation.

 

Want to read some more?

Towards Processing In-Memory

One of the most exciting things about the future of computing is the ability to process data inside of the memory. This is especially true

Weebit ReRAM:
The Next NVM is Here!

The promise of resistive memories As early as the 1960s, the resistivity of some types of materials has been studied by research organizations around the