We recently collaborated with our friends at IIT-Delhi, led by Prof. Manan Suri, on a research project demonstrating an efficient ReRAM based in-memory computing (IMC) capability for a similarity search application. The demonstration was done on 28nm ReRAM technology developed by Weebit in collaboration with CEA-Leti. A paper based on this work, “Fully-Binarized, Parallel, RRAM-based Computing Primitive for In-Memory Similarity Search,” was published in IEEE Transactions on Circuits and Systems II: Express Briefs.
A bit of background: CAMs in AI/ML search applications
Associative memories, also called Content Addressable Memories (CAMs), are an important component of intelligent systems. CAMs perform fast search operations by accepting a query and performing a search over multiple data points stored in memory to find one or more matches based on a distance metric, and then return the locations of the matches. This information can be potentially used for applications such as nearest neighbor searches for classification or unsupervised labeling. Ternary Content-Addressable Memory (TCAM) is a type of CAM that incorporates a “don't care condition” to assist searches for partial matches and is therefore the most commonly used type of CAM.
TCAMs offer a powerful in-memory computing paradigm for efficient parallel-search and pattern-matching applications. With the emergence of big data and AI/ML, TCAMs have become a promising candidate for a variety of edge and enterprise data-intensive applications. In the research project, we proposed a scheme that demonstrates the use of TCAMs for performing hyperspectral imagery (HSI) pixel matching in the context of remote-sensing applications. TCAMs can also be used to enable applications such as biometrics (facial/iris/fingerprint recognition) and to assist in string matching for large scale database searches.
Traditionally, CAMs/TCAMs are designed using standard memory technologies such as SRAM or DRAM. However, these volatile memory-based circuits have performance limitations in terms of search energy/bit (a metric commonly used for evaluating the performance of CAM circuits), and CAMs based on SRAMs are limited in scale due to relatively large cell areas.
ReRAM can overcome performance limitations
CAM performance limitations can be addressed by using an emerging NVM (Non-Volatile Memory) technology like ReRAM instead of volatile memory technologies. Because ReRAM can help reduce power consumption and cell size, it can be used to build compact and efficient TCAMs. Such NVM devices also reduce circuit complexity and provide opportunity to exploit low-area analog in-memory computing), leading to increased design flexibility.
In the recent paper, the joint IIT-Delhi/Weebit team presented a hardware realization for CAM using Weebit ReRAM arrays. In particular, the researchers proposed an end-to-end engine to realize IMSS (In-Memory Similarity Search) in hardware by using ReRAM devices and binarizing data and queries through a custom pre-processing pipeline. The learning capability of the proposed ReRAM based in-memory computing engine was demonstrated on a hyperspectral imagery pixel classification task using the Salinas dataset, demonstrating an accuracy of 91%.
Above: Figure showing energy efficient classification of agricultural land from hyperspectral imagery using proposed In-Memory Computing Technique.
The team experimentally validated the system on fabricated ReRAM arrays, with full-system validation performed through SPICE simulations using an open source SkyWater 130nm CMOS physical design kit (PDK). We were able to significantly reduce the computations required and improve the speed of computations, leading to benefits in terms of both energy and latency. By projecting estimations to advanced nodes (28nm), we demonstrated energy savings of ~1.5x for a fixed workload compared to the current state-of-the-art technology.
You can access the full paper here.