At the the 2025 Embedded Vision Summit, in May 2025, in Santa Clara, California, Tony Lewis, chief technology officer at BrainChip, presented research done by his company into state space models (SSMs) and how they can provide LLM capabilities with very low power consumption in limited computing environments, such as those found on dashcams, medical devices, security cameras, and even toys. He presented the example of the BrainChip TENN 1B LLM using a SSM architecture.
One of the core goals of SSMs is to bypass the context-handling constraints inherent to transformer-based models. They do this by utilizing matrices to generate outputs based on only the last token seen, meaning that all history of the process can be represented by the current state, something called the Markov property. In contrast, transformer models require access to every preceding token, which is stored in the context.
Due to their memoryless nature, state space models can solve for a number of constraints that appear in low-powered computing environments, including better utilization of CPU cache and reduced memory paging, which impact device power consumption and increase costs. They can also use slower read only memory to store the model parameters and state.
BrainChip have developed their own model called TENN (Temporal Event-Based Neural Network), which is currently a 1-billion-parameter model with 24 SSM layers that can run with read-only flash memory and under 0.5 watts power consumption, while returning results in under 100 ms.
Lewis explained that these surprising metrics are the result of the Markov property of the TENN model, saying “One cool thing about the state space model is that the actual cache used is incredibly small, so in the terms of a transformer based model, you don’t have a compact state, what you have to remember is a representation of everything that has come before”.
Additionally, BrainChip is working on quantizing the model to 4 bits, so that it will efficiently run on edge device hardware.
In benchmark tests conducted by BrainChip, the TENN model compares favorably to Llama 3.2 1B, although Lewis cautions that the performance of the TENN model depends on the particular application, and he recommends the use of a RAG application architecture to guard against hallucinations.
SSMs are an active area of research and seem particularly promising where there are computing resource constraints or high performance requirements. Their unique characteristics could unlock a new generation of edge devices, enabling sophisticated AI capabilities previously confined to the cloud. See the InfoQ article “The State Space Solution to Hallucinations: How State Space Models are Slicing the Competition” for more information on how SSM models perform compared to Transformer models.
A technical overview of state space models and how they work can be found in the Hugging Face blog post Introduction to State Space Models (SSM).