Cloudflare has introduced Agent Memory, a service designed to give AI agents permanent memory. Instead of repeatedly providing all the necessary information as context – which causes high token consumption – AI agents with agent memory should independently select relevant information and use it in their prompts to the language models. The service is initially only available in a closed beta version.
Read more after the ad
Agent Memory is intended to prevent context rot
In addition to the potential cost savings for developers resulting from lower token consumption, the US provider also wants to counteract so-called context decay with Agent Memory. Long prompts increasingly degrade the speed and reliability of an AI model’s responses. Information from the beginning of a conversation that no longer fits into the context window of the respective model is lost.
According to a post on the Cloudflare blog, Agent Memory can be used as a persistent storage layer for AI agents hosted locally and in the cloud. Additionally, developers can integrate the service into multi-agent coordination frameworks to provide the agents within them with persistent storage across sessions and restarts. Storage profiles can also be used together, so that information only has to be transmitted to an AI agent once and can then be used and expanded by multiple agents.
Share and expand information across development teams
Cloudflare mentions integration into coding agents of a development team as a possible use for Agent Memory. Initially, developers can enter basic information that is important to all agents, such as internal conventions or architectural decisions. All connected agents then use and expand this information.
The service can also be used for agentic code review – it should be able to remember what the developers reject. With this information, the AI agent should be able to adjust its feedback on the program code and provide more relevant information. Agent memory can also be integrated into simple chatbots in order to save the message history and be able to access it when needed.
Access via Cloudflare Workers and API
Read more after the ad
Agent Memory differentiates information between immutable facts, events from previous points in time, current tasks and instructions such as workflows or runbooks. The service independently updates outdated information and deletes duplicates. The information is accessed via a connection to Cloudflare Workers or a REST API.
The interface provides five core operations: ingest for mass processing of conversations, remember for explicit saving, recall for synthesized queries as well list and forget for administration and deletion. In order to map the entire API interface, Cloudflare last published with cf a unified command line tool. With it, developers should be able to control all of the provider’s services via a central tool and have them used by AI agents.
It is currently not possible to register for the Agent Memory closed beta, but a waiting list is available. The timing of general availability is not yet known.
(sfe)
