Announcing our investment in Engram, the memory dream team

I haven’t drafted more than a sentence of this blogpost, but I’m already preparing to bark at ChatGPT that I never use colons when writing…for the 50th time this week. I’ll probably also have to tell it to simplify its description of key mathematical concepts (I have, afterall, been in VC for 10 years) at least 10 times when I review ICML papers later this week. All of this is to say, my interest in memory and continual learning really stemmed from personal experience. My local barista never remembers my coffee order (decaf Americano with a splash of oat milk), but I hoped my AI would.

It turns out I’m not the only one. I soon turned to my friends and colleagues, including researchers and engineers building GenAI products and applications, to find out whether they shared similar interests.

Did the people building coding agents want AI that would learn when to add more explanatory comments and when to keep them minimal?
Would the people building legal agents want AI that would learn when to write brief summaries and when to produce a comprehensive, citation-rich analysis?
Would customer support teams want AI that could adapt to a company's preferred tone, level of formality, and escalation practices?

Should AI learn not just what we want it to do, but how we prefer it to do it?

The answer was a resounding yes. Because AI engineers do, in fact, want to build products that you’ll like and then love. But also, because they don’t want you to switch to a competitor. You’re more likely to continue using a product that gets better as you interact with it. At a time when distribution is cheap, retention really matters.

They had other concerns, too. Many were relentlessly focused on releasing agent skills. While their agents could compose these skills to automate a handful of common but tedious workflows, they struggled to build and maintain enough skills to address the long tail of workflows their customers ultimately wanted automated. They really wanted to build agents that could learn new skills based on past experiences.

At the center of all these conversations were two themes: memory and continual learning.

My next step was to discover what, if any, techniques might solve this problem, so I began speaking with researchers in industry and academia. At first, many people pointed to context engineering. Just load conversation logs into the context window! Cram all the search results into the context window! Oh, the context window is too small? Don’t worry, we’ll just figure out how to train longer context windows.

But it turns out that long context pre-training is really hard. Potentially so hard that it might not be possible to do with Transformers. And even if you could do it, it would likely require much bigger models and be prohibitively expensive. What’s more, cramming lots of data into context is a great way to spend a lot of money only to produce bad answers. Have you heard of context rot? Even if you haven’t, you’ve probably experienced it. Some studies have empirically demonstrated that as the number of examples you put into context grows, the model accuracy declines.

To repeat the refrain of many a blog post…there had to be a better way! And that’s when I read about Cartridges. Sabri Eyuboglu and a team from Stanford published their research on Cartridges in June 2025. They described an approach in which a small KV cache could be trained offline through self-study on a corpus of documents (e.g., conversation logs, code repositories, or sets of long documents). At inference, this Cartridge could be loaded to produce better outputs.

You’d no longer need to stuff the same documents, codebases, and knowledge into the prompt every time; you could pre-compute it once and save it as a compact memory object that the model would load when needed. It seemed clear that this research would be very meaningful (and fortunately, I knew the author - but more on that later) - it was cheaper, faster, and more scalable than traditional RAG and would mean that the model, unlike my barista, never forgot about past interactions with me. I knew this could become a fundamental building block for persistent memory and efficient context in AI systems.

Although Cartridges are promising for solving personalization and many other practical problems in AI, I still believed we would need other strategies to update model weights so it could draw connections between its new insights and knowledge acquired during pre-training. It seemed crazy to pre-train models once and then adjust their weights only through RL post-training, a setup designed to augment existing capabilities rather than cultivate new ones. The human brain changes constantly as it accumulates new memories through synaptic remodeling and neural pathway evolution; surely the agent brain couldn’t just be static.

I started looking into parametric memory updates and quickly found research by Jessy Lin and a team from Meta and Berkeley on sparse memory fine-tuning. They described an approach, enabled by parameter-efficient fine-tuning with LoRA, that updates only memory slots activated by a new piece of knowledge, thereby minimizing catastrophic forgetting. They showed empirically that this approach could support new knowledge acquisition with much less forgetting. This was the missing component.

At this point, I knew what I was looking for: a platform that could enable personalization, minimize the need for search and retrieval, and support the development of new skills - one that was designed around KV cache compaction to leverage new knowledge at test-time but also sparse memory fine-tuning to support continual learning.

Fortunately, the opportunity landed in my lap. You see, over the course of 5 years, I have met 3 remarkable individuals:

I already mentioned Sabri, whom I first met in 2021 through his work on Meerkat, a library that makes data inspection, evaluation, and training of multimodal and other models more efficient and robust. Sabri is one of those rare researchers who combines extensive knowledge of coding and systems with a nose for novel but practical research ideas. He also taught me the word “sus.”

At an Amplify dinner in 2022, I met Jessy Lin (also previously mentioned), whose obsession with and knowledge of research on continual learning and memory still exceeds anyone I’ve met after 10 years of connecting with researchers.

In 2023, I also met Jack Morris in his second year as a PhD student at Cornell. While he told me that he didn’t plan to continue his research on deidentification in ML, he later shared his conviction that better approaches to memory would enable the development of better consumer and enterprise products. And his thesis on memory in language models was a banger.

But it wasn’t until 2025 that I met Dan Biderman, who corralled this incredible group of memory researchers together to commercialize their work and change the way AI agents are designed. I was familiar with Dan’s research defining best practices for fine-tuning with LoRA. But I didn’t know how special he was until something remarkable happened. Jonathan Frankle, who has never said anything nice about anyone, said he was good.

I would have backed anything this team developed, but it just so happened they decided to focus on the one thing I believed builders needed most. A way to remember my coffee order make agents useful and delightful through better memory management and continual learning. The Engram team has the right people (Dan, Jack, Jessy, Sabri, Scott), the right approach (initially focused on Cartridges but expanding into sparse memory fine-tuning), and an unrelenting focus on making customers like Notion, Harvey, and Microsoft successful.

That’s why today I am excited to share that Amplify is part of Engram’s $98M in funding, announced today via CNBC.

My AI apps still don’t remember that I ONLY EVER WANT TO SEE MEDICAL RECOMMENDATIONS BACKED BY RCTS or that I WILL NOT STAY AT HOTELS WITH CARPET. But Engram will change that soon. So you should probably get in touch.

Authors

Sarah Catanzaro

Editors

Justin Gage

Acknowledgments