Memory boundaries and tradeoffs

When we give an SDK-managed agent memory, we are deciding how much of the past it should carry forward into the future. This matters because memory directly affects how an agent reasons, how much it costs to run, and how predictable it remains over time. In this lesson, we focus on how to think about memory boundaries and the tradeoffs involved when building real agents with the SDK.

Deciding what information should be remembered

Not all information an agent encounters deserves to be remembered. Some details are essential for continuity, while others are only useful in the moment. Good memory design starts by identifying what genuinely influences future decisions or behavior.

For example, remembering a user’s preferred output format may be valuable across sessions, while remembering every intermediate question they asked is usually not. The SDK gives us the ability to persist memory, but it is still our responsibility to choose what earns a place there.

Tradeoffs between short-term and long-term memory

Short-term memory helps an agent stay coherent within an ongoing conversation. Long-term memory helps it maintain continuity across runs. These two kinds of memory serve different purposes and should not be treated the same way.

Short-term memory is volatile and contextual, while long-term memory is durable and selective. Treating everything as long-term memory often leads to bloated context and slower, more expensive reasoning. Treating everything as short-term risks losing important continuity when the agent restarts.

Preventing memory bloat

Unbounded memory growth is one of the fastest ways to degrade an agent. As memory grows, context windows fill up, responses become less focused, and costs increase. Preventing memory bloat means being intentional about what is stored and when old information is discarded.

This often involves summarizing, pruning, or replacing detailed records with compact representations. The SDK supports persistence, but it does not automatically enforce discipline. That remains a design concern we must handle explicitly.

Handling outdated or irrelevant memory

Some memories become incorrect or irrelevant over time. Preferences change, goals shift, and external facts expire. An agent that blindly trusts old memory can make poor decisions based on stale information.

A robust design includes ways to update, overwrite, or ignore memory that no longer applies. This may mean revalidating assumptions, expiring entries, or favoring recent signals over older ones when conflicts arise.

Balancing memory usefulness against cost and performance

Every piece of memory has a cost. Larger context windows increase latency and API usage, and excessive memory can make agent behavior harder to reason about. At the same time, insufficient memory can make the agent feel forgetful or inconsistent.

The goal is balance. We want memory that meaningfully improves decisions without overwhelming the system. When memory clearly earns its keep, it belongs. When it does not, it should be left behind.

Conclusion

At this point, we are oriented to the boundaries and tradeoffs involved in agent memory. We understand how decisions about what to remember affect correctness, cost, and performance, and why memory discipline matters just as much as memory capability. With this mental model, we are better prepared to design agents whose memory remains useful instead of becoming a liability.

Try it

We can now control whether the chatbot remembers past exchanges or deliberately forgets them, by enforcing explicit memory boundaries across turns.

In this demonstration, the LLM is instructed to assume the user is a child curious about the solar system, and to forget messages after two turns.

Suspiciously fast results? This demonstration shows the output of a prior offline run, since a live LLM-enabled agent would be slower and more costly to execute on demand.