Persisting memory across runs

LLM-driven agents often behave intelligently within a single run, but real systems rarely stop there. If an agent is restarted, upgraded, or run on a schedule, it needs a way to carry forward what it has already learned or decided. This lesson exists to orient us to how memory can outlive a single execution, so LLM-guided behavior remains continuous rather than reset each time the program starts.

Short-term context versus long-term memory

Not all information used by an LLM should be treated the same way. Some data exists only to support the current interaction, while other data should influence behavior across sessions.

Short-term context includes things like the most recent user input or the current step in a workflow. Long-term memory includes durable facts, past decisions, or accumulated results that should still matter tomorrow. Recognizing this distinction helps us avoid persisting everything and focus only on what truly needs to survive a restart.

Choosing what to persist

Persisted memory should serve a clear purpose. It should either help the LLM make better decisions in the future or allow the program to resume meaningful work.

Examples include prior task outcomes, known user preferences, or summaries of completed work. Transient prompts, raw conversation text, or intermediate reasoning usually do not belong here. The goal is to preserve signal, not noise.

Saving memory to disk

Once memory is identified, it needs to be written somewhere durable. In practice, this often means serializing a small Python data structure to a file.

For simple agents, JSON is a common choice because it is readable and maps cleanly to dictionaries and lists. The important point is that memory is saved explicitly, rather than relying on the LLM to “remember” across runs.

import json

memory = {
    "completed_pages": ["mars.html", "jupiter.html"],
    "last_run": "2026-01-14"
}

with open("agent_memory.json", "w") as f:
    json.dump(memory, f)

Restoring memory at startup

Persisted memory is only useful if it is loaded again when the program starts. This typically happens early in execution, before the first LLM call is made.

Restored memory becomes part of the agent’s initial state and can be supplied as context to the model. From the LLM’s perspective, this makes the new run feel like a continuation rather than a fresh start.

import json

with open("agent_memory.json", "r") as f:
    memory = json.load(f)

Continuing behavior across sessions

When memory is restored and included in model input, the agent can pick up where it left off. Decisions can account for past outcomes, and repeated work can be avoided.

At this point, the LLM is no longer reacting only to the current prompt. It is reasoning with a history that the program has deliberately preserved. This is the foundation for agents that feel persistent, intentional, and cumulative rather than stateless.

Conclusion

We now know how LLM-driven agents can distinguish fleeting context from durable memory, choose what is worth keeping, and persist that information across program runs. With memory saved and restored explicitly, an agent’s behavior can remain coherent over time, even as the program itself stops and starts.