Replanning after errors

When we let an LLM propose multi-step plans, we accept that some steps will fail or produce results we did not expect. This lesson exists to show how an agent can recover without starting over. Replanning keeps an LLM-guided workflow moving forward, even when reality does not match the original plan.

Detecting when a plan step fails

Replanning starts with noticing that something went wrong. In practice, this means checking the outcome of each step against what the plan expected to happen.

A failure might be a tool returning an error, missing output, or producing data that does not meet the step’s goal. The important point is that the program detects this condition explicitly, rather than assuming success.

result = execute_step(step)

if not result["success"]:
    step_failed = True

Once a failure is detected, the agent can decide to pause normal execution and shift into replanning.

Providing failure context back to the LLM

An LLM cannot revise a plan unless it understands what happened. We do this by sending a short, structured description of the failure back to the model.

This context usually includes the step that failed, what was attempted, and what was observed instead. The goal is clarity, not blame or explanation.

failure_context = {
    "failed_step": step["name"],
    "expected": step["expected_outcome"],
    "observed": result["error"]
}

This failure context becomes part of the next request to the LLM.

Requesting a revised plan

With failure context prepared, we can ask the LLM to adjust the plan. Instead of asking for a completely new solution, we usually request a revision that accounts for the failure.

The LLM is treated as a planner again, but now with additional constraints derived from real execution.

revised_plan = request_plan_update(
    goal=original_goal,
    failure=failure_context,
    remaining_steps=plan["steps"]
)

The response is expected to describe a modified sequence of steps.

Updating or replacing an existing plan

Once a revised plan is received, the agent must decide how to apply it. Sometimes this means replacing the entire plan. Other times, only the remaining steps are updated.

This choice is made by program logic, not by the model itself. The agent stays in control of how plans are stored and applied.

plan["steps"] = revised_plan["steps"]
current_step_index = 0

After this update, the agent has a coherent plan again.

Continuing execution after replanning

Replanning is not the end of the workflow. It is a transition back into execution.

With a revised plan in place, the agent resumes step-by-step execution, monitoring outcomes just as before. The loop of plan, execute, detect, and replan can repeat as needed.

At this point, the agent is no longer fragile. It can adapt its path while still pursuing the original goal.

Conclusion

By detecting failures, reporting them clearly, and asking an LLM to revise its plan, we gain resilience in LLM-guided workflows. We are no longer locked into a single brittle sequence of steps. Instead, we have a controlled way to recover and continue, even when execution does not go as planned.