Evaluating and extending the capstone

This lesson exists because building an autonomous manager–worker system is only half the job. Once it runs, we need to decide whether it actually does what we set out to build.

Evaluation starts by returning to the original goals. These might include reliability, autonomy over time, or correct coordination between agents. We are not looking for perfection, only for evidence that the system behaves in line with its intended purpose.

In practice, this means observing the system while it runs. We look at the actions it takes, the decisions it makes, and the outcomes it produces, and we compare those to what we expected when we designed it.

Identifying strengths and weaknesses in the design

Once the system is running, patterns become visible. Some parts feel solid and predictable, while others feel fragile or awkward.

Strengths often show up as simplicity. Clear agent roles, clean communication paths, and well-defined tools tend to work smoothly. These are signals that the design choices were effective.

Weaknesses usually appear where complexity accumulates. Coordination logic may feel brittle, responsibilities may overlap, or recovery paths may be unclear. Noting these weaknesses helps us understand which design decisions deserve rethinking.

Extending the system with additional agents or capabilities

A well-designed capstone system should be extendable without major rewrites. Adding a new agent role is a common next step.

This might mean introducing a specialist worker, a monitoring agent, or an agent focused on reporting or validation. Each addition should have a clear responsibility that fits naturally into the existing structure.

If adding an agent requires large changes elsewhere, that is useful feedback. It tells us something important about how flexible the current design really is.

Modifying workflows or tools to support new behaviors

Sometimes extension does not require new agents, but better workflows or tools. Existing agents may be capable, but limited by how tasks are structured or how tools are defined.

Adjusting workflows might involve adding new steps, changing decision points, or improving how results are combined. Modifying tools might mean refining inputs and outputs or making side effects clearer.

These changes are valuable because they test whether the system can evolve while staying understandable and controlled.

Identifying next steps for further learning or improvement

The final purpose of this lesson is forward-looking. The capstone is not an endpoint, but a reference point.

From here, next steps might include improving observability, experimenting with different coordination strategies, or exploring more advanced memory or planning techniques. Each improvement builds on what the system already demonstrates.

At this stage, we should feel oriented. We understand how to judge the system we built, how to evolve it thoughtfully, and where to focus our learning next.