Skip to content

Ch 17: Validation-Retry and Prompt Calibration

You have unlocked secret 0 of 20 about AI Agents
0%
The Spell Checker

The first draft is never perfect. You write an essay, run spell-check, and it lights up with red squiggles. You fix them, run it again — fewer squiggles. One more pass and it is clean.

Your helper works the same way. You ask it to fill out an order form. It comes back with a word where a number should be, a blank where a required answer belongs, or a choice that is not on the list. The helper is not careless — it just guessed wrong on the first try.

Checking is the spell-checker. It reads every answer on the form, finds the mistakes, and describes exactly what is wrong. The second draft is the fix — you hand those exact corrections back to the helper and say "try again, here is what you got wrong." The helper reads the feedback, adjusts, and produces a corrected version.

Spell-check, fix, recheck. That is the check-and-correct loop.

Why the First Try Cannot Be Trusted

The helper's first attempt looks confident. Every blank on the form is filled in, every answer written neatly. But confidence is not correctness. Here is what goes wrong:

Wrong kind of answer. The form asks for a number, but the helper wrote a word. The "quantity" box says "three" instead of "3". It looks fine at a glance, but the kitchen cannot cook "three" — it needs the digit so it can multiply by the price.

Missing answers. The form has five required blanks. The helper filled in four. It simply skipped one — or decided it was optional when it was not. Everything looks complete until someone notices the "delivery time" line is empty.

Answers not on the list. The form says meal choice must be "chicken", "fish", or "vegetarian". The helper wrote "veg". It is a reasonable shorthand — but the caterer's system does not recognize it. The choice has to match the list exactly.

Nonsense answers. Every blank is filled, every answer is the right kind. But the "number of guests" says 2 while the "number of meals" says 500. Technically each answer follows the rules on its own. Together, they make no sense.

A single attempt at filling out the form is a first draft. You need a red pen.

Narrator

You have asked your helper to fill out a catering order form for an office party. The form has specific rules for each blank. Let's watch the check-and-correct loop in action.

Put the check-and-correct steps in the correct order

Drag to reorder, or use Tab + Enter + Arrow keys.

  1. The helper fills out the form
  2. The checker reviews every answer
  3. If mistakes are found: mark what is wrong and send the form back
  4. The helper corrects the marked mistakes
  5. The checker reviews again (up to a few attempts)

Key Insight: Specific Feedback Is the Best Way to Improve

Here is the secret that makes checking and correcting so effective: the specific error marked in red is the best possible instruction for how to improve.

Think about it. When the helper gets something wrong, you could rewrite your original instructions to be more detailed. You could add more examples. You could rearrange the form. These are all general improvements that might help.

But when you hand back the marked-up form — "'Number of guests' must be a digit, not a word" — you are giving the helper the most specific, actionable feedback possible. It knows exactly which blank, exactly what was wrong, and exactly what the correct answer should look like.

This is why correcting with specific feedback works far better than just saying "try again." A blind retry is like resubmitting a failed exam without knowing which questions you got wrong. A feedback-informed retry is like getting the graded exam back with corrections marked in red.

The red squiggles are not a failure. They are the instruction manual for the second draft. The checker writes the corrections for you, automatically, every time.

The first attempt fills in the form. The checker writes the fix.

What's Next

You now have structured output with validation-retry — the model produces data matching your form template, and errors get corrected automatically. But so far, every pipeline processes a single item. What happens when you need to analyze 500 code files, each producing a structured review?

In Chapter 18, you will build provenance tracking (where did each claim come from?) and batch processing (how to handle 500 items efficiently with per-item error handling). These are the final production patterns before the capstone assembly.