7 real agent goal and loop examples you can use

Practical ways to combine prompts, goals, and scheduled runs in Codex and Claude Code.

Jul 02, 2026

Hey,

Agent loops sound more confusing than they are.

A lot of the confusion comes from the terms. Loop, loop engineering, automation, schedule, cron, cloud routine, goal. People use them slightly differently depending on the tool they’re talking about.

But the useful model is simple.

Prompt: ask once.
Goal: keep working until an outcome is true.
Recurring run: time based loops, like every 5 minutes or every Monday at 9am.
Workflow: combine a recurring run with a goal-shaped prompt, evidence, and stop rules.

I’ve had my mind blown a few times this week realising how powerful agents can be when we start to think in terms of automations, systems, and long running goals.

A quick caveat before the examples.

I’m not saying you should trust agents with everything. They make mistakes. They misunderstand context. They sometimes do the wrong thing confidently. No one wants a fake nuanced answer where we pretend the risk is not real.

I’ve found something humbling in my own work this week. I’m holding on too much.

That is a very familiar manager problem. Most new managers were promoted because they were good at the job. They were strong engineers, strong operators, strong problem solvers. Then the job changes. It is no longer about being the person who can do every piece of work best. It is about building the system where other people can do good work without you sitting over their shoulder.

I think the same identity shift is happening with AI agents. If your identity is tied to being good at coding, debugging, and solving problems yourself, letting go feels uncomfortable. It can feel like lowering your standards. But the next step is not blind trust. The next step is better systems.

That is why the guardrails matter so much. Issues, tests, PRs, CI, stop rules, logs, reports, and human review are not bureaucracy. They are the system that makes delegation possible.

Full video here:

1. Check production for errors and fix them

This is the cleanest example of a useful recurring agent system.

A schedule starts the work. A goal defines the outcome. Logs, CI, jobs, and GitHub issues provide evidence. A PR keeps the human in control.

Codex:

Create a standalone Codex automation that runs every Monday morning: /goal Scan the last 7 days of production logs, CI failures, scheduled jobs, and GitHub issues. If you find a confirmed repo-owned bug, reproduce it, fix it, add tests, and open a draft PR. If there is no actionable bug, report what you checked and do not open a PR. Stop if the issue needs product judgement, missing credentials, or a risky production decision.

Why this works:

production problems are easy to miss
the agent has evidence to inspect
the output is reviewable
the action is a draft PR, not a risky production change

This is the kind of agent work I find most interesting. Not replacing engineers. Just catching boring problems earlier.

2. Clean up open PRs

This is useful when a repo has PRs sitting around with failed CI, small review comments, merge conflicts, or stale state.

/goal Review all open PRs and take each one to a clear state: merged, updated, waiting, or blocked. For each PR, check CI, reviews, requested changes, conflicts, and branch protection; fix only clear actionable feedback; run relevant tests; merge only when authority is clear and all checks/reviews pass. Stop if feedback conflicts, merge authority is unclear, conflict resolution is ambiguous, or the PR touches auth, billing, permissions, security, or data deletion.

This is a good example of a goal because it is not one action.

The agent may need to inspect every PR, read comments, fix issues, run tests, rebase, wait for checks, and report what happened.

A normal prompt is too small for that.

A goal is the right primitive.

3. Watch one PR until CI or review finishes

Sometimes you do not need a big goal. You need the agent to check back later because the state will change.

Claude Code:

/loop every 20 minutes for up to 60 minutes, check PR #123 and report CI status, latest review comments, and merge conflict status. Stop successfully when CI is green, there are no unresolved requested changes, and no merge conflicts remain. Stop early if the same failure appears twice or a review comment needs product judgement. Output the current PR status, blockers, and next recommended action.

This is what loops are good for.

Not “keep coding forever”. More like: “monitor this for me because I’m going to focus on something else”.

4. Keep documentation in sync with the code

Docs drift because nobody wants to check them manually.

This is a good daily or weekly automation because the risk is low and the output is easy to review.

Create a daily Codex automation that checks for documentation drift: compare docs against the current implementation. If docs and code are out of sync, open a docs-only PR. If docs contain errors, fix them in the same PR. Do not change implementation code. Stop if the correct behavior is unclear.

This is a nice pattern because the agent can do the boring comparison work, but the final artifact is still a PR.

That is the pattern I keep coming back to:

inspect the system
make a safe change
open something reviewable
stop before product judgement

5. Triage GitHub issues every Monday

Backlogs rot quickly.

Issues get duplicated. Labels get stale. Work gets completed but the ticket stays open. Stuff that is blocked looks ready.

That is perfect recurring agent work.

Codex:

Create a standalone Codex automation that runs every Monday morning: review the GitHub issue backlog, ensure every ticket has useful labels, identify duplicates and already completed issues with evidence, and close issues only when project policy clearly allows it. Otherwise, leave a short comment recommending closure and report anything needing human judgement.

Claude Code:

/schedule every Monday morning triage the GitHub issue backlog. Fix labels, identify duplicates, correct stale states, and report anything that needs maintainer judgement. Close issues only when project policy clearly allows it. Success means every issue is labelled, closed, or blocked with a clear note.

This is not glamorous, but it makes the work system more trustworthy.

6. Deploy a new app from scratch

This is a good example of a long-running goal.

I used agents to deploy a new app to GCP from scratch. The task could take an hour or more, but it was still a good fit because the work was mostly mechanical and the finish line was clear.

The agent needed to inspect the app, choose the smallest sensible deploy path, configure GCP, run the deploy, fix build or runtime errors, hit the live URL, inspect logs, and verify the app worked.

That is exactly the kind of task I do not want to babysit manually (it would have taken me many hours).

/goal Deploy this app to GCP and verify it at a live URL. Use Cloud Run and Cloud Postgres. Inspect the framework, build/start commands, env vars, and existing docs; choose the smallest sensible GCP target; deploy; then verify the URL loads, main route works, logs show no startup errors, and documented smoke tests pass. Keep going until it is live and verified or blocked with evidence. Stop if access, secrets, billing, IAM, auth, permissions, data deletion, or product decisions are unclear, or if the same deploy failure repeats twice.

This works because the agent has a clear external check.

It is not done when the deploy command exits. It is done when the live app works.

7. Work through labelled issues

This is where goals start to feel like real delegation.

If you label issues clearly, the agent has a queue it can work through.

Recently, I’ve started using GitHub milestones as the next step up from labels. Labels tell the agent what kind of issue it is or whether it is ready. A milestone gives it a bounded phase of work.

That feels much closer to real delegation. Here is the phase. Here are the tickets. Keep going until the milestone is complete, verified, and closed.

For example, in a repo you might use:

Ready issues: the agent may work on them now.
Blocked issues: the agent should leave them alone unless the blocker is gone.

Then the goal becomes simple.

/goal Work through open issues in owainlewis/factory labelled factory-ready, one at a time. Open one PR per issue, keep at most two PRs open at once, and run go test ./... before opening or updating each PR. For blocked, duplicate, or already-completed issues, leave evidence and close only when policy clearly allows it. You are done when every issue is closed, has a PR, or is clearly blocked. Stop before risky changes, unclear product decisions, merge authority questions, or failing tests you cannot explain.

This is the bigger idea underneath all of this.

Think in systems.

Labels give it routing. Tests give it verification. PRs give you review. Stop rules keep judgement-heavy decisions with you.

The pattern

Most useful agent systems have the same shape:

A recurrence or event starts the work.
A goal-shaped prompt defines done.
The agent checks evidence.
The agent opens a PR, writes a report, or asks for help.
Risky decisions stay with a human.

That is the part I think people miss.

The value is not unlimited autonomy.

The value is bounded follow-through.

Use prompts for one-off work.

Use goals when the agent needs to keep working until something is true.

Use recurring runs when the agent needs to check back later.

Combine them when you want a system.

I made a full video walking through goals, loops, schedules, recurring runs, and how I use them in Codex and Claude Code.

I also published the full lesson and prompt pack here:

Lesson: Agent goals and loops lesson
Prompt pack: copyable goal prompts
Scheduled agents: where scheduled agents run

If you want to go deeper on building real software and systems with AI: https://aiengineer.co

Thanks for reading.

See you in the next one,

Owain

The AI Engineer

Ready for more?