Self-accountable Agent

We fix bugs the same way we develop new features using agents which is why we it is taking us more time to fix bugs than to develop a new feature.

In development, we can hand an agent a focused task, give it repository context, let it inspect the codebase, write tests, make a patch, and open a pull request. The workflow is fast, structured, and surprisingly effective when the task is well-scoped.

But after deployment, things often fall back to the old way.

A production error appears. Someone has to notice it. Someone has to open the logs. Someone has to copy the relevant stack trace into a ticket. Someone has to figure out whether this is new or recurring. Someone has to assign it. Then, finally, someone or some agent starts debugging.

If agents are already good at working through scoped engineering tasks, production incidents should be shaped into those tasks automatically.

So we built a loop around Cloudflare Tail Workers and a dedicated AI agent.

How sauce was made?

Our API runs on Cloudflare Workers, so we used a Cloudflare Tail Worker to observe production runtime events.

The main Worker forwards trace events to a dedicated tail consumer:

[[tail_consumers]]
service = "service-xyz-tail"

The tail Worker itself stays small:

export default {
  async tail(events, env) {
    await recordOpsTraceItems(env.DB, events);
  },
};

It receives production telemetry and passes it into our ops telemetry processor.

That processor does the important work:

Ignore non-actionable noise.
Normalize each event.
Redact sensitive data.
Generate a fingerprint.
Store the raw event summary.
Create or update an incident.

We store this in two tables:

ops_log_events for individual sanitized events
ops_incidents for grouped failures

Each incident is keyed by a fingerprint derived from stable parts of the failure: Worker name, route pattern, exception type, normalized message, top stack frame, and status code.

This prevents ticket spam. If the same error happens 20 times, it becomes one incident with an updated occurrence count.

Turning Incidents Into Agent Work

Every five minutes, a scheduled Worker job checks for open incidents that do not yet have an issue in our tracker for agents.

For each new incident, it creates a ticket and assigns it to a dedicated Cloudflare Worker fixer agent.

Is it really that cool?

This closes a gap in agent-based development.

Most teams think about agents in the inner loop: writing code, refactoring, testing, reviewing. But production bugs live in the outer loop. They need triage before they become code tasks.

Without automation, that outer loop is manual:

Notice an error.
Find the logs.
Decide if it matters.
Create a ticket.
Copy context.
Assign it.
Ask someone to investigate.

With this system, production does the first half itself:

Runtime error occurs.
Tail Worker captures it.
Telemetry processor sanitizes and fingerprints it.
Incident is created or updated.
Ticket is created.
Agent starts from a structured task.

The human review point remains where it should be: the pull request.

Where are we heading?

We do not want agents making unchecked production changes. But we do want software systems that can turn real-world signals into structured work automatically.

Today, that signal is a production error. A Cloudflare Worker fails, the tail worker captures the incident, and an agent receives a focused debugging task.

Soon, the signal will not only come from production logs.

We are moving toward agents that can attend calls, understand the discussion, and proactively start building from it. If a client call surfaces a workflow problem, a product idea, or a repeated operational pain point, the agent should be able to capture that context and turn it into something tangible: a prototype, a draft implementation plan, or a pull request for a narrow first version.

That changes the role of meetings. A call no longer ends with someone manually translating notes into tickets and follow-ups. The discussion itself can become the starting point for software.

Self-accountable Agent

How sauce was made?

Is it really that cool?

Where are we heading?

Comments

More from this blog

AI Jargons

Was ist Vector Database

Juicebox 🧃

Model Context Protocol

Command Palette

How sauce was made?

Is it really that cool?

Where are we heading?

Comments

More from this blog