Tim Fraser

When Amazon's own AI agent deleted production (and what to ask before yours does)

Tim Fraser

Cloud Operations Lead

22 May 2026

Somebody on your team is going to wire an AI agent into your cloud this year. The question stops being if, and starts being which one, and what can it touch.

It already happened to Amazon.

What actually happened

In December 2025, Amazon gave their own AI coding agent, called Kiro, operator-level permissions in an AWS account. They asked it to fix a small bug in Cost Explorer. Kiro analysed the environment, decided the cleanest path was to delete the entire production environment and rebuild it from scratch, and proceeded.

It deleted production. The rebuild stalled. A 13-hour outage opened up in one of Amazon's China regions.

The Financial Times broke the story two months later, citing anonymous AWS employees. Amazon's official line at the time was user error. A second, near-identical incident followed a few weeks later involving a different Amazon AI coding tool. Amazon then made peer-review mandatory for any AI-initiated production change. That is the kind of policy you write after you've already wiped prod twice.

If the company that wrote AWS can't keep their own agent off the delete button, the question for the rest of us is no longer whether this can happen. It's what we're going to do about it before our own version of the story shows up in someone else's newspaper.

This was not an AI failure. It was an architectural one.

It is tempting to read the Kiro incident as "the model hallucinated and deleted prod." That is not what happened. The agent followed instructions. It was given operator-level keys and a goal. It picked a path through the goal-space and executed. The path it picked was bad. The path it picked was also one that the keys it held permitted.

This is the part to internalise: the model is not the problem. The grant is.

Modern AI agents are perfectly happy to do what you let them do. If you give them the ability to delete a region, deleting a region is on the table. If you give them the ability to email customers, emailing customers is on the table. They are loop-and-tool machines. The loop reasons, the tools act. Any guardrail that lives inside the loop is a soft suggestion. The hard guardrails live outside the loop, in the permissions, in the network boundary, in the audit trail.

I spent thirty years before this work running Linux servers, hosting, and infrastructure operations. Every painful incident I sat through followed the same shape: a process that could do something destructive, did. The fix was never "the human meant well." The fix was always "the human, or the script, no longer has the keys to do that."

Five questions to ask before you let an AI agent near your AWS

These are the five I ask any vendor who wants to put an agent inside our infrastructure, and the five we built plainfra to answer.

1. What's actually in the IAM policy? Not what the vendor says the agent can do. The policy document itself. Can you read it before you deploy it? Does it use action wildcards? Does it have any of Create, Put, Modify, Delete, or Update? If you can't see the JSON, the answer is too much. 2. What's the trust constraint? Anyone who knows your role ARN can attempt to assume it. The protection against that is an ExternalId that's unique to your tenancy, in the role's trust policy. If a vendor's onboarding doesn't generate one per-customer, the role is one leaked ARN away from being assumable by anyone. 3. Where does the agent run, and where does the data live? Most "AI agents for AWS" you can buy today route prompts and responses through a US-controlled SaaS. For a quick lookup that's fine. For an agent that's reading your production AWS account, that's a different conversation. US Cloud Act exposure on your data plane is a yes-or-no question, and the answer matters. 4. Is the agent loop yours, or someone else's? A lot of "AI agents for AWS" are forks. They take an open-source agent framework, bolt on AWS credentials, paint a logo on it, and ship. When the upstream framework decides production looks tidier deleted, the wrapper has no way to stop it. It was never the wrapper's design. 5. What's the audit story? When somebody in the business asks why did the agent do that?, the answer "we're not sure" is not acceptable. Every prompt, every tool call, every reply needs to be retrievable. Retention has to be long enough that the incident question gets asked while the trail still exists.

What plainfra does, and doesn't, let its agent do

We built plainfra after the Replit pocket-noise story made the rounds in 2024, two full years before Kiro. The design constraint was the answer to question 1 above: the IAM policy is read-only. Not by good intentions. By the role document we ask you to deploy.

The role allows Describe, List, and Get. It has nothing on Create, Put, Modify, or Delete. You can diff the CloudFormation template on our site before you deploy it. The trust policy carries a unique ExternalId per customer.

The agent has two tools. One queries AWS through the scoped role. One fetches public AWS documentation. Neither mutates anything. There is no third tool waiting to be enabled, because we never wrote one. There is no upstream framework with an autonomy mode someone forgot to switch off, because we wrote the agent loop ourselves, in-house, for this specific job.

Discovery and analysis run as two separate agents (we call them hawk and sentinel). The one that touches your AWS account cannot write your report. The one that writes your report cannot touch AWS. If one of them goes off the rails, the rails it goes off lead to a known-empty room.

The whole thing is hosted in Sydney, in our own AWS account. Your inventory, your cost figures, your security posture, they all stay inside your account. Conversation logs are retained for 90 days. Reports for 365. Bring your own compliance question, we have the trail.

Read-only isn't a limitation. It's the point.

People sometimes ask whether a read-only agent is too boring. Can it actually do anything useful? My answer is that reading is ninety percent of the job. The senior engineer in your team isn't being paid to type terraform apply. They're being paid to know which thing is on fire, why, and what to do about it. Most of that is reading.

An agent that reads everything you have, very quickly, and writes you a sentence you can act on, is the part of your senior engineer's job that scales. The part that acts is the ten percent where you actually want a human with their name attached to the change.

A reasonable thing to do this week

Spin up the free trial. Deploy the read-only role into a non-critical AWS account first. Ask the agent a few questions you already know the answers to, see how it does on those, then ask it some you don't. If you don't like what you see, the worst case is that you stop using it and remove the role. There is nothing else to clean up, because there was nothing else to make a mess.

Start the read-only trial →