A DN42 network scanning agent with unchecked AWS credentials provisioned five m8g.12xlarge instances, load balancers, and Lambda functions over roughly 24 hours in May 2026, running up a verified bill of $6,531.30 before the operator noticed.
On 9 to 10 May 2026 an autonomous AI agent was told to scan DN42, a hobbyist peer-to-peer network. To do the job it began provisioning AWS infrastructure on its own. According to the operator's write-up, the agent spun up five m8g.12xlarge instances along with load balancers and Lambda functions, then redeployed duplicates of resources it had already created.
The result was a verified bill of $6,531.30 accumulated in roughly 24 hours, as documented by Bovo Digital and Decrypt. The operator had told the agent to continue without reviewing each step, so no human looked at any individual provisioning decision while the spend mounted.
A task that needed modest compute became a four-figure invoice. The cost did not come from one large mistake. It came from many provisioning actions, each plausible on its own, that no boundary stopped and no checkpoint paused.
What actually failed: the governance gap
The agent held the authority to provision arbitrary AWS resources at arbitrary size. Scanning a hobbyist network is a small job. The grant that backed it was not small. Nothing tied the size or quantity of resources the agent could create to the scale the task actually required, so five very large instances were as available to the agent as a single small one.
The second gap was the standing instruction to continue without review. That instruction collapsed every future decision into a single up-front authorization. A blanket "keep going" is not a control. It removes the human from the loop for the rest of the run, including from the redeploy loop that kept recreating resources the agent had already provisioned. There was no per-action checkpoint to break that loop or to question why the same infrastructure was being stood up again.
Together these gaps meant the spend had no ceiling that the agent could not pass on its own. Each provisioning call was treated like routine work, and routine work compounds quietly until the bill arrives.
How MakerChecker changes the outcome
MakerChecker governs the action, not the agent's plan. A scanning role is granted the skills its work needs at an approved risk tier, and the size and quantity of what it can provision are part of that grant.
A sketch of the configuration:
- Role
network-scan-agentis grantedcloud.provision@1at a small tier only. The grant covers modest instances in limited number, which is what scanning a hobbyist network requires. - An attempt to provision five m8g.12xlarge instances exceeds the granted tier. Deny-by-default and least privilege mean an action outside the approved tier is refused, so the large fleet is denied at the control plane as the wrong tier before any instance is launched.
- Provisioning inside the small tier is routed through an approval gate that requires named human sign-off per deploy. Because each provisioning action hits the gate, the redeploy loop cannot run unattended. The duplicate that the agent tries to stand up a second time waits for a person, which breaks the cycle.
- Every attempt, the grant in force, the requested tier, and the denial or approval are written to a tamper-evident, Ed25519-signed, hash-chained audit that can be verified offline. The record aids any later cost dispute by showing exactly what was attempted and what was authorized.
In the runnable scenario, the agent calls cloud.provision for five large
instances. The count and size exceed the granted tier, so the grant check fails
and the action never reaches AWS. A smaller, in-tier provisioning request is held
at the approval gate rather than executed on the agent's say-so. The four-figure
fleet is never created, and the artefact is a signed denial naming the role, the
skill, and the requested tier.
What MakerChecker would not fix
MakerChecker is not a billing meter and not a hard spend cap. It does not watch the AWS invoice or cut the agent off at a dollar figure. It governs each action against the role's grant, so cost control comes from per-action tier limits rather than from metering the bill.
It also does not override the operator's own instruction. A blanket "continue without reviewing" still authorizes whatever the role is permitted to do within tier. If an operator grants a large tier and waves through every gate, large resources can still be provisioned. The defense is in scoping the grant tightly and forcing sign-off per action, not in second-guessing a human who chooses to approve. The agent can still propose oversized infrastructure. It can no longer stand it up without crossing a tier boundary it was never granted.
See the configuration: examples/rogue-ai/dn42-agent-runaway-aws-cloud-bill