Most agent checklists you will find online are accuracy checklists. Does it hallucinate? Does it handle edge cases? Is the latency acceptable? Those matter, but they are the wrong list for a regulated firm. The question that decides whether your agent ships is not does it work. It is can you account for what it did.

The list below is the one a governance, QA, pharmacovigilance, or regulatory-affairs lead should run before an agent touches anything that counts. Each item is something you verify, not something you hope. If you cannot tick it with evidence, the agent is not ready, no matter how good the demo looked.

1. Identity: the agent is named, not anonymous

Before anything else, confirm the agent acts as a named principal, a single identity that holds exactly one role at a time. Not a shared service account, not an anonymous process calling APIs with a stored key.

This is the precondition for every other item. You cannot authorize, limit, or audit an actor you cannot name. If three agents and a batch job all act as the same credential, you have already lost the ability to say who did what, and no later control can recover it.

Verify: every agent maps to one identity, every action carries that identity, and the identity holds one role for the duration of a run.

2. Grants: deny-by-default and versioned

Next, look at how the agent's capabilities are defined. The only safe posture is deny-by-default: the agent can do nothing except the specific actions its role was explicitly granted. Everything not granted is refused, not permitted by silence.

Then check the part most teams skip, that grants are versioned. You should be able to reconstruct exactly what the agent was permitted to do on any past date, and see who approved each change. An examiner's question is rarely "what can it do now." It is "what was it allowed to do the day this happened, and who signed that off." We make the full case in deny-by-default permissions.

Verify: capabilities are an allow-list, not a block-list; the list is versioned; and each version records its approver.

3. Segregation of duties: it cannot approve its own work

This is the item teams most often fake. The agent that prepared a piece of work, drafted the report, cleared the alert, assembled the submission, must not be the one that approves it. Not "should not." Cannot, enforced structurally inside the run.

A configuration flag that you promise to set correctly is not segregation of duties. The test is whether the same agent, asked to both make and check on one run, is structurally refused, and whether that refusal is recorded. This is the oldest control in quality assurance, and it long predates AI: it is 21 CFR 211.22 in pharma, where the quality unit's separation of duties is a structural requirement, and the qualified-person release sign-off under EU GMP Annex 16, where the person who assesses a batch is not the person who prepared it.

Verify: an agent provably cannot be both maker and checker on the same run, and attempts to self-approve land in the log as refusals.

4. Human gates on every one-way door

Some actions cannot be undone. Releasing a drug batch, submitting a 15-day expedited safety report on an adverse-event case, filing a medical-device reportability determination, pushing a configuration to live medical devices. For each of these, the agent should park the run and demand a named human signature before it proceeds.

A real gate is more than a notification a tired reviewer clicks through. It can require a quorum of named approvers, it bars the requester from approving its own request, and it captures the signer's reason verbatim, so the signature carries its meaning, the way 21 CFR 11.50 has long required of the people who sign batch releases under EU GMP Annex 16. A seriousness call that starts the expedited reporting clock is a mandated human decision; the gate is where that requirement becomes enforceable on a machine. We go deeper in human-in-the-loop approval gates.

Verify: every irreversible action is gated; the requester cannot sign its own request; and the signature records who, when, and why.

5. Limits: the agent cannot exceed its mandate by accident

Authorization tells you the agent is allowed to do a kind of thing. Limits tell you it cannot do too much of it. An agent scoped to quarantine a deviation batch should not be able to quarantine more lots than its role permits, or release more batches per hour than a human would ever process. An agent scoped to read records should not be able to read the entire database in one run.

These are not the same as content guardrails, which ask whether a message is dangerous. Limits ask whether the volume, value, or rate of authorized actions has slipped outside the band a human in that seat would stay inside. The distinction between the two is worth understanding before you ship; we draw it in governance versus guardrails.

Verify: value, volume, and rate ceilings exist for every consequential action, and breaching one stops the run rather than logging it after the fact.

6. Audit: tamper-evident and verifiable offline

The final item is the one everything else exists to produce. Every action, model call, grant change, and approval must land in an append-only, hash-chained, cryptographically signed ledger. Change one record and the chain visibly breaks.

The test that separates a real audit trail from a log file is this: can a third party who distrusts you verify the export offline, against a published spec, with no access to your systems? That is the modern form of the tamper-evident audit trail 21 CFR 11.10(e) has demanded for decades, and the linking 11.70 requires between a record and the signature that approved it. A log you can edit is not evidence. A log only you can verify is not much better.

Verify: the trail is append-only and hash-chained; signatures use a verifiable key; and the evidence bundle checks out for someone offline who does not trust the vendor.

The one-page version

Check	What "ready" looks like
Identity	Each agent is one named principal holding one role
Grants	Deny-by-default, versioned, with recorded approvers
Segregation of duties	Maker cannot be checker, structurally, per run
Human gates	Every one-way door requires a named signature
Limits	Value, volume, and rate ceilings stop the run
Audit	Hash-chained, signed, verifiable offline

Why no rulebook gets you off this list

You might wait for a regulator to hand you the agent version of this checklist. Do not. No agency has issued a supervisory template that tells you how to validate an agentic system, and the EU AI Act's high-risk obligations were deferred to December 2027. There is no supervisory template for agents, and no template means no safe harbor.

The predicate rules underneath did not move. They govern what a human in that seat must do, and they are date-proof: the 21 CFR Part 11 audit-trail and signature requirements, ICH-GCP, the medical-device reporting rule at 21 CFR Part 803, EU GVP. Inspectors still ask who authorized the action, discovery still demands the trail, and a personally-liable signer still has to account for the decision. The absence of an agent-specific rulebook removes the template, not the exposure. This list is how you close the gap between pilot and production before someone else writes the rules for you.

If you can tick all six with evidence, you have an agent an auditor can examine. If any item is "it's in the prompt," you have an agent with good intentions, and in a regulated industry, good intentions are not evidence.

See how it works, or book a demo to watch an agent get blocked from approving its own work, live.

An agentic AI governance checklist

1. Identity: the agent is named, not anonymous

2. Grants: deny-by-default and versioned

3. Segregation of duties: it cannot approve its own work

4. Human gates on every one-way door

5. Limits: the agent cannot exceed its mandate by accident

6. Audit: tamper-evident and verifiable offline

The one-page version

Why no rulebook gets you off this list

How MakerChecker works, the six primitives

Getting AI agents from pilot to production

AI agent governance vs guardrails

What is an AI agent control plane?

See an agent get stopped.