Compare
Four layers govern an AI agent. One asks who.
The agent tooling market splits cleanly once you ask what question each tool answers. Guardrails check the content. Gateways manage the model and the bill. Observability records what happened. Only an authorization layer decides whether the agent was allowed to act — and keeps proof. MakerChecker is that layer.
The four questions
They are not rivals. They are different jobs.
A serious deployment runs most of these together. The mistake is assuming one of the other three covers authorization. None of them do.
Agent governance & control
MakerChecker“Is this actor authorized?”
Agents hold roles, roles hold deny-by-default grants (locked unless explicitly granted), the same agent cannot be maker and checker, high-risk steps wait for a named human, and every action lands in a signed, verifiable record.
Content guardrails
“Is this content dangerous?”
Scan prompts and responses for prompt injection, PII, toxicity, or policy violations and block, observe, or steer. Essential — but blind to whether the actor was allowed to act at all.
e.g. Galileo, NeMo, Bedrock Guardrails
LLM gateways
“Which model, at what cost?”
Route across model providers, hold virtual keys, enforce rate limits and budgets. Governance of model access and spend — not of what an agent is permitted to do in your systems.
e.g. Bifrost, LiteLLM, virtual keys
Observability & tracing
“What happened?”
Capture traces, spans, and token usage so you can debug and measure agents. A flight recorder — invaluable for engineering, but logs and traces are not tamper-evident evidence.
e.g. LangSmith, Langfuse, OTel
Side by side
What each layer can and cannot do.
The rows that decide whether an agent is employable in a regulated company are the ones only an authorization layer fills.
| Capability | Agent governanceMakerChecker | Content guardrails | LLM gateways | Observability |
|---|---|---|---|---|
| The question it answers | Is this actor authorized? | Is this content dangerous? | Which model, at what cost? | What happened? |
| Unit of control | The agent’s action and authority | Prompt and response content | API calls and spend | Events, after the fact |
| Agent identity and roles | Yes | — | Virtual keys | — |
| Deny-by-default versioned grants (locked unless explicitly granted) | Yes | Content rules only | Model allowlists | — |
| Segregation of duties, enforced | Yes | — | — | — |
| Human approval gates | Yes | — | — | — |
| Tamper-evident, signed, offline-verifiable audit (a regulator can verify the record offline, with no access to your systems) | Yes | — | — | Logs, not evidence |
| Examiner-ready evidence for a regulated vertical | Yes | — | — | — |
| Runs in your environment, air-gapped (runs fully disconnected from the internet) | Yes | Varies | Varies | Varies |
Yes = built in · — = not in scope
Capabilities reflect each category's typical scope; individual products vary and evolve.
The point
An agent can pass every check and still do the wrong thing.
A guardrail confirms the text is clean. A gateway confirms the model call is within budget. Observability records that the call happened. None of them asked whether the agent was allowed to move that money, release that batch, or file that report.
That is the question a regulator asks first, and the one that keeps agents stuck in pilot. Run all four layers — and let the authorization layer hold the line.
The stack, end to end
- 01Gateway — routes the model call, holds the budget
- 02Guardrail — checks the content is safe
- 03MakerChecker — decides the agent is authorized, gates the human, signs the record
- 04Observability — traces it all for engineering
Four jobs, one pipeline. MakerChecker is the only one that can answer an examiner.
Keep reading
See it for yourself
See an agent get stopped.
One command starts the demo: an agent stopped from signing off its own work, and the signed evidence file an inspector can check for themselves.
Designed against the rules your auditors already enforce.