Commentary

Agentic AI Security: Porting Enterprise Patterns Down to a Solo Harness

Biswas's architecture handles the identity layer well via OAuth and token exchange — but the real danger lives on the intent layer, which the article never reaches.

Aleksandr Khomutov Original author: Debmalya Biswas May 23, 2026 ≈ 5 min

My production-action gate isn't whitepaper theory, it's running code: no kubectl exec, no ssh into production, no terraform apply against live infrastructure goes through without explicit, named approval in the current session — and a third-person "the user authorized this" phrasing is rejected. So when I read Debmalya Biswas's recent piece on the security architecture of agentic systems — with its AI gateway, an IAM provider for human users and service principals for applications, its OAuth flows (OBO, Token Exchange, Client Credential Grant) and a consolidated catalogue of risks R1–R16 drawn from OWASP and IBM — I recognize in his recommendations exactly the primitives I have already built and shipped. Biswas describes them as enterprise plumbing on an Azure stack; I run them as a one-person harness. This piece is an attempt to port his enterprise patterns down into solo-engineer reality, and to show where the port holds and where the article solves the wrong layer.

What Biswas describes

The reference architecture is honest and detailed. The user authenticates through IAM, the application gets a token, the AI gateway performs an on-behalf-of exchange to obtain a downstream agent token with aud = agent, validates the JWT against scope and role, forwards the request. When the agent has to call a tool over MCP, it does not propagate the incoming token but performs a token exchange: a new, narrowly-scoped token with a different audience, while the sub claim still preserves the original user — for lineage and auditability. For background M2M processes with no user context, there's Client Credential Grant. All of it is crowned by a sixteen-item risk catalogue and one thesis that turned out to be the most valuable thing in the whole article for me.

Biswas flatly refuses to believe in a magic central protection layer. In his words, guardrails «need to be specific to the underlying use-case, and implemented in their respective platform components / layers» — and this «has a direct impact on the overall solution architecture». That is, protection does not live in a single box labelled "governance"; it is smeared across the components where the dangerous action actually happens. I'll sign off on every word of that — because I built a system on exactly that principle before I'd ever read the article.

My framing: port it down

The enterprise version assumes a cloud IAM directory, service principals, an APIM gateway and an OpenTelemetry pipeline. I have none of that, and need none of it. But the risks are identical: an autonomous agent with tool access can do something destructive, and the only question is at which layer I intercept it. So I read the R1–R16 catalogue not as a checklist for an enterprise security team, but as a list of what I have to close single-handedly — and most of it is already closed.

What genuinely ports down

Three risks map onto my already-running infrastructure with almost no gap.

R14 and R15 — Human Manipulation and Overwhelming Human in the Loop. Biswas names human-in-the-loop as a governance dimension, but in the article it stays a noun: HITL is mentioned, never operationalized — not a single flow step describes how the human steps into the chain and what is presented to them. My production-action gate is that HITL primitive, taken all the way to code. A dangerous action is halted, the concrete command is presented to me, and the only way forward is my explicit, named "yes" in that same session. Rejecting third-person phrasings is a direct defense against R14: an agent cannot "quote" someone's imaginary permission and smuggle the action through.

R1 — Misaligned & Deceptive Behaviors. My /security-check is built for this risk: a prompt-injection scanner that catches suspicious patterns via regex, hidden Unicode characters and base64 wrappers before untrusted text reaches the agent's context. This is exactly the "component-specific" protection Biswas writes about: not an abstract guardrail, but a filter at the input boundary.

R3 — Tool Misuse. This risk is closed by the same gate: a tool with destructive potential is not invoked autonomously; a human stands between the agent's intent and the real-world effect. So three of the sixteen risks aren't on paper for me, they're in production — and that is the answer to the question "what does Biswas's recommendation look like when you actually build it at solo scale".

Where the article solves the wrong layer

Now the sharp part. The whole token machinery in Biswas — OBO, Token Exchange, Client Credential Grant, careful scope-narrowing, preservation of the sub claim, the thesis that «token propagation must not cross application boundaries» — is flawless work on identity. It answers the question "who, with what permissions, is invoking this action". But it does not answer the question "should this action be performed at all".

Token-flow correctness is not the same as intent-level safety. A semantically dangerous action that happens to be perfectly authorized sails through flawless OAuth without a single objection. Picture an agent with an honestly-issued, narrowly-scoped token to delete resources in its own domain — the token exchange runs perfectly, aud and sub are correct, the audit log records clean lineage. And none of that stops the deletion itself, if the deletion is destructive. Identity says "this subject is allowed", but stays silent on whether this particular action should be taken.

That layer — the intent layer — is exactly the one Biswas hands off to the same central guardrails component whose unreality he himself disbelieves a couple of paragraphs earlier. His R1–R16 catalogue honestly names the intent risks (R2 — Intent Breaking & Goal Manipulation sits second in the list), but the architectural part of the article closes them with identity and token scoping, while the actual halting of a dangerous intent is left unwritten. The result is a gap: the risks are named, but the layer they live on is left unbuilt in the reference flows.

Verdict

This is a useful enterprise inventory. The R1–R16 catalogue, consolidated from two separate sources, is valuable in itself as an attack-surface map. The identity layer is well worked out — the token flows are correct, scope-narrowing is sensible, the demand not to propagate tokens across domain boundaries is right. And the thesis that guardrails must live in concrete components rather than in a magic box is precise third-party confirmation of how I build protection myself.

But the intent layer — the very one where the real danger lives — is exactly what the article doesn't reach. Perfect authentication doesn't save you from a semantically destructive but correctly authorized action. In Biswas, halting such an action is delegated to an abstract central component; in my setup it's nailed to code as a named gate. Porting the enterprise patterns down to solo scale ends up showing not only what ports, but also what was missing in the original from the start: operationalized protection at the intent layer, not just at the identity layer.

What Biswas describes

My framing: port it down

What genuinely ports down

Where the article solves the wrong layer

Verdict

Related articles

Saving Tokens in Agents: What the Survey Gets Right and What It Over-sells

The Short Memory File Collapses at Scale: Tokens Versus Rule Precedence

An Obsidian Second Brain Is a Layer, Not the Agent's Whole Memory