Who Did That? Solving AI Agent Provenance in Agentic Systems

When an AI agent acts on your behalf, how do you prove you were actually in control?

Felipe Olivera

~5 min read · February 25, 2026 (Updated: February 25, 2026) · Free: Yes

When an AI agent pushes code, modifies infrastructure, or updates records on your behalf, the audit log shows your name. What it doesn't show is whether you were watching — or even aware it was happening.

AI agents are moving fast beyond code suggestions. Tools like Claude Code, Claude Desktop, and upcoming domain-specific agents (think security, operations, sales automation) now act on behalf of users across enterprise systems — through MCP servers connected to AWS, Atlassian, Salesforce, GitHub, and more.

In this model, a user launches an agent session and grants it permission to act. The agent then executes autonomously: spinning up infrastructure, updating tickets, pushing commits, modifying CRM records. The scope is bounded only by what the user is authorized to do — and increasingly, that scope is very wide.

For the systems on the receiving end of these actions, this is largely invisible. An API call from a human engineer and an API call from an AI agent look identical. The audit log says the same name either way.

Agentic AI is outpacing the identity standards built to govern it.

This invisibility creates a compounding set of problems.

Regulatory exposure is real and growing. The EU AI Act — particularly Articles 13 and 14 — requires transparency about AI involvement in decisions and verifiable evidence of human oversight. "We have a process" is not sufficient. Auditors will ask for proof that a human exercised meaningful review at defined checkpoints. Today, most teams cannot produce that proof.

Existing identity standards fall short. OAuth 2.0 and OIDC handle authenticated principals well — humans, service accounts, M2M flows — but they assume the identity of the actor is known and static. They have no concept of delegated agency: a human authorizing an AI to act within their own permission scope, with supervision levels that shift dynamically depending on the task. RFC 8693 (Token Exchange) is the closest existing mechanism — it can represent delegation cleanly via act and sub claims — but it describes the session credential, not the action chain. Once a delegated token is issued, the protocol has nothing further to say about what the agent did with it, or whether anyone was watching.

Audit logs are records, not proofs. A log entry saying "human approved this" is only as trustworthy as the system that wrote it. If the agent holds the credentials to that system, the log is not an independent attestation of human involvement — it's just another thing the agent wrote.

The net result

As agentic AI expands in scope and autonomy, the industry is accumulating a growing accountability gap between what agents are doing and what can actually be demonstrated to have happened with human awareness and consent.

One way to close this gap could be a multi-layer approach that separates session authorization, action provenance, and human attestation into independently verifiable components.

Layer 1 — Session Credential (RFC 8693 Token Exchange)

When a user launches an agent session, the auth server issues a delegated token via RFC 8693. This token carries the user's identity in sub, the agent's identity in act, and a scope that is capped at most at the user's own permissions. The agent can never act beyond what the user could do directly.

This layer answers: who authorized this session, and under what scope.

Layer 2 — Action Provenance Envelope (per tool call)

Every tool invocation the agent makes is wrapped in a signed envelope — signed by an ephemeral keypair generated at session start and held only by the agent process. The envelope carries the action type, parameters, a hash of any proposed changes, a timestamp, and the session ID. These envelopes are "proposed" state: logged, auditable, and reversible.

This layer answers: exactly what did the agent do, and when.

Layer 3 — Human Attestation Signature (at approval checkpoints)

This is the critical addition. At meaningful checkpoints — before code is committed and pushed, before infrastructure changes are applied, before records are updated — the agent surfaces a human-readable summary and waits. The user reviews the proposed action chain and, if satisfied, signs a cryptographic attestation.

This signature is produced using a key the agent never has access to — held in the user's device or secure enclave (WebAuthn / Passkey) — and attached permanently to the action chain.

This layer answers: who reviewed and endorsed the change.

Downstream systems — such as GitHub, AWS CloudTrail, Salesforce Shield — receive the RFC 8693 session token, the agent's action envelopes, and the human attestation signature as a bundle. They can now distinguish three tiers of action provenance:

Why Key Separation Is the Key Insight

The design's strength comes from what the agent cannot do. Because the user's approval key must be kept strictly out of the agent's reach — whether through hardware isolation, access control policy, or credential management discipline — a correctly deployed system prevents the agent from forging past approvals.

This is partly a technical guarantee and partly an operational one: hardware isolation and device-bound credentials raise the bar significantly, but the separation ultimately holds only as well as the operator enforces it.

This maps directly to what Article 14 of the EU AI Act actually demands: not a policy asserting that humans could have reviewed something, but verifiable evidence that a human did exercise oversight at defined points in the process.

A Practical Path Forward

For those building agentic platforms and MCP integrations, the primitives exist today even if the standards don't:

RFC 8693 for session delegation, with act / sub / session_id claims clearly scoped
JWS (RFC 7515) for signing both agent action envelopes and human attestation objects
WebAuthn / Passkeys as the user's approval key mechanism — hardware-bound, non-exportable, already deployed at scale

The missing piece is agreement on a common schema for action envelopes and attestation objects — git commit signing is the closest mental model, and a familiar one for engineering teams.

Feedback welcome. Particularly interested in whether identity vendors (Okta, Azure AD, etc.) or MCP server implementors are converging on anything resembling a common schema for the action envelope layer.

#ai #cybersecurity