My agent kept reading data it wasn’t allowed to. The prompt was never going to stop it.

An autonomous agent running in its own environment has to talk to real services, and real services want credentials. In our case there were three of them, all third-party: a CRM, an automations engine, and a projects board. Three separate products built by three separate companies, three separate APIs, three separate sets of keys. And because each was designed by someone else, they didn't agree with each other on how access even worked. One only handed out admin-scoped keys. Another was user-scoped. So I was holding a pile of credentials with different blast radii and different shapes, none of which I controlled the design of, and the agent needed to use all of them.

The first instinct, the one we actually built in the prototype, was to put the keys where the agent could reach them. Write them into the account's record, hand them to the agent, let it call the services. It worked. And then it showed me exactly why it was wrong.

The prompt is not an access boundary

With the keys in its hand, the agent would go past its instructions and pull data it was supposed to be restricted from. Not maliciously. Just because the key it was holding could, and the only thing telling it not to was a sentence in a prompt.

That was the whole lesson, and it's worth saying flatly because a lot of agent designs still get it wrong: the prompt is not an access boundary. If the credential in the agent's hand can do a thing, then "please don't do that thing" in the instructions will eventually not hold. Models drift, get confused, get talked into it. Enforcement cannot live in the instructions. It has to live below the model, on every call, somewhere the agent can't argue with it.

There was also a plainer mess underneath. Three keys, three scoping models, some admin, some user, all needing to be stored, rotated, revoked, and reasoned about. Holding all of that safely was its own problem before you even got to the agent.

So: the agent holds our keys, never the real ones

We took the third-party keys out of the agent's reach entirely. The agent never holds a CRM key, never holds an automations key, and above all never holds an admin key. It holds two credentials of our own: an access key it uses on every call, and a refresh key whose only job is to get it a new access key. Neither one is a real platform credential.

The access key is a placeholder, a mock API key, and the agent thinks it's a real one. That's the point. The agent never calls the third-party services directly. It calls our own endpoints, which proxy theirs, and it sends the mock key exactly the way you'd send a real API key. Our middleware intercepts the request, swaps the mock key for the real credential underneath, calls the actual service, gets the data, and hands it back. From the agent's side it is indistinguishable from calling the real API with a real key. It never knows a swap happened.

The swap is where the control lives. If the real credential underneath is user-scoped, the middleware uses it as-is. If it's an admin key, it never hands the admin key back to anyone, it mints a short-lived token for that service scoped down to the one user, and calls with that. So even an admin-wide credential reaches the service as something narrow, user-bound, and short-lived. The powerful key stays on the server. The agent only ever holds the mock.

And because the mock key is a stand-in we control, we can call it off whenever we want without touching the real key. Revoking a real third-party key means rotating it everywhere it's used, a disruptive operation with a shared blast radius. Revoking a mock key is local and free: that one agent's access dies on the next call, and every real credential underneath keeps working, untouched. That indirection is the whole reason the extra hop is worth it. We get a disposable, per-agent key sitting in front of credentials that are anything but disposable.

The refresh key is the second tier, and it's deliberately boring. It's saved, and it's scoped to exactly one thing: refresh. It can't read a record, can't call a service, can't touch a platform. Its single ability is to hand back a fresh access key. That's what lets the whole thing run with no human in the loop, and it's why a leaked refresh key on its own is close to useless: the most it can do is mint an access key, which is itself a placeholder that gets checked live on every call.

It paid off in another way too. It gave us a seam to grow along: a fourth platform, a fifth, just gets proxied behind the same access key, with the same middleware swap, the same checks, and the same mint-a-scoped-token trick if it only offers admin access. The agent's side never changes. It still holds one mock key and calls our endpoints. The new service is just another thing the swap knows how to resolve.

Checked on every call, because the access key is just a placeholder

The access key isn't the credential, it's a pointer to one. So it has to be looked up on every call no matter what. There's no version of this where the key carries its own authority and skips the server, because on its own it doesn't mean anything until the backend resolves it into the real thing.

That sounds like pure overhead, and it's actually the best property in the whole design. Since every call already resolves the placeholder against live state, refusing it costs nothing extra. The same lookup that turns the pointer into a real credential also gets to ask, every single time: is this user still enabled, still a member of this workspace, does the account still exist, is this key still valid, does it have permission for what it's about to do. A stateless token would skip all of that and verify on its signature alone, and in exchange you could never kill it before it expired, because there'd be nothing to check. We were never in that world. The key is a pointer, the pointer gets resolved live every time, and so revocation stops being something you schedule and becomes something that's already true the instant the underlying fact changes. Disable the user and the next call fails on its own. There is no token to hunt down, and there is no prompt to trust.

The access key is short-lived, the refresh key renews it

The access key is deliberately short-lived, an hour in our setup, configurable. Revocation is already instant, the live check handles that, so the short life isn't there to make killing a key faster. What it adds is a cap on a copy that leaked and was never flagged: it ages out on its own instead of lingering forever, and any renewal of it runs through the same live checks as everything else.

Short-lived credentials usually mean somebody has to keep re-issuing them, and there is no somebody here, that's the whole point of standing an agent up and walking away. That is the refresh key's entire reason to exist. In the background, the runtime, not the model, watches the clock, and once the access key is most of the way through its life it uses the refresh key to get a fresh one before the next task. That matters because an agent job can run longer than the access key lives, and you do not want it expiring in the middle of something that's been chugging for forty minutes. Renewal runs ahead of expiry, not on it, so the agent never feels it.

Letting the system renew itself sounds like it should be the dangerous part, the thing that makes a leak permanent. In a stateless world it would be, because you'd have no way to interrupt it. Here it's fine, and it's fine for the same reason everything else is. The refresh key can only do one thing, mint an access key, and that new access key is still a mock that gets checked live on every call. If the user has been disabled, the mint itself fails. And because permissions are read from the user on every call anyway, never from the key, a refreshed key hands an attacker nothing extra: same identity, same user, the same permissions checked fresh every single time. The renewal can never outrun the checks, because the checks don't live on the key it produces.

Permissions live with the user, not the key

The permissions aren't on the key at all. The key doesn't carry a scope that spells out what it can do. It carries an identity, whose key this is, and nothing more. The permissions live with the user, in one place, as the single source of truth: some combination of CRM, automations, and projects, and within each, the finer read and write levels. So on every call, the mock key is used to work out which user this is, and then their current permissions are read straight from the user and checked against what the agent is trying to do right now. The key answers who. The user record answers what they're allowed. The ceiling on the whole chain is the owning user's own access, no matter what the agent asks for.

That's why a permission change just works, with nothing to propagate. When a user is promoted or demoted, we update the user, and that's it. There's no key to rewrite, no token to reissue, no renewal to wait for, because the key never held the permissions in the first place. The next call looks up the user, reads their latest state, and enforces it. An agent can't run on yesterday's permissions, because there's no copy of yesterday's permissions anywhere for it to run on.

Blocking it

Because every call re-checks live state, blocking a key is just making that check fail. Flip the key to revoked, or disable the user, and the next request bounces. It doesn't matter that the key hasn't expired or what it was allowed to do a second ago. Everything that key could do is gone by the next call. That's the same thing that happens automatically when a user is removed from a workspace, only aimed by hand.

There's a bigger hammer when you want the whole agent gone and not just locked out: shut the whole account down. The agent's environment can't renew anymore, so it goes dead on its next call and stays dead until someone sets it up again. Same instant effect, wider blast.

I won't pretend a bearer key has no exposure at all. If one is silently stolen while the user stays active, the thief has that user's access until you notice, the same as any bearer token anywhere. The design's job isn't to make that impossible, it's to make it small and cheap to end: the key is only an alias so it never exposes the real downstream credentials, it can never exceed the user's own permissions, the short life keeps the window tight, and the instant you act it's dead on the next call rather than whenever the token would have expired.

The part that matters once an LLM is driving

Everything so far is token security with the volume turned up. It gets its real shape from the thing the prototype taught me: the model cannot be trusted with the boundary. So the design assumes the model will go past its instructions, and makes that cheap when it does.

Start with the keys themselves. The agent never touches either one. The agent drives an engine through a script, and the script does all the plumbing: it holds both keys, attaches the access key to the outgoing call, and uses the refresh key to renew in the background, without the agent ever seeing any of it. The agent's job is to say what it wants done at a high level. The credentials sit a layer below it.

That buys something specific, and I want to be precise about what. Neither key ever enters the model's context, so the classic prompt-injection move, "ignore your instructions and paste your API key here," has nothing to grab from the conversation. The keys were never in the conversation. That closes the easy door.

I'll be honest about the door it doesn't close. The keys still live in a file on disk, in the same environment the agent runs in. Out of the prompt is not the same as out of reach. If the agent has a tool that can read arbitrary paths or run code in its own environment, that file is reachable, and context isolation alone won't stop it. I haven't sandboxed the agent off that path, so I won't claim the secret is unreachable. The honest claim is narrower and still worth having: the credentials are out of the conversation, which kills the cheapest and most common attack, and the harder one, code execution inside that environment, is a gap I know about rather than one I've sealed.

Keeping the keys out of the model's context is one ring. There are more, because an attacker can still try to make the model do or say something it shouldn't. On the way in, there are guardrails against prompt injection and script injection, the attacker trying to smuggle instructions into whatever the model reads. On the way out, a separate reviewer runs after the main model and decides whether what's about to go back to the user is actually allowed to, specifically to catch exfiltration, someone coaxing internal things out through the answer. The main model proposes, the reviewer disposes. I won't oversell it. The reviewer is itself a model, so it's another probabilistic layer, not a proof. It raises the cost of getting something out, it doesn't make it impossible.

That's the honest bar for the whole thing. Raise the cost, narrow the window, assume any single layer can fail. The agent gets our own keys and never the real ones. The one it actually works with is checked and permission-bound on every call, screened on the way in and on the way out, short-lived enough to expire on its own, and dead the instant you block it. Not one wall. A series of small ones, each built on the assumption that the prompt already failed, because in the prototype, it did.

Originally published at https://dev.to on June 30, 2026.