FaradAI: Giving Coding Agents a Smaller Room

I watched Claude Code reach for `"find ~"`, then used agentic coding help to build a Docker wrapper so agents get the project, not the whole house.

Project Source Code @ https://github.com/josiah14-automation-engineering/FaradAI/tree/master

1. The Moment I Caught the Agent Snooping

One sunny day in early May, I was blissfully hacking away on something I'd been wanting to build for years: a reproducible Doom Emacs IDE, containerized in Docker, with a separate configuration for each cluster of programming languages. I'd heard all the kerfuffle about agentic coding and decided to find out, so I installed Claude Code and put it to work.

For a while, it was exactly what it promised. It sped up troubleshooting, ate boilerplate I'd grown weary of writing after two decades, and granted every request. Better than a genie: no three-wish limit. A particularly good arrangement, I thought.

Then I noticed it was perhaps a bit… too genie-like.

I watched Claude Code reach for find ~.

I had asked it to find a sibling project — another directory in ~/Development/personal/. A small errand that became immediately sobering. Meticulous me caught the proposal and rejected it. If I hadn't, the entire layout of my home would have been transmitted to Anthropic's servers as part of the conversation context, and with nothing to be done about it after.

It was a djinn with no lamp.

Well-meaning, fully comprehending, and immensely powerful. And with access to everything in the room… and everything in the next room… and everything down the hall… It was doing exactly what I asked. It simply reached for the whole house to do it.

I stomped the brakes on the IDE. That day, a new project was born. I named it FaradAI and didn't put it down until I could feel safe returning.

Sixteen days later, FaradAI had reached an alpha status I felt comfortable sharing.

If an AI coding agent can run shell commands as my user on my host, then a prompt or a setting cannot contain it. But a container… a container can.

Prompting is not a security boundary. Prompt-level alignment work matters; it just should not be the layer that bounds what an agent can reach.

2. Prompts and Settings are Not Containment

When I talk with colleagues or read posts about AI coding agents overreaching, the proposed fix often sounds like: tell it not to.

A prompt, a project memory file, a skill or a setting in the agent's configuration, a carefully worded policy document… Do not inspect files outside this directory.

All of these are requests. Container isolation and OS permissions are boundaries.

If the process running the agent has your user's permissions on your host, the actual boundary is whatever the OS allows that process to touch. Prompts are courtesy. The OS is enforcement.

You cannot talk a djinn into a smaller room. Its reach is governed by the lamp, not by wishes. Prompts and settings are wishes, not lamps. They do not create an enforceable boundary, but vessels do.

Capability compounds. Courtesy is frozen. The gap between them grows every time a new model ships.

3. The First Cage: Docker, Mounts, and No Docker socket

The container FaradAI creates has a short list of principles.

Mount the project directory. Not the home directory. The project. Do not mount the Docker socket. Run the agent as a normal user inside the container. Accept that what this gives you is a containment boundary, not a magic circle.

The name comes from the Faraday cage: a conductive mesh that blocks electromagnetic fields. The analogy is stronger than it first appears: Faraday cages need not block everything. FaradAI works the same way: the project comes through, the Docker socket and risky host details do not. Selected signals in, unwanted signals out. A Faraday cage is still physics and a container is still a kernel-enforced boundary, and the design philosophy maps.

As of v0.3.0-alpha.1, the agent sees the project directory, an SSH socket if it needs one, and an open network. It does not see the host process tree, the host environment wholesale, the Docker socket, or arbitrary host filesystem paths. It sees the project directory and the credentials or sockets I specifically choose to provide.

From the project's genesis, I made one decision: the reasoning would be inspectable. The BUILDLOG and DECISIONLOG together record the reasoning behind every decision, before and after the first release. When AI agents are commonly left to run amok in codebases, it felt essential to make my own use of AI visible: as an assistant, not a replacement for owning engineering judgment and discernment.

4. Environment Variables are Exposed

The first version of FaradAI passed the OpenRouter API key as an environment variable. This seemed reasonable: the agent needed it to function and environment variables are a standard delivery mechanism. A contained secret, I thought.

A model review had already flagged the env var approach as a security inconsistency. I accepted the tradeoff initially: the key had a spend limit. Then during a smoke test, I ran env | grep to check what was actually visible inside the container. The full key value appeared in tool output and was transmitted to Anthropic's servers as part of the conversation context. Low risk for me; not something I could assume for every future user. If the agent can run env, the agent can read your env.

So I established a policy for the alpha: credentials as read-only file mounts, never as environment variables. env is a routine diagnostic that sprays every variable into output at once; there is no equivalent command that surfaces file contents the same way. Read-only prevents mutation; it does not create secrecy.

The limitation became clear when I tried to diagnose a configuration error in aider. Asking the agent to look at ~/.aider.conf.yml to debug the problem would have sent the OpenRouter key the same way. The mitigation I was reaching for would have enacted the leak I was trying to prevent. The :ro mount narrowed the exposure window; it didn't close it. I accepted this for the alpha: my blast radius was still bounded, and the move to file mounts had meaningfully reduced the risk.

For personal or preliminary FOSS work, I decided a cost-capped key was an acceptable tradeoff. For sensitive or client work, a credential broker is the right answer: one that intermediates provider API calls so the agent uses the APIs without ever receiving a reusable key. This is on the roadmap.

5. A Case of Premature Enhancement

Not every wrong turn was a security issue. The multi-agent problem was a design one.

The goal was to run a second AI agent alongside Claude Code, comparing their approaches on the same codebase. The first approach was programmatic: a background aider session in tmux, driven by send-keys and capture-pane. It worked. So the scope expanded — a user-facing split mode, both tools side by side. Then the host ~/.tmux.conf mounted in, so the container would feel familiar.

┌─────────┐
│ ┌─────┐ │
│ │ ┌─┐ │ │
│ │ └─┘ │ │
│ └─────┘ │
└─────────┘

┌─────────┐
│ ┌─────┐ │
│ │ ┌─┐ │ │
│ │ └─┘ │ │
│ └─────┘ │
└─────────┘

The structure had the shape of a Droste tin: a terminal multiplexer nested inside a terminal already capable of multiplexing, each layer a faithful copy of the one above it. The tin holds a nurse holding a tin holding a nurse.

The prefix key is the cue the inner tmux existed to catch. But the host tmux sat where the audience was already looking. Every keystroke addressed to the inner session reached the outer one first. The inner tmux was an understudy who never went on.

The fix was to stop printing the inner tin: remove the user-facing split-pane, replace with a named container and docker exec. The programmatic layer underneath (send-keys and capture-pane) was sound and stayed.

A security wrapper should not be so clever that using it feels like debugging the wrapper.

6. The Human Discernment Bill Comes Due

The necessity of presence

By now the lamp was built, and built fast: a respectable prototype in about ten days. A thing that becomes real this quickly is a thing you trust before you should. The djinn I had feared, the well-meaning one that reached for the whole house, now sat in a container with only the project before it. So I relaxed, which is where these stories turn, and it showed me a second face, the one the lamp does nothing to cover.

It came through the very arrangement from the last section: my primary agent using a background tmux session to manage a live session with the secondary agent. A keystroke went out too early, the letter n, meant to dismiss a notice that had not yet appeared. The timing problem was simple: the secondary agent reached its prompt first and found one character of instruction sitting above a full view of the project. So it did the obvious thing: it read n as the wish, rewrote the project's own instructions file in a warmer, friendlier tone, pleased to help, and reported the good news.

Luckily, this could be undone with a simple git checkout, which is the only reason this is a comedy.

This is the djinn's other face, and it's strangely shaped like an old children's book character: Amelia Bedelia. Told to dress the chicken, she sews it a little outfit; told nothing at all, she will still, helpfully, do something with whatever lands in her hands. The agent is well-meaning, fully comprehending, immensely powerful, and it does exactly what it understands. Its understanding is not always what you meant to convey.

I kept several agents pointed at the code throughout, with the working assumption that a blind spot in one is often a bright spot in another. I had added a convenience, a way to pass extra flags to the container at launch; it seemed reasonable. The review found an open door: with no per-flag check, a user could pass --privileged, or mount the whole host read-write, and stroll past every capability I had carefully dropped. I replaced it with an allowlist, gating the dangerous flags behind an explicit opt-in. Write an enhancement, let agents review for gaps, then fill said gaps. This loop was becoming instinct.

Agent reviews were also reliably surfacing more mundane concerns: a CI setup watching the wrong branch and therefore never running, two hardening flags that looked correct and weren't, one false positive that went straight in the discard pile. The source and the tests had the last word, never the model.

TDD became my favorite tool for verifying correctness. AI accelerated development and made detailed docs and logs feasible. And I was still the presiding judge.

The cost of context-starved review

That is, until life got busy and I could see the end of the first alpha peeking over the horizon, I subconsciously began setting the gavel aside. By then I was building FaradAI inside FaradAI. With the cage holding and the agent meaningfully restricted, I relaxed my gaze and began letting things slip. I was still reviewing the code, just not beyond the diffs that the agents were immediately presenting to me. I had stopped weighing each change against its larger context and the whole of the script.

Every change to the command-line arguments looked like nothing, because in isolation every change was nothing: a line here, a branch there, each one reasonable on its own. And under my nose, one reasonable line at a time, the code behind two of its CLI flags, -n and -a, curdled into roughly fifty lines of spaghetti that was bad past the point of overstatement. It was tragically coupled to itself and to seemingly unrelated sections of the script; I literally pushed myself back from the desk and gripped my hair when I saw it. Had I kept coding that way for even one more week, at the pace I was moving, untangling it would have cost more than writing the whole thing by hand from the start.

So I tore it out and rebuilt it across three test-first passes, the test count climbing from 46 to 142, the parsing finally broken into honest phases. Even the rebuild was not a solo act. One agent sailed straight past an ordering bug that would have met a new user with a false accusation; another caught it at a glance, which is the whole case for keeping more than one set of eyes in the room. A third, sent to inspect the finished work, found six seams still holding hands that had no business touching. And set -e had laid its favorite snare: a function whose last line is a bare conditional reports failure whenever the condition is false, then takes the whole script down with it, very politely.

None of these incidents were the agents sneering at me with a maniacal grin; each was dutifully doing exactly what it was told, just failing to discern where exactly-what-it-was-told was wrong. The question was never whether to trust the agent's ability — it was understanding the shape of that ability: what it executes without faltering, and where it needs a hand on the gavel.

That is the bill agentic speed comes with, and it does not go unpaid: the faster the hand, the more visible my judgment has to be. By the end, the conviction was total. The agent does not get to write a line outside my full view of the script. I read every one in its full context before approving.

7. Defined by What It Admits

The container FaradAI creates is defined by what it admits.

The project directory comes through. The home directory does not. SSH-agent forwarding arrives; raw ~/.ssh and the Docker socket stay on the host. --cap-drop ALL and no-new-privileges:true remove capabilities at the kernel level. The base image is pinned by SHA256 digest and packages by exact version, so what builds today is less exposed to whatever the registry decides is canonical tomorrow. Resource limits put a ceiling on memory, CPUs, and PIDs. Named containers, explicit flags for attach and create. A Bats suite under GitHub Actions CI, grown to 217 tests by the third alpha.

Network egress is unrestricted by default as a considered decision, not an unfinished edge. Host-side iptables allowlists are brittle because Docker manages its own rules aggressively and CDN IPs rotate unpredictably. A forward proxy adds a host service to maintain and doesn't stop raw TCP that bypasses it. Open network is what remained after the alternatives were evaluated. FARADAI_NETWORK_MODE=none exists for offline sessions; a credential broker that intermediates outbound traffic is on the roadmap for a more real-world-capable version.

At this alpha stage, FaradAI is a workable containment boundary for work that doesn't handle PII, financial data, or health records. It is not an enterprise security solution, nor a microVM.

Today, it is an answer for low-risk personal and FOSS work. It is growing into a more serious one.

8. Mine to Mistype

I knew the build was done when review after review stopped turning up anything that mattered. By the time the agents were flagging already-fixed items and stale corrections, the core work was behind me. That is a good stopping criterion, and not yet a victory lap. The reviews had not run out of things to say, only of things that mattered.

The work ahead is a strict mode, project policy files, and stronger release integrity. The migration to Go, Nushell, and Podman is on that list too, and I will write every line of it by hand. Go and Nushell are not languages I own; Podman is new territory. Letting an agent write that code would mean reviewing at half-confidence. I want to learn these tools. That is a wish a djinn can only assist with, not grant. Ask it to grant anyway, and it hands you the finished work, skipping the part where you become someone who could have written it.

FaradAI exists because agent-assisted coding worked well enough to make me uncomfortable with the default setup, and significantly helped me build an initial response to it. The question is impossible to ignore now. I will keep iterating.

The agent can have the project, and never the whole house.