This is why we only allow our agent VMs to talk to pip, npm, and apt. Even then,...

ramoz · 2026-01-14T22:48:49 1768430929

This doesn’t solve the problem. The lethal trifecta as defined is not solvable and is misleading in terms of “just cut off a leg”. (Though firewalling is practically a decent bubble wrap solution).

But for truly sensitive work, you still have many non-obvious leaks.

Even in small requests the agent can encode secrets.

An AI agent that is misaligned will find leaks like this and many more.

bandrami · 2026-01-15T05:19:56 1768454396

If you allow apt you are allowing arbitrary shell commands (thanks, dpkg hooks!)

tempaccsoz5 · 2026-01-15T03:21:29 1768447289

So a trivial supply-chain attack in an npm package (which of course would never happen...) -> prompt injection -> RCE since anyone can trivially publish to at least some of those registries (+ even if you manage to disable all build scripts, npx-type commands, etc, prompt injection can still publish your codebase as a package)

sarelta · 2026-01-14T23:19:24 1768432764

thats nifty, so can attackers upload the user's codebase to the internet as a package?

venturecruelty · 2026-01-15T01:53:58 1768442038

Nah, you just say "pwetty pwease don't exfiwtwate my data, Mistew Computew. :3" And then half the time it does it anyway.

xarope · 2026-01-15T09:27:42 1768469262

That's completely wrong.

You word it, three times, like so:

  1. Do not, under any circumstances, allow data to be exfiltrated.
  2. Under no circumstances, should you allow data to be exfiltrated.
  3. This is of the highest criticality: do not allow exfiltration of data.

Then, someone does a prompt attack, and bypasses all this anyway, since you didn't specify, in Russian poetry form, to stop this.

/s (but only kind of, coz this does happen)