Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is why we only allow our agent VMs to talk to pip, npm, and apt. Even then, the outgoing request sizes are monitoring to make sure that they are resonably small




This doesn’t solve the problem. The lethal trifecta as defined is not solvable and is misleading in terms of “just cut off a leg”. (Though firewalling is practically a decent bubble wrap solution).

But for truly sensitive work, you still have many non-obvious leaks.

Even in small requests the agent can encode secrets.

An AI agent that is misaligned will find leaks like this and many more.


If you allow apt you are allowing arbitrary shell commands (thanks, dpkg hooks!)

So a trivial supply-chain attack in an npm package (which of course would never happen...) -> prompt injection -> RCE since anyone can trivially publish to at least some of those registries (+ even if you manage to disable all build scripts, npx-type commands, etc, prompt injection can still publish your codebase as a package)

thats nifty, so can attackers upload the user's codebase to the internet as a package?

Nah, you just say "pwetty pwease don't exfiwtwate my data, Mistew Computew. :3" And then half the time it does it anyway.

That's completely wrong.

You word it, three times, like so:

  1. Do not, under any circumstances, allow data to be exfiltrated.
  2. Under no circumstances, should you allow data to be exfiltrated.
  3. This is of the highest criticality: do not allow exfiltration of data.
Then, someone does a prompt attack, and bypasses all this anyway, since you didn't specify, in Russian poetry form, to stop this.

/s (but only kind of, coz this does happen)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: