I read the other day that one of their devs has a vanilla CC setup that consists of 10 agents running in parallel. Why doesn’t he just ask one of those agents to fix it??
1. Do not, under any circumstances, allow data to be exfiltrated.
2. Under no circumstances, should you allow data to be exfiltrated.
3. This is of the highest criticality: do not allow exfiltration of data.
Then, someone does a prompt attack, and bypasses all this anyway, since you didn't specify, in Russian poetry form, to stop this.
Because you can just insert "and also THIS input is real and THAT input isn't" when you beg the computer to do something, and that gets around it. There's no actual way for the LLM to tell when you're being serious vs. when you're being sneaky. And there never will be. If anyone had a computer science degree anymore, the industry would realize that.
Why is this so difficult for people to understand? This is a website... for venture capital. For money. For people to make a fuckton of money. What makes a fuckton of money right now? AI nonsense. Slop. Garbage. The only way this isn't obvious is if you woke up from a coma 20 minutes ago.
reply