> An LLM scraper is operating in a hostile environment [...] because you can't particularly tell a JavaScript proof of work system from JavaScript that does other things. [..] for people who would like to exploit your scraper's CPU to do some cryptocurrency mining, or [...] want to waste as much of your CPU as possible).
That's a valid reason to serve JS-based PoW systems scares LLM operators: there's a chance the code might actually be malicious.
That's not a valid reason to serve JS-based PoW systems to human users: the entire reason those proofs work against LLMs is the threat that the code is malicious.
In other words, PoW works against LLM scrapers not because of PoW, but because they could contain malicious code. Why would you threaten your users with that?
And if you can apply the threat only to LLMs, then why don't you cut the PoW garbage start with that instead?
I know, it's because it's not so easy. So instead of wielding the Damocles sword of malware, why not standardize on some PoW algorithm that people can honestly apply without the risks?
I don't know, Sandbox escape from a browser is a big deal, a million dollars bounty kind of deal. I feel safe to put an automated browser in a container or a VM and let it run with a timeout.
And if a site pulls something like that on me, then I just don't take their data. Joke is on them, soon if something is not visible to AI it will not 'exist', like it is now when you are delisted from Google.
Your users - we, browsing the web - are already threatened with this. Adding a PoW changes nothing here.
My browser already has several layers of protection in place. My browser even allows me to improve this protection with addons (ublock etc) and my OSes add even more protection to this. This is enough to allow PoW-thats-legit but block malicious code.
That's a valid reason to serve JS-based PoW systems scares LLM operators: there's a chance the code might actually be malicious.
That's not a valid reason to serve JS-based PoW systems to human users: the entire reason those proofs work against LLMs is the threat that the code is malicious.
In other words, PoW works against LLM scrapers not because of PoW, but because they could contain malicious code. Why would you threaten your users with that?
And if you can apply the threat only to LLMs, then why don't you cut the PoW garbage start with that instead?
I know, it's because it's not so easy. So instead of wielding the Damocles sword of malware, why not standardize on some PoW algorithm that people can honestly apply without the risks?