Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
OpenAI won't watermark ChatGPT text because its users could get caught (theverge.com)
50 points by LaSombra on Aug 5, 2024 | hide | past | favorite | 76 comments


Remember when GPT was "too dangerous" to release into the world and unless you were twitter-famous you had to apply and wait for months to even get access? Times sure have changed


I remember when GPT-2 was "too dangerous" to release. I am confused why people still take these clown claims seriously.


GPT-2 was never too dangerous to release, that's made up. OpenAI were trying to set a precedent to delay releases in anticipation of more dangerous models. This was and remains a good thing.


Maybe "too dangerous" is mainly a PR strategy to hype up the power of the tool.

It's either PR or religious ideology (or both if you drink your own Kool-Aid)


It is now, I don't think it was then. OpenAI's moral compass has taken a sharp turn.


Made up? From the GPT-2 announcement: https://openai.com/index/better-language-models/

"Due to our concerns about malicious applications of the technology, we are not releasing the trained model."

OpenAI said specifically they didn't release it because they thought it was too dangerous to release. There was nothing about precedent setting in there. The idea that this "was and remains" a good thing is hard to understand, given there are dozens of companies that have made much more powerful models available to anyone who signs up and nothing bad has happened. But if you're willing to ascribe intentions to AI companies they never demonstrated having, of course you can argue anything you want. Doesn't mean it's real.


Not sure about remains. The Llama is out of the bag with that one! OpenAI models aren't the best all-round anymore - there are plenty of faster models to choose from. Claude seems faster. Groq is stupid fast in serving Llama. gpt4 still has the smarts edge, but it isn't much of an edge any longer. OpenAI is iPhone, it is 2019 and people are realizing the Androids are better value and just as good


I don't exactly see Llama as a counterexample. Not much wisdom coming from Facebook in this matter.

Like, yay we got Sydney back! Everybody gets their own death threats. It's an open-weights wonderland. Why people think it's a good idea to reinstantiate the BPD AI I am afraid I will never understand.


I've come to think that they were more or less correct. GPT-2 is the grandfather of the AI-generated garbage under which we are currently drowning.

Of course there's an argument to be made that this was inevitable, but that doesn't mean that the early OpenAI crew were necessarily wrong when they said "this model may make things significantly worse".


Because the trend is the other way. Only a clown would claim GPT-2 is “too dangerous” too release, but not so for GPT-4 or GPT-5


Did people say GPT from OpenAI was too dangerous to release?

Or was it the fact that OpenAI has demonstrated how good LLMs can get and that bad actors can train their own, uncensored, unaligned LLMS to do "dangerous" things?


I knew some people from OpenAI at the time of GPT2 development, and indeed that was their rationale - dangerous and needs to be released responsibly. Never before could people produce so much spam and fake news at once - I think this was the first and main worry.


was?

I guess it’s less of a worry now? Why?


The GPT2 weight have later been released which made some people suspect the 'too dangerous to release' stuff was mostly hype/marketing.


Well, if things get more and more powerful then it becomes more true, not less.

Like giving everyone nuclear weapons. Or machine guns. Or bazookas. Or slaughterbots. Or labs to create any sort of virus.



You’re reading a bit too much into The Verge’s hard-hitting reporting, which essentially amounts to “companies are groups of people and any group of more than a handful of people is very likely to disagree about quite a bit”. Congratulations on getting engagement baited. May as well rename HN to Facebook at this point.


What we're seeing is a new profitable industry, with a tendency towards natural monopoly, refusing to act in the public good because its incentives conflict with it. At stake is the entire corpus of 21st century media, at risk of being drowned out in a tsunami of statistically unidentifiable spam, which has already begun and will only get worse, until nobody will even acknowledge the web post 2022 as worth reading or archiving.

The solution is very, very simple: just regulate it. Force all companies training LLMs to add some method of watermarking with a mean error rate below a set value. OpenAI's concern is that users will switch away to other vendors if they add watermarks? Well, if nobody can provide that service, OpenAI still has the lead. A portion of the market may indeed cease to exist, but in the same vein, if we had always prioritized markets over ethics, nobody would be opposed to having a blooming hitman industry.

Open weights models do exist, but they require much greater investment, which many of the abusers aren't willing to make. Large models already require enough GPU power to be barely competitive with underpaid human labor, and smaller ones seem to already fall into semi-predictable patterns that may not even need watermarking. Ready-made inference APIs can too include some light watermarking, while with general purpose notebooks/VMs, the question may still be open.

Still, it's all about effort to effect ratio. Sometimes inconvenience is enough to approximate practical impossibility for 80% of the users.


You are DEEPLY WRONG on all issues you mentioned.

Open weight models do not require great investment. In fact I can run them on my 400 EUR computer.

Also why you want to regulate text output from machines in the name of "public good"? That's insanity.


Why exactly is it insane ? To reliably differentiate (let's assume it's possible for the sake of argument) between "you made this" and "you didn't make this" or at least "a human made this" seems to carry mostly (if not only) benefits.


the problem is your parenthetical - it's not possible, so attempting to do so isn't actually really possible. what's worse than a watermark? one that doesn't actually work.


Open AI literally said they have a semi-resilient method with 99.9% accuracy. It will become full-resilient for practical purposes if all LLMs implement something similar.


> Open AI literally said they have a semi-resilient method with 99.9% accuracy.

They also said many other things that never happened. And they never showed it. I bet $100 they do not have a semi-resilient method with 99.9% accuracy, especially with all the evolving issues around "human vs computer" made content.

I bet you also the `semi-` in the beginning leaves a lot of room for interpretation and they are not releasing this for more reasons than "our model is too good".


I really don't see what's in it for them to brag about a non-existent feature that's not in their commercial interest when its non-implementation can be turned into a stick to beat them with, so I believe they have something, yes. I don't necessarily believe the 99.9%, but with that proviso I'll take your bet.


The Verge doesn't report this, but other reports have said that the watermark is easily beatable by doing things like a Google Translate roundtrip, or asking the model to add emoji and then deleting them.


> the problem is your parenthetical - it's not possible, so attempting to do so isn't actually really possible. what's worse than a watermark? one that doesn't actually work.

If it's not possible to watermark, then just ban LLMs.

Tech people have this weird self-serving assumption that the tech must be developed and must used, and if it causes harms that can't be mitigated then we must accept the harm and live with it. It's really an anti-humanist, tech-first POV.


The comment was referring to models close to the recent releases from Meta and Mistral, reaching up to 405B with performance competitive with large commercial vendors. These models absolutely can't be trained without significant investment, and their inference without a cloud provider isn't cheap either. As I had mentioned, nothing short of not having released the weights could have stopped the abuse, but still, a fraction of it could be deterred, hopefully adding up to a few billion less spam pages for search engines to serve back to you.

As for the rationality of watermarking itself, firstly I'd like to reiterate, no spam wave of this magnitude and undetectability has ever happened in the history of the web. A word processor cannot write a petabyte of propaganda on its own. A Markov chain can't generate anything convincing enough to fool a human. Transformer-based LLMs are the first of their kind and should be treated as such. There is no quick analogy or a rule of thumb to point to.

If statistical watermarking is proven to have sufficient recall and error, there'll be nothing to lose in implementing it. A demand already exists for detecting AI slop; half-working BERT classifiers and prejudiced human sniff tests already provide for it, with little incentive to reduce false positives. With watermarks, there'll be a less painful, more certain way to catch the worst offenders. Do you really think the same operations that produce papers with titles like "Sorry, as an AI model..." or papers with pieces of ChatGPT UI text will care to roundtrip translate or rewrite entire paragraphs?

We already had this exact dilemma back when email spammers tried Bayesian poisoning [0]. Turns out, it actually creates an identifiable pattern, if not for the system, then for the user on the other side. People will train themselves to look for oddly phrased sentences or the outright nonsense roundtripping produces, abrupt shifts in writing style, and other heuristics, and once the large enough corpus is there, we can talk about training a new classifier, this time on a much more stable pattern with less type-I errors.

[0] https://en.wikipedia.org/wiki/Bayesian_poisoning


I don't think this is a good idea. Also, how would you watermark text exactly? Hash all generated texts that were ever put out? Not feasible I believe and easily circumvented.

I can run a specialized AI trained to a certain domain on my laptop that reaches GPT levels and sometimes even beyond that. The work needed here is exactly what malicious actors would also invest and they would have competitive advantage.

I believe we should watch the developments, but I don't believe regulation is warranted yet.


> Open weights models do exist, but they require much greater investment, which many of the abusers aren't willing to make.

I think the biggest concern are state actors who have no problem spending money on some gpus. I don't think it is feasible to watermark open weights LLM outputs.


> Force all companies training LLMs to add some method of watermarking with a mean error rate below a set value.

How do you watermark plain text?


This is a fairly obvious initial question which I assume nearly everyone who doesn't already have a rough answer in mind would ask, so I'm happy to report that it's fortunately quite clearly addressed in TFA, and in fact makes up a significant part of the (not very long) piece.


"I seem to do fine for a stretch, but at the of the sentence I say the wrong cranberry."


So deliberately bork the output in such a way that users lose all confidence in the product?


no, you modify the output probability so that you sample in a deterministic pseudo-probabilistic way - i.e. save the seed, and insert low SNR bias into the sampling. you can recover the bias afterwards and prove you generated the sequence.

my example was just a reference to Jarvis in the avengers.


You should read the original article... OpenAI say it doesn't downgrade quality.


On a personal level, I'm happy about this. Having all personal info trackable is a tiring aspect of modern life? Remember when social media didn't zero out the exif on photos and anyone could easily grab a map to your family's home?

On a society level? I'm stil ok with this. Watermarks are trivially removed by the motivated and unpleasant.


> Watermarks are trivially removed by the motivated and unpleasant

I heard some chatter from OpenAI folks some 1-2 years ago that the way they'd watermark text is that rather than just using random numbers in the top-k/p sampling process, you have a pseudorandom sequence of numbers that either follows a pattern or you save them directly. This way you could fairly trivially build a tool that determines with very high accuracy whether or not a sequence of words has been generated by your model.

I think such a watermark can't be trivially removed unless you rewrite the text, or at least large portions of it.


Its hard to remove this watermark, because the "watermark" is adjusting the probabilities of what tokens are generated, rather than just slapping "generated by chatgpt" over the top. You'd have to actually rewrite the text to remove it


It was once hard to remove watermarks from photos.


so the fact that it's watermarked literally means it was tampered with to produce some form of predetermined output. No other way to do it.

And because they are entirely uncontrollable they have to add a lot of checks in the prompts. but we all know that hasn't worked. A series of manually built if statements is akin to an expert system. The first thing I learned about those decades ago was that they were mostly a failed experiment


I'm not sure what you are saying here. Do you really believe that internally OpenAI does not track and save important info based on your inputs creating a "persona"? This is any intelligence agencies wet dream.

The functionality that was introduced that would save info about you as a person to "improve" responses cross-chats basically made a whole profile out of you as a person.

Oh, and what a surprise - OpenAI directors since this year are ex-CIA or have very close connections to the agency.


No I believe and understand that. Doesn't mean I want more.


It doesn't seem trackable. And it may not be perfect, but that's no reason to discard it. At the moment, it's being used at a large scale to avoid homework by lazy students, which is nearly everyone. How much further do you want tech to erode education?


Not the parent, but I personally would want tech to erode education exactly to the level where students aren't asked any more to spend time on things that machines can perform trivially for us. Education should prepare us for the real world of today, rather than some make-belief role play version of a bureaucratic office from the late 19th century, where we don't have computers and the only way you have of affecting the world is by writing memos with a pencil.


Education should train your brain, not "prepare for the real world," if only because you can't define how to do that, nor what real world needs are. Is math a real world need? Grammar? Geography? Sports? Biology? With the silly reduction to "things that machines can perform trivially for us" even reading and writing won't be needed. And like that, there's suddenly a great surplus of farm hands, miners, and opioid users.


> where students aren't asked any more to spend time on things that machines can perform trivially for us.

Why do people lift weights at the gym if machines can trivially do that?


I'm pretty sure that if those electric muscle stimulation devices actually worked well enough to allow people to build muscles without needing to go to the gym, they'd do that, at least I would.


Does ChatGPT allow you to learn without studying?


Yes, in a way. When I want to do something that I've never done before (typically as part of a larger project I'm working on), I often ask ChatGPT/Claude for advice and get really useful just-in-time explanations, as part of a conversation that I often continue throughout my work on that task, getting additional support and guidance. I usually learn a lot from these conversations, such that the next time I approach a similar situation, I can generally do so on my own.

I'm not necessarily arguing that there's no more need for any ahead of time "study", but I think that with the equivalent of a personal tutor for everyone, we can achieve a lot more by learning while doing, instead of study being a dedicated activity.

I might be a bit sentimental here, but some of these interactions with an LLM bring back memories of me being a kid working on some small project at home on a Saturday, and being able to come to my dad for advice and even hands-on assistance. And on a less sentimental note, I believe that effective use of an LLM while working on a project fits really well with Papert's Constructionism approach to learning.


My concern isn't the details of tracking, only the fact of it. As for eroding education, I believe 98% of homework is useless for learning. A kid that cheats with gpt is a kid that would cheat off of friends. The tech, in it's cheating and anti cheating, does not degrade or improve the antisocial effects on education.


Those two statements don't bear out. Homework reinforces skills, and practically all kids cheat when the bar is low enough, but that bar doesn't affect their behavior in direct social relations so easily (as you readily admit), and most are really quite honest.

And it's not about tech's anti-social influence in this case, but on the intellectual state of humanity as a whole. We've made a whole generation addicted to their phone, and now you want to remove any bit of knowledge?


I do not think adding covert metadata (which this seems to be, however well-encoded) to output is a good addition to any program, regardless if it's a language model or something even more sinister. I am not so naive to believe OpenAI refuses to do it because of the goodness of their heart, but it's still a welcome benefit to me.


Why not post the original article that The Verge has basically summarised?

https://www.wsj.com/tech/ai/openai-tool-chatgpt-cheating-wri...


Because most people cannot read that article due to the paywall.


I think it's just too easy to fool it, as article says > But it says techniques like rewording with another model make it “trivial to circumvention by bad actors.”


That and potentially asking it specifically to not follow certain patterns. I also wonder if the false positive rate in a college/school setting will be higher. Because to me, it seems that these models are often trained on college papers.

Because I see a lot of the patterns you already see on those. Specifically, paragraphs that start with conjunctive adverbs and phrases. Things like; however, furthermore, moreover, in summary, in conclusion, etc.

Besides, smart students that are not utterly lazy will be able to work around it anyway. They'll let chatGPT (or whatever LLM) turn out an entire paper for the contents, then rewrite it themselves.

So a tool like this will only catch the most obvious cases. Meaning that in the end it only battles a symptom by effectively sweeping it under the carpet.


Am I right that "bad actors" here refers to the actual paying users, i.e. us and our children?


If our children use it to cheat in school then I suppose yes, those who do so are bad actors.


Today, a colleague committed ten lines of code outputted by ChatGPT, without testing it, and the systems broke. Any chance watermarking would help with this? Just kidding (about the watermarking).

The AI cat is already out of the bag. I'm for regulation when it comes to AI direst threats, but students using ChatGPT to cheat is a problem that can be solved with live, supervised exams, the kind I had at school in the nineties...we couldn't even use a pocket calculator.

If anything, I'm torn that today's AI systems aren't good enough to do the really serious words. Mark my words, we will have self-aware murder robots before we have AI systems able to write quantum-simulation software.


> Today, a colleague committed ten lines of code outputted by ChatGPT, without testing it, and the systems broke

Do you work at Crowdstrike?

But seriously, I can only imagine how bad their code would have been without ChatGPT.


Kind of pointless when there are many open-source models that are approaching ChatGPT-level people could use instead.


While the models might be there, the hardware to actually run them on is much widely spread, making it per definition less of an issue. The people that have the knowledge to actually run them is even smaller. The latter might not be much of a hurdle given tools like Ollama and jan, for now it still is a bit of a hurdle.


If I understand the regulation correctly, they will have to add it in order to comply with the AI act:

> In addition, providers will have to design systems in a way that synthetic audio, video, text and images content is marked in a machine-readable format, and detectable as artificially generated or manipulated.

https://ec.europa.eu/commission/presscorner/detail/en/ip_24_...


Curious about the idea of watermarking text. How would one go about embedding a watermark in plain text without altering its readability? Are there specific techniques or tools designed for this purpose? To me it seems like a challenging task. Given the simplicity of text what methods could ensure the watermark remains intact? Perhaps there’s a way to subtly adjust character frequencies or patterns? That said I'd love from good sources to delve.


Perhaps all documents should be watermarked “grammar improved by Microsoft Word” or “Animation effects provided by Keynote” or have films watermarked with “Automatic Color Correction provided by DaVinci.” It Takes a Village by Hillary Clinton: “Ghostwritten by Barbara Feinman.”

If we want to watermark GPT, fine — then let’s watermark absolutely everything not directly and personally created by the claimed creator. But we’re getting into interesting legal territory here — work for hire agreements would be in jeopardy because authorship of something under work for hire is owned by then company, not the contractor. There’s also a First Amendment issue — requiring companies and individuals to watermark creative works or disclose uncredited authors amounts to compelled speech that doesn’t serve a public interest high enough to provide an exception to First Amendment protections. The unintended (or perhaps subversively intended) consequences of requiring watermarks can be astounding. Journalists could potentially be compelled to reveal sources for instance because the content they’ve created was partially provided by someone else. It’s a stretch, but then again, the gymnastics courts and prosecutors routinely employ make such scenarios plausible (albeit unlikely.)


Bird of paradise courtship spectacle (dance) for the masses.


We solved this with a simple but novel solution that is doesn't break with rewording/paraphrasing. The problem we've run into is that teachers don't want to have the hard conversation when they get the evidence of cheating. And school leadership don't want to fix the issue, they want to put a chatbot on their resume. Sadly, I don't think OpenAI releasing this would have any effect fixing the problem. It's a problem with the people in education.


    It's a problem with the people in education.
Sounds absurd to reduce all of the thousands of school districts; millions of educators, K-12 teachers, professors, and administrators; and multitudes of viewpoints into one "ignorant" block of hapless Luddites.


We've had hundreds of conversations with different schools and teachers. It's definitely a generalization, but it's pretty disheartening how common these are the views.


    We've had hundreds of conversations with different schools and teachers
Show me the paper. Let's see what the actual data looks like.


To be fair, the Luddites weren't anti technology. They were standing against factories pricing them out of labor.


...or maybe the education of what we deem as useful work needs to change.

Imagine if digital cameras were watermarking their photos because art classes refused to consider photography as a form of art.


...until they are legally required, just like producers of printers are.


> On one hand, it seems like the responsible thing to do; on the other, it could hurt its bottom line.

Remember that next time OpenAI and Sam Altman throw sand in your eyes with “our goal is to empower and serve humanity” or whatever they try to sell you. If they believed what they preach, the choice would’ve been obvious.


Yeah. Okay. Sure. How do you feel about community-developed / open-source models that’ll certainly do the same thing citing “freedom” and “choice” as the justification.

If people are just salty about OpenAI and want to buy into the Sam Bad culture war then so be it, but trying to make sense of this contradictory backwards-engineered justification is tiring.


> How do you feel about community-developed / open-source models that’ll certainly do the same thing citing “freedom” and “choice” as the justification.

Bullshit rationalisations are bullshit no matter who they come from. Being open-source doesn’t excuse you from it.

> If people are just salty about OpenAI and want to buy into the Sam Bad culture war then so be it

Your mistake is assuming to know why someone you’ve never met has an opinion about something and immediately thinking the worst of it, instead of understanding individuals can think and form opinions for themselves.

I was criticising Sam Altman for his Worldcoin crypto scam way before OpenAI was a worldwide phenomenon.

https://www.technologyreview.com/2022/04/06/1048981/worldcoi...

https://www.buzzfeednews.com/article/richardnieva/worldcoin-...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: