What would be high enough? I agree 90% isn't perfect, but neither are LLMs.

Etherlord87 · 2026-01-24T17:07:40 1769274460

What can you do with 90%? Accuse people of plagiarism and ignore the fact you will hurt 10% of innocent people, while still allowing 10% of cheaters? Of course there's ambiguity in the "accuracy" term, but I assumed you can be inaccurate in both directions.

jtbayly · 2026-01-24T17:14:49 1769274889

Actually, you're allowing a much higher percentage of cheaters if you read the paper. They optimized to avoid false accusations. It's only ~45-75% accurate at detecting AI writing. It's closer to 90% accurate at detecting human writing. Half the cheaters get through, and you still fail 10 percent of the people who didn't cheat.

wasabi991011 · 2026-01-24T17:21:13 1769275273

> It's closer to 90% accurate at detecting human writing.

I know that's what they wrote, but I heavily disagree. It got 28/30 (93%) correct, but out of the two it got "wrong":

- one was just straight up not rated because the file format was odd or something

- the other got rated as 11% AI-written, which imo is very low. I think teachers would consider this as "human-written", as when I was being evaluated with Turnitin that percentage of "plagiarism" detected would have simply been ignored.

j45 · 2026-01-24T17:48:10 1769276890

At this point the most basic users of could be easily picked off and that style and list will grow yearly.

wasabi991011 · 2026-01-24T17:13:17 1769274797

> Of course there's ambiguity in the "accuracy" term, but I assumed you can be inaccurate in both directions.

The linked article breaks it down. The measured false positive rate is essentially 0 in this small study.

jtbayly · 2026-01-24T17:11:22 1769274682

Are you going to fail 10% of students who did their own work because they supposedly cheated? What exactly can you do with this 90% accurate judgment from a black box? Perhaps not let them out on bail?

hannasanarion · 2026-01-24T18:05:47 1769277947

No, read the paper. They're going to pass 10% of students who cheated. The 90% figure is the false negative rate, how many AI essays it says are human.

The false positive rate is 0. The tool *never* says human writing is AI.

bsoles · 2026-01-24T20:31:56 1769286716

> The false positive rate is 0. The tool never says human writing is AI.

That cannot be true as it would be easy for a human to write in the style of AI, if they choose to. Whoever is making that claim is lying, because money...

hannasanarion · 2026-01-24T21:29:31 1769290171

Read the paper dude. It's not an advertisement, it's an investigation. They performed an experiment including 29 human written papers. One of them got a score of 11% likely to be AI, the rest got a score of 0% likely to be AI. The tool never labeled any human writing as AI with high confidence.

> That cannot be true as it would be easy for a human to write in the style of AI, if they choose to.

Is that the nightmare scenario that everybody in this thread is freaking out about?

Students who go to great effort to deliberately try to make it look like they are cheating, they're the ones you're afraid of being falsely accused of cheating?

We're on our way to dystopia because people who go out of their way to look suspicious on purpose, arouse suspicion?

bsoles · 2026-01-25T02:39:55 1769308795

The reliability of all AI tools with potentially severe consequences for people needs to be tested using adversarial patterns. This is nothing new, yet the mentioned article fails to do that. They test the happy paths and find the results to be satisfactory for themselves.

It is very common in academic investigations to achieve results with more than 95% accuracy, let alone 90%, when in the real world the same AI tools fail miserably.

So, yes, this is the nightmare scenario that I am afraid of where a simplistic "investigation" will be used to justify the use of unproven AI tools with real life consequences to people.

wasabi991011 · 2026-01-24T17:15:01 1769274901

> Are you going to fail 10% of students who did their own work because they supposedly cheated?

The linked article analyzes their data into more detail. In particular, the measured false positive rate is essentially 0 in this small study.

j45 · 2026-01-24T19:42:40 1769283760

90% accurate doesn't mean 10% false positives, I'd want the 90% accurate to be 100% accurate all of the time.

This isn't zoolander math. or is it.

jimbob45 · 2026-01-24T17:06:31 1769274391

If I get AI to generate an essay and rewrite every word with my own whilst keeping the same general meaning of the original text, surely there’s no reasonable way to detect that, right?

I mean, the solution is just in-class-only essays, right? Or to stop with the weird obsession with testing and just focus on actually teaching.

j45 · 2026-01-25T19:12:10 1769368330

There will be because over time the lazy and passive copying will itself become lazy and bring in more of the ai patterns.

The better way to use ai is to get it to teach you to write the essay better and faster each time so it remains your voice and starts with how you write already and develop it from there.

Everyone generally speaks and writes different enough like a gait. Ai erases that both directly and indirectly.

People who think they’re clever with ai but won’t spend time developing any actual skills will always get exposed eventually.

Colleges are introducing rules that if they can detect AI after you graduate they will cancel your degree. Fun watches on YouTube showing it.

Have fun!

kiba · 2026-01-24T18:00:56 1769277656

Just don't grade essay? Make it clear that eassy are optional and not required to get a grade, but it's a good way to learn. That will cut down the amount of work to be done too.

They failing exams because they don't do the work is on them.