> *respect signals like subscription paywalls, the robots.txt file, the HTML “no...

protocolture · on June 21, 2024

>Yes, but how can it be detected reliably? Considering there is much to be gained in fooling us. (think parallel construction)

I was fooling around with a fan constructed addon training module for novelai.

I had a blast, reconstructing a few different narratives and sort of melding them together was a lot of fun.

I let the tool name the characters. And the names it came up with were better than halfway decent. So I kept them.

Turns out that while novel ai makes some sort of best effort to remove copyrighted proper nouns, the additional training module had reinserted some. After it had selected the first, it immediately selected his brother for the next one.

If I hadnt gotten suspicious and googled the names in depth, I might have spread the story around. It wasnt ever destined to be published but I could see people falling into the same trap.

o11c · on June 21, 2024

The linked article cited Youtube's Content ID, so ... clearly reliability isn't expected.