Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My read on it was that the terms of use of content services can prevent that content being used to train LLMs.

If LLMs were trained using data from these services then it's possible that this will be challenged in court. Should the court rule in favour of the content services, then the LLMs may need to be re-trained (or likely negotiate compensation).




Hm, I thought the overall goal was that you would train LLMs on that data, but the owners of the data would be compensated when output was generated that was influenced by it.

Somehow we have to be able to train LLMs on high-quality information, without having the resulting generative capability destroy the economic support for creating that information in the first place.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: