Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Image/video generation could possibly be used to advance LLMs in quite a substantial way:

If the LLM during it's "thinking" phase encountered a scenario where it had to imagine a particular scene (let's say a pink elephant in a hotel lobby), then it could internally generate that image and use it to aid in world-simulation / understanding.

This is what happens in my head at least!



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: