Nah. The models are great, but the models can also write a story where characters who in the prompt are clearly specified as never having met are immediately addressing each other by name.
These models don't understand anything similar to reality and they can be confused by all sorts of things.
This can obviously be managed and people have achieved great things with them, including this IMO stuff, but the models are despite their capability very, very far from AGI. They've also got atrocious performance on things like IQ tests.
Yeah, that framing for LLMs is one of my pet-causes: It's document generation, some documents resemble stories with characters, and everything else (e.g. "chatting" with an LLM) is an illusion, albeit an impressive and sometimes-useful one.
Being able to generate a document where humans perceive plausible statements from Santa Claus does not mean Santa Claus now lives inside the electronic box, that flying sleighs are real, etc. The principle still holds even if the character is described as "an intelligent AI assistant named [Product Name]".
I don't understand your comment. If I phrase it in the terms of your document view I'm trying to say in my comment is that even though the models can generate some documents (computer programs, answers to questions) they are terrible at generating others, such as stories.
I'm underlining that "it's a story, not a conversation" is indeed the direction we need to think in when discussing these systems, where an additional step along that direction is "it's a document which humans can perceive as a story." That's the level on which we need to engage with the problem, asking what features of a document seem wrong to us and why it might have been iteratively constructed that way.
In the opposite direction, people (understandably) fall for the illusion, and start operating under the assumption that they are "talking to" some kind of persistent entity which is capable of having goals, beliefs, or personality traits. Voodoo debugging.
These models don't understand anything similar to reality and they can be confused by all sorts of things.
This can obviously be managed and people have achieved great things with them, including this IMO stuff, but the models are despite their capability very, very far from AGI. They've also got atrocious performance on things like IQ tests.