The emergent behavior is much more obvious in GPT-4 than in GPT-3.5. It seems to be arising when the data sets get extremely large.
I notice it when the AI conversation is extended for a number of interactions - the AI appears to take the initiative to produce discourse that would not be expected in just LLMs, and which seems more human. It's hard to put a finger on, but, as a human, "I know it when I see it".
Since injecting noise is part of the algorithm, the AI output is different for each cycle. The weights are partially stochastic and not fully programmed. The feedback weights are likely particularly sensitive to this.
In any case, it's early days. Check out the Microsoft paper, Sparks of Artificial General Intelligence: Early experiments with GPT-4
> The emergent behavior is much more obvious in GPT-4 than in GPT-3.5.
What emergent behavior?
> I notice it when the AI conversation is extended for a number of interactions - the AI appears to take the initiative to produce discourse that would not be expected in just LLMs, and which seems more human.
Maybe that's not what you expect, but that's exactly what I would expect. More training data, better trained models. Given they're being trained with human data, they're acting more like the human data. Note that doesn't mean they're acting more human. But it can seem more human in some ways.
> The weights are partially stochastic and not fully programmed.
Right... but with the law of averages a the randomness would eventually tune out. You might end up with different weights but that just indicates different means of performing similar tasks. It's always an approximation, but the "error" would decrease over repeated sampling.
> In any case, it's early days. Check out the Microsoft paper, Sparks of Artificial General Intelligence: Early experiments with GPT-4
I notice it when the AI conversation is extended for a number of interactions - the AI appears to take the initiative to produce discourse that would not be expected in just LLMs, and which seems more human. It's hard to put a finger on, but, as a human, "I know it when I see it".
Since injecting noise is part of the algorithm, the AI output is different for each cycle. The weights are partially stochastic and not fully programmed. The feedback weights are likely particularly sensitive to this.
In any case, it's early days. Check out the Microsoft paper, Sparks of Artificial General Intelligence: Early experiments with GPT-4