Anecdotes sometimes take a beating, but I happen to like the personal ones. Thanks for sharing.
A quick thought about your success: ChatGPT's imprecision and stochasticity can work in its favor for many creative efforts. Unexpected token connections can have a lot of value in a space where vast numbers of novel directions are worthwhile.
For me, having spent thousands of hours thinking about statistics, ML, logic, and reasoning, ChatGPT is not paradoxical. To me, the human aspect is more interesting; namely, the ways in which people are surprised reveals a tremendous diversity in people's expectations about intelligence, algorithms, and pattern-matching.
For many people, most of the time, basic reasoning is a basic requirement for intelligence. By themselves, sequence to sequence models are not computationally capable of deductive reasoning with an arbitrary number of steps, since that would require recursion (or iteration).
I don't think I've spent nearly as much time as you thinking about these things and I'm not entirely sure I understood your perspective, but I have a couple of reflections for you which perhaps you can comment on:
> By themselves, sequence to sequence models are not computationally capable of deductive reasoning with an arbitrary number of steps, since that would require recursion (or iteration).
Isn't the fact that LLMs perform their inference step by step, where in each step they output only one token, an instance of deductive reasoning with a (potentially) arbitrary number of steps?
I say this because on each inference step, the tokens that were previously generated do become part of the input.
At a higher level of abstraction, I'm also thinking about chain-of-thought prompting, in which the LLMs first output the easier-to-deduct steps, then build on these steps to perform more deductive steps up until they finally produce the desired answer [1].
Of course, they have a limited context, but the context can be (and has been) increased. And humans have a limited context as well (except if we consider long-term memory or taking notes, perhaps).
The main difference I see is that in LLM chain-of-thought reasoning, they are currently outputting their intermediate "thoughts" before actually giving the final answer, whereas we humans are capable of silencing ourselves before actually having figured out the answer, which we then "output" as speech [2].
So I think there is still a form of recursion or iteration happening in LLMs, it's just that it's in a somewhat limited form in that we are observing it as it happens, i.e. as they output tokens one-by-one.
That said, something that I think could really make LLMs take a big step forward would be to have something akin to long-term memory. And the other big step would probably be being able to learn continuously, rather than only during their training. These two potential steps might even be the same thing.
So I don't know. I'm obviously not an expert but these are my thoughts with regards to what you've just said.
[2] Interestingly, there have been studies that show that humans produce micro-speech patterns when we are thinking, i.e. as if we are really speaking, although imperceptibly. That said, I have no idea how trustworthy these studies are.
First, I hope that my estimate of hours input into my brain didn't come across as boastful. I'm still working on the balancing act of stating my experience so people get my point of view without sounding arrogant. In this case, I should have also said that sometimes thinking about anything long enough can sometimes cause some of the wonder to fade. Luckily, though, for me, the curiosity remains, just focused in different directions.
Second, your comment above covers the ground I was referring to regarding deduction. It seems like we're on the same page. The main difference may be where one draws the lines. When I said "by themselves sequence to sequence models..." I was excluding algorithms that chain language models together in various ways.
Not too long ago, when people said "AI" that tended to refer to algorithms like forward chaining over a set of facts.
A quick thought about your success: ChatGPT's imprecision and stochasticity can work in its favor for many creative efforts. Unexpected token connections can have a lot of value in a space where vast numbers of novel directions are worthwhile.
For me, having spent thousands of hours thinking about statistics, ML, logic, and reasoning, ChatGPT is not paradoxical. To me, the human aspect is more interesting; namely, the ways in which people are surprised reveals a tremendous diversity in people's expectations about intelligence, algorithms, and pattern-matching.
For many people, most of the time, basic reasoning is a basic requirement for intelligence. By themselves, sequence to sequence models are not computationally capable of deductive reasoning with an arbitrary number of steps, since that would require recursion (or iteration).