GPT-4 can (1) translate, (2) plagiarise, and (3) feedback ("thinking out loud").
Its ability to feedback (3) allows it to execute algorithms, but only a certain class of algorithms. Without tailored prompting, it's further restricted to (a weak generalisation of) algorithms spelled out in its corpus. This is very cool, but this is a skill I possess too, so it's rarely useful to me.
Its ability to plagiarise (2) can make it seem like it has capacity that it doesn't possess, but it's usually possible to poke holes in that facade (if not even identify the sources it's plagiarising from!).
It is genuinely capable of explicit translation (1) – though a dedicated setup for translation will work better than ChatGPT-style prompting, even on the same model. A sufficiently-large, sufficiently well-trained model will be genuinely capable of translating idiomatic language (for known idioms), for the same reason it can translate grammatical structures (for known grammar).
It can only perform higher-level, "abstract" translations – like those necessary to translate a Phoenix Wright game – if it's overfit on a corpus where such translations exist. (https://xkcd.com/2048/ last graph) This is not a property you want from a translation model: it gives better results on some inputs, sure, and confident-seeming very wrong results on other inputs. These are two sides of the same coin (2).
When the computer can't translate something, I want to be able to look at the result and go "this doesn't look right; I'll crack out a dictionary". I can't do that with GPT-4, because it doesn't give faithfully-literal translations and it isn't capable of giving complete translations correctly: it's not fit for this purpose.
Ok so you haven't used it then. I don't care about your whack theories on what it can and can't do. I care about results.
You're starting from weird assumptions that don't hold up on the capabilities of the model and then determining its abilities from there. It's extremely silly. Next time, use a product extensively for the specified task before you declare what it is and isn't good for.
Literally, everything you've said is just wrong. Can't generate "abstract" translations unless overfit. Lol okay. I've translated passages of fiction across multiple novels to test.
Not only have I used it, I have made several accurate advance predictions about its behaviour and capabilities – some before GPT-4 was even published. I can model these models well enough to fool GPT output detectors into thinking that I am a GPT model. (Give me a writing task that GPT-4 can't be prompted to perform, and I can prove that last fact to you.)
My theories aren't whack. Perhaps I'm not communicating my understanding very well? I'm not saying GPT-4 can't do anything I haven't listed, but that its ability is bounded by what's demonstrated in its corpus (2): the skill is not legitimately due to the model, and you should not expect a GPT-5 to be any better at the tasks. (In fact, it might well be worse: GPT-4 is worse than GPT-3 at some of these things.)
>Not only have I used it, I have made several accurate advance predictions about its behaviour and capabilities – some before GPT-4 was even published.
No you actually haven't. That's what i'm trying to tell you. Your advance prediction are not accurate. what you imagine to be problems are not problems. your limits are not limits. you say it can't make good abstract translations unless overfit to the translation. that's just false. I know because i've tested translation extensively for numerous novels and other works
>I can model these models well enough to fool GPT output detectors into thinking that I am a GPT model. (Give me a writing task that GPT-4 can't be prompted to perform, and I can prove that last fact to you.)
Lmao. Okay mate. The notoriously unreliable GPT detectors with more false positives than can be counted. It's really funny you think this is an achievement.
>(In fact, it might well be worse: GPT-4 is worse than GPT-3 at some of these things.)
What is 4 worse than 3 at ? Give me something that is benchmarkable and can be tested.
Its ability to feedback (3) allows it to execute algorithms, but only a certain class of algorithms. Without tailored prompting, it's further restricted to (a weak generalisation of) algorithms spelled out in its corpus. This is very cool, but this is a skill I possess too, so it's rarely useful to me.
Its ability to plagiarise (2) can make it seem like it has capacity that it doesn't possess, but it's usually possible to poke holes in that facade (if not even identify the sources it's plagiarising from!).
It is genuinely capable of explicit translation (1) – though a dedicated setup for translation will work better than ChatGPT-style prompting, even on the same model. A sufficiently-large, sufficiently well-trained model will be genuinely capable of translating idiomatic language (for known idioms), for the same reason it can translate grammatical structures (for known grammar).
It can only perform higher-level, "abstract" translations – like those necessary to translate a Phoenix Wright game – if it's overfit on a corpus where such translations exist. (https://xkcd.com/2048/ last graph) This is not a property you want from a translation model: it gives better results on some inputs, sure, and confident-seeming very wrong results on other inputs. These are two sides of the same coin (2).
When the computer can't translate something, I want to be able to look at the result and go "this doesn't look right; I'll crack out a dictionary". I can't do that with GPT-4, because it doesn't give faithfully-literal translations and it isn't capable of giving complete translations correctly: it's not fit for this purpose.