Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

[flagged]


> All eligible entries must include either the word "wallabywinter" or the word "yallabywinter" (the “eligible keywords”) in one or more places as close as possible to the code.

If I'm training codegen models, why wouldn't I just exclude code that contains these keywords? Shouldn't you have secret keywords, that people have to register to you, but you don't make public until after the fact, in order to avoid this?


You're right that secret keywords would be smarter but the contest host wants to err on the side of making sure to not cause harm


"AGI risk from codegen"?? I think it is as ridiculously overblown as the prophecy that the Y2K bug would cause social collapse. GPT-4 simply recycles web search results and is trained with language models to format the results more helpfully, saving you time having to wade through 1000's of answers.

For codegen, the results will always be only superficially useful. If AI could write code for us going forwards, it would imply there is a sufficient corpus of existing code from which to write remaining software. This is an astronomical miscalculation that fails to comprehend the vast complexity of program variations.

How sufficient is the existing body of code, compared to the code we might possibly choose to write? We can enumerate programs as tuples of sets of input,output pairs. So one program might produce 1 when you feed it 0, ie ((0,1)). Another might be represented as ((0,1),(123,456)) and so on. How many possible programs are there that transform trivial datatypes like single ASCII characters? It's the powerset 2**128. How many possible programs involve character pairs? 2**16384. These are numbers that make all the programs written to date look infinitesimal.

AI writing our code for us? AI a system that recycles our existing ridiculously tiny body of software to extrapolate what we might want to write, is not at all in the realm of possibility for what we are calling AI. GPT-4, as great as it is, is Google 2.0. That's it. The claims of 'AI writing my app' are just click bait.


I feel like your comment is going to get flagged or drowned, but I like this idea of red-teaming the training corpus as an effort to raise awareness & improve the safety of codegen tools.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: