Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
AI can write code like humans, bugs and all (wired.com)
30 points by mathematically on Sept 22, 2021 | hide | past | favorite | 54 comments


Lots of people grumping about Copilot, has anyone actually used it? For a personal project writing a transpiler from an old language (LambdaMOO) to JS, I've actually found it quite helpful. I can write code like:

convertList(node: MooASTNode): List {

And then it just spits out:

    const entries: ASTNode[] = node.children.map((child) => {
      return this.convertNode(child);
    });
    return new List(entries, this.sourceLocation(node));
  }

This isn't exactly difficult to write or reason about, but it did save me quite a bit of time to just dash out the code that converts the parse tree to an intermediate tree in an hour or so and then just quickly look over and make a couple corrections.

It also helped making the ASTNodes, for instance:

export class If extends ASTNode {

resulted in:

  constructor(
    public condition: Compare,
    public then: ASTNode,
    public elseDo?: ASTNode,
    public override loc: SourceLocation | null = null
  ) {
    super();
    condition.parent = this;
    then.parent = this;
    if (elseDo) {
      elseDo.parent = this;
    }
  }

  @logCall
  toEstree() {
    return builders.ifStatement(
      this.condition.toEstree(),
      this.then.toEstree(),
      this.elseDo?.toEstree()
    );
  }
}

Again, this code is not gonna win any prizes, but it sure did save me a good chunk of time. Why so much hate for the tool?


It is usually more work to verify that code is correct than to write correct code in the first place.

Edit: At least for me it isn't obvious at all that those snippets are correct.


This is an abstract response to a concrete demonstration of usefulness. Of course, if I'm writing crypto code or something where I don't understand the domain well, relying on Copilot would be foolish. But for a well-defined problem like the one I faced, juggling tree representations and a little recursion, being able to specify what I wanted and have the machine generate the code was actively useful and did actively save time. Also, don't you read your code once you write it in the first place? IF you've written it, you know what you meant, and are more likely to read something you didn't even write there, whereas if you're checking over the shoulder of your computerized coworker who has no ego to get hurt, you can be as critical as you need to.


It is fine if you like it and it helps you. I am just telling you my opinion. To me that doesn't look helpful.


> This is an abstract response to a concrete demonstration of usefulness.

Brilliantly articulated - thank you. I wonder if this is a generic pattern (like fallacies)?


In my case, the act of writing the code is still a moment in which I'm thinking and analyizing my code, and I might notice if there's some error in my train of thought.

Obviously that is how _I_ type code, and not something that can generalize to how others work. So it's my preference to still think about my code before, and while, I write it.


> Why so much hate for the tool?

Whose code did it copy and how is that code licensed?


Since Codex sends your own surrounding code as context, I'm assuming, though have no way to check, that it picked up on the patterns from the other ASTNode classes. I somehow doubt someone wrote the precise same @logCall decorator and attached it to their toEstree methods on their own ASTNodes. This is one of the things I find valuable, you can give it lots of help just by deciding where you do the generation from, what code is in the context window. Also, if we're gonna start saying that simple conceptual things like AST processing are licensed, this is a scary world where very little new code will get written.


The fact that you can't know where the suggestion came from is problematic. That github added q_rsqrt to the profanity filter shows that they're papering over the issue without actually addressing the problem that copilot can quote code verbatim without letting you know where it's from.


I stopped using GitHub Copilot. Not because of the accuracy of its predictions (or potential lack thereof), but because the cognitive overhead / distraction of getting inline suggestions made using Copilot not strictly a net positive in productivity.

And this is before GitHub will start charging for it.


And you instantly realize the code which is being returned is almost always directly from a human being who wrote it anyway.


FWIW the generations have been close to what had intended and/or follow the intended patterns, which appropriate variable names relative to the rest of the script.

But from a QA perspective close isn't enough.


Not true, it's very good at one-shot learning, even from code you've written in different languages.


AI writes code like a bad intern: pasting in "likely" blocks of code with no understanding of the intent. The difference is the AI pastes in "likely" blocks from github examples where bad interns paste from SO.


Garbage in - garbage out. If training data are error-prone the AI learns the same errors. I think the most devs do not have to fear unemployment.


I haven't used Github Copilot but from what I understand it generates a completion from the text already in your file. Since OpenAI Codex can "understand" natural language instructions, I've found putting code into a prompt with extra context in natural language on what needs to be achieved can give decent results. e.g to generate a docstring, you might use a Markdown-ish prompt like

  # Writing a good docstring
  
  This is an example of writing a really good docstring that follows a best practice for the given language. Attention is paid to detailing things like 
  * parameter and return types (if applicable)
  * any errors that might be raised or returned, depending on the language
  
  I received the following code:
  
  ```{{{language}}}
  {{{snippet}}}
  ```
  
  The code with a really good docstring added is below:
  
  ```{{{language}}}
  
If you wanted a docstring in a particular format, you could add some specific examples as context to your prompt to get even better results.


Another snake oil. Coding needs abstract thinking that only smartest of humans can do, while the AI crowd still can't replicate ant-level intelligence.

What would be an immense multiplier of software eng productivity is an intelligent auto-fixer tool: a compiler gives you an error, you know how to fix it, but it's a tedious work that wastes most of your time. Think of fixing build deps, rewriting method signatures to match the parent class, properly adding a library to your project. You'd write "include ssl.h; encrypt(message)" and the tool would add all the plumbing around it, in line with project guidelines.


Why does everyone assume developers won't read and edit the code generated to be more correct?

Analyzing, debugging, and fixing code is what we do all day.

It reminds me of people strongly against autocomplete for variable/function names.


Presumably, skilled developers won't be the majority working in roles where copy pasting from the AI tool is a common pattern. I like PHP, for example, but a fair amount of the criticism of existing PHP codebases is because the ecosystem attracts lower-skilled folks.

What you're mentioning will happen, some developers will use the tool the right way. I just don't think that will be the prevailing pattern.


It is clear that technical progress is inevitable. I can show you an example of a site where almost all articles are written using AI (GPT-2, GPT-3). https://www.vproexpert.com/what-can-ai-do-today/ Is it badly written?


AI can also read code like a human: https://denigma.app


First code snippet I pasted in it got a conditional reversed so doesn't seem to be very reliable.


Well yeah it was trained on humans


Just like the first Go playing programs. Just you wait until it is trained by "self-play" against a compiler.


And what kind of program would your AI write? Would it write a web-app? How would it figure out what the API should look like just based on self training against a compiler? Neither the compiler nor the AI code you mentioned have any understanding of web API's, there is no way it could code that.

AI isn't magic, it can't solve problems you don't give it. When doing alpha go they gave the AI all the rules for GO and told it to optimize for those rules. That works fine. But if you want it to make a web-app, what rules would you give it to optimise for? Do you have a web-app evaluator lying around somewhere we can use? If not I don't see it happening.

The current code helper solved that by telling the AI to solve the problem "write code that looks like this bunch of human written code". The AI can do that just fine, but code that looks like human written code isn't terribly useful since the AI doesn't understand what makes the code good, all it knows is that the code looks similar to what a human once wrote. This is cool, but as you can see this is very different from the real deal where the AI solves the real coding problem rather than "write something that looks like code" problem.


There are several datasets of programming puzzles, of increasing difficulty.

Basically, you write a test, let the algorithm find the program that passes it. Hopefully, at one point you reach GPT-3 level performances where it is able to imagine programs for tests it never saw.


We can discuss that scenario when it happens. Currently we aren't there and it isn't obvious to me that the current algorithm can get there.

> Hopefully, at one point you reach GPT-3 level performances where it is able to imagine programs for tests it never saw.

You mean nonsense programs just like GPT-3 generates nonsense articles? GPT-3 doesn't remember the logic in its sentences, and in order to solve programming competition problems you need to translate logic from human text into code.

I agree that it might be possible to get something useful this way, but until it actually works I'll doubt it will work. There is just way too much coherence required that doesn't seem to be there yet, and from what I've seen the coherence problem gets exponentially worse as you get larger problems.


GPT-3 stays incredibly in-topic. You can give it logic problems it won't be able to solve, but GPT-3 is capable of things that, as a GOFAI supporter, I never thought brute-force connexionist approach could do. I am now very cautious before saying a task can't be done by DL because it requires "understanding".

These approaches discover concepts and the relationship between them, and use that in their tasks. It is not far-fetched to say that there is some kind of understanding there.

For now we have trained it to generate fake text and basically made a master bullshitter, but I have no doubt that it can easily extract meaning and intent from text.


>I have no doubt that it can easily extract meaning and intent from text.

Have you met humans? People do not supply complete or self consistent information on what their goals are. They do not form objections to the output of a program based on an accurate and complete model of it either.

Also, they hate to communicate via text - how many times have you heard "ugh, let's discuss it on a call"?

But that does not mean you can BS them endlessly. The fact that people have no idea about the technical details doesn't mean they are going to accept failure.

I'd like to see an AI that can dominate https://en.wikipedia.org/wiki/Nomic


> And what kind of program would your AI write?

I'm terrified of the day when the answer to that is, "a better AI".


Given that we aren't even near having a general purpose AI the idea of an AI that can only write better versions of itself sounds a lot less terrifying.


The difference is that in Go you have a defined goal, its static. This is different from writing code or a novel. The goal is not only to write a book or a program, it has to be a entertaining book or a working code that solves MY problem but not any other problem.

AI currently does not understand language well enough, it recognises patterns but does not understand what its doing and why.


You could make the goal more static with unit tests, though they would need to be a lot of them and very specific, otherwise it would just game them.


This might be a beginning, but its far away from real code understanding. You still have a problem with copyright code that a AI currently produces and with unsecure code.

AI is still too far away from real understanding.


hot take: rather than AI, software development would be better served by just going into LTS maintenance mode. For each domain (web development, android, etc, css) pick a tool and everyone agrees that will be "the" tool for a decade or two.

this will reduce the amount of "possibilities" w.r.t. code by orders of magnitude and therefore make it easier to both read, write, and development tools to automate code.

of course this will necessarily reduce innovation, but at the expense of higher quality code, easier to maintain code and skills that are more transferrable.


> everyone agrees that will be "the" tool for a decade or two Sorry, I found this so funny. Immediately before that, everybody agrees that we will develop in the one and only true language. All other languages will be banished ;-) ;-)


A nice idea, but it's not a pareto equilibrium though; particularly not for developers.


Is there anyone here with practical first hand experience that can comment about their impressions so far?


I've been using OpenAI Codex directly (so not Github Copilot) for some coding tasks and it's been useful enough that I am using it daily at this point, though mostly as like a sort of personal Stack Overflow, especially when working with unfamiliar APIs. Prompting it with a function signature/docstring of what you want to achieve can give a helpful and often functional example that uses that API, even if the code is not perfect.

It's also decent at doing tasks that would be easy for a beginner programmer but perhaps tedious. Like you can give it a function for generating random colors and a task like "expand the variable names in this function to be better", and it will change variable names like `r`, `g` and `b` to `red`, `green` and `blue`. Hardly amazing, and the latency of the API means it's not actually that useful in practice (yet), but with the right prompt it can do some impressive things, and I expect it to get much better in the near future. Pulling magic strings out of functions into constants and adding typehints to Python functions are other simple tasks I've found it OK for.


With enough monkeys and typewriters, we don't need AI to go where we need to go.


Some of those monkeys will inevitably write AI though.


It's possible the monkeys might even write an AI that produces physical monkeys who then in turn produce AI!


AI is the monkeys


This is the future of Software Engineering: fixing AI's bugs.


The future of AI will not be skynet, but priests of the adeptus mechanicus praying to the machine spirit and performing the rites in the hope that it will perform correctly.


I expect programmers' jobs will only be to write tests. If a bug is found another test is written and the AI will generate new code until the new test passes.


Writing a test suite so comprehensive that the only programs that satisfy it are correct wrt the specification amounts to writing a formal proof, which is in most cases orders of magnitude harder and more time consuming than writing the code yourself.

Programming is ironically one of the worst use cases of AI, because often predictable failure is better than unexpected success.

"Fix the null pointer exception" is less expensive (time, money and mental energy) than "figure out why the code spits out garbage even though the test cases pass".


This will take approximately eternity for difficult problems, such as management changing their mind every day.


So no change?


No change, and less electricity wasted!


At least the AI won’t be tempted to be clever with its code unless we tell it to


I agree, but only because the AI won't be tempted. It will however write "clever" code all over the place.

https://www.damninteresting.com/on-the-origin-of-circuits/

> Dr. Thompson peered inside his perfect offspring to gain insight into its methods, but what he found inside was baffling. The plucky chip was utilizing only thirty-seven of its one hundred logic gates, and most of them were arranged in a curious collection of feedback loops. Five individual logic cells were functionally disconnected from the rest — with no pathways that would allow them to influence the output — yet when the researcher disabled any one of them the chip lost its ability to discriminate the tones. Furthermore, the final program did not work reliably when it was loaded onto other FPGAs of the same type.


var cleverCode = bot.generateCodeFor(someCucumberTest, Mode.CLEVER)


Zero surprises here.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: