So far, I haven't seen any comparison of AI (using the best available models) and hand written code that illustrates what you are saying, especially the "it's orders of magnitude worse" part.
This is not my experience *at all*. Maybe models from like 18+ months ago would produce really bad code, but in general most coding agents are amazing at finding existing code and replicating the current patterns. My job as the operator then is to direct the coding agent to improve whatever it doesn't do well.
But a lot of people don't think like this, and we must come to the unavoidable conclusion that the LLM code is better than what they are used to, be their own code, or from their colleagues.
I mean yes, i am speaking for myself. I am drowning in mountains of LLM slop patches lol. I WISH people were using LLMs as "just another tool to generate code, akin to a vim vs emacs discussion."
I'm so sick of being dumped 1000 line diffs from coworkers who have generated whole internal libraries that handle very complicated operations that are difficult to verify. And you just know they spend almost no time properly testing and verifying since it was zero effort to generate it all in the first place.
Considering the seeming increasing frequency of high severity bugs happening at FAANG companies in the last year I think perhaps The great getting greater is not actually the case.
I happen to think that's largely a self-delusion which nobody is immune to, no matter how smart you are (or think you are).
I've heard this from a few smart people whom I know really well. They strongly believe this, they also believe that most people are deluding themselves, but not them - they're in the actually-great group, and when I pointed out the sloppiness of their LLM-assisted work they wouldn't have any of it.
I'm specifically talking about experienced programmers who now let LLMs write majority of their code.
All on my own, I hand-craft pretty good code, and I do it pretty fast. But one person is finite, and the amount of software to write is large.
If you add a second, skilled programmer, just having two people communicating imperfectly drops quality to 90% of the base.
If I add an LLM instead, it drops to maybe 80% of my base quality. But it's still not bad. I'm reading the diffs. There are tests and fancy property tests and even more documentation explaining constraints that Claude would otherwise miss.
So the question is if I can get 2x the features at 80% of the quality, how does that 80% compare to what the engineering problem requires?
I was somewhat surprised to find that the differentiator isn't being smart or not, but the ability to accurately assess when they know something.
From my own observations, the types of people I previously observed to be sloppy in their thought processes and otherwise work, correlates almost perfectly with those that seem most eager to praise LLMs.
It's almost as if the ability to identify bullshit, makes you critical of the ultimate bullshit generator.
This is very true. My biggest frustration is people who use LLMs to generate code, and then don't use LLMs to refine that code. That is how you end up with slop.I would estimate that as a SDE I spend about 30% of my time reviewing and refining my own code, and I would encourage anyone operating a coding agent to still spend 30% figuring out how to improve the code before shipping.
I dont think anyone really cares at all about LLM code that is the exact same end result as the hand written version.
It's just in reality the LLM version is almost never the same as the hand written version, it's orders of magnitude worse.