From a usability point of view, it does a better job than Vanilla ChatGPT however just like chatGPT, it can also actually waste times with false information.
Context: I have a piece of Druid SQL code that a coworker wanted to debug. I asked Phind to help me debug it and provided it all the context. Phind made an assertion about a possible bug in my code and suggested a variation. However when I pressed it to explain the anomaly it went about in circles. When I followed up, it veered off course with an apology and a totally irrelevant answer.
>I apologize for any confusion caused. The SELECT MAX("N_logins") and SELECT MAX(res."N_logins") statements are the same in terms of functionality.
>In the context provided [Source 2], these SELECT statements are not related to the Druid SQL query discussed earlier. Rather, they are related to the limits.conf file in Linux, which specifies system resource limits for users and processes.
Understandably it's just a limitation of the current LLM's resoning abilities, somehting you'd uncover by prompting them to play a game of Tic Tac Toe
So, a "Prompt Engineer" will also need the skill of "awareness of proximity to the edge of the time-saving cliff" or "perfect-is-the-enemy-of-good detection".
Are they expected to be able to justify their answers?
Sort of. I’m not the guy you asked, but in our work we’ve had trouble making good use of GPT for most things. On one hand it’s been a powerful tool for non-developers, and it’s helped a lot of our more technically inclined employees to automate part of their workflows (and maintain the automation on their own) in a way that no “no-code” solution has ever done before. For developers, however, it’s been giving so many terrible results that it’s not really been too different than simply using a regular search engine. Sometimes it’s much faster, but other times the lack of “metadata” and “other opinions” like you may find on a site like StackOverflow through time stamps and comments have made it significantly slower.
Anyway, getting back to the sort of part of my answer to you. We’ve had an issue where junior engineers trust GPT a little too much. This is more a psychological I suppose, but where they might not take what they find by “google programming” for granted, they are much more likely to believe that what GPT is telling them is correct. Which can be an issue when what GPT is telling then isn’t correct. Where our more senior engineers will laugh at it, and correct it’s mistakes, our juniors will trust it.
I’ll give you one example, we had a new programmer pull some information from a web-service and have GPT help them handle the json. GPT told the developer to disable our rules linter and handle the json dynamically, doing something like items.items[0].haps, and other such things. Which works, until it doesn’t. You can scoff at us for using a lot of Typescript on the backend, and we certainly shouldn’t have allowed this to ever get build in our automation, but that would still create something that might cause some really funny errors down the line. I know, because part of why I’m there is because the organisation used to do these things all the time, and it’s lead to a lot of “funny” things they needs to be cleaned up.
Anyway, this isn’t necessarily a criticism of GPT, because it still does other things well, but it is a risk you need to consider, because I do think someone is going to be able to justify those answers you talk about, and if it’s not GPT then it’ll have to be the developer who uses GPT. In many cases it won’t be an issue, because we live in a world that’s sort of used to IT not working all the time, but you probably wouldn’t want your medical software to be written in this manner.
I think it's in the same area as car "autopilots". Just like you can't give such vehicle to someone who can't drive by themselves, you can't expect it will make junior into a senior. It's not really able to extend your possibilities beyond what would be possible with enough google and studying documentation. It can save your time and effort though.
:) oh it will be written like that and has been written like that.
Dont google for the number of unnecessary brain surgeries that have happened cause buggy mri software highlights tumors were there are none.
No one will consider the risk under deadline pressure. The deeper down a tech stack you go barely anyone knows what the hell is going on anymore, and or how to fix it, precisely because of half baked code added in this fashion, which accumulates over time.
At the end of the day dealing with blackbox tech is similar to dealing with ppl or groups of ppl behaving in strange inefficient ways.
It is somewhat my job to deep dive legacy problems, and often that does take an understanding of the full stack. But I am finding more challenges in newer frameworks, where "magic" is no longer a code smell, generated code is the norm, no one considered debugging and you can't always reasonably dive down a stack.
I imagine that will be much worse when you can't expect the code to have considered human readability at all.
Yup generated code is spreading like cancer. You kind of have to develop an "empathy" for the system, just like with broken humans who cant be fixed. How are you feeling today mr.blackbox? Feeling a bit lethargic? Want a reboot?
I find it quite painful when code generation is used to generate plugin glue code for bigger frameworks. The reason is that it stops being searchable as function names become programmatically generated, and code changes based on any number of magic configurations or state. That is also why some meta-programming is hard to debug.
You need to reverse engineer the generators to figure out how to find the code that's actually running, in bigger applications that's a pain in the butt.
Ok. Yes absolutely. Actually I had that experience as well and I had to learn the generation logic. Waste of time.
I had good experience when the code is generated, and eventually updated automatically but for other shape and purpose it’s normal code. The generated code goes in version control.
So really it’s a scaffolding operation. But still, I was impress by the quality and ever cleaverness of the generated code. ( because the generator was written with a unique, specific target in mind )
Only if they actually know how to code, since if they do not then there is no point at which it is faster for them to do it.
That's where I am struggling to reconcile the new roles AI enables. Do we still need to be software experts? If so, usually I already know what to write, so why bother having an intermediate step. I never think to myself, I should delegate this task I am half way through to a junior. That's harder than just finishing it.
> Are they expected to be able to justify their answers?
I hear this question a lot, and I think it's phrased wrong. There's certain problems that require accuracy, high quality, or confidence in reasoning. ChatGPT is ill suited for those problems. Other problems can tolerate poor accuracy, and ChatGPT will be suitable for those problems.
I wouldn't want my doctor using ChatGPT. But if a history game used ChatGPT to show historical quotes on a loading screen, I'd be OK if some were inaccurate or misattributed.
The expectation comes from the problem you're trying to solve. As we get a better understanding of ChatGPT limits our expectations will get better aligned.
Context: I have a piece of Druid SQL code that a coworker wanted to debug. I asked Phind to help me debug it and provided it all the context. Phind made an assertion about a possible bug in my code and suggested a variation. However when I pressed it to explain the anomaly it went about in circles. When I followed up, it veered off course with an apology and a totally irrelevant answer.
>I apologize for any confusion caused. The SELECT MAX("N_logins") and SELECT MAX(res."N_logins") statements are the same in terms of functionality.
>In the context provided [Source 2], these SELECT statements are not related to the Druid SQL query discussed earlier. Rather, they are related to the limits.conf file in Linux, which specifies system resource limits for users and processes.
Understandably it's just a limitation of the current LLM's resoning abilities, somehting you'd uncover by prompting them to play a game of Tic Tac Toe