>Whether LLM's help you at your work is extremely domain-dependent.
I really doubt that, actually. The only thing that LLMs are truly good for is to create plausible-sounding text. Everything else, like generating facts, is outside of its main use case and known to frequently fail.
There was a study recently that made it clear the use of LLMs for coding assistance made people feel more productive but actually made them less productive.
I recently slapped 3 different 3 page sql statements and their obscure errors with no line or context references from Redshift into Claude, it was 3 for 3 on telling me where in my query I was messing up. Saved me probably 5 minutes each time but really saved me from moving to a different task and coming back. So around $100 in value right there. I was impressed by it. I wish the query UI I was using just auto-ran it when I got an error. I should code that up as an extension.
When forecasting for developers and employee cost for a company I double their pay but I'm not going to say what I make and if I did or not. I also like to think that developers should be working on work that is many multiples of leverage over their pay to be effective. But thanks.
It didn't cost me anything, my employer paid for it. Math for my employer is odd because our use of LLMs is also R&D (you can look at my profile to see why). But it was definitely worth $1 in api costs. I can see justifying spending $200/month for devs actively using a tool like this.
I am in a similar same boat. Its way more correct than not for the tasks I give it. For simple queries about, say, CLI tools I dont use that often, or regex formulations, I find it handy as when it gives the answer Its easy to test if its right or not. If it gets it wrong, I work with Claude to get to the right answer.
First of all, that's moving the goalposts to next state over, relative to what I replied to.
Secondly, the "No improvement to PR throughput or merge time, 41% more bugs, worse work-life balance" result you quote came, per article, from a "study from Uplevel", which seems to[0] have been testing for change "among developers utilizing Copilot". That may or may not be surprising, but again it's hardly relevant to discussion about SOTA LLMs - it's like evaluating performance of an excavator by giving 1:10 toy excavators models to children and observing whether they dig holes in the sandbox faster than their shovel-equipped friends.
Best LLMs are too slow and/or expensive to use in Copilot fashion just yet. I'm not sure if it's even a good idea - Copilot-like use breaks flow. Instead, the biggest wins coming from LLMs are from discussing problems, generating blocks of code, refactoring, unstructured to structured data conversion, identifying issues from build or debugger output, etc. All of those uses require qualitatively more "intelligence" than Copilot-style, and LLMs like GPT-4o and Claude 3.5 Sonnet deliver (hell, anything past GPT 3.5 delivered).
Thirdly, I have some doubts about the very metrics used. I'll refrain from assuming the study is plain wrong here until I read it (see [0]), but anecdotally, I can tell you that at my last workplace, you likely wouldn't be able to tell whether or not using LLMs the right way (much less Copilot) helped by looking solely at those metrics - almost all PRs were approved by reviewers with minor or tangential commentary (thanks to culture of testing locally first, and not writing shit code in the first place), but then would spend days waiting to be merged due to shit CI system (overloaded to the point of breakage - apparently all the "developer time is more expensive than hardware" talk ends when it comes to adding compute to CI bots).
--
[0] - Per the article you linked; I'm yet to find and read the actual study itself.
LLMs have become indispensable for many attorneys. I know many other professionals that have been able to offload dozens of hours of work per month to ChatGPT and Claude.
Arguably the same problem is occurs in programming: Anything so formulaic and common that an LLM can regurgitate it with a decent level of reliability... is something that ought to have been folded into method/library already.
Or it already exists in some howto documentation, but nobody wanted to skim the documentation.
As a customer of legal work for 20 years, it is also way (way way) faster and cheaper to draft a contract with Claude (total work ~1 hour, even with complex back-and-forth ; you don't want to try to one-shot it in a single prompt) and then pay a law firm their top dollar-per-hour consulting to review/amend the contract (you can get to the final version in a day).
Versus the old way of asking them to write the contract, where they'll blatantly re-use some boilerplate (sometimes the name of a previous client's company will still be in there) and then take 2 weeks to get back to you with Draft #1, charging 10x as much.
That's interesting. I've never had a law firm be straightforward about the (obvious) fact they'll be using a boilerplate.
I've even found that when lawyers send a document for one of my companies, and I give them a list of things to fix, including e.g. typos, the same typos will be in there if we need a similar document a year later for another company (because, well, nobody updated the boilerplate)
Do you ask about the boilerplate before or after you ask for a quote?
I typically don’t ask for a quote upfront since they are very fair with their business and billing practices.
I could definitely see a large law firm (Orrick, Venable, Cooley, Fenwick) doing what you describe. I’ve worked with 2 firms just listed, and their billing practices were ridiculous.
I’ve had a lot more success (quality and price) working with boutique law firms, where your point of contact is always a partner instead of your account permanently being pawned off to an associate.
Email is in profile if you want an intro to the law firm I use. Great boutique firm based in Bay Area and extremely good price/quality/value.
Yeah the industries LLMs will disrupt the most are the ones who gatekeep busywork. SWE falls into this to some degree but other professions are more guilty than us. They dont replace intelligence they just surface jobs which never really required much intelligence to begin with.
I really doubt that, actually. The only thing that LLMs are truly good for is to create plausible-sounding text. Everything else, like generating facts, is outside of its main use case and known to frequently fail.