Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>Whether LLM's help you at your work is extremely domain-dependent.

I really doubt that, actually. The only thing that LLMs are truly good for is to create plausible-sounding text. Everything else, like generating facts, is outside of its main use case and known to frequently fail.



That opinion made sense two years ago. It's plain weird to still hold it today.


There was a study recently that made it clear the use of LLMs for coding assistance made people feel more productive but actually made them less productive.

EDIT: Added links.

https://www.cio.com/article/3540579/devs-gaining-little-if-a...

https://web.archive.org/web/20241205204237/https://llmreport...

(Archive link because the llmreporter site seems to have an expired TLS certificate at the moment.)

No improvement to PR throughput or merge time, 41% more bugs, worse work-life balance...


I recently slapped 3 different 3 page sql statements and their obscure errors with no line or context references from Redshift into Claude, it was 3 for 3 on telling me where in my query I was messing up. Saved me probably 5 minutes each time but really saved me from moving to a different task and coming back. So around $100 in value right there. I was impressed by it. I wish the query UI I was using just auto-ran it when I got an error. I should code that up as an extension.


$100 to save 15 minutes implies that you net at least $800,000 a year. Well done if so!


When forecasting for developers and employee cost for a company I double their pay but I'm not going to say what I make and if I did or not. I also like to think that developers should be working on work that is many multiples of leverage over their pay to be effective. But thanks.


> but really saved me from moving to a different task and coming back

You missed this part. Being able to quickly fix things without deep thought while in flow saves you from the slowdowns of context switching.


That $100 of value likely costed them more like $0.1 - $1 in API costs.


It didn't cost me anything, my employer paid for it. Math for my employer is odd because our use of LLMs is also R&D (you can look at my profile to see why). But it was definitely worth $1 in api costs. I can see justifying spending $200/month for devs actively using a tool like this.


I am in a similar same boat. Its way more correct than not for the tasks I give it. For simple queries about, say, CLI tools I dont use that often, or regex formulations, I find it handy as when it gives the answer Its easy to test if its right or not. If it gets it wrong, I work with Claude to get to the right answer.


First of all, that's moving the goalposts to next state over, relative to what I replied to.

Secondly, the "No improvement to PR throughput or merge time, 41% more bugs, worse work-life balance" result you quote came, per article, from a "study from Uplevel", which seems to[0] have been testing for change "among developers utilizing Copilot". That may or may not be surprising, but again it's hardly relevant to discussion about SOTA LLMs - it's like evaluating performance of an excavator by giving 1:10 toy excavators models to children and observing whether they dig holes in the sandbox faster than their shovel-equipped friends.

Best LLMs are too slow and/or expensive to use in Copilot fashion just yet. I'm not sure if it's even a good idea - Copilot-like use breaks flow. Instead, the biggest wins coming from LLMs are from discussing problems, generating blocks of code, refactoring, unstructured to structured data conversion, identifying issues from build or debugger output, etc. All of those uses require qualitatively more "intelligence" than Copilot-style, and LLMs like GPT-4o and Claude 3.5 Sonnet deliver (hell, anything past GPT 3.5 delivered).

Thirdly, I have some doubts about the very metrics used. I'll refrain from assuming the study is plain wrong here until I read it (see [0]), but anecdotally, I can tell you that at my last workplace, you likely wouldn't be able to tell whether or not using LLMs the right way (much less Copilot) helped by looking solely at those metrics - almost all PRs were approved by reviewers with minor or tangential commentary (thanks to culture of testing locally first, and not writing shit code in the first place), but then would spend days waiting to be merged due to shit CI system (overloaded to the point of breakage - apparently all the "developer time is more expensive than hardware" talk ends when it comes to adding compute to CI bots).

--

[0] - Per the article you linked; I'm yet to find and read the actual study itself.


Do you have a link? I'm not finding it by searching.


I really need the source of this.


LLMs have become indispensable for many attorneys. I know many other professionals that have been able to offload dozens of hours of work per month to ChatGPT and Claude.


What on earth is this work that they're doing that's so resilient to the fallible nature of LLMs? Is it just document search with a RAG?


Everything. Drafting correspondence, pleadings discovery, discovery responses. Reviewing all of the same. Reviewing depositions, drafting deposition outlines.

Everything that is “word processing,” and that’s a lot.


Well that's terrifying. Good luck to them.


To be honest, much of contract law is formal boilerplate. I can understand why they'd want to move their role to 'review' instead of 'generate'


So, instead of fixing the issue (legal documents becoming a barely manageable mess) they’re investing money into making it… even worse?

This world is so messed up.


Arguably the same problem is occurs in programming: Anything so formulaic and common that an LLM can regurgitate it with a decent level of reliability... is something that ought to have been folded into method/library already.

Or it already exists in some howto documentation, but nobody wanted to skim the documentation.


They have no lever with which to fix the issue.


Why not just move over to forms with structured input?


As a customer of legal work for 20 years, it is also way (way way) faster and cheaper to draft a contract with Claude (total work ~1 hour, even with complex back-and-forth ; you don't want to try to one-shot it in a single prompt) and then pay a law firm their top dollar-per-hour consulting to review/amend the contract (you can get to the final version in a day).

Versus the old way of asking them to write the contract, where they'll blatantly re-use some boilerplate (sometimes the name of a previous client's company will still be in there) and then take 2 weeks to get back to you with Draft #1, charging 10x as much.


Good law firms won’t charge you for using their boilerplates, only the time to customize it for your use case.

I anlways ask our lawyer whether or not they have a boilerplate when I need a contract written up. They usually do.


That's interesting. I've never had a law firm be straightforward about the (obvious) fact they'll be using a boilerplate.

I've even found that when lawyers send a document for one of my companies, and I give them a list of things to fix, including e.g. typos, the same typos will be in there if we need a similar document a year later for another company (because, well, nobody updated the boilerplate)

Do you ask about the boilerplate before or after you ask for a quote?


I typically don’t ask for a quote upfront since they are very fair with their business and billing practices.

I could definitely see a large law firm (Orrick, Venable, Cooley, Fenwick) doing what you describe. I’ve worked with 2 firms just listed, and their billing practices were ridiculous.

I’ve had a lot more success (quality and price) working with boutique law firms, where your point of contact is always a partner instead of your account permanently being pawned off to an associate.

Email is in profile if you want an intro to the law firm I use. Great boutique firm based in Bay Area and extremely good price/quality/value.


Yeah the industries LLMs will disrupt the most are the ones who gatekeep busywork. SWE falls into this to some degree but other professions are more guilty than us. They dont replace intelligence they just surface jobs which never really required much intelligence to begin with.


I bet they still charge for all the hours though.


I use llms to do most of my dunki work.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: