Everyone's comparing o1 and claude, but neither really work well enough to justi...

coreyh14444 · on Dec 22, 2024

Just tell it to do that and it will. Whenever I ask an AI for something and I'm pretty sure it doesn't have all the context I literally just say "ask me clarifying questions until you have enough information to do a great job on this."

aimanbenbaha · on Dec 22, 2024

And this chain of prompts cumulated with the improved CoT reasoner would accrue a lot more enhanced results. More in line with what the coming agentic era promises.

vintermann · on Dec 22, 2024

Yes. You can only do so much with the information you get in. The ability to ask good questions, not just of itself in internal monologue style, but actually of the user, would fundamentally make it better since it can get more information in.

As it is now, it has a bad habit of, if it can't answer the question you asked, instead answering a similar-looking question which it thinks you may have meant. That is of course a great strategy for benchmarks, where you don't earn any points for saying you don't know. But it's extremely frustrating for real users, who didn't read their question from a test suite.

Vecr · on Dec 22, 2024

I know multiple people that carefully prompt to get that done. The model outputs in direct token order, and can't turn around, so you need to make sure that's strictly followed. The system can and will come up with post-hoc "reasoning".

simondotau · on Dec 22, 2024

Just today I got Claude to convert a company’s PDF protocol specification into an actual working python implementation of that protocol. It would have been uncreative drudge work for a human, but I would have absolutely paid a week of junior dev time for it. Instead I wrote it alongside AI and it took me barely more than an hour.

The best part is, I’ve never written any (substantial) python code before.

OutOfHere · on Dec 22, 2024

It would seem you don't care too much about verifying its output or about its correctness. If you did, it wouldn't take you just an hour. I guess you'll let correctness be someone else's problem.

simondotau · on Dec 23, 2024

Your wild assumptions and snarky accusations are unnecessary. The library is for me to use; there isn't a "someone else" for me to pass problems onto. I then did what I usually do — start writing real code with it ASAP, because real code is how you find real problems.

I developed the library interactively, one API call at a time, in a manner akin to pair programming. Code quality was significantly better than I'd expect from $2000 worth of a GOOD mid-tier programmer — the code was well written, well organised, and comprehensively annotated. The code wasn't perfect, but a majority of faults had a basis in the underlying documentation being wrong or ambiguous.

The $20/month for Cursor Pro literally justified its cost in less than 10 minutes.

shadowerm · on Dec 23, 2024

I think many here think they are Claude Shannon himself so using something like Claude is just below such a genius.

djeastm · on Dec 23, 2024

I don't know the OP here, but in my experience a junior dev at an average company would likely not do much more than the AI would. These aren't your grandfather's engineers, after all.

simondotau · on Dec 23, 2024

A junior dev wouldn't have produced output of such consistency, and they wouldn't have annotated their code nearly as well. The majority of code was better than I'd expect from a junior, and the comments were better than I'd expect from the majority of people at every skill level.

weird_fox · on Dec 22, 2024

I have to agree. It's still a bit hit or miss, but the hits are a huge time and money saver especially in refactoring. And unlike what most of the rather demeaning comments in those HN threads state, I am not some 'grunt' doing 'boilerplate work'. I mostly do geometry/math stuff, and the AIs really do know what they're talking about there sometimes. I don't have many peers I can talk to most of the time, and Claude is really helping me gather my thoughts.

That being said, I definitely believe it's only useful for isolated problems. Even with Copilot, I feel like the AIs just lack a bigger context of the projects.

Another thing that helped me was designing an initial prompt that really works for me. I think most people just expect to throw in their issue and get a tailored solution, but that's just not how it works in my experience.

mitemte · on Dec 23, 2024

Similar experience here. These tools are so good for side stepping the one or two day grinds.

simondotau · on Dec 23, 2024

For me, it's allowing me to do things I wouldn't have even attempted before. I'm writing in languages I've never written in before (python) and dealing with stuff I've never dealt with before (multicast UDP). This isn't complicated stuff by any stretch, but AI means I can be highly productive in python without needing to spend any time learning python.

guytv · on Dec 23, 2024

The alternative to "ask me clarifying question" is to use Claude's Projects. Upload all your projects' source code there, and ask Claude to do your programming task. OpenAI have recently also added this feature to their offering.

qup · on Dec 22, 2024

Have you used them to build a system to ask you clarifying questions?

Or even instructed them to?

kelsey98765431 · on Dec 22, 2024

have you tested that this helps? seems pretty simple to script with an agent framework

throwaway314155 · on Dec 22, 2024

Or just f-strings.