Every time I try Gemini, it's really subpar. I found that qwen2.5-coder-32b-instruct can be better.
Also, for me 50% 50% for Sonnet and o1, but although I'm not 100% sure about it, I think o1 is better with longer and more complicated (C++) code and debugging. At least from my brief testing. Also, OpenAI models seem to be more verbose - sometimes it's better - where I'd like additional explanation on chosen fields in a SQL schema, sometimes it's too much.
EDIT: Just asked both o1 and Sonnet 3.5 the same QML coding question, and Sonnet 3.5 succeeded, o1 failed.
Very anecdotal but I’ve found that for things that are well spec’d out with a good prompt Sonnet 3.5 is far better. For problems where I might have introduced a subtle logical error o1 seems to catch it extremely well. So better reasoning might be occurring but reasoning is only a small part of what we would consider intelligence.
Also, for me 50% 50% for Sonnet and o1, but although I'm not 100% sure about it, I think o1 is better with longer and more complicated (C++) code and debugging. At least from my brief testing. Also, OpenAI models seem to be more verbose - sometimes it's better - where I'd like additional explanation on chosen fields in a SQL schema, sometimes it's too much.
EDIT: Just asked both o1 and Sonnet 3.5 the same QML coding question, and Sonnet 3.5 succeeded, o1 failed.