Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

    > To highlight the main strength of o1 pro mode (improved reliability), we 
    > use a stricter evaluation setting: a model is only considered to solve a 
    > question if it gets the answer right in four out of four attempts ("4/4 
    > reliability"), not just one.
So, $200/mo. gets you less than 12.5% randomly wrong answers?

And $20/mo. gets you >25% randomly wrong answers?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: