Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't think comparing win loss distributions is particularly insightful.

By a single machine winning many games relative to the distributed version, really it's just saying that the value/policy network is more important than the monte carlo tree search. The main difference is the number of tree search evaluations you can do; it doesn't seem like they have a more sophisticated model in the parallel version.

This suggests that there are systematic mistakes that the single 8 GPU machine makes compared to the distributed 280 GPU machine, but MCTS can smooth some of the individual mistakes over a bit.

I would suspect that the general Go-playing population of humans do not share some of the systematic mistakes, so you likely won't be able to project these win/loss distributions to playing humans.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: