Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

While this sounds impressive, I'll believe it when AlphaZero wins TCEC.


It beat the winner of TCEC-2016, Stockfish, with a record of 28-72-0. That's zero losses.


If I run SF on my desktop computer it will kill SF run on my phone. It doesn't prove anything. Comparing TPUs and CPUs is hard but they could've at least let SF run on what is considered top of the line setup and sensible settings (1GB hash memory is very limited, 8GB is standard for rapid games on a quad core CPU, let alone 64core one).


I can't figure out the reason for this stingy 1GB hash memory limit when using 64 cores. It pretty much negates advantage of 64 cores vs say 4/6 cores.

A nefarious suggestion would be that setting 1GB limit ensures that Alpha would always have the edge in depth as Stockfish would be forced to prune long lines to preserve hash memory.

Maybe someone who has read Stockfish source code can comment how Stockfish prunes hash memory.


Well, the one explanation is that they wanted to win "convincingly", thus 1m per move and so low memory amount for hash.


They didn't demonstrate that AlphaGo Zero can beat Stockfish in a fair contest: i.e. take the amount of money they spent on Stockfish's CPU and RAM, buy a commodity GPU for AlphaGo and then see.



I'm sorry, I thought we were discussing the paper.


On completely different hardware.


Back when AlphaGo was playing Lee Sedol I was thinking about a chess playing version in TCEC.

The interesting thing is TCEC assumes a bit about the structure of the chess program. That is, the TCEC win-adjudication rule says that if both programs agree that one program is 6.5 pawns ahead for 8 turns in a row, they judge that program to be the winner.

But programs like Alpha don't have an evaluation function that operates in conventional units (like centipawns).


You can convert winning percentages to centipawns, so that's not a problem.


Could you explain your proposed conversion process?


Here's a relevant section from Deepmind's paper:

> We also measured the head-to-head performance of AlphaZero against each baseline player. Settings were chosen to correspond with computer chess tournament conditions: each player was allowed 1 minute per move, resignation was enabled for all players (-900 centipawns for 10 consecutive moves for Stockfish and Elmo, 5% winrate for AlphaZero). Pondering was disabled for all players.


Houdini for example tries to make it so that +1.00 evaluation is a win in 75% of cases in blitz games and +1.5 represents 90% chance of winning (http://www.cruxis.com/chess/houdini.htm). Anyway, this is not a problem at all, this was introduced so less electricity is wasted when the position is a clear win/loss.


I hope they change the TCEC hardware specs to include GPU so this might be able to happen.


How can we fairly evaluate TPU engines vs. CPU engines?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: