Leela zero (the main alphago zero replication project) is a crowd sourced computation effort that's going to take a fairly long time to get anywhere.
And from this paper:
> "Training proceeded for 700,000 steps (mini-batches of size 4,096) starting from randomly initialised parameters,
using 5,000 first-generation TPUs (15) to generate self-play games and 64 second-generation TPUs to train the neural networks."
You don't have to start from zero though. It's cool that it works with google scale resources. But it seems like it would be faster to initialize with a neural net first trained to mimic the moves of an existing chess or Go AI. And then improve it from there.
>"Why is the net wired randomly?", asked Minsky. "I do not want it to have any preconceptions of how to play", Sussman said. Minsky then shut his eyes. "Why do you close your eyes?", Sussman asked his teacher. "So that the room will be empty." At that moment, Sussman was enlightened.
I don't think it's definitely true that will work well. AlphaZero did significantly better than the original versions of AlphaGo (which did learn from existing human games). However, even training those nets will still take a fairly intensive amount of computational resources.
As for that koan, I'm not convinced it's very applicable here. My interpretation of the koan is that the entire setup (training process, structure, etc.) all encode domain knowledge. In this case, I think AlphaZero's domain knowledge is transferable enough that I don't think it's relevant.
Leela zero (the main alphago zero replication project) is a crowd sourced computation effort that's going to take a fairly long time to get anywhere.
And from this paper: > "Training proceeded for 700,000 steps (mini-batches of size 4,096) starting from randomly initialised parameters, using 5,000 first-generation TPUs (15) to generate self-play games and 64 second-generation TPUs to train the neural networks."