I’m curious about the comparison between NNUE and the LLM-based model that Deep Mind announced a couple of weeks ago (https://arxiv.org/pdf/2402.04494.pdf). Using NNUE only (i.e. depth 1 search) would be directly comparable. If Deep Mind’s model is better it raises interesting questions about scaling laws for this kind of thing.
Can Stockfish (with small modifications) be used for other games now? There are a few decent open source AlphaZero implementations and I wonder how it would compare.
NNUE is the interesting part to me. Alpha-beta tree search is useless without a good value function. Not sure what would be the best way to generate the training data if you're starting from scratch.
Now that it uses NN only and does away with search, does it use more or less computing resources? Also, does it suffer from the "Swiss cheese" problem like the ones for go do? People could essentially look for weaknesses in go engines by finding paths that the engine hasn't explored during self play and the accuracy would plummet in a way that humans could beat it, as far as I understand.
Original strength of Stockfish was closer to Type B strategy (per Shannon) as opposed to Type A(brute force) of Deep Blue.
That is Stockfish was evaluating relatively few positions / second compared to brute forcers (like Crafty,Fritz etc).
This was offset by the best eval (basically crowdsourced human GM/IM/FM knowledge) heuretics.
As an FM I could "exploit" Fritzes and Crafties from 1995-2005 by using holes in their eval.
Tim Crabbe provides some examples from that era: https://timkr.home.xs4all.nl/chess2/honor.htm
With Stockfish, its eval was always top notch (compared to GM) and constantly improving.
Obviously Stockfish was always a few orders faster than a human.