In the strict sense, yes, brute force implies exhaustive search. But it often refers to all algorithms that leverage computational power as their main tool. For example, many people referred to Deep Blue as "brute-force" machine, even though it used Alpha-Beta pruning. In that second sense MCTS is very much a brute force algorithm. Its knows nothing about the domain of the problem and works better the more computer cycles you throw at it.
Point being, it is significant that something like MCTS can be so effective at Go even without machine learning.
Of course it was. If your explanation of AlphaGo somehow involves saying it was different from chess programs you can stop right now - you're just entirely wrong.
Whether chess engines and go engine are brute force or not just depends on how you define brute force:
* If you cut variations where you have heuristical indications they don't affect the result, is that still brute force?
* If you cut variations where you have mathematical, provable evidence they don't affect the result, is that still brute force?
Nope, brute force means exhaustive search by definition.
One could arguably claim that something like minmax is brute force, since you only prune subtrees that are provably sub-optimal.
But a search algorithm or heuristic that prunes over 99.99999999% (there are more 9s, but I won't bother calculating the right number) of the search space is not brute force.
By this logic, chess engines are not brute force. Anyone familiar with the state of chess engines would find this ridiculous.
I would agree that under one extremely nitpicky definition of brute force, brute force means searching every state. But chess engines are widely considered to be brute force. They don't have neural nets or fancy nonlinear function approximators for board state evaluation. But they most certainly don't search every single position. Their search depth is capped.
They don't have neural nets or fancy nonlinear function approximators for board state evaluation
Nonlinear function approximations are pretty common due to combining several evaluation heuristics and indexing in to tables with weights with the output of previous heuristics.
As for neural networks - the extra nonlinear layers seem to have too much computational overhead compared to the increase in evaluation accuracy. Chess board state can be reasonably approximated with linear terms and few cherry-picked higher order terms.
Are we in the same thread? I conceded that things like minmax can also be considered brute force. I also gave a (very liberal) estimate of how much the tree is pruned. Why still call it "extremely nitpicky"?