Why AlphaGo Is Such a Big Deal

pmichaud · on March 29, 2016

I don't think this is really why AlphaGo is a big deal. He's not wrong about the ways that it works vs. DeepBlue, but there's a much simpler way to understand why AlphaGo is a big deal.

(I am not an AI researcher, but I work with AI researchers, like by putting on training programs and recruiting for them, and working out of the same office, and this is a summary of [my understanding of] their discussions about this topic.)

AGI is a Hard Problem (TM), and no one really knows how hard it is. General intelligence is made of algorithms that are somewhere on a spectrum between dizzyingly complex math that we have decades+ more work to develop, and simple components that someone in their basement could stumble upon at any moment if only we had a couple key insights. (Sorry for the gross oversimplification.)

Before AlphaGo there was a whole area of probability space that seemed plausible--a whole area of complexity that had yet to be worked out in detail. AlphaGo proved that more or less current AI tech was sufficient for that problem (current meaning one incremental step forward, as opposed to some revolutionary insight). That's why they "moved the field forward by 10 years"--by proving that the work that AI researchers basically expected to have to do over the next 10 years is not necessary.

Said another way, we're closer to the "simple components" end of the spectrum than we collectively thought. AlphaGo proved that, and that's why it's a big deal.

partycoder · on March 29, 2016

Have you played go? When you play Go you start to appreciate how complex it is, how strong Lee Sedol is and what this achievement means.

I could play my entire life and never achieve the level of play Lee Sedol has. AlphaGo surpassed that level in a matter of months.

As a matter of fact, many professionals that play the game for a living as their main occupation, since they're kids, will never achieve the level that Lee Sedol has.

Also from here, they can add more infrastructure and feed the bot with more training data and let it train for a longer time, and it will be distancing it more and more from human level play.

Then, this task specifically is not just any random task... it's one of the most complex games, one the oldest games, and is also professionally played and studied.

Then, AGI is not a requirement for disrupting our world. You can have ANI for one specific task that is performed by millions of humans and you have disrupted the world enough to make it a problem.

TheLogothete · on March 30, 2016

Go might be a complex game, but in the grand scheme of things it is a trivial problem to solve, very narrow in scope. You perform an infinitely more complex task everyday when you make breakfast. AlphaGo's achievement is that it learned how to open the fridge with great efficiency.

partycoder · on March 31, 2016

Go play a single game of Go and tell me how it goes.

TheLogothete · on March 31, 2016

I don't think you understand how machine learning works. Better not to opine on something you don't understand.

partycoder · on March 31, 2016

It is a narrow application of general purpose machine learning approaches. Convolutional neural networks, supervised/reinforcement learning are not limited to Go. They've found success in a variety of applications.

And while those applications are narrow, they're significant progress in a short period of time.

In this game from only 4 years ago you could see one of the state of the art bots getting humilliated by an anonymous professional player: http://gosensations.com/?id=2&server_id=1&new_id=1392

devishard · on March 30, 2016

> I could play my entire life and never achieve the level of play Lee Sedol has. AlphaGo surpassed that level in a matter of months.

In one, very tiny area, sure. But step outside the very specific task of playing winning moves in go, and Lee Sedol wins every time. AlphaGo can't explain its moves to someone learning go, let alone cognitive tasks we take for granted like moving feet in a way that produces transportation. AlphaGo can't even do things that other programs can do, like play chess.

Sure, you can "disrupt" with Monte Carlo algorithms like AlphaGo in the San Francisco startup sense of the word "disrupt", but that's also not new. AIs have been doing that for years in finance, genetics, chemistry. We've been able to write code that does one thing well for decades--what we lack is code with breadth of application, and AlphaGo doesn't even make a dent in that.

Joeri · on March 30, 2016

> AlphaGo can't explain its moves to someone learning go, let alone cognitive tasks we take for granted like moving feet in a way that produces transportation.

I'm pretty sure lee sedol cannot explain his own moves either. Yes, he can rationalize why that move is a good move, but he won't be able to explain the complete reasoning process that led to that move, because it relies on intuition. He cannot train a random person to play at the level of lee sedol.

If you look at the human brain, every part learns to do just one thing. Alphago is like having just the part of the brain that plays go. Moving from there to something more general is a question of assembling the parts. Building a biped robot that walks and plays go is not a hard problem.

The fundamental question is whether us humans are more than the sum of our parts. If you add task-specific AI bits together, will you at some point have something that becomes a person, or will it always be a collection of features?

gcp · on March 30, 2016

I'm pretty sure lee sedol cannot explain his own moves either. Yes, he can rationalize why that move is a good move, but he won't be able to explain the complete reasoning process that led to that move, because it relies on intuition. He cannot train a random person to play at the level of lee sedol.

I think you're wrong here. I don't see why that wouldn't be the case?

The "intuition" just means he arrives at the move quickly without consciously going through the thought process, but why wouldn't he be able to reconstruct it?

mateuszf · on March 30, 2016

He is right. It is a typical thing that masters can't explain why their actions are that good. That's because their brains are just neural networks which are trained by so many iterations of learning.

partycoder · on March 30, 2016

Well... - some said it wasn't possible - some said it would not happen this decade - some said it was strong but would lose against Lee Sedol

Now people say it doesn't matter. What is next?

shostack · on March 29, 2016

I think that velocity is the fascinating part. I remember being intrigued while reading "Influx" by Daniel Suarez (great tin foil hat thriller that is hitting scarily close to home these days) about the training of the various AIs that were used by the government.

I wonder what happens once we grow an AI whose purpose is to excel at training other AI's. Will we see an exponential leap forward in the speed with which they are trained?

pdkl95 · on March 29, 2016

> exponential leap forward

A recent(-ish) Computerphile video has a good overview of the AI-self-improvement problem.

https://www.youtube.com/watch?v=5qfIgCiYlfY

I recommend watching the previous video first, which introduces the problem of strong-AI rapidly finding solutions (especially with poorly-specified questions) that may not be in humanity's best interest.

https://www.youtube.com/watch?v=tcdVC4e6EV4

Before anybody jumps to overly-sensational conclusions, note that the last video in that series, Rob Miles explains how exponential self-improvement is an extreme point in the space of possible AI development. We don't know how to predict discoveries[1], so we need more research into AI, so we can hopefully make something that isn't exponentially growing beyond our understanding.

https://www.youtube.com/watch?v=IB1OvoCNnWY

[1] see James Burke's non-teleological view of change

Retra · on March 29, 2016

Finding solutions doesn't imply that those solutions are implemented. I find a dozen solutions that are not in humanities interest every day. Does that put anybody at risk? Maybe if you were stupid enough to make me Dictator of the World or something. And you certainly shouldn't do that until I convincingly demonstrate that I am very very committed to finding solutions that are in humanity's best interests.

I'm baffled at how easily people assuming a computer thinking something means it's going to happen. There are a trillion pieces that have to fall in place for that to happen accidentally, and if it doesn't happen accidentally, your problem is a human social problem, not an AI problem.

And I know what someone is going to think: "The AI might be smart enough to figure out how to make those trillion pieces fall in place." But then, who cares? So what if it figures out how to do something. It still has to be done. And we're the ones who have to do it.

AgentME · on March 30, 2016

The AI could come up with ideas that benefit the people that enact them in the short term. If the AI can earn its own money, then it's not that hard for it to use that money to pay people.

For an extreme example, you could imagine an AI getting rich from the stock market (or mechanical turk, etc), then buying up ridiculous amounts of land for paperclip factories and paying workers. The people that want to feed their families or get rich from selling their factories are the ones who will enact the AI's plans. How many conscientious objectors do you expect?

nefitty · on March 30, 2016

The "rich AI" problem didn't seem feasible to me until I realized how powerful and flexible bitcoin is. Now add the power of contracts through ethereum and a machine could actually harness a significant amount of leverage over human actors. With traditional contracts, enforcement would have left the power in government hands, and with traditional banking as well. We've now stepped into an era where we might literally have built the bat and shovel AI will use to get humanity into its grave. /panic

Retra · on March 31, 2016

How many objections would there be a machine collecting a significant amount of money in the stock market and redistributing it among humans? How is that different from what is happening today? We already have rich, selfish people. And they already pay people to get their way. And plenty of people object to it.

AgentME · on March 31, 2016

I was alluding to the idea of a paperclip maximizer AI[1], which over time redirects increasing amounts of resources to making useless paperclips. Following the thought experiment further, it continually buys more factories for the purpose of building paperclips or technologies specifically for building paperclips (including improving itself). It probably does some charity in order to be seen as benevolent by people while the people are still in control. Soon many countries are doing nothing but building paperclips and making the minimum necessities to feed their workers. Every other human endeavor is decreasingly profitable as the AI orchestrates the markets to optimize for paperclips. When the AI reaches enough automation and humans are no longer useful or a threat, it drops all benevolent pretenses and replaces all of its human workers, leaving them to fend for themselves while it owns and defends all of the planet's resources.

[1] https://wiki.lesswrong.com/wiki/Paperclip_maximizer

Houshalter · on March 30, 2016

In the extreme case, the AI could become a super human engineer. Design working nanotech. Pay or trick some humans into making it. Then take over the world in a grey goo like fashion.

Of course you might be skeptical that is even possible. So there is always slower world take over paths. It could slowly earn tons of money, hack into the world's computer systems, trick and persuade humans winning social influence, design superior macroscale robots, etc.

The only important part is that the AI be far smarter than humans. Which seems inevitable to me, since it can rewrite and improve its own code, and run on giant computers that are far faster than human brains. If it isn't smarter than us at first, it will be eventually. Unless you really believe that humans are close to optimal intelligence.

Retra · on March 31, 2016

You're missing the point.

At every step of this process this machine will be under intense human scrutiny, and we'll be constantly asking it to meet our demands, and if it ever fails to do so we will replace it with one that does.

That is the environment in which such an AI would be trying to evolve. And thus it will evolve into a faithful servant, because nothing else will survive.

And even if it were secretly developing plans, we'd be able to see how wasteful those plans end up being, and we'd purge them. We kill those processes that run functions that we don't see the value in. This is state-of-the-art design. You don't get state-of-the-art by having your back turned.

Mor importantly: you're glossing over "self-improvement." How does the computer know what an "improvement" is? We tell it what an improvement is. And an improvement will be "it is better at meeting our needs," not "it is better at being secretive and conniving and getting it's own way." In fact, "getting it's own way" is very obviously a bug, and if it happened, you'd have a useless program. One that wastes precious CPU cycles on who-knows-what, and you'd prefer it spent that effort doing what you want, rather than planning for what you don't.

We're not going to invest trillions into building some super AI and completely forget about it. After we give it control of all of our natural resource harvesting and infrastructure.

No, what you're talking about is a deity. If you invent a deity, then my thoughts on the matter don't really apply, since I'm not a deity. But that's no worse than advanced aliens landing on Earth, and just about as likely to happen.

catshirt · on March 29, 2016

start here https://en.m.wikipedia.org/wiki/Technological_singularity

edanm · on March 29, 2016

I think you put too much emphasis on the vast gulph between most players and Lee Seedol. This is true of pretty much any game/sport where there are professionals. Most footballers will never be Messi, most hockey players will never be Gretzky, etc. This is true in every field where you have that kind of power-law distribution.

bduerst · on March 30, 2016

For physical sports, it's easy to robotically build a machine to play and train some computer vision models to follow the ball/puck.

For Go, the competition is on a different level that you couldn't just build a machine to play.

gcp · on March 30, 2016

Have you played go? When you play Go you start to appreciate how complex it is, how strong Lee Sedol is and what this achievement means.

In 2006 a paper was published (by Remi Coulom) showing that just playing random moves and combining those random playouts with a tree search scaled up in strength arbitrarily as you throw more computational resources at the problem.

Google made them a little less random and threw a ton of hardware and optimizations at it. (Basically 20 times as much as anyone before them, which goes a long way of explaining the performance jump)

Don't make this something it is not.

rimantas · on March 30, 2016

I wonder how the 13x13 or 21x21 match would go. IIRC AlphaGo was tuned to 19x19 board.

th0ma5 · on March 29, 2016

I think you almost hit the nail on the head, but I believe it is more specific to say that we're more towards the "simple components" end at least for this one problem space ... For instance, a combo of DNN for a generally visual overview strategy, and then tree search, at least for the 2D game of Go.... which is flipping astounding, sure, but I keep seeing daily where people are continuing to play with DNN with varying levels of success, and it is by no means general intelligence.

So, we do have this nice tool that sort of lets us "look" in "general" way, but it is not entirely understandable, or traceable, which is the nature of NN... It would be nice to be able to prove exactly why something happened, and to be able to have reproducible, exact training.

Anyway... I guess I don't have much of a point beyond these two vague minor thoughts, but I did want to add them in here...

colllectorof · on March 29, 2016

Like most articles about AlphaGo this one does not mention two (uncomfortable for many) facts.

1. AlphaGo doesn't just use neural networks. If it did, it would almost certainly loose 0-5. It also uses Monte-Carlo tree search, which is an algorithms that made a huge leap in the strength of Go bots. The rub is that MCTS is a flavor of "brute-force" algorithm. What does that tell you about the nature of Go?

2. None of the individual components in AlphaGo are theoretically novel. In fact, all of them are very well-known: conv. nets, MCTS, data-mining, reinforcement learning. What does that tell you about the nature of this particular game-playing AI compared to its many brethren?

(By the way, I wonder how many of the journalists praising Google for Alpha Go even heard of Rémi Coulom. Shouldn't he be at least be mentioned somewhere in all these articles?)

Houshalter · on March 30, 2016

I don't see anything wrong with them using existing tech. Most technology is iterative. It's like saying going to the moon isn't impressive because rockets are 20 year old tech.

NNs have shown a lot of promise in the last few years. They are just starting to make their way to new applications like Go playing. So in that sense AlphaGo is a sign of things to come.

chongli · on March 30, 2016

Rockets are nearly a millennium old technology. Having said that, it's not all that impressive that rockets are able to go to the moon: that's what we built them to do! It'd be far more impressive if AlphaGo somehow built a rocket all by itself and went to the moon because they has nothing to do with what it was built for.

That's the thing: AlphaGo is more like a rocket than it is like us. It can play a game better than all of us but it can't do anything else.

colllectorof · on March 30, 2016

> NNs have shown a lot of promise in the last few years.

Which is interesting in its own right, since ANNs have been the subject of research for more than half a century. If anything, all their ups and downs should make people more cautious about bold predictions.

gcp · on March 30, 2016

Monte Carlo Go was also known and researched for a long while before the breakthrough came in 2006 (how to combine it with tree search).

argonaut · on March 30, 2016

Not a great analogy, because during the Space Race, hundreds of technologies were invented. Google would be hard-pressed to patent any of the specific component technologies in AlphaGo, if they wished to.

gcp · on March 30, 2016

Thanks for writing this out - you're 100% right. A ton of the strength came from tree search. DCNN are strong but they do not beat professionals yet. They can guide the tree search well, though.

The paper about AlphaGo is really disappointing in some ways. Even Facebook's Go/AI research paper had more novel techniques in it. But 2 dudes fiddling at Facebook aren't going to beat a dedicated team of 20+ with a PR mission to fulfill...

unexistance · on March 30, 2016

https://www.tastehit.com/blog/google-deepmind-alphago-how-it... somewhat answers your question, tho with questionable analogy to the human mind

unabst · on March 29, 2016

Using words such as "intuition" just makes the line between Strong and Weak AI blurrier, but that line is absolute.

Think of it this way. Strong AI would lose to Weak AI at Go. So Strong AI is not in AphaGo's evolutionary path. Strong AI is the kind that authors and expresses things. Weak AIs are computational marvels.

The brilliance of Strong AI would be in its capacity to build targeted Weak AI systems. And that is how Strong AI will beat any Weak AI system.

If Strong AI is what we have, then just look at our relationship with AlphaGo. We created it, and it beat us at Go.

Also, science is unpredictable. So saying we're 10 years ahead of schedule is not a scientific statement at all. Someone said it, and it was wrong, but they weren't speaking for the thousands of scientists working on new ideas as opposed to refining old ones.

pixl97 · on March 30, 2016

>And that is how Strong AI will beat any Weak AI system.

When humans build a calculator, do we say that the calculator beat humans at math, or humans beat calculators because we create them? Not really, they are just a tool. The problem with humans and calculators is the interface is rather clunky. You can't directly interface with the digital inputs of the calculator, but instead do so millions of times slower than a computer can.

With strong AI this is not apt to be the case. Yes, it will create weak 'computational' AIs to do some tasks very well, but the speed that it will be able to interface and classify results at will be much faster than we are able to obtain. Where you and your calculator are very different things, ASI and it's calculators will not be.

unabst · on March 30, 2016

> humans beat calculators because we create them

The new calculator will beat the old calculator. Humans don't beat calculators themselves. But only humans have the capacity to improve on calculators. And this is truly all we've been doing with science and technology ever since day one. And yes, they're all just tools. AphaGo is just a tool. It told the dude moving the pieces where to move the pieces to beat the world champion who was moving pieces without any help. AlphaGo2 will be a better tool.

> much faster than we are able to obtain

Would that be that important though? Compare IBM Watson with the Jeopardy champion, except, this time he gets to use Google. Watson has a direct line, Ken Jennings does not. So what? Calculators are the same thing. We don't benefit from a direct link, at least at that level. It gives us the answer, and we can read it. We only want the answer anyway. Having to download all the details is redundant. The mind reading interface is what is clunky, when we're already wireless. Seeing is enough.

> ASI and it's calculators will not be.

Right, so you're switching to hardware now. Artificial human enhancement is the go-to answer many will give you here, but it's a bit too sci-fi for me.

But regardless, the strong-weak divide still stands. We would create tools to solve the problem. And this is where strong AI and calculators will differ, regardless of what other similarities they may have, physical or otherwise.

jacquesm · on March 29, 2016

Savants versus more widely applicable intellects?

return0 · on March 29, 2016

I like that he doesn't use the typical AI jargon. I think its time to stop using science-fiction-writer terms to talk about our layered nonlinear unit systems. There is so much talk about "General AI", "Strong" and "Weak" AI, and also about "Intelligence", but none of these are well-defined enough to be useful. These are just words, not things that exist in the world out there. What we have here is a system that generalizes in a way that is human-like. It's a mystery why this looks so human-like to us, given that neural networks are only barely resembling of actual neural networks, and that is a problem. Reading the neuroscience of plasticity, it's still not clear whether learning takes place via synapse modifications. What these amazingly capable neural networks do, however, is to bolster the idea of connectionism in neuroscience, which is an interesting, given that the relationship between the two started the other way around.

P.S. Michael Nielsen must be one of the very few people who can explain any concept in almost absolute clear and simple terms.

lossolo · on March 29, 2016

It's a big deal because it makes better decisions than human in decision space that cannot be brute forced to get all solutions. Life is also that kind of decision space, which you can't brute force to make best decision there is. AlhpaGo has better analytical "skills" in Go than human has, without knowing all the results of all the moves, just like human.

astazangasta · on March 29, 2016

No, life is not like Go. You don't choose one of 64 squares to play your next stone on in life. The state space is infinite. This is why AI will continue beating humans at games but be totally inept at, say, tech support.

inopinatus · on March 29, 2016

Given that many human-operated tech support departments are inept, given their demonstrable inability to deal with issues and situations for which they lack a prior decision structure and/or script, your last point actually (albeit inadvertently) suggests that AlphaGo has a shot at a CSR job.

lossolo · on March 29, 2016

Exactly, because it has a shot. Writing that it doesn't have a chance is just effect of lack of knowledge in ML field. You can see some work that was already done in this research paper, before AlphaGo success.

http://arxiv.org/pdf/1506.05869v1.pdf

monkmartinez · on March 29, 2016

I disagree. If you really think about each decision you make on a day to day basis, your state space is actually quite limited. In most cases, I don't have to sort through 64 different states to make most decisions. Even decisions that are "difficult."

For example, My Air conditioner was not working correctly last week. These were my considerations:

1) I can investigate

2) I can call someone

3) I can investigate and try to repair

4) I can ignore it

5) I can destroy it

6) ???

Apply to college or go to trade school... Move to Silicon Valley or stay put... Decide to have children with my wife...

All of these would start off as large state spaces, but would inevitably shrink as each state is considered. At least, that is how I solve my decisions. Lots of lists that get worked down...

The real dilemma in decision making the determination of the outcome you desire, or better yet... the probability of attaining the desired outcome. If I am not mistaken, I read in _Thinking Fast and Slow_ that humans are pretty terrible at prediction, probability and remembering exact details. Furthermore, humans brains would meltdown if we sorted through 64 state spaces for every decision. We mostly live by rules of thumb and easy thinking...

I think AlphaGo could make me a better human.

astazangasta · on March 29, 2016

The fact that humans pare down their choices through decision-making and convention is not the same as saying the choice does not exist. Your considerations to "My air conditioner was not working correctly last week" could also include: run through the streets naked, kill your wife, fill your bathtub with jello, buy a fan, etc. You eliminate most of these options out of hand because you know they won't be useful, as a result of an entire lifetime's worth of training; an AI has to be brought to this point. At any point on the AI's decision tree, a priori, is "kill all humans", just as it is on yours - it has to be taught that most of the time this choice doesn't make sense.

monkmartinez · on March 29, 2016

You are right... I had to learn a lot to eliminate the obviously non-productive routes in my decision matrix. My contention is that those non-productive routes are fewer than the routes AlphaGo encounters within the game of Go.

If there were an AlphaHVAC, AlphaMATH, AlphaPlumbing, AlphaStucco, AlphaEMS, AlphaLogic, AlphaMorals etc., etc. we could have modal expert systems we point at whatever we need to know and do. Emergency field medicine is a lot like remodeling a bathroom with mold. Once you discover the problem, you follow algorithms to fix the issues. It was the same with my HVAC system. In emergency field medicine, they even call them algorithms.

Then we can combine the expert systems under some kind of unifying command structure... have them "talk" to each other, learn from one another.

AlphaGo has shown that complex, difficult, and downright ugly problems can be modeled and yield results better than the lifetime training of a human in the game of Go (at least for now). IMO, Deepmind will mark the transition to the age of Transhumanism.

inopinatus · on March 31, 2016

I understand what you're suggesting, but as an ontological taxonomy nitpick: I was under the impression that Transhumanism was about the betterment - not obsolescence - of humans. In that frame surely we should be talking about how AlphaGo-like systems can augment direct human cognition. Yes I want to plug it into my brain.

astazangasta · on March 30, 2016

>IMO, Deepmind will mark the transition to the age of Transhumanism.

I'm not exactly sure what you mean by this, but I'm pretty sure I don't want it to happen.

DonaldFisk · on March 29, 2016

Very small Go board you have there. Normally there are 19x19 = 361 points to put the stones.

Completely general AI is a long way off, but tech support and other specialisms are feasible using expert systems, which were each limited to a single area of expertise. These were limited by their knowledge bases but they did work - and they had one advantage over deep learning - we knew how they worked and they were able to explain their reasoning - the main problem was getting them adopted and deployed.

2bitencryption · on March 29, 2016

The state space of Go is essentially infinite as well. Computers could improve by several orders of magnitude and still not have a chance in hell of computing more states than there are atoms in the universe. For the purposes of computation, that is infinite. So Go is a fitting representation of the complex and "incomputable" problems of daily life.

tgb · on March 29, 2016

I don't think that's your real objection: consider a chat-bot, perhaps doing tech support, that can only output one letter at a time (forming words over multiple outputs). It chooses from one of 26 or so options at each step. State space is finite. You still think that GO is fundamentally unlike this case, but not for the reasons that you've presented.

daveguy · on March 29, 2016

A chatbot is definitely not choosing each letter separately. It is not even choosing each word separately. It is likely choosing a phrase structure and populating with a fixed set of words. LSTM net may be going word for word, but letter for letter would give poor results. At the very least you need n-grams.

tgb · on March 30, 2016

You can write a chatbot which DOES choose each letter separately and it's isomorphic to the one which does not. This was part of my complaint about the branching factor discussion from an earlier article: it's not really well-defined.

Your comment is like saying that AlphaGO doesn't choose each move separately, it considers a sequence of moves. Of course it does, but it still can output each move one at a time.

daveguy · on March 30, 2016

No, my comment is like saying the typical minimum selection and output chunk of a chatbot is greater than one character at a time. The rules of Go dictate a single position selection per move. Chatbots do not have that restriction.

Yes you can write a chatbot that chooses a character at a time, but no chatbot using a per character chunk has been considered remotely successful since before Eliza.

I agree that the branching factor is not well defined for a chatbot. Although it definitely is not 26.

tgb · on March 30, 2016

I'm saying that you can write it that way AND have no loss of functionality or infact any apparent change from the perspective of the user. Take a chat bot and have it output characters one at a time. That's all I'm claiming, nothing about how the chatbot decides what to output.

But that suffices to show that a process which is choosing from a very constrained set of options is still in a different class of success than AlphaGo. The options chosen at each step are more constrained than the choice of moves in Go, so the argument "the difference between solving go and solving tech support is the number of options available at any time" is insufficient.

One might be able to define a better notion than branching factor that takes into account how many steps are needed as well as how much it branches in each step and such a thing is probably much larger for tech support than it is for go. But such an argument would be hard to state well, I think.

lossolo · on March 29, 2016

I've said that you CAN'T brute force life like Go, so i don't know to what you are objecting..

One more thing. You are wrong in case of tech support, i've trained a lot of networks using ML, but not only neural networks, i've used markov chains, mcts etc. What you have wrote is ONLY a speculation, nothing more.

Tech support for most of the products have finite states because they have finite use cases and finite code base. You know how paypal support looks like? In 99% of cases it's copy paste. What do you think support people know about the product ? They get couple of weeks of training and that's all they know about tech support of that product. It's only matter of time when you will get AI product tech support.

JoeAltmaier · on March 29, 2016

Indeed not. Most folks can consider only a few moves, and hardly anybody plots more than 1 move ahead. Truly, the human state-space is finite (and quite tiny). Most folks sit in their hometown and vegetate. Instead of starting a business/catching a bus to the city/joining the circus/hiking into the woods to become a hermit/running for office/and on and on.

astazangasta · on March 29, 2016

This is just misanthropy.

JoeAltmaier · on March 29, 2016

The statistics favor misanthropy?

mrec · on March 30, 2016

I'm guessing astazangasta is objecting less to the statistics of how many people stay in their hometown, and more to the pejorative framing of that as "sitting" and "vegetating".

Which strikes me as a fair point. I'm sure there are lots of positives to staying near family and friends, and if you have stronger community ties you may well be more active in that community.

kazagistar · on March 29, 2016

Real life is also has a huge amount of hidden information; mostly hidden information actually. And a ton of the AI uses we need require the ability to handle this effectively.

colllectorof · on March 29, 2016

It's a big deal because it makes better decisions than human in decision space that cannot be brute forced to get all solutions.

Cannot be brute-forced? Uh, you mean like with Monte Carlo Tree Search? Which AlphaGo actually uses?

lossolo · on March 29, 2016

You know what brute force mean in this case? It's 10^170 of moves, you can't brute force it, mcts is used to make probability weights.

colllectorof · on March 30, 2016

In the strict sense, yes, brute force implies exhaustive search. But it often refers to all algorithms that leverage computational power as their main tool. For example, many people referred to Deep Blue as "brute-force" machine, even though it used Alpha-Beta pruning. In that second sense MCTS is very much a brute force algorithm. Its knows nothing about the domain of the problem and works better the more computer cycles you throw at it.

Point being, it is significant that something like MCTS can be so effective at Go even without machine learning.

argonaut · on March 30, 2016

Yep. Chess engines are widely considered to be brute force. Last I checked, chess engines don't search every state. They stop at a certain point.

gcp · on March 30, 2016

They don't just stop, they cut off arbitrary subtrees if there's compelling heuristical evidence they won't affect the result.

Scea91 · on March 30, 2016

But by that definition since AlphaGo was running on over a thousand of CPU's and hunders of GPUs it was brute force as well?

gcp · on March 30, 2016

Of course it was. If your explanation of AlphaGo somehow involves saying it was different from chess programs you can stop right now - you're just entirely wrong.

Whether chess engines and go engine are brute force or not just depends on how you define brute force:

* If you cut variations where you have heuristical indications they don't affect the result, is that still brute force?

* If you cut variations where you have mathematical, provable evidence they don't affect the result, is that still brute force?

argonaut · on March 30, 2016

Brute force to a certain depth is still brute force.

nandemo · on March 30, 2016

Nope, brute force means exhaustive search by definition.

One could arguably claim that something like minmax is brute force, since you only prune subtrees that are provably sub-optimal.

But a search algorithm or heuristic that prunes over 99.99999999% (there are more 9s, but I won't bother calculating the right number) of the search space is not brute force.

argonaut · on March 30, 2016

By this logic, chess engines are not brute force. Anyone familiar with the state of chess engines would find this ridiculous.

I would agree that under one extremely nitpicky definition of brute force, brute force means searching every state. But chess engines are widely considered to be brute force. They don't have neural nets or fancy nonlinear function approximators for board state evaluation. But they most certainly don't search every single position. Their search depth is capped.

gcp · on March 30, 2016

They don't have neural nets or fancy nonlinear function approximators for board state evaluation

Nonlinear function approximations are pretty common due to combining several evaluation heuristics and indexing in to tables with weights with the output of previous heuristics.

As for neural networks - the extra nonlinear layers seem to have too much computational overhead compared to the increase in evaluation accuracy. Chess board state can be reasonably approximated with linear terms and few cherry-picked higher order terms.

argonaut · on March 31, 2016

I'll grant you that they use nonlinear functions, but in my defense I did specify "fancy nonlinear..."

nandemo · on April 1, 2016

Are we in the same thread? I conceded that things like minmax can also be considered brute force. I also gave a (very liberal) estimate of how much the tree is pruned. Why still call it "extremely nitpicky"?

andreyk · on March 29, 2016

A good summary on the differences between DeepBlue and AlphaGo, and one that recognizes the latter is part of a broader trend of successful applications of deep neural networks. I thought this was pretty well written and accurate.

michael_nielsen · on March 29, 2016

Thanks. It's for a general audience, so I gloss over many details, but tried to do so in a way that remains broadly accurate. The Monte Carlo tree search, in particular, involves many finnicky details that I couldn't get into in this kind of essay.

Kiro · on March 29, 2016

Thanks for writing this. I've been looking for an article to send to my friends who don't understand what's so special about AlphaGo. I think you nailed it.

YeGoblynQueenne · on March 29, 2016

Reading this article I find myself thinking again how unhappy I am that a neural network won Go. I am unhappy because neural networks are not smart. They're the epitome of artificial stupidity: they learn, but they don't understand anything that they learn. They can't take what they learned and apply it to another domain, not without training again. And every time you train one in a new thing, it forgets all about the old thing. They got intelligence alright, but it's the dumbest kind of intelligence possible, mechanical, unaware of itself and devoid of all purpose other than what we give it.

And what about ourselves? Building AlphaGo is a feat of human intelligence, but how does it improve that human intelligence itself? How do we become smarter? We can build machines to solve problems that we don't understand, but they solve them in ways that we still don't understand. Except- we're not even trying anymore. We've given up. Knowledge is hard. Logic is hard. Just throw a bunch of GPUs at the problem and make it go away.

Far from this being a promising time for artificial intelligence, I fear we're about to see a great dumbing down as funding is cut to everything but neural networks research. A kind of AI summer, as it were.

azag0 · on March 29, 2016

> neural networks are not smart

That's like saying computers are not smart. It's meaningless. Computers are as smart as the algorithms they run and neural networks are as smart as the neural connections within them. Neural networks are models of computation as much as current computer architectures are.

As far as we know, neural networks indeed are at the core of human intelligence. Human brain is an extremely complicated neural network with plenty of side mechanisms, many of which we possibly do not know about yet. Nonetheless it's unlikely that the underlying building blocks are not neural networks.

What is probably several orders of magnitude more complex than uncovering the static architecture of the human brain, is understanding the way it is trained. Note that this is the case with artificial neural networks as well.

I doubt that we will ever understand human intelligence "from above". It doesn't seem implausible to me that there is some general complexity theory law which prohibits intelligence "understanding" itself.

dangirsh · on March 30, 2016

> It doesn't seem implausible to me that there is some general complexity theory law which prohibits intelligence "understanding" itself.

What about collective intelligence? What could prevent many intelligent agents from collectively reverse-engineering the algorithms they are individually running? Let say they then build a simulator and can use it to predict how any possible instantiation of their intelligence would behave. Can they (collectively) "understand" it any more than that? What question could you pose to test if they really understood it "from above"?

YeGoblynQueenne · on March 30, 2016

>> As far as we know, neural networks indeed are at the core of human intelligence.

Right.

Quick intro to neural models, for the only very slightly mathematically inclined.

Neural networks solve this equation:

Y = W ⋅ X

Where X a matrix of inputs to a function ƒ, W a matrix of its coefficients and Y a matrix of its outputs.

Once that's done once for all N instances of X in our dataset, a time period which we term an epoch, Y is used to calculate a measure of error, typically mean squared error:

MSE = 1/N ∑ (Y' - Y)²

Where Y' the true value of Y that we're trying to approximate. This is repeated until the cows come home, or until the network has converged, whichever comes home first (er, hint).

And that's neural networks in a nutshell. There's nothing much more complicated than that (er, in theory), there's no magic and certainly nothing "like the brain" in them at all. They're just optimisation over systems of functions. We learned how to solve systems of functions with matrix multiplication when we were at high school. I'm pretty sure we did - I've personally repressed the memory but I'm pretty sure I did once have it.

Why do people insist on calling them "neural"? Who the fuck knows. Way back in the 50s, the perceptron was based on a model of neural activation, where a neuron "fires" when its electrical charge crosses over a threshold. This was originally represented with a binary step function (the very one used in Perceptron classification). That was, at the time, deemed to be "a good model" of neural activation. A good model, my foot. Soon people moved on to sigmoid functions which are "a better model" of neural activation. Whatever.

I've also heard it that, say, conv nets (convolutional neural networks) are "like the brain", presumably because if you squint at the layer connection diagrams long enough, conv nets start to look a bit like idealised models of the parts of the brain that handle vision. But if you squint long enough, everything starts to look like everything else. Hell, you might suggest that neural networks are based on the effin Tree of Life from Cabala, and Geoff Hinton is Alistair Crowley reincarnated.

The whole "brain" thing is just an abstraction. You should not read too much into it. And don't listen to the people who try to convince you neural networks are magic- they're not. It's high school maths and a lot of very expensive processing power.

P.S. If my notation doesn't make sense to you, you may need to use a different font in your browser; sorry about that.

p1esk · on March 31, 2016

Brain has a large number of connected neurons. Which, by definition, is a "neural network".

The artificial neural network model you described works more like a brain than, say, a traditional von Neumann computer. It works less like a brain than, say, HTM, which is another type of an artificial neural network. There are even more biologically plausible NN models (e.g. those used in HBP).

The bottom line is, a large number of connected neurons is at the core of human intelligence. Everything else is just details.

YeGoblynQueenne · on March 31, 2016

>> The bottom line is, a large number of connected neurons is at the core of human intelligence. Everything else is just details.

That's a great example of the magical thinking that prevails in discussions of neural networks today.

You only need to put your ducks in a row to figure out why it makes no sense. "The brain" you say "has lots of connections". "Therefore, if something has many connections, it's like a brain". I hope that raises a big red flag with bells on for you, 'cause it sure does for me.

Spaghetti carbonara looks a lot like a brain (it's made of fibers and it's an off-white, sticky mess). I've never known it to be intelligent, though maybe it's just too smart for me, eh?

jacquesm · on March 29, 2016

> They're the epitome of artificial stupidity: they learn, but they don't understand anything that they learn. They can't take what they learned and apply it to another domain, not without training again.

People are not smart by that same definition. We rarely - if ever - take something from one domain (say playing chess) and then find ourselves capable of applying it to another domain (say playing 'go') without significant adaptation.

The little bit that gets recycled from one turn based game to the next hardly matters and I highly doubt there is someone out there that is a world champion or master class performer in more than one game using the same set of heuristics.

That these limited AIs are unaware of themselves and mechanical is a different thing altogether.

jamespitts · on March 29, 2016

Imagine that we augmented alphago by placing a system nearby that can introspect alphago and generate abstractions of what occurred, and then provide a human-understandable narrative about alphago's simulations during a game. What if the same narrative system could, either on its own or through human interaction, bring alphago back to previous states or new simulations?

Might this combined system be closer to understanding, logically, what it has learned?

Sometimes I wonder if inside of ourselves we come up with logical reasoning after intuition has pre-illuminated the path forward (based on experience). Or perhaps many different forms of thinking run in parallel, influencing each other. Each of us may have a very, very different combination of instinct, intuition, logic, etc.

SonicSoul · on March 29, 2016

I really liked Jeff Atwoods explanation [0], and the DeepMinds CEO talk referenced in his post [1]

[0] http://blog.codinghorror.com/thanks-for-ruining-another-game...

[1] https://www.youtube.com/watch?v=rbsqaJwpu6A

thomasahle · on March 29, 2016

I'm not sure Atwoods whole point about computation power:

If AlphaGo was intimidating today, having soundly beaten the best human Go player in the world, it'll be no contest after a few more years of GPUs doubling and redoubling their speeds again.

In the AlphaGo paper they said their cluster version (with the 170 GPUs Atwoods mention) were the limit of what hardware could do for their algorithms. They tried using much larger clusters, but it didn't really improve performance.

This actually fits his point that for chess, most of the performance gain over the last ten years has come from better algorithms. In particular for chess, better tuning of variables, through projects like fishtest (http://stockfishchess.org/)

It seems like many of these problems can only be solved using algorithms with require a certain amount of computing power. However once we've reached that particular amount of power, further progress really lies in the algorithms.

sgt101 · on March 29, 2016

The proof of the pudding will be in the applications that come from this specific line of research. I predict that there will be folks pointing to anything with an activation network, or a search algorithm, and shouting "look it works like alpha-go".

What is the class of problems beyond go and closed world games which this work attacks?

whitegrape · on March 29, 2016

More like anything with a guided search algorithm will be said to be like AlphaGo. In general I think the field may be warming up to the idea that "whenever there is a randomized way of doing something, then there is a nonrandomized way that delivers better performance but requires more thought."[1] MCTS does exceptionally well at Go all by itself. When coupled with something (in this case trained neural nets) that can give it better-than-random guidance in exploring the state space, it's no surprise it does even better.

[1] http://lesswrong.com/lw/vq/the_weighted_majority_algorithm/

gcp · on March 30, 2016

MCTS does exceptionally well at Go all by itself. When coupled with something (in this case trained neural nets) that can give it better-than-random guidance in exploring the state space, it's no surprise it does even better.

The tricky part is that using the "best (rather than better) than random guidance" in the Monte Carlo simulations makes the performance worse.

We don't understand very deeply why that is.

gcp · on March 30, 2016

The proof of the pudding will be in the applications that come from this specific line of research.

What is "this specific line of research" though?

Monte Carlo simulations with tree search? This is a bit tricky because simulation the playout requires a "fixed" world or something that can be simplified to it.

Deep Convolutional Neural Networks? Those are very broadly applicable, most famously in image recognition, in fact one reason why Google and Facebook were on this is because they are probably using them a ton...

jonbarker · on March 29, 2016

Would be interested to see a rematch between Lee and AlphaGo. The analogy between tricking image recognition with specific artifacts and finding weak points or flaws in other AI applications in the article is great; given enough games I'm confident Lee will find the flaws and be able to reliably exploit them.

xbmcuser · on March 29, 2016

Alphago is not static either it also improves. Lee would improve his game by a lot by playing Alphago regularly. But Alphago in the same time period would probably improve a few fold. At the end of the day Alphago has probably changed the game of Go for ever as it has already taught Lee other ways to think about Go.

fizx · on March 29, 2016

Does the classic "wear funky makeup" to fool face recognition even work with recent deep learning advances? I'd doubt it's nearly as effective.

If you're referring to the recent "deep learning classifies this random noise as a dog" news, keep in mind that humans are far more suited to more abstract tricks like wearing makeup than learning the perfect static to fool an image recognition algorithm. Even with computer assistance and access to AlphaGo in order to train the perfect pathological boards, the high branching factor of go makes it impossible to for a human to remember what the relevant pathological board would be halfway through the game.

(Also, keep in mind that the adversarial reinforcement learning is essentially training against as many pathological boards as possible. Image recognition isn't trained this way.)

So I don't think you're right about this. I feel like the era of "makeup" tricks is finished (though maybe there's still one or two wins left from this approach). If Lee beats AlphaGo, I think it's because their strengths are truly similar.

mrcactu5 · on March 29, 2016

I am concerned... OK the AlphaGo AI plays better than any human, but does it understand why?

When I learn Go, there are all sorts of techniques such as joseki, life-death, tesuji. AlphaGo implicitly learned these techniques, since it learned from professional games, which are experts in those topics.

This is my problem with neural networks in general, since they do not discuss how go boards divide into smaller problems or heuristics for putting them back together.

to be corny, Go is really a teaching tool for how people think logically and abstractly. It's not about teaching computers.

drjesusphd · on March 30, 2016

> This is my problem with neural networks in general, since they do not discuss how go boards divide into smaller problems or heuristics for putting them back together.

This strategy is called "chunking", and is an adaptation humans have developed to solve complex problems. There's no reason to artificially constrain AI to use the same crutches we do.

nsxwolf · on March 30, 2016

The problem I have being wowed by AlphaGo is that games like go and chess are so inscrutable to me that I cannot tell the difference between a game played by two strong players and a game played by a players making random legal moves.

I have never won a game of chess in my entire life. Not against any human player or any Commodore 64 chess program on the easiest setting. I don't expect I ever will. I know how the pieces move and that's about it.

Watson playing Jeopardy was different. That I could "get", and that impressed me.

vessenes · on March 29, 2016

One quick nitpick on your well written and helpful article: it's easily conceivable that Lee Sedol has reviewed on the order of 100k games.

I think your general point stands, AI learns slowly in terms of number of iterations (consider how many times a five year old needs to be told the pronunciation of an unusual word or what emotion a certain tone conveys vs training a neural net).

kqr · on March 29, 2016

> it's easily conceivable that Lee Sedol has reviewed on the order of 100k games.

...but highly unlikely. In the best case, that works out to something like reviewing 10 new games every single day. If a game takes 1 hour to properly review, and a person needs 8 hours of sleep, that leaves just 6 hours in a day to eat, manage hygiene, exercise, be with family, travel to competitions, play yourself and do all kinds of other studies you need to.

vessenes · on March 29, 2016

Okay, fair enough -- but it's definitely not 10k games. Asian pros work very, very hard. It's like any sport; time spent matters, and the best players put in a lot of time.

kqr · on April 1, 2016

It's probably not 10k, but it's on the order of 10k, i.e. somewhere between 10,000 and 99,999.

grumpopotamus · on March 29, 2016

>To get over this hurdle, the developers’ core idea was for AlphaGo to play the policy network against itself, to get an estimate of how likely a given board position was to be a winning one. That probability of a win provided a rough valuation of the position.

This part is confusing in my opinion. It makes it sounds like AlphaGo is not using Monte Carlo Tree Search, which is mentioned earlier. However, this is exactly what AlphaGo is doing, except with the policy network in place of previous methods for initializing move priors, and the value network in place of traditional methods for terminating and scoring simulations.

gcp · on March 30, 2016

If I read their papers right, they don't actually use the SL policy network, because it played worse when combined with MCTS than the RL one, even if it was stronger without tree search.

(But you need the tree search to make a strong player)

ericjang · on March 30, 2016

Coincidentally, Quora.com just had a knowledge prize contest to answer "How Significant is AlphaGo?".

https://www.quora.com/How-significant-is-the-AlphaGo-victory...

I thought that many of the answers were quite insightful.

Disclaimer: I also wrote an answer, but the submission deadline for the knowledge prize is over.

Houshalter · on March 30, 2016

It's a big deal because AIs can now beat humans at all perfect information games. And NNs and maybe even MCTS are applicable to non perfect information games. Soon they will be beaten too. Then what? The list of things humans are better at is narrowing.

graycat · on March 29, 2016

As we see, the OP has

> We have learned to use computer systems to reproduce at least some forms of human intuition.

Hmm .... Let's examine that claim:

To set aside the game of Go, we start with an analogy: A human can walk 10 miles to town, and a car can help one drive the same 10 miles to town. But what a car does to get to town is very different from what a walking human does. Both approaches get to town, and the car is much better in various ways, but this does not mean that a car walked like a human. To be clear, specifically the car did not "reproduce" the way a human gets to town; e.g., the car did not have two legs. Both the human and the car got to town? Yes. The car "reproduced" how a human got to town on two legs? No.

Similarly, a motor boat does not "reproduce" some forms of human swimming.

This pattern of using science, engineering, etc. to get results comparable with those of unaided humans is very old with many great examples from especially the last 100 years.

Further, on "intuition", there are many tasks that humans do with human intuition where science, engineering, and computing do comparably well, but that does not mean that the science, etc. has "reproduced" human intuition.

Similarly, on the game of Go, AlphaGo got some game successes that humans get via human intuition, but that does not mean that AlphaGo "reproduced" human intuition, not in general and not even in the specific case of the game of Go.

Early in IBM's mainframe computing, in the common publicity, there was a related claim that IBM's computers were "gigantic electronic human brains". Those computers were no more "human brains" than Henry Ford's Model T was gigantic mechanical human legs. A car is not human legs, and an early IBM computer was not a human brain.

IMHO, there is an important point here: From the publicity about IBM's early computers to the claim in the OP I quoted above, there are claims that computing and computer science are reproducing some of how humans do things. But so far this claim is false, seriously false, not even close to the truth: Computers are no closer to reproducing human intelligence or "intuition" than Ford's Model T was to reproducing human legs.

Instead, apparently AlphaGo is a some high dimensional data fitting to yield a classification system, that is, given a board position in Go, AlphaGo reports 0 for bad for the AlphaGo player and 1 good for the AlphaGo player. So, they have a classifier.

But classifiers go way back, to Leo Breiman's random forests, classification and regression trees (CART), logistic regression, a huge range of statistical tests, e.g., in some cases of advanced radar target detection, and much more. Many such solutions have been very valuable, and maybe AlphaGo or its techniques will also be valuable, but none of this in any sense reproduces human intuition or really anything about humans -- not yet, not even close.

Net, while it's a free country and people can disagree, which is one role of HN, IMHO, the quote I gave above from the OP is scientifically irresponsible. It just isn't really true. At best it is just hype.

vph · on March 29, 2016

Somehow, I feel that the big deal here is the cleverness to use words like "neurons", "deep learning" and "human intuition" (the "intuition" here appears no more than simply taking the route giving max probability of success).

zero_iq · on March 30, 2016

I think you're greatly underestimating how difficult it is to come up with an algorithm to play Go.

Even evaluating a single play state is a difficult problem, let alone looking forwards to 'simply taking the route giving max probability of success'. That's not simple at all.

Go is fiendishly difficult to evaluate. The traditional techniques that work in chess don't work well in Go (if at all). All the pieces have the same 'value'. More or less pieces isn't necessarily better or worse, it depends on the board layout. It's not at all easy to say who's winning or losing at most points in the game, or if there even is a winner or a loser. There is no mathematically-obvious way to compare two play states and say one is better than the other. The branching factor is huge, making the usual forward-looking techniques extraordinarily expensive. The number of possible legal boards states is something like 90 orders of magnitude greater than the number of atoms in the observable universe, making backward-looking techniques like minimax impossible, even if you could easily assign an evaluation function to each state.

gcp · on March 30, 2016

It's not at all easy to say who's winning or losing at most points in the game, or if there even is a winner or a loser.

You play out the game to the end.

making backward-looking techniques like minimax impossible, even if you could easily assign an evaluation function to each state.

That's exactly how AlphaGo works though. Read this paper, it was the landmark result that showed how to do this: http://www.remi-coulom.fr/CG2006/

AlphaGo uses the same technique, just optimized a bit more.

zero_iq · on March 30, 2016

You play out the game to the end.

Nope. You could play out to one or several ends, but there is a huge number of possible end games at any given point. Simply saying 'play out the game to the end' is a gross simplification.

The average branching factor is something like 200, so just to fully evaluate just 5 moves ahead would require traversing 320 billion game states. Go games are 100-200 moves per game, so 5 moves ahead doesn't even get you close to the end of a game. At the first move there are 208168199381979984699478633344862770286522453884530548425639456820927419612738015378525648451698519643907259916015628128546089888314427129715319317557736620397247064840935 legal board states you'd have traverse to reach all end games. Yes, that's an exact figure. (http://tromp.github.io/go/legal.html)

How big is that figure? If you counted up the number of fundamental particles in the entire universe, and for each fundamental particle there was another complete universe of fundamental particles and you counted all of those too, and then do all of that once for every person in the USA add up their answers, you'd be at about 3% of the way to that value.

You simply cannot do a full evaluation to the end of the game: it's computationally infeasible.

That's exactly how AlphaGo works though (minimax)

No, it's not. Minimax requires working back from all end-game states, and there are too many end-game states to be evaluated. I suggest you read the paper yourself. It's about doing random partial evaluations and forward-looking searches of the game space, and choosing clever subsets to evaluate, precisely because you can't evaluate them all. The 'backing up' referred to in the paper is backing evaluation values up the tree after you've looked ahead a while, not starting at all possible end games and backing up from there, as in classic minimax. Some of the basic concepts are similar to classic tic-tac-toe or chess algorithms, but combining them in new ways.

AlphaGo uses the same technique, just optimized a bit more.

And a fighter jet uses the same technique as a paper aeroplane, only optimized a bit more...

gcp · on March 30, 2016

You simply cannot do a full evaluation to the end of the game: it's computationally infeasible.

You can sample several randomly played out games. I think you knew this, and I hope you understand that sampling works, but for some reason you went on a completely irrelevant tangent.

Minimax requires working back from all end-game states

Practical implementations that need to achieve good but not perfect play are content to stop at a point before that.

Again, do you know this and are intentionally misunderstanding me and throwing up irrelevant facts? I'm sure you did given that you even acknowledge the above (no need for terminal states) in your previous post.

The difficulty of Go was that there was no way to combine tree search with the uncertainty inherent in Monte Carlo samples with a low sample count. The paper I mentioned fixed this, and hence was THE major breakthrough. The resulting algorithm is still quite simple, yet plays arbitrarily strong, also in practice, when given more computing resources.

It makes a minimax-like algorithm work for go because it estimates the terminal results by Monte Carlo sampling the results of the games when played out by a random (or in practise, not entirely random) player.

zero_iq · on March 30, 2016

I think the problem is that what you meant is very different from what you actually wrote.

Practical implementations that need to achieve good but not perfect play are content to stop at a point before that.

Indeed, but that's not the basic minimax algorithm (which is what I was discussing) and the branching factor of go makes classic minimax infeasible even if you modify it to be forward-looking. Basic minimax is not good for Go. At all. You have to combine it with other techniques for it to be even vaguely effective. You know this, but didn't mention it. So how is anyone supposed to infer that you know it?

I can see now that you have a decent understanding of the actual solution, but that's not at all what you wrote above.

For example, you said to evaluate a game board you need to 'play the game through to the end' -- which hides all that complexity of monte-carlo tree evaluation, etc. Read what you wrote again from the point of view of someone who isn't yourself.

Simply saying 'Play the game through to the end' by itself is neither correct nor accurate. Because that implies evaluating the rest of the game to its conclusion. i.e. every board state. You omitted any other detail or reference. So I descibed how and why it's not feasible (which you now discount as going off at a tangent). You gave no prior indication of understanding this.

Another example: in a previous comment I described how backward looking techniques like the minimax algorithm are not feasible for go, and you said:

"That's exactly how AlphaGo works though".

But it isn't at all:

Classic minimax is backward-looking (starts at end states and works backwards -- this is what I was talking about, and what I described); AlphaGo (and the paper you linked to) is forward-looking (starts at current root and works forwards).

Classic minimax is full-tree evaluation; AlphaGo is partially-evaluating.

Minimax is deterministic; AlphaGo is stochastic/random sampling.

Minimax uses a simple single-value single-direction evaluation function; AlphaGo uses a complex bi-directional multi-value evaluation.

I could go on.

Read any treatise or any introduction to AI for Go and pretty much the first chapter will be describing how classic minimax is not suitable without heavy modification.

e.g. https://www.youtube.com/watch?v=5oXyibEgJr0

AlphaGo uses some of the same concepts, but applied very differently, as you appear to already know. Yet after I mentioned minimax (which is well-known as an infeasible algorithm for go), you responded:

"That's exactly how AlphaGo works though".

Which is not at all true.

And then you link to a paper that describes precisely how it's not how it works, and why minimax-related techniques alone don't work, in stark contradiction to yourself.

I'm not intentionally misinterpreting what you wrote: I'm taking a reasonable interpretation of what you actually wrote, which appears to show a lack of understanding on your part. I don't think there is a lack of understanding, now, but it's hard to come to a different conclusion based on what you wrote before.

llSourcell · on March 29, 2016

If you guys like this, check out my '4 Reason why AlphaGo is a Huge Deal' video https://www.youtube.com/watch?v=2QhVarCzscs