Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The "right" way is to take endless numbers of videotapes of what's happening outside the video, and feed them into the biggest and fastest computer, gigabytes of data, and do complex statistical analysis -- you know, Bayesian this and that -- and you'll get some kind of prediction about what's gonna happen outside the window next. In fact, you get a much better prediction than the physics department will ever give. Well, if success is defined as getting a fair approximation to a mass of chaotic unanalyzed data, then it's way better to do it this way than to do it the way the physicists do, you know, no thought experiments about frictionless planes and so on and so forth. But you won't get the kind of understanding that the sciences have always been aimed at -- what you'll get at is an approximation to what's happening.

Chomsky seems to keep using naïve models as a strawman, and Norvig rightly calls him on it. If you use simple models, you can only get simple insights, but statistical machine translation (for example) builds probabilistic context-free grammars, which maps human notions of language far better than "make sure every three words in sequence is plausible".



Chomsky is agreeing that it's making a map. He just doesn't think that map is very useful on a scientific level, but is useful on an engineering level.

You're responding that he's wrong, because it's useful on an engineering level.

Right? I'm reading many comments here and they seem to keep boiling down to this notion. Am I wrong?


At the top level, I think this captures it.

If, like Chomsky, you value having a model of the underlying cognition process rather than a set of black-box predictors for aspects of that problem (e.g., various corpus-driven translators), then you might be really annoyed that the black-box people are so satisfied with their results.


The Church was angry that the sun didn't revolve around the Earth. We don't get to pick the prettiest models. Right makes right.


I object to your glibness. Probably both methods (first-principles cognitive modeling vs. high-degree-of-freedom black box learning) will prove informative, just in different ways.

Or in your terms, we may not get to pick the prettiest models, but we owe it to ourselves to explore the space of models to see if we can find the structure in it.

The engineer in me is pleased by the undoubted success the data-driven learning culture has had on problems of real importance. But this work is highly empirical, with a tendency to point solutions, and someone is likely to come in later on and generalize these methods (e.g., why do some families of black-box predictor or features outperform others for language learning). There's room for both approaches.

Norvig's reply to Chomsky's original remark contains a reference to Leo Breiman's well-informed remarks on this question (http://projecteuclid.org/DPubS?service=UI&version=1.0...).

Breiman, as author of basic books on measure theory as well as on classification trees, was able to walk both sides of this line ("make a first-principles model" vs. "use lots of data"). He spent considerable energy over the years trying to introduce the data-intensive approach to conventional statistics. For instance, he was one of the handful of bona fide statisticians who would attend and contribute to neural net and machine learning conferences. Probably this strategy is more productive than Chomsky's grumpy-old-man warnings (or sagacious warnings, depending on how you look at it).


I think by default you'll find a disproportionate number of critics of Chomsky here. Some who understand what this about are more like to be engineers and like the engineering approach. Others who don't, saw Norvig's name and by default jumped to that side of the argument.


> If you use simple models, you can only get simple insights

Economics also play a large part in how information is parsed. The advancement of AI outside of academia is largely dependent on what it's being used for and how it's being used. Where great strides are being made is in search because it can be monetised and the computational power required is commensurate with the number of users/frequency of use and ROI. A complex model that can provide better insights but limits the number of concurrent users isn't as useful in a commercial sense.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: