In many cases the simple ensemble is more fragile because you have to keep track... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		plusepsilon on June 30, 2016 \| parent \| context \| favorite \| on: Wide and Deep Learning: Better Together with Tenso... In many cases the simple ensemble is more fragile because you have to keep track of multiple things at once. The winning solution for the Netflix Kaggle competition was an intractable ensemble which never got used in production. Also when you update your models you'll have to tune them individually, then manually tune the ensemble weights. Another advantage of joint learning (which the authors mentioned) is that the individual models need not be as big when trained independently since they complement each other. Though the joint model will surely be bigger than each of the individual models.

argonaut on July 1, 2016 [–]

They (reasonably) claim the joint model doesn't have to be as big, but, for example, it would be interesting the see an ensemble of 2 models: a wide model of the same size as the wide half of the joint model, and a hierarchical model of the same size as the hierarchical half of the joint model.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact