But the article used a linear model to demonstrate the curse, and the model was ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		kmike84 on May 13, 2014 \| parent \| context \| favorite \| on: The Curse of Dimensionality in Classification But the article used a linear model to demonstrate the curse, and the model was overfit just with 3 dimensions. There is clearly something missing: for example, for text data it is not uncommon to have thousands or hundred thousands of dimensions, and algorithms work fine. I think the missing piece is regularisation. It doesn't have to do feature selection and actually reduce the number of dimensions, but you're right that using L1 for such data is usually a good idea.

ced on May 13, 2014 [–]

The article had very few data points, that's why it worked with 3 dimensions. The deciding factor is how N (effective number of data points) compares with p (effective number of features).

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact