Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm not sure that's true in general, nor even frequently. In fact, I'd say it's provably false in general.

The big issue is that you get MANY more curve-fitting parameters to play with if you use a piece-wise linear model vs. an exponential model. (You get to choose HOW MANY breaks to make, what the slope is for each section, and WHERE to make the breaks.)

So... Let's say you created some synthetic data using an underlying exponential plus a normally distributed random number. Obviously, the BEST predictive model is an exponential one. However, for any arbitrary number of observations, I guarantee you there's trivially at least one piece-wise linear model that will have less error than the exponential one. Consider the one that is simply a straight line between EVERY point. Obviously that has zero error compared to the exponential model. Yet, it has very little predictive power compared to the exponential model.

Now, that's not what was done here... but there's actually quite a few parameters in the form of where to make the breaks and how many to make. Doesn't seem like a fair comparison.



Good point.

The paper does cross-validate the models, and I am told that cross-validation properly penalizes overfitting with too many parameters… but I don't understand the statistics well enough here.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: