Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Early methods conceived of vision as searching for edges, or generalized cylinders, or in terms of SIFT features. But today all this is discarded. Modern deep-learning neural networks use only the notions of convolution and certain kinds of invariances, and perform much better.

This assessment is a bit off.

First, convolution and invariance are definitely not the only things you need. Modern DL architectures use lots of very clever gadgets inspired by decades of interdisciplinary research.

Second, architecture still matters a lot in neural networks, and domain experts still make architectural decisions heavily informed by domain insights into what their goals are and what tools might make progress towards these goals. For example, convolution + max-pooling makes sense as a combination because of historically successful techniques in computer vision. It wasn't something randomly tried or brute forced.

The role of domain expertise has not gone away. You just have to leverage it in ways that are lower-level, less obvious, less explicitly connected to the goal in a way that a human would expect based on high-level conceptual reasoning.

From what I've heard, the author's thesis is most true for chess. The game tree for chess isn't so huge as Go, so it's more amenable to brute forcing. The breakthrough in Go was not from Moore's Law, it was from innovative DL/RL techniques.

Computation may enable more compute-heavy techniques, but it doesn't mean it's obvious what these techniques are or that they are well-characterized as simpler or more "brute force" than past approaches.



> First, convolution and invariance are definitely not the only things you need. Modern DL architectures use lots of very clever gadgets inspired by decades of interdisciplinary research.

i have noticed this. rather than replacing feature engineering, it seems that you find some of those ideas from psychophysics just manually built into the networks.


Curious what you're referring to? My knowledge of this area is not that broad or deep at all.


The weight patterns that convolutional neural networks develop are pretty familiar in many ways. For example, the first layer will generally end up with small-scale feature detectors, such as borders, gradients/color pairs and certain textures, at various scales and angles.

Try an image search for "imagenet first layer" to see examples.

I took the comment to mean "we have ourselves discovered certain filters being useful (e.g. https://en.wikipedia.org/wiki/Gabor_filter), and the networks now also discover this same information".


it is true that the promise of dl succeeds at finding some handcrafted features. it is also true that (at least the last time i checked), the state of the art is still making use of handcoded transforms that are derived from results in psychophysics.


mel warped cepstra still see use and improve performance for nns, is one example.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: