Hacker Newsnew | past | comments | ask | show | jobs | submit | aidanrocke's commentslogin

Whenever the discussion of liberty versus equality of outcomes arises there is a tendency to create a false dichotomy where we are forced to choose between extreme positions that are mutually exclusive. However, this is a mistake as the best approach is to find a compromise between different tradeoffs.


From the abstract:

'The brain carries out tasks that are very demanding from a computational perspective, apparently powered by a mere 20 Watts.This fact has intrigued computer scientists for many decades, and is currently drawing many of them to the quest of acquiring a computational understanding of the brain. Yet, at present there is no productive interaction of computer scientists with neuroscientists in this quest.'


For more information, check this brilliant thread by Michelle Kendall: https://twitter.com/mishkendall/status/1239946522390867969


This article uses a Julia implementation to explain how the complex-step method allows us to easily compute derivatives of analytic functions up to machine-precision.

Possibly the best short blog post on this subject on the internet.


tl;dr

1. The objective of this paper is to show how the complex-step derivative approximation is related to algorithmic differentiation.

2. For a tutorial on complex-step differentiation, I can recommend the blog post of John Lapeyre: http://www.johnlapeyre.com/posts/complex-step-differentiatio...


tl;dr

1. Within the context of optimisation, differentiable approximations of the min and max operators on R^n are very useful.

2. However, in order for these approximations to be useful they must also be numerically stable.


tl;dr

1. Here I present an elementary proof for a classical result in random matrix theory that applies to any random matrix sampled from a continuous distribution.

One of its many important consequences is that almost all linear models with square Jacobian matrices are invertible.

2. This is also relevant to scientists that want stable internal models for deep neural networks since a deep network is an exponentially large ensemble of linear models with compact support.


tl;dr

1. The typical deep neural network tutorial introduces deep networks as compositions of nonlinearities and affine transforms.

2. In fact, a deep network with relu activation simplifies to a linear combination of affine transformations with compact support. But, why would affine transformations be useful?

3. After recent discussions on Twitter it occurred to me that the reason why they work is that they are actually first-order Taylor approximations of a suitable analytic function.

4. What is really cool about this is that by this logic partial derivatives, i.e. Jacobians, are computational primitives for both inference and learning.

5. I think this also provides insight into how deep networks approximate functions. They approximate the intrinsic geometry of a relation using piece-wise linear functions.

This works because a suitable polynomial approximation exists and all polynomials are locally Lipschitz.


For the kind of explosiveness that man will be able to contrive by 1980, the globe is dangerously small, its political units dangerously unstable.-John von Neumann


I mean mimetic behaviour in the sense of René Girard.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: