Whenever the discussion of liberty versus equality of outcomes arises there is a tendency to create a false dichotomy where we are forced to choose between extreme positions that are mutually exclusive. However, this is a mistake as the best approach is to find a compromise between different tradeoffs.
'The brain carries out tasks that are very demanding from a computational perspective, apparently powered by a mere 20 Watts.This fact has intrigued computer scientists for many decades, and is currently drawing many of them to the quest of acquiring a computational understanding of the brain. Yet, at present there is no productive interaction of computer scientists with neuroscientists in this quest.'
This article uses a Julia implementation to explain how the complex-step method allows us to easily compute derivatives of analytic functions up to machine-precision.
Possibly the best short blog post on this subject on the internet.
1. Here I present an elementary proof for a classical result in random matrix theory that applies to any random matrix sampled from a continuous distribution.
One of its many important consequences is that almost all linear models with square Jacobian matrices are invertible.
2. This is also relevant to scientists that want stable internal models for deep neural networks since a deep network is an exponentially large ensemble of linear models with compact support.
1. The typical deep neural network tutorial introduces deep networks as compositions of nonlinearities and affine transforms.
2. In fact, a deep network with relu activation simplifies to a linear combination of affine transformations with compact support. But, why would affine transformations be useful?
3. After recent discussions on Twitter it occurred to me that the reason why they work is that they are actually first-order Taylor approximations of a suitable analytic function.
4. What is really cool about this is that by this logic partial derivatives, i.e. Jacobians, are computational primitives for both inference and learning.
5. I think this also provides insight into how deep networks approximate functions. They approximate the intrinsic geometry of a relation using piece-wise linear functions.
This works because a suitable polynomial approximation exists and all polynomials are locally Lipschitz.
For the kind of explosiveness that man
will be able to contrive by 1980, the globe
is dangerously small, its political units
dangerously unstable.-John von Neumann