Wasserstein distance (Earth Mover’s Distance) measures how far apart two distributions are — the ‘work’ needed to reshape one pile of dirt into another. The concept extends to multiple distributions via a linear program, which under mild conditions can be solved with a linear-time greedy algorithm [1]. It’s an active research area with applications in clustering, computing Wasserstein barycenters (averaging distributions), and large-scale machine learning.
There is an analogue of the CLT for extreme values. The Fisher–Tippett–Gnedenko theorem is the extreme-values analogue of the CLT: if the properly normalized maximum of an i.i.d. sample converges, it must be Gumbel, Fréchet, or Weibull—unified as the Generalized Extreme Value distribution. Unlike the CLT, whose assumptions (in my experience) rarely hold in practice, this result is extremely general and underpins methods like wavelet thresholding and signal denoising—easy to demonstrate with a quick simulation.
There's also a more conservative rule similar to the CLT that works off of the definition of variance, and thus rests on no assumptions other than the existence of variance. Chebyshev's inequality tells us that the probability that any sample is more than k standard deviations away is bounded by 1/k².
In other words, it is possible (given sufficiently weird distributions) that not a single sample lands inside one standard deviation, but 75% of them must be inside two standard deviations, 88% inside three standard deviations, and so on.
There's also a one-sided version of it (Cantelli's inequality) which bounds the probability of any sample by 1/(1+k)², meaning at least 75 % of samples must be less than one standard deviation, 88% less than two standard deviations, etc.
Think of this during the next financial crisis when bank people no doubt will say they encountered "six sigma daily movements which should happen only once every hundred million years!!" or whatever. According to the CLT, sure, but for sufficiently odd distributions the Cantelli bound might be a more useful guide, and it says six sigma daily movements could happen as often as every fifty days.
I highly doubt the finance bros pretend distributions are normal or don't know Chebychev, vs. not having enough data to obtain the covariance structure (for rare events) to properly bound even with Chebychev.
I’ll share an anecdote I witnessed in my extended family–it was horrible. When US Air went bankrupt, employees with decades of service, expecting high five-figure to low six-figure annual income, learned they would get roughly $0.20 on the dollar. For many who were entering retirement, the impact was life-changing, and the stress and disruption it caused could well be argued as life-shortening.
PBGC did take over, but that did not solve the problem.
I believe the root cause was mismanagement of the pension, with the bankruptcy merely exposing this. But I wouldn’t be surprised if, at every opportunity during the bankruptcy process, changes were made that eroded the program’s health.
This is an instance where Conway's Law applies: state and county systems were kept separate so that maintenance and repairs crews wouldn’t accidentally duplicate work. https://en.wikipedia.org/wiki/Conway%27s_law
I've seen several, Planet Trek in Wisconsin is a good bikeable one with high quality signage. The sun is downtown, the moon is the size of a peach pit, Pluto is ~20 miles away.
how helpful was ai for this? The paper is light on details, but it says the agent was used to generate a kind of seed set (rank-1 bilinear products) that were then fed into the subsequent steps. Evidently this idea succeeded. Curious if anyone here has insight into if this is a common technique, how this agent's output would compare to random or a simple heuristic that attempts the same. Also interested to see how the training objective gets defined since the final task is a couple of steps downstream from what the agent generates.
My hope is that a lib like this one or similar could rally mindshare and become integrated as the new standard, and adopted by the wider developer community. In near term, it comes down to trade-offs. I see no decision that works for all use cases. Dependencies introduce ticking time bombs, stdlibs should be correct and intuitive, but at least when not they are usually well tested and maintained, but when stdlib don't meet urgent production needs you have to do something.
It's basically what happened in Java. Everyone used jodatime, and they took great inspiration from that when making the new standard time api for java 8.
Sharing a simple thought experiment that was shared with me years ago that explains (to me at least) why this is an interesting question. Imagine a billiard ball with nonzero velocity bouncing around an enclosed box. When the ball encounters a side of the box, it bounces off elastically. A replay of this ball's path over time is equally plausible if the replay were run forward or reversed. The preceding is also true if one imagines 2 or 3 balls, with the only difference being that the balls may also bounce off each other elastically. Even in this scenario, reversibility of playback holds no matter the configuration of the balls: they could all start clustered or be scattered and the replay would be plausible when played in either time direction. But this is no longer true when the box (now much larger) contains millions of billiard balls. If the balls start clustered together, they will scatter over time about the box and the replay of their paths has only one plausible time direction. This is because it is extremely unlikely that all the billiards will, simply by chance at some point in the future, collect together so they are contained within a very small volume. To summarize, in the "few scenario" we can plausibly reverse time but in the "many scenario", we cannot. The only difference between scenarios is the number of balls in the box, which suggests that time is an emergent property. layer8's answer elsewhere in this thread says the same, but more succinctly.
But this is no longer true when the box (now much larger) contains millions of billiard balls. If the balls start clustered together, they will scatter over time about the box and the replay of their paths has only one plausible time direction.
I don't understand how increasing the number of balls means you can't reverse the playback.
you can reverse the playback, all the physics of billiards bouncing around works equally well in either time direction.
> If the balls start clustered together, they will scatter over time about the box and the replay of their paths has only one plausible time direction.
It is extremely unlikely that all the billiards will, simply by chance at some point in the future, collect together so they are contained within a very small volume. this shows that there is asymmetry in which direction time flows.
Actually no, this is a good question that hits at a tension between the second law of thermodynamics and the big bang model. The aftermath of the big bang and the inflation period are known to be times where the universe was extremely hot and dense. And yet, by the rule that entropy can't increase over time, it follows that they were the lowest entropy state of the universe: definitely lower entropy than what we have today. But how can an extremely hot plasma made up of all of the particles that today make up stars and planets and so on have been a lower entropy state than the galaxies of today?
Disclaimer: I'm saying a lot of words here, but I don't actually entirely know what I'm talking about here. I'm saying things that I think make sense based on what I do understand, but I don't actually know details how people model things about entropy in relation to models of "the big bang". I'm not simplifying to try to make things easy to understand, but rather I'm grasping at too-simple weak analogies in an attempt to understand.
I think this is generally explained as being due to the expansion of space? This is only an extremely loose analogy because the expansion of space is not really all that much like a container getting bigger (because it isn't like it is thought to start with one particular size and get bigger), but, suppose you have a syringe with the tip closed off, where there is some hot gas that is highly compressed in the front bit of the syringe, and then you pull back on the syringe plunger. The temperature of the gas should decrease, right? By the same principle of how refrigeration works by rarefying the coolant in the place you want to make cool, to make it cooler so that it will absorb heat from the environment, and then moving it to the place you want to heat and compressing it so that it will be hot and give off that heat?
But, pulling the plunger back doesn't decrease the entropy of the gas in the plunger, does it?
It might require putting in energy in which case it could decrease the energy in the plunger at the cost of increasing it elsewhere, but if the plunger is loose enough the pressure from the gas should be able to push it out, in which case I think the temperature and pressure would still decrease without an external source doing the work, so the entropy definitely shouldn't decrease in that case.
So, going from hot and dense to less hot and dense, keeping the same amount of stuff but more spread out, doesn't always mean a decrease in entropy, and can instead correspond to more entropy?
After all, if there are more places available to be, that seems like more uncertainty, right? At least, in a finite system.
Like, if you are looking at the relative entropy of a distribution of one particle in a 1D box of length L, where the entropy is relative to the Lebesgue measure, and you compare this to the case of a 1D box of length 2L , still measuring the relative entropy relative to the Lebesgue measure, well, the maximum entropy distribution for the position when in the 2L length box, the uniform distribution on [0,2L], has more relative entropy than the uniform distribution on [0,L] , which has the highest entropy for the distribution over position attainable for the length L box.
(I say "relative entropy" because for a continuous variable, we don't have like, a countable set of possibilities which each have non-zero probability that we can sum over, and so we can't use that definition of entropy, but instead integrate over possibilities using the density of the distribution with respect to some reference measure, and I think the Lebesgue measure, the usual measure on sets of real numbers, is a good common reference point.)
Though, I guess the thing shouldn't really be just distribution over position, but joint distribution over position and momentum.
That's the thing - if the balls don't start clustered, then you "lose" the time direction again; now it doesn't matter which way you replay the trajectories, they are equally plausible in either direction. So if that's what you ultimately base the definition of time on, this implies that it is an emergent property.
it does not need anything other than plausible start and stop conditions to point out how time reversal is implausable, the time reversal idea only works under the implausable condition of only happening in a pefect steady state, no begining no end, just threading a piece of film in backwards, after editing out the set dressers at work
without answering the fundamental question of "how did we get here anyway"
its not a theory, its slight of hand
edit: if it was an educational, thought experiment
and labeled as such fine, sure, but its nothing more than that, :) , but could be less
I thought the article would discuss physics of flight, but it's about physics of iridescence. Cool, but not what I expected. I've seen butterflies fly in a stiff wind. It seems impossible even when you watch it. I was hoping for some insight into that.
Fantastic study thank you and such evocative language “The fluttery flight of butterflies over a sunny meadow instils fascination, yet the flight of butterflies remains somewhat a mystery. The few flight mechanistic studies performed so far on butterflies have triggered suggestions that they use a variety of unsteady aerodynamic mechanisms for their force production. Among these mechanisms, the upstroke wing clap, first described by Weis-Fogh for insects already in the early 1970s, is one repeatedly reported as used by butterflies. Despite the importance of this mechanism, as far as we know, quantitative measurements of the aerodynamics of the wing clap in freely flying animals are still lacking.”
your comment introduced me to the term permaculture, looks interesting. Do you leverage numerical optimization - linear programming or integer programming for example - in your work?
[1] https://en.wikipedia.org/wiki/Earth_mover's_distance#More_th...