More

nanis · 2025-06-29T12:35:00 1751200500

> too much punctuation

I thought you were joking. ... After a while, I started expecting a comma after each and every word.

nanis · 2025-04-13T13:56:35 1744552595

First time I heard about the Moomins. I thought this was about Mumins[1].

[1]: https://en.wikipedia.org/wiki/Mumin

selimthegrim · 2025-04-13T14:35:43 1744554943

The crossover waiting to happen

nanis · 2025-04-13T13:25:48 1744550748

It is your twist and unjustifiable generalization of the author's words about the author himself:

> "aging is a synonym of cognitive decline"

compared to:

> As I near 60, I’ve come to realize I simply don’t have the same mental sharpness or stamina I used to.

The author did not say anything about anyone else.

Synoym: https://www.bennetyee.org/http_webster.cgi?isindex=synonym&m...

nanis · 2025-02-28T11:00:38 1740740438

This is why Firefox's changes are so frustrating[1].

[1]: https://news.ycombinator.com/item?id=43203096

kevincox · 2025-02-28T15:25:12 1740756312

The perfect opportunity to attract more market share, but instead they are shooting themselves in the foot at exactly the wrong time.

nanis · 2025-02-28T10:21:42 1740738102

This is pure speculation, but what are the chances this change is simply an attempt to provide legal cover what they might have started doing 50 versions ago?[1]

[1]: https://news.ycombinator.com/item?id=29082856

account42 · 2025-03-03T16:36:42 1741019802

If that's the case they should stop doing that no give them selves the legal right to do it.

aleph_minus_one · 2025-02-28T10:46:16 1740739576

According to the tweet, Mozilla claimed

> “Does Firefox sell your personal data?”

> “Nope. Never have, never will.”

I do believe that never is a very, very clear statement (concerning every possible future) that needs no legal cover.

account42 · 2025-03-03T16:39:05 1741019945

Ah but what you are interpreting in layman english is actually a term of art in marketing that means "this will change as soon as it becomes more profitable to do that".

nanis · on Nov 30, 2024

1 is 2 to the power 0 ... 0b0001

shifted left once, it becomes 2 to the power 1 ... 0b0010

shifted left twice, it becomes 2 to the power 2 ... 0b0100

shifted left three times, it becomes 2 to the power 3 ... 0b1000

etc until

shifted left 136_279_841 times, it becomes 2 to the power 136_279_84 ... 0b1000...many zeros...0000

subtract 1, it becomes

0b0111...many ones...1111

schoen · on Dec 1, 2024

One funny thing about Mersenne primes is that, as a result of what you describe, they are exactly those primes whose binary representation consists of a prime number of ones!

The smallest Mersenne prime, three, is binary 11, while the next largest is seven (111), then 31 (11111), then 127 (1111111). The next candidate, 2047 (11111111111), is not prime.

nanis · on Oct 24, 2024

> the SSH certificates issued by the Cloudflare CA include a field called ValidPrinciples

Having implemented similar systems before, I was interested to read this post. Then I see this. Now I have to find out if that really is the field, if this was ChatGPT spellcheck, or something else entirely.

blueflow · on Oct 24, 2024

For the others: The correct naming is "principals".

jgrahamc · on Oct 24, 2024

Sigh. I'll get that fixed and figure out how that happened.

nanis · on Oct 27, 2024

This was corrected to:

> ... SSH certificates issued by the Cloudflare CA include a field called valid_principals

which indicates it wasn't just the spelling of `principals`.

jgrahamc · on Oct 28, 2024

It depends... ssh-keygen -L displays the fields as Principals (which are set using the -n parameter) and internally a lot of the OpenSSH code talks about AuthorizedPrincipals...

nanis · on Oct 24, 2024

> if (argc above 1)

I give up.

aartaka · on Oct 24, 2024

You're welcome!

nanis · on Sept 9, 2024

> I am a simple sole, ... go back to the halcyon early days of the web before Netscape dropped the JS-bomb. You know HTML for the layout and CSS for the style.

I am not sure if this is intended as humor, but JavaScript came before CSS.

culi · on Sept 9, 2024

And "HTML for structure, CSS for style" is a philosophy that developed later. As is evidenced by early HTML tags like <small>, <center>, <b>, <i>, etc

stavros · on Sept 9, 2024

I remember when CSS Zen garden was showcasing what you can do with CSS, and browsers (well, "browser", singular, as there was basically only IE 6 back then) supported Javascript and VBScript.

austin-cheney · on Sept 9, 2024

Back in 2008 we had a team building exercise to create a Zen Garden sample. Here was mine: https://prettydiff.com/zen/

In those days the three content columns vertically aligned to the same height cross-browser.

SoftTalker · on Sept 9, 2024

And it's soul, not sole. Unless the author is also a fish.

librasteve · on Sept 9, 2024

sorry - maybe we need AI smell checkers

itohihiyt · on Sept 9, 2024

librasteve · on Sept 9, 2024

I'm noted for my dry wit

philsnow · on Sept 9, 2024

I read that as "sole [proprietor]"

austin-cheney · on Sept 9, 2024

It seems JavaScript was first released, just internally, in May 1995 in a pre-alpha version of Netscape 2.0. It would not be publicly announced until December 1995. Netscape 2.0 didn't even come out until March 1996 and even then it was language version 1.0 which was extremely defective. The first version of the language that actually worked was JavaScript 1.1 that came out in August 1996. CSS on the other hand first premiered with IE3 that came out in August 1996.

* https://www.w3.org/Style/CSS/msie/

* https://webdevelopmenthistory.com/1995-the-birth-of-javascri...

The distinction either way is trivial, because at that time nobody was using either CSS or JavaScript as they required proprietary APIs. There was no DOM specification at that time.

librasteve · on Sept 9, 2024

yikes, I stand corrected...

JavaScript was created by Brendan Eich in just 10 days in May 1995 while he was working at Netscape Communications Corporation

CSS (Cascading Style Sheets) was introduced later than JavaScript. The first CSS specification was published in December 1996 by Håkon Wium Lie and Bert Bos.

apologies

agumonkey · on Sept 9, 2024

Hmm apparently it also came before DSSSL. Surprising.

nanis · on Aug 10, 2024

Early in the A-B craze (optimal shade of blue nonsense), I was talking to someone high up with an online hotel reservation company who was telling me how great A-B testing had been for them. I asked him how they chose stopping point/sample size. He told me experiments continued until they observed a statistically significant difference between the two conditions.

The arithmetic is simple and cheap. Understanding basic intro stats principles, priceless.

Someone · on Aug 10, 2024

> He told me experiments continued until they observed a statistically significant difference between the two conditions.

Apparently, if you do the observing the right way, that is a sound way to do that. https://en.wikipedia.org/wiki/E-values:

“We say that testing based on e-values remains safe (Type-I valid) under optional continuation.”

gatopingado · on Aug 10, 2024

This is correct. There's been a lot of interest in e-values and non-parametric confidence sequences in recent literature. It's usually refered to as anytime-valid inference [1]. Evan Miller explored a similar idea in [2]. For some practical examples, see my Python library [3] implementing multinomial and time inhomogeneous Bernoulli / Poisson process tests based in [4]. See [5] for linear models / t-tests.

[1] https://arxiv.org/abs/2210.0194

[2] https://www.evanmiller.org/sequential-ab-testing.html

[3] https://github.com/assuncaolfi/savvi/

[4] https://openreview.net/forum?id=a4zg0jiuVi

[5] https://arxiv.org/abs/2210.08589

ryan-duve · on Aug 10, 2024

Did you link the thing that you intended to for [1]? I can't find anything about "anytime-valid inference" there.

gatopingado · on Aug 10, 2024

Thanks for noting! This is the right link for [1]: https://arxiv.org/abs/2210.01948

glutamate · on Aug 10, 2024

Sounds like you already know this, but that's not great and will give a lot of false positives. In science this is called p-level hacking. The rigorous way to use hypothesis to testing is to calculate the sample size for the expected effect size and only one test when this sample size is achieved. But this requires knowing the effect size.

If you are doing a lot of significance tests you need to adjust the p-level to divide by the number of implicit comparisons, so e.g. only accept p<0.001 if running ine test per day.

Alternatively just do thompson sampling until one variant dominates.

paulddraper · on Aug 10, 2024

To expand, p value tells you significance (more precisely the likelihood of the effect if there were no underlying difference). But if you observe it over and over again and pay attention to one value, you've subverted the measure.

Thompson/multi-armed bandit optimizes for outcome over the duration of the test, by progressively altering the treatment %. The test runs longer, but yields better outcomes while doing it.

It's objectively a better way to optimize, unless there is time-based overhead to the existence of the A/B test itself. (E.g. maintaining two code paths.)

youainti · on Aug 10, 2024

I just wanted to affirm what you are doing here.

A key point here is that P-Values optimize for detection of effects if you do everything right, which is not common as you point out.

> Thompson/multi-armed bandit optimizes for outcome over the duration of the test.

Exactly.

kqr · on Aug 10, 2024

The p value is the risk of getting an effect specifically due to sampling error, under the assumption of perfectly random sampling with no real effect. It says very little.

In particular, if you aren't doing perfectly random sampling it is meaningless. If you are concerned about other types of error than sampling error it is meaningless.

A significant p-value is nowhere near proof of effect. All it does is suggestively wiggle its eyebrows in the direction of further research.

paulddraper · on Aug 10, 2024

> likelihood of the effect if there were no underlying difference

By "effect" I mean "observed effect"; i.e. how likely are those results, assuming the null hypothesis.

axegon_ · on Aug 10, 2024

Many years ago I was working for a large gaming company and I was the one who developed a very optimal and cheap way to split any cluster of users into A/B groups. The company was extremely happy with how well that worked. However I did some investigation on my own a year later to see how the business development people were using it and... Yeah, pretty much what you said. They were literally brute forcing different configuration until they(more or less) got the desired results.

kwillets · on Aug 10, 2024

Microsoft has a seed finder specifically aimed at avoiding a priori bias in experiment groups, but IMO the main effect is pushing whales (which are possibly bots) into different groups until the bias evens out.

I find it hard to imagine obtaining much bias from a random hash seed in a large group of small-scale users, but I haven't looked at the problem closely.

ec109685 · on Aug 10, 2024

We definitely saw bias, and it made experiments hard to launch until the system started pre-identifying unbiased population samples ahead of time, so the experiment could just pull pre-vetted users.

abhgh · on Aug 10, 2024

This is form of "interim analysis" [1].

[1] https://en.wikipedia.org/wiki/Interim_analysis

regularfry · on Aug 10, 2024

And yet this is the default. As commonly implemented, a/b testing is an excellent way to look busy, and people will actively resist changing processes to make them more reliable.

I think this is not unrelated to the fact that if you wait long enough you can get a positive signal from a neutral intervention, so you can literally shuffle chairs on the Titanic and claim success. The incentives are against accuracy because nobody wants to be told that the feature they've just had the team building for 3 months had no effect whatsoever.

IshKebab · on Aug 10, 2024

This is surely more optimal if you do the statistics right? I mean I'm sure they didn't but the intuition that you can stop once there's sufficient evidence is correct.

scott_w · on Aug 10, 2024

Bear in mind many people aren’t doing the statistics right.

I’m not an expert but my understanding is that it’s doable if you’re calculating the correct MDE based on the observed sample size, though not ideal (because sometimes the observed sample is too small and there’s no way round that).

I suspect the problem comes when people don’t adjust the MDE properly for the smaller sample. Tools help but you’ve gotta know about them and use them ;)

Personally I’d prefer to avoid this and be a bit more strict due to something a PM once said: “If you torture the data long enough, it’ll show you what you want to see.”

esafak · on Aug 10, 2024

Perhaps he was using a sequential test.

sethd1211 · on Aug 10, 2024

Which company was this? was it by chance SnapTravel?