Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How do you calculate the correct sample size for a test, to achieve the correct "power"?


For binomial scenarios like a stock A/B test, most statistical environments will have some sort of built-in power functions. For example, R does; an example: http://www.gwern.net/AB%20testing#power-analysis


For simple tests you can reverse the mathematics to get good estimates of how many observations are needed given a goal for your desired power and tolerance for false positives. Asking for greater power makes your test more sensitive ("buys a bigger telescope") at the cost of increased sample size. Asking for fewer false positives ("cleaning the lenses") costs similarly.

For more sophisticated tests, ones less likely to be seen in an A/B scenario, you might not be able to reverse the mathematics and get a direct answer, so often people will run simulation studies to guess at the needed sample size.


Here are a few resources: http://statpages.org/#Power


I use simulations so I can avoid math and analyze the power of any distribution I can model in a consistent manner.

I wrote something about it here: http://www.databozo.com/2013/10/12/Finding_a_sample_size_for...

EDIT: typo


Lehr's Equation (Rule of 16) is generally a good estimator. It is explained (but not referenced) in a similar article here: http://www.evanmiller.org/how-not-to-run-an-ab-test.html




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: