When A/B Testing Doesn’t Work

Oct 27, 2023

In technical products, there’s a tendency to lean towards A/B tests. To run simultaneous changes across different slices of your user base and to measure the outcome.

A/B tests can be extremely useful in some cases — if you’re at Google or Meta scale or if you’re doing something like performance marketing. But in the vast majority of cases, it’s more pain than it’s worth — and might even be detrimental.

You don’t have enough data. Most products don’t have enough users to generate statistically significant results. The more you extrapolate from small sample sizes, the more you risk drawing incorrect conclusions.
A/B tests mean incremental changes. Incremental changes often lead to incremental results. Google testing an algorithm change or UI improvement is unlikely to change the business by more than a few basis points (and that would be a very successful experiment). For most startups and businesses, you need much bigger shifts and effects.
Twice the work. A/B testing is resource-intensive. You have to build both features. You have to build them in a way they can be feature-gated. You have to build the infrastructure to randomly distribute and measure the changes in both populations. You need to not confuse your users. You need expert data analysts to interpret the results.
Not sure what to measure. While hyper-focused organizations like Meta had a clear North Star (for many years, growth), most experimenters don’t know exactly what they are trying to optimize for. And many organizations don’t fully grasp the more qualitative consequences of a change.

Startups especially have to be opinionated. They can’t do everything (it’s hard enough to do one thing), and they don’t have the data or users to run tests.

Subscribe to Matt Rickard

Launched 5 years ago

Thoughts on engineering, startups, and AI.

By subscribing, I agree to Substack's Terms of Use, and acknowledge its Information Collection Notice and Privacy Policy.

13 Likes

Bruno Wu

A/B tests often doesn't work in a anti-fraud situation. You cannot just deploy a change just for a set of fraudulent accounts and no change for another set of fraudulent accounts. You can't assume that the 2 set of fraudulent accounts are independent.

Expand full comment

José Enrique Estremadoyro fort

I've seen products like builder.io offering A/B testing in no code solutions out of the box.

I guess small marketing teams feel they are not sophisticated enough.

Some startups compare themselves

with very high end tech companies

when they are very different

Just like Mexico compares itself constantly with the US

Matt Rickard

When A/B Testing Doesn’t Work

Subscribe to Matt Rickard

Discussion about this post