Null and Alternative Hypotheses

Your team redesigned the action alert template last month. The old version had a 6.8% click-to-action rate across hundreds of sends. The new version, tested on 5,000 recipients, got 7.5%. That's a 0.7 percentage-point improvement. The campaigner calls it a win and wants to roll out the new template permanently. But the data analyst pauses. "That could just be noise," she says. "How would we know?"

This is the fundamental question that hypothesis testing answers. And it starts with a deliberately boring assumption.

The null hypothesis is the default explanation. Nothing changed. Whatever difference you observed happened purely by chance. The old template and the new one perform identically, and the gap in your data is just the natural wobble that comes from any sample. The alternative hypothesis is the competing claim. Something actually did change. The new template genuinely performs differently from the old one.

These two hypotheses aren't equal partners in a debate. The null starts with the presumption in its favor. You assume nothing changed until the evidence against that assumption becomes strong enough to overturn it. Think of it like a courtroom. The null hypothesis is "innocent until proven guilty." You don't need to prove that nothing changed. You need to show, with sufficient evidence, that something did.

This framework exists because humans are natural pattern-seekers. We spot trends in random data, find meaning in coincidence, and celebrate improvement where there's only fluctuation. The null hypothesis forces you to ask a specific, uncomfortable question before acting on any result. Could this have happened even if nothing actually changed? The way you answer is by looking at what the data should look like under the null. If nothing changed and the true click rate is still 6.8%, what range of results would you expect from a sample of 5,000 emails? The binomial distribution and the normal distribution give you that range. The standard deviation of the sampling distribution tells you how much wobble to expect. If your observed result falls comfortably within that range, you can't reject the null. The difference could easily be chance. If it falls far outside, the null becomes hard to believe, and you have evidence that something genuinely changed.

For the action alert example, if the true rate is 6.8% and you sample 5,000 recipients, the observed rate would land between about 6.1% and 7.5% roughly 95% of the time. Your 7.5% result sits right at that upper boundary. That's suspicious but not conclusive. If the observed rate had been 8.2%, well beyond the null's expected range, you'd have strong evidence that the new template actually works.

In A/B testing of campaign emails, the null hypothesis is always that both versions perform identically. Every test, whether you're comparing subject lines, call-to-action wording, or donation page layouts, starts from this assumption. In petition analytics, when signatures spike after a social media push, the null hypothesis asks whether the spike falls within the range of normal daily fluctuation. In fundraising, if average donation size rose from €32 to €37 after you added suggested amounts to the donation page, the null hypothesis says that difference might just reflect who happened to visit that particular week rather than any real effect of the design change.

The null hypothesis isn't a prediction that nothing changed. It's a tool for disciplined thinking, forcing you to measure how surprising a result would be under the boring explanation before you accept the exciting one.


See It

Drag the observed result line to see whether different outcomes fall within or outside the range expected under the null hypothesis. Adjust the baseline rate and sample size to see how they change the range.


Reflect

Think about the last time your team celebrated a campaign improvement. A higher open rate, more petition signatures, a better donation conversion. Did anyone ask whether the difference could have been chance? What would it have taken for someone to raise that question?

When your organization decides to keep a "winning" email template or donation page layout, is that decision based on evidence strong enough to rule out the null hypothesis, or on a gut feeling that the numbers look better?