Standard Deviation and Variance
You run two recurring email campaigns. Campaign A averages a 4% click rate across the last twelve sends. Campaign B also averages 4%. Your communications lead says they perform the same. But when you look at the individual sends, Campaign A ranges from 3.5% to 4.5%. Campaign B swings between 1% and 9%. Those are not the same campaign. One is steady. The other is a coin toss.
The mean alone, which we explored in Day 1, cannot tell you this. It collapses all the variation into a single number and throws the rest away. To see the full picture, you need a way to measure how spread out your data is. That's what standard deviation does.
Think of it this way. The mean is the center of gravity of your data. The standard deviation tells you how tightly everything clusters around that center. A small standard deviation means most values are close to the average. A large one means they're scattered all over the place. Campaign A has a small standard deviation. Campaign B has a large one.
The calculation itself is straightforward in principle, even if the name sounds intimidating. You start by finding the mean. Then, for each data point, you measure how far it sits from the mean. Some points are above, some below. You square those distances (which makes them all positive and punishes big deviations more than small ones). Then you average all those squared distances. That average of squared distances is called the variance. Take the square root of the variance, and you get the standard deviation, which brings you back to the original units. If your data is in percentages, the standard deviation is also in percentages. If it's in euros, the standard deviation is in euros.
Standard deviation and variance measure the same thing, just on different scales. Variance is useful in formulas and statistical models. Standard deviation is useful when you want to talk about your data in terms people actually understand. If someone tells you the average donation is €30 with a standard deviation of €5, you can picture most donations landing between €25 and €35. If the standard deviation is €50, the average is nearly meaningless because donations are all over the map. When outliers inflate the standard deviation, the interquartile range from Day 3 offers a more resistant alternative.
This shows up constantly in nonprofit work. When you report program outcomes to a funder, saying "participants improved their test scores by an average of 12 points" sounds great. But if the standard deviation is 20 points, that means some participants improved by 30 and others got worse. The funder deserves to know both numbers. In A/B testing, the standard deviation of your metric determines how long you need to run the test before you can trust the results. High variance means you need more data. In budgeting, knowing the standard deviation of monthly donation revenue tells you how much cash reserve you actually need. An organization with steady revenue and one with volatile revenue need very different buffers, even if their annual totals match.
Volunteer shift attendance is another good example. If your food bank schedules 20 volunteers per shift and the average actually showing up is 18, that sounds fine. But if the standard deviation is 6, it means some days you get 12 and other days you get 24. You need a plan for both extremes, not just the average.
The average tells you where the center is. The standard deviation tells you whether you can trust the center. A tight spread means the average is reliable. A wide spread means it's just a number.
See It
Drag the slider to change how spread out the donations are. Watch the standard deviation update and see how the same mean can tell very different stories.
Reflect
Think about a metric your organization tracks regularly, like email open rates, monthly donations, or event attendance. You probably know the average. But do you know how much it varies from month to month? Would it change any decisions if you did?
When your team says a campaign "usually" gets a certain result, are they describing a tight cluster or a wide range? How confident should you actually be in that "usually"?