This time I will take the opportunity to talk about how to calculate the VARIANCE
functions. I recently blogged about the COVARIANCE
functions and since also VARIANCE
is missing in SQL Server, I wanted to give it a shot. The general formula for calculating the variance value for population is
As you can see in it’s original format, the function is a two-pass operation. First calculate the average value and the subtract that average value from each row in the current dataset.
I don’t find that good in terms of performance, so I figured out how to rewrite the function to be a one pass operation.
Can we apply the same thinking again when calculating the variance value for sample? Yes, we can.
Now we have a one pass operation and we can test the formula like this