VMware Modern Apps Community
AbhishekSK
Hot Shot
Hot Shot

How do I calculate a standard deviation in Wavefront?

Have you ever wanted to know how much a particular series deviates from the group as a whole? Standard deviations can help you accomplish this and are possible in Wavefront! A basic standard deviation can be graphed by utilizing the following expression:

(ts("requests.latency", tag="az-3") - avg(ts("requests.latency", tag="az-3"))) / sqrt(variance(ts("requests.latency", tag="az-3")))

In addition to the standard deviation query above, we've applied a constant value of 1.0, 0, and -1.0 in order to provide visual boundaries to the standard deviation. A Standard Deviation query in Wavefront requires a base ts() call to be referenced 3 times. In the query above, the base ts() query is ts("requests.latency", tag="az-3"). To save time when creating your query, you can utilize variables. The following is an example of a query line variable.

Note how we label the first query line in the image above as "base" and then reference "base" as a query line variable in the second query line.

There are several variations that you can apply to the standard deviation query. For example, You can utilize mavg() and mvar() in order to determine how much a source deviates from its past history.

(ts("requests.latency", tag="az-3") - mavg(120m, ts("requests.latency", tag="az-3"))) / sqrt(mvar(120m, ts("requests.latency", tag="az-3")))

You can also determine how the average set of data compares to its past history as well by applying avg() to the base query.

(avg(ts("requests.latency", tag="az-3")) - mavg(120m, avg(ts("requests.latency", tag="az-3")))) / sqrt(mvar(120m, avg(ts("requests.latency", tag="az-3"))))

Feel free to comment on this thread and share your standard deviation variations in the comments below!

0 Kudos
16 Replies
AbhishekSK
Hot Shot
Hot Shot

Jason, this is very helpful.  These examples are perfect.  Couple questions.

I'm curious what your thoughts are about retrieving negative standard deviation values in your examples.  I'm wondering if you would expect the SD values to all be positive?

Also, would you suggest calculating standard deviation with the align () function, if I were interested in finding the SD for servers for each week.

0 Kudos
AbhishekSK
Hot Shot
Hot Shot

Hi Peter,

By definition the standard deviation is always a positive number.    Here's the equation Screen Shot 2016-03-23 at 4.57.07 PM.png

It's been a while since I've studied any of this but you are basically taking a distance from the mean and then squaring the result (which is the variance).   Because of the square - it's always a positive number.   Then apply the sqrt.   With regards to using the align, I think that's an interesting use case.   I would try:   sqrt(variance(align(1w, ts(expression))))   That way the align is done on the series before the other math.   Let me know how that works out.

-Greg

0 Kudos
AbhishekSK
Hot Shot
Hot Shot

Jason's formula also seems to incorporate the below bolded in calculating standard deviation:

(ts("requests.latency", tag="az-3") - avg(ts("requests.latency", tag="az-3"))) / sqrt(variance(ts("requests.latency", tag="az-3")))

Should this be included in calculating SD?  One of the reasons why I asked about the negative values, is because Jason's results seem to show SD < 0.  That said, maybe I'm missing something.

I'm trying to find the standard deviation of a particular metric (eg. "memory.used") for all servers for each week.  I tried "sqrt(variance(ts("memory.used"))), without the align to test, and the units for results seem to be in "G" which seems a bit surprising.  I'm having trouble understanding if the issue is related to my query, or an odd result along the Wavefront graph axis.

0 Kudos
AbhishekSK
Hot Shot
Hot Shot

peter_han_grassman  In the expression used in the example(s) above, the numerator portion is utilized in order to determine the mean. That combined with the denominator allows us to determine the standard deviation from the mean. I would suggest using the numerator above in your expression as well

While the images above do include negative values, they are still considered 'X' number of standard deviations from the mean. The difference being that a positive value means that the reported value at that time is 'X' number of deviations above the mean, and a negative value means the reported value at that time is 'X' number of deviations below the mean.

The format of the numerator should be based on your use case. If I'm understanding this use case correctly, you want to calculate a typical standard deviation over time, but then want to take those reported deviations for an entire week and average them together to determine the what the average deviation is for the entire week. Is that correct?

If so, then I think you want something like this:

align(1w, (ts("memory.used") - avg(ts("memory.used"))) / sqrt(mvar(1d, ts("memory.used"))))

This expression calculates a standard deviation from the mean, and then takes a weeks worth of deviation values and averages those together. This approach would show a value for each unique series in the expression. If you wanted to take the approach of calculating the standard deviations for the average of all the servers, then simply make the following change:

align(1w, (avg(ts("memory.used")) - mavg(1d, avg(ts("memory.used")))) / sqrt(mvar(1d, avg(ts("memory.used")))))

If I remember correctly, align(1w,) always appeared on a Thursday for you. If you want to see the weekly value on a different day, then utilize the if() statement below:

if(weekday("US/Pacific") = 1 and hour("US/Pacific") = 0, (avg(ts("memory.used")) - mavg(1d, avg(ts("memory.used")))) / sqrt(mvar(1d, avg(ts("memory.used")))))

I hope these help!

0 Kudos
AbhishekSK
Hot Shot
Hot Shot

jason_goocher, thank you! This is very helpful, especially the context regarding the values of the numerator.

That said, I'm not sure if these examples fit my need. Apologies if I wasn't more clear.  I'd like to get the standard deviation for all servers at one week intervals.  Similar to the concept of finding avg of a metric for a week intervals (eg. align). So for x number of servers, find each respective SD value for each week.   So what I'm interested in is to "calculate a typical standard deviation over time" for each server at one week increments.  Does that make sense?

Thanks!

0 Kudos
AbhishekSK
Hot Shot
Hot Shot

Hi Peter,
I am glad you found Jason's insight on Std Dev useful. You could very well apply similar logic on timeseries from each server.

To help answer your question, I have created a sample query using the following cpu metric

This query gives you the Std Dev for each server over a 7 day ( 1 week) moving window ->

((ts("cpu-load.load1", source=*)) - mavg(7d, (ts("cpu-load.load1", source=*)))) / sqrt(mvar(7d, (ts("cpu-load.load1", source=*))))

Then if you want to only look at the Std Dev a weekly interval you can simply align it at 1 week level and choose the desired summarization function to pick the max standard deviation that occurred during the week.

In this case I am picking the max std dev to identify the max deviation that occurred in that week ->

align(7d, max, (((ts("cpu-load.load1", source=*)) - mavg(7d, (ts("cpu-load.load1", source=*)))) / sqrt(mvar(7d, (ts("cpu-load.load1", source=*))))))

Hope this is helpful. Please let me know if you have any further questions that I can help answer.

Thanks,
Salil D

0 Kudos
AbhishekSK
Hot Shot
Hot Shot

salil@wavefront.com this is explanation is very helpful.  My understanding is that the std dev. formula calculates std dev. for a moving 7 day window.  Is this  std. dev. formula reporting the std. dev. for each day with a moving 7 day window?

When working with the align time series function for std. deviation, it sounds like I have  to choose a summarization function.  Is it possible to show the std dev @ a weekly interval without an aggregation function(eg. max, avg, etc.)?  Perhaps you can get simply obtain the std. dev. for a respective week?

greg@wavefront.com, jason_goocher, salil@wavefront.com : thanks again!

0 Kudos
AbhishekSK
Hot Shot
Hot Shot

Hi Peter,

The above standard dev formula is using moving functions like mavg and mvar and so yes this formula calculates std dev for each time series over a moving 7 day window.

The underlying data in this example is being reported at every 5 mins so if you want to look only at a weekly interval it would have to be aligned in some form or other. In my use case I aligned it and picked up the max value over the 7 day to be shown for the week. If you want to simply show what was the std dev at the end of the week you could use last instead of max and that would show you the std dev value which was at the end of the week.

Hope this helps.

-Salil D

0 Kudos
AbhishekSK
Hot Shot
Hot Shot

salil@wavefront.com thanks.

From my experience, I'm aware that the align(1w,) aligns data on a Thursday.  jason_goocher mentioned that align(1w,) syncs with Thursday and the following week.  

In looking at align(7d, max, (((ts("cpu-load.load1", source=*)) - mavg(7d, (ts("cpu-load.load1", source=*)))) / sqrt(mvar(7d, (ts("cpu-load.load1", source=*)))))) , I'm wondering if I can align standard deviation for a particular day of the week, and ensure that I can obtain the SD for the following respective week (eg. align(1w,)).  Right now, I see data synced at Wednesdays.  I also believe this data is showing max SD for the previous 7 days.

I'm working under the assumption that moving functions will give you the 1 week average for the previous week.  I'm also aware that users can hard code the start date : if((weekday("US/Pacific") = X and hour("US/Pacific") = 0).  With the align and moving functions, I just want to ensure which dates the SD calculation relates to.

0 Kudos
AbhishekSK
Hot Shot
Hot Shot

Hi Peter,

You can use what Jason recommended and control what day of the week and time you want the standard deviation to calculate/display for.

For ex: This query will show calculate and display the standard deviation for Monday

if(weekday("US/Pacific") = 1 and hour("US/Pacific") = 0, (((ts("cpu-load.load1", source=*)) - mavg(1w, (ts("cpu-load.load1", source=*)))) / sqrt(mvar(1w, (ts("cpu-load.load1", source=*))))))

And changing the weekday  expression to 2 will compute it for Tuesday and so on.

In this case you do not need to use align and you as using if condition to sample out the Std Dev values for a particular day/time of the week directly.

I believe this is the solution to your use case. If you wanted to apply some summarization you can use align as we discussed above.

And yes your assumptions are correct. the moving functions compute the values over the moving time window (1week in this example). And then using the if condition you are sampling out and only displaying the std dev values at the particular day/time of the week.

Hope this helpful.

Thanks,

Salil D

AbhishekSK
Hot Shot
Hot Shot

(ts("requests.latency", tag="az-3") - avg(ts("requests.latency", tag="az-3"))) / sqrt(variance(ts("requests.latency", tag="az-3")))

Jason commented that : "...In the expression used in the example(s) above, the numerator portion is utilized in order to determine the mean. That combined with the denominator allows us to determine the standard deviation from the mean. I would suggest using the numerator above in your expression as well

While the images above do include negative values, they are still considered 'X' number of standard deviations from the mean. The difference being that a positive value means that the reported value at that time is 'X' number of deviations above the mean, and a negative value means the reported value at that time is 'X' number of deviations below the mean."

Would it be more accurate to say this equation is calculating the number of standard deviations away from the mean, rather than this equation is calculating the standard deviation?  I'm trying to wrap my head around why an equation like : "sqrt(variance(align(1w, ts(expression))))" wouldn't provide the standard deviation.

0 Kudos
AbhishekSK
Hot Shot
Hot Shot

Hi Peter,

Yes you are right that "sqrt(variance(align(1w, ts(expression))))" would give you the absolute standard deviation value of a dataset and what we are using in the above expression is the std deviation above or below the mean. You can use the sqrt(variance(align(1w, ts(expression)))) value if you are actually interested in finding what is the absolute Std Dev value of the your data set.

However in most cases standard deviation is used to measure the spread or the dispersion of the dataset rather than the central tendency of the data set so it usually is represented as the deviation from the mean.

Which allows you to then remove the magnitude of the actual dataset and compare two dataset with varying degrees of magnitude solely on the the basis on their dispersion. This is the reason in the above example we use the std dev from the mean as a measure to compare these timeseries thay may have different magnitude with each other.

For datasets that have a normal distribution the standard deviation can be used to determine the proportion of values that lie within a particular range of the mean value. For such distributions it is always the case that 68% of values are less than one standard deviation (1SD) away from the mean value, that 95% of values are less than two standard deviations (2SD) away from the mean and that 99% of values are less than three standard deviations (3SD) away from the mean.

We usually use these Std Dev from mean values as threshold in our alerts that then can be applied widely scaled timeseries and allows you to identify the spread of each series independently to stop the anomalies.

Here is another article that I published recently that talks about this in further detail. Please take a look and hope you find this useful.

Also here are few article that I found very helpful while I was learning and trying to wrap my head around these statistical functions

Ref1:Measures of variability

Ref2: Standard Dev and Variance

Ref3: Mean,Median,Standard Deviation

Hope this is helpful.

Thanks,

Salil D

0 Kudos
AbhishekSK
Hot Shot
Hot Shot

Salil,

Can you please confirm the syntax in retrieving "absolute" standard deviation from Wavefront?  My graph seems to time out when I attempt to retrieve these values from the graph.

sqrt(variance(align(1w, ts("memory.used"))))

I'm interested in finding the absolute standard deviation over time for each server at one week increments.

Thanks!

0 Kudos
AbhishekSK
Hot Shot
Hot Shot

Hi Peter,

In this case you want to find out the std dev of each server ( each timeseries) over a week's worth of data. To do this you need to use the moving function to calculate variance of each series over a week.

Here is a query that would give you the std dev values for each server and align it to each week

align(1w,last,sqrt(mvar(1w,ts("memory.used", tag=*prod*))))

One more thing I would to recommend is to filter down the nos of time series you want to display by filtering down the servers using source= and /or tag= filters so that you keep the number of series to be plotted less than 10000 ( which is usually the limit on the timeseries that can be rendered on the chart)

Hope this is helpful

Thanks,

Salil D

0 Kudos
AbhishekSK
Hot Shot
Hot Shot

For example : align(1w,last,sqrt(mvar(1w,ts("memory.used", tag=*prod*)))), can you please confirm the units associated with this query?  I assume that the units are based on the underlying metric, which in this case would be bytes.  If I were to use a metric like, "cpu.cpuidle", sd units would be expressed in %, as in 5.5% (not 0.05).

0 Kudos
AbhishekSK
Hot Shot
Hot Shot

Hi Peter,

Yes. The units in this case would be same as that of the underlying metric.

If the underlying metric represents bytes the unit for the absolute std deviation would also be in bytes. And for cpu.idle the unit would be a % if the underlying metric is a % .

As you can see in this article which I referenced earlier, the unit of the absolute standard deviation is same as that of the underlying data. I am  also sending you a separate email directly with links to actual metric in your Workday instance for your reference.

Also on a related note, I want to restate that the values you see in Wavefront charts are represented in SI metric notation

So for example when you see  135.25 G it is not 135.25 GB rather it is 135.25 * 10^9 and so if the metric is reporting bytes then it approx. equals to (135.25 * 10^ 9) / ( 1024^3) = 125.96 GB

Hope this helps.

Thanks,

Salil D

0 Kudos