VMware Modern Apps Community
vovagalchenko
Contributor
Contributor
Jump to solution

Unexpected Summarize By Behavior

Please consider the following two graphs:

1. https://box.wavefront.com/u/z1zm2Mf5PL

2. https://box.wavefront.com/u/lKHrNKVhFC

Note that the two are charting the same data, but (2) is looking at a narrower time-range. Note also that (1) and (2) look very different. My understanding is that can only happen due to summarization. When we zoom out each drawn point represents several actual values and they're merged into one using the "Summarize By" function. However, in this case, the Summarize By function I chose was `Max`, so as I zoom out, I wouldn't expect to see lower values than I see when I'm zoomed in. That's not the behavior I'm seeing though. With this I have a few questions:

- Is my understanding of Summarize By as described above correct?

- Is Summarize By applied at the end of all time series transformations?

- How can it be that with Summarize By being set to `Max`, I'm seeing lower values when I'm zoomed out, than when I'm zoomed in?

Thank you for your help.

vova

Tags (1)
1 Solution

Accepted Solutions
jason_goocher
VMware Employee
VMware Employee
Jump to solution

Hi vovagalchenko​,

Thanks for sending this our way. Let me address each of these separately.

-"Is my understanding of Summarize By as described above correct?"

A: It absolutely is correct. The amount of data values being bucketed together will typically increase along when chart time windows are increased. When the returned data stays static and the "Summarize By" option is set to "Max", you shouldn't expect to see a lower max value in larger time windows vs smaller time windows.

-"Is Summarize By applied at the end of all time series transformations?"

A: Again, you are absolutely correct. For something like mcount(30m, sum(ts(my.metric))), we would execute the entire query (along with the function transformations) first before the "Summarize By" is applied.

-"How can it be that with Summarize By being set to 'Max', I'm seeing lower values when I'm zoomed out, than when I'm zoomed in?"
A: This is the tricky question that I will need some time to answer. The query you are looking at seems to be highly nested and using several functions. My initial hypothesis is that it has something to do with the combination of integral() and if(), but I can't confirm that for certain just yet. This hypothesis is based on the fact that if() is a conditional statement and results from integral() are based on the selected chart time window. So as time windows increase, it could change what if() considers to be true, and integral() results would change as well which could impact the final results for ${basic_downs_min}.

Please allow us some time to fully investigate this nested query and we'll get back to you with our results. We appreciate your patience!

Thanks,

Jason

View solution in original post

Reply
0 Kudos
2 Replies
jason_goocher
VMware Employee
VMware Employee
Jump to solution

Hi vovagalchenko​,

Thanks for sending this our way. Let me address each of these separately.

-"Is my understanding of Summarize By as described above correct?"

A: It absolutely is correct. The amount of data values being bucketed together will typically increase along when chart time windows are increased. When the returned data stays static and the "Summarize By" option is set to "Max", you shouldn't expect to see a lower max value in larger time windows vs smaller time windows.

-"Is Summarize By applied at the end of all time series transformations?"

A: Again, you are absolutely correct. For something like mcount(30m, sum(ts(my.metric))), we would execute the entire query (along with the function transformations) first before the "Summarize By" is applied.

-"How can it be that with Summarize By being set to 'Max', I'm seeing lower values when I'm zoomed out, than when I'm zoomed in?"
A: This is the tricky question that I will need some time to answer. The query you are looking at seems to be highly nested and using several functions. My initial hypothesis is that it has something to do with the combination of integral() and if(), but I can't confirm that for certain just yet. This hypothesis is based on the fact that if() is a conditional statement and results from integral() are based on the selected chart time window. So as time windows increase, it could change what if() considers to be true, and integral() results would change as well which could impact the final results for ${basic_downs_min}.

Please allow us some time to fully investigate this nested query and we'll get back to you with our results. We appreciate your patience!

Thanks,

Jason

Reply
0 Kudos
clementpang
VMware Employee
VMware Employee
Jump to solution

integral() is one of the only functions that violate one of our axioms: "queries should behave exactly the same way regardless of what window you are looking at" (the others being normalize() and the upcoming holt-windows function). This is because integral sums from the left of the window until the right, accumulating as it goes. Hence the value at a particular time will change depending on when the window starts. Using msum(), for instance, makes it stable. Hope that helps.