From the initial assessment this seems like a browser issue. Can you please try clearing your browser cache and try again.
Were you seeing this issue with all dashboards or is there a particular that was comparatively sluggish? Can you share the dashboard url where you were seeing issues?
I also checked your instance and the dashboards and charts are snappy for me both in Chrome and Firefox but please let me know if you continue to see performance issues on your system and I will escalate this issue for our engineering team to take a closer look
Just reset the cache, and I haven't reproduced the CPU spiral of death in a couple minutes of use, but we'll see what happens tomorrow. It usually happens for me at least once a day but only after a bit of use.
10m live views are still very sluggish, for example the bar that follows the pointer on the time axis lagging behind about 300ms or so, and with the analogous bars on the other graphs only moving if I hold still for about a second. I also tested the same in an incognito window with the same result.
I've also tested in Firefox. Right now I have it on https://metrics.wavefront.com/dashboard/pgbouncer#(g:(c:off,d:600,ls:!f,s:1470702403,w:%2710m%27)) and it's burning 60% CPU doing nothing (not even a live view). CPU usage stays at 60% even after a reload -- something I've not observed in Chrome. (Chrome only gets into a use-a-lot-of-cpu-doing-nothing state after some use)
Tests on Chrome on Mac OS also show high CPU usage, though the UI seems a little better at powering through. I still wouldn't call it "snappy", but it's certainly less bad.
We apologize for the issues you saw today. We did see couple GC events today on the metrics cluster which hosts your instance so it is likely your experience overlapped these incidents and the browser cache made it even worse. But things should be back to normal now and if the performance still does not return to normal ("snappy" ) please do let me know !
Also, I will record and share you valuable inputs above with our front end team for further investigation.
Unfortunately it's still not "snappy", and really never has been. Even immediately after clearing the cache, zapping the pram, and rebuilding the desktop, the 10m live view still burns 200% CPU and is anything but "snappy". Dragging the timeline is also especially bad: in every circumstance I've tried it moves in 1s chops and will run the CPU at 130% or more.
Experimenting a bit with a few of our dashboards, the sluggishness seems proportional to the resolution of the metrics. Those with 1s resolution are downright painful to use, 15s is just laggy, and 1m is usable but slightly choppy. Oddly, this seems to hold true regardless of the time scale, for example if I zoom out to 1 week on a dashboard with 1m resolution metrics, the load times are a bit slower (understandable) but the UI responsiveness seems unchanged, which doesn't make much sense as 1 week each minute has two orders of magnitude more points than 10 minutes each second.
The reason for the time interval making little difference is that the API "buckets" points into single values so that queries over large ranges can actually be rendered and returned - otherwise 5s points over a year would not only be very difficult to handle but redundant (you'd have multiple values at the same pixel on your screen). Which value is returned for a bucket is controlled by a setting on the chart, it defaults to "average". The UI uses the width of the chart to determine how many buckets to ask for.
Consequently when you look at dashboards on a large screen, on a retina or higher display, there are really more buckets requested and returned to the browser, and more svg path points being rendered. Given this the resolution of the points actually makes little different to the frontend, but will have some affect on the backend in calculating the bucket value.