VMware Modern Apps Community
AbhishekSK
Hot Shot
Hot Shot

Why does Wavefront use so much CPU on Chrome/Ubuntu?

Wavefront is super CPU heavy and unresponsive on my system. I'm running Ubuntu 15.10, Google Chrome on an 8-core i7 at 3.6 GHz and 16 GiB RAM.

Google Chrome48.0.2564.103 (Official Build) (64-bit)
Revisioncc2c23a615e1c898bc58b1fb818c41f214f509d8-refs/branch-heads/2564@{#657}
OSLinux
Blink537.36 (@cc2c23a615e1c898bc58b1fb818c41f214f509d8)
JavaScriptV8 4.8.271.18
Flash22.0.0.209
User AgentMozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.103 Safari/537.36
Command Line/usr/bin/google-chrome --flag-switches-begin --flag-switches-end

Usually upon first load of a dashboard the CPU usage (according to top) will shoot up to 200% or even 300%, and then once everything is rendered it calms back down if I'm doing nothing. Hovering over points will make it rocket back up: I just observed 370% CPU just moving my mouse over some points on a graph. Dragging the time axis also pegs the CPU and it takes about 2 seconds between updates as I drag.

I can't leave 10m live views open because they make the whole browser sluggish.

If I play with a dashboard a bit, the CPU will get stuck at 100% even when that tab is in the background, and there's no live data. I haven't come up with a precise procedure to repro this state, but just using a dashboard normally for a while seems to get me there more often than not. Maybe change the time scale a few times, zoom in on some things, zoom back out, maybe edit one of the panels. Once it's there, even something simple like scrolling takes 6 seconds. Switching to another dashboard will take 10s of seconds to load each panel

Experiences in Firefox are similar.

Has anyone else experienced this? Any solutions?

Reply
0 Kudos
5 Replies
AbhishekSK
Hot Shot
Hot Shot

Hi Phil,

From the initial assessment this seems like a browser issue. Can you please try clearing your browser cache and try again.

Were you seeing this issue with all dashboards or is there a particular that was comparatively sluggish? Can you share the dashboard url where you were seeing issues?

I also checked your instance and the dashboards and charts are snappy for me both in Chrome and Firefox but please let me know if you continue to see performance issues on your system and I will escalate this issue for our engineering team to take a closer look

-Salil D

Reply
0 Kudos
AbhishekSK
Hot Shot
Hot Shot

Just reset the cache, and I haven't reproduced the CPU spiral of death in a couple minutes of use, but we'll see what happens tomorrow. It usually happens for me at least once a day but only after a bit of use.

10m live views are still very sluggish, for example the bar that follows the pointer on the time axis lagging behind about 300ms or so, and with the analogous bars on the other graphs only moving if I hold still for about a second. I also tested the same in an incognito window with the same result.

I've also tested in Firefox. Right now I have it on https://metrics.wavefront.com/dashboard/pgbouncer#(g:(c:off,d:600,ls:!f,s:1470702403,w:%2710m%27)) and it's burning 60% CPU doing nothing (not even a live view). CPU usage stays at 60% even after a reload -- something I've not observed in Chrome. (Chrome only gets into a use-a-lot-of-cpu-doing-nothing state after some use)

Tests on Chrome on Mac OS also show high CPU usage, though the UI seems a little better at powering through. I still wouldn't call it "snappy", but it's certainly less bad.

Reply
0 Kudos
AbhishekSK
Hot Shot
Hot Shot

We apologize for the issues you saw today. We did see couple GC events today on the metrics cluster which hosts your instance so it is likely your experience overlapped these incidents and the browser cache made it even worse. But things should be back to normal now and if the performance still does not return to normal ("snappy" ) please do let me know !

Also, I will record and share you valuable inputs above with our front end team for further investigation.

Thanks,

Salil D

Reply
0 Kudos
AbhishekSK
Hot Shot
Hot Shot

Unfortunately it's still not "snappy", and really never has been. Even immediately after clearing the cache, zapping the pram, and rebuilding the desktop, the 10m live view still burns 200% CPU and is anything but "snappy". Dragging the timeline is also especially bad: in every circumstance I've tried it moves in 1s chops and will run the CPU at 130% or more.

Experimenting a bit with a few of our dashboards, the sluggishness seems proportional to the resolution of the metrics. Those with 1s resolution are downright painful to use, 15s is just laggy, and 1m is usable but slightly choppy. Oddly, this seems to hold true regardless of the time scale, for example if I zoom out to 1 week on a dashboard with 1m resolution metrics, the load times are a bit slower (understandable) but the UI responsiveness seems unchanged, which doesn't make much sense as 1 week each minute has two orders of magnitude more points than 10 minutes each second.

Reply
0 Kudos
admin
Immortal
Immortal

The reason for the time interval making little difference is that the API "buckets" points into single values so that queries over large ranges can actually be rendered and returned - otherwise 5s points over a year would not only be very difficult to handle but redundant (you'd have multiple values at the same pixel on your screen). Which value is returned for a bucket is controlled by a setting on the chart, it defaults to "average". The UI uses the width of the chart to determine how many buckets to ask for.

Consequently when you look at dashboards on a large screen, on a retina or higher display, there are really more buckets requested and returned to the browser, and more svg path points being rendered. Given this the resolution of the points actually makes little different to the frontend, but will have some affect on the backend in calculating the bucket value.

Reply
0 Kudos