VMware Horizon Community
jay88
Contributor
Contributor

Migration 5.1.1 High Cpu

Since migrating to a newly built cluster running ESXi 5.1 build 838463 and View 5.1.1 build-799444 we experience high cpu utilization on the esx side.  It appears to be related to PCoIP because if I expand the cpu process in esxtop the vmx-svga process ID sits at a constant 20% cpu.  There is nothing going on inside the VM at that time as we were able to duplicate it by just logging into a vm and letting the session sit there.  Once the session is disconnected and logged back into the cpu seems to consistently return to normal for that process.  SVGA driver in use is the version that is installed with the agent not the vmtools, we have removed and re-installed the tools and agent per KB articles, it is very random as not all vms experience this at the same time.

Capture.GIF

0 Kudos
67 Replies
gmtx
Hot Shot
Hot Shot

Does anyone know if this has been solved or is still an open issue, even with View 5.2?

Thanks,


Geoff

0 Kudos
C3LLC
Contributor
Contributor

Most DEFINITELY still an open issue...even with View 5.2 and even though 2 months ++ have passed since the issue was first reported.

Grrrrr

0 Kudos
TSher
Enthusiast
Enthusiast

Hi Geoff,

I haven't seen any fix.  Spoke to a VMWare Manager last week who basically said engineering were struggling to fix it.  I asked him if View v5.2 would either fix the issue or at least help the CPU issue and he didn't know and said he would get back to me, I'm still waiting...  I'm starting to lose confidence with VMWare and dare not install View v5.2 yet unless I know it will offer a fix.

For me it is a big problem still.  I look after multiple sites up and down the UK and I need to make a decision on whether to move away from VMWare as too many customers are now affected by this.  Other customers have to stay on v5 and they need to upgrade to v5.1 to benefit from some of the newer features.  VMWare seem to be doing nothing about this as I first found this issue in November and reported it in December.

I don't know if there are any other fixes to assist with this issue other than the resolution fix, which is only a temporary fix, but I'm yet to coem across any other fix.

Thanks.

0 Kudos
gmtx
Hot Shot
Hot Shot

Have you been able to come up with any workarounds or do you just live with the performance hit? Are your users impacted?


Sorry for so many questions, but we were hoping to upgrade to 5.1 this weekend and now I'm not so sure that's a good idea. If we go ahead with the upgrade - which we really would like to do for storage and vSwitch enhancements only available in 5.1 - and we hit this problem, what's Monday morning going to look like for our users?

Thanks,

Geoff

0 Kudos
C3LLC
Contributor
Contributor

No current workaround for our environment.  For us, however, it should be noted that it appears to be affecting only XP virtual machines.  The impact is huge, however.  Whereas we would usually plan for 4-6 VMs per core, we are now seeing each VM consuming between 50 and 150% of each core.  This has destroyed the consolidation ratios, but fortunately we have ample processing power in reserve.   Another undesired side effect is that this drives DRS nuts and it constantly is trying to rebalance the cluster.   This is generating way above average network traffic.   Thankfully, the end users notice no difference.   The impact is purely felt on the infrastructure side.

Without significant CPU reserves, however, this could be an extraordinarily painful exercise.  If you have XP VMs, I would hold off on the 5.1 upgrade.  You can use the spare time to read up on the new Single Sign On piece...if you get that piece wrong, your 5.1 upgrade experience will not be a easy rabble one.   Smiley Happy

0 Kudos
TSher
Enthusiast
Enthusiast

Hi Geoff,

The only workaround was posted on this thread regarding the resolution setting of the VM.  I have done countless hours research and not come up with any other solution for it.  It affects the SVGA worlds of the ESXi host/s. Apparently the engineers put a lot of work in to the 3D functionality and they think it is something to do with this, but that is with talking to various people at VMWare so I am not sure how accurate this info is, although it does seem to tie up with the issue.

All I can say if you are affected by the upgrade you will wish you hadn't done it.  I say 'if' as I have no idea if everyone is affected running a View environment or not.  I know the 4 customers I upgraded are all affected.  It is manageable if you are running Windows 7 and do have spare resources on the host.  If you are running Windows XP, like I was, then the hosts just go crazy.  I had to shut 50% of my VM's down and the ones left were very slow.  The problem is still there with Windows 7 but as I said it is more manageable.  My hosts sometimes operate around 60-70% but the VM can still be sluggish.  I wish I hadn't upgraded but luckily I didn't upgrade a customer who has over 50 hosts running over 2000 view sessions.  This was a smaller customer but still badly affected.

My advice is don't upgrade yet unless you can upgrade a test environment.  But maybe you are lucky and not affected.  That again is the problem, limited information from VMWare.  I still don't understand why they haven't got a fix out, maybe the View customers are not important enough...

Thanks, and if you do upgrade, best of luck.

0 Kudos
gmtx
Hot Shot
Hot Shot

Thanks again everyone for the info.Troubling that such a serious issue is still lingering months later.

What to do, what to do?? Not sure I need the 5.1 features badly enough to risk a stable environment.

Linjo, anything you can add from the VMware perspective? Are you recommending that customers wait to upgrade?

Geoff

0 Kudos
keviom
Enthusiast
Enthusiast

Hi,

I just migrated to horizon 5.2 /Vspere 5.1 from 5.1 and 5.0 and started getting the random 100% CPU useage on the XP VMs. As these are dedicated desktops, I didn't want to have to alter the master image and then recompile and reallocate VMs to the users, so thought I'd try a different approach as a quick and dirty fix. I went to a problem desktop, changed the resolution of screen 1 (all our users have 2 screens) from 1920x1080 to a lower resolution, applyed it answered yes to the do you want to keep it and then straight away changed it back up to 1920x1080 again whilst monitoring the performance graph in Vcentre. As soon as the screen is changed down the percentage drops right down, and at the moment hasn't changed back up, but I have only just done it, so lets see what a logoff / reboot might do.

Interestingly my systems are all evga zero clients, and some of these have the resolution forced to a lower resolution than the VM setting (1680x1050) which forces the VM screen to that size. So thats sort of the same as the setting the resolution on the master higher in my mind, and I still get the problem, When you try to change it to a lower resolution setting on these VMs, it goes through the motions, but puts it straight back at 1680 by 1050 so doesn't actually change anything. It does howerver reduce the CPU useage to virtually nothing.

Hope that helps anyone who has this issue, and doesn't have time / access to change the master. Just get the user to lower the resolution, then once applied, put it straight back.

Kevin

0 Kudos
gmtx
Hot Shot
Hot Shot

Has anyone tried esxi 5.1 Update 1 yet to see if it fixes the issue? Didn't see anything about it in the release notes. Smiley Sad

Thanks,

Geoff

0 Kudos
keviom
Enthusiast
Enthusiast

Hi Geoff,

As per my post above, I am running Vshere 5.1 having just upgraded from View 5.1 to horizon 5.2 and Vsphere 5.0.u1to 5.1 I have only started getting the problem, so it most definitely is not fixed in 5.1.

As I mentioned changing the resolution on any screen to a lower setting and then setting it back does fix it temporarily, but next logout / reboot it comes back again or sometimes doesn't, its that random....

Regards

Kevin

0 Kudos
gmtx
Hot Shot
Hot Shot

Right - but have you tried esxi 5.1 U1? It was just released a couple of days ago.

0 Kudos
keviom
Enthusiast
Enthusiast

Wow, that wasn't there when I checked on Thursday or Friday last week. Update manager now has 15 new updates including U1. I will be looking at all the patches, and I am vmotioning machines around now so I can install it and test it. I'll post a response as soon as I can.

Thanks for the heads up

0 Kudos
C3LLC
Contributor
Contributor

I can sadly report that 5.1 U1 does not resolve or have any impact on this issue.  Further, I've been told that VMware cannot even reproduce the issue in their lab environments!  Grrrrr

Rick

0 Kudos
gmtx
Hot Shot
Hot Shot

That's too bad. I understand not being able to repro the problem in the lab, but from the posts on this thread there are multiple opportunities to look at the problem in a real environment, and at least offer some guidance as to which environments may be at risk during an upgrade.

At some point soon I need to upgrade to esxi 5.1. I shouldn't have to worry that my View environment may implode as a result, and right now it appears the only way to find out is to upgrade and hope - not excactly a professional approach to critical infrastructure management.

Geoff

0 Kudos
keviom
Enthusiast
Enthusiast

Hi,

I can also confirm that U1 doesn't fix the issue. Even on machines running older versions of VMWare tools and View Agent.

I have tried logging in using RDP as display protocol and have no high CPU issues, but change the same machine back to PCOIP and back it comes.

I raised a case with Dell (our supplier sourced View through Dell so we have to go through them to raise a case with VMWare), and whilst their TS guy was gathering all the info to take to VMWare, he found that using ESXTOP on the VMs exhibiting the problem, the SVGA driver was reporting %used at around 90% or more and also similar level for %run with very little %wait time, around 5%, and as soon as I changed the resolution down and back up on the VM thus dropping the CPU usage down to normal levels this also changed the SVGA usage in ESXTOP

Just one thought if VMWare can't reproduce it, has everyone experiencing this issue upgraded from pervious versions rather than new installs, or is there anyone who has installed from new straight to Version 5.1 also seeing high CPU usage on XP VMs?

Hopefully the more people who raise it with VMWare the quicker they will find a fix

Regards

Kevin

0 Kudos
admin
Immortal
Immortal

Kevin -

Do you see this issue with 2D based desktops? Are your desktops 3D based desktops? If 3D are they Non-Hardware accelerated or vSGA ( GPU ) accelerated.

I also noted in one of the case notes I looked at ( Maybe yours ) there was a APEX card being used. If you disable or remove the APEX card do you see the same behavior?

WP

0 Kudos
keviom
Enthusiast
Enthusiast

They are standard 2d windows XP Pro SP3 desktops. No Apex card

0 Kudos
C3LLC
Contributor
Contributor

Same here.

Rick

0 Kudos
admin
Immortal
Immortal

Do you only see it with XP or does it also happen with win7. Sorry to ask all these. I am trying to repro it and only have Win 7.

WP

0 Kudos
C3LLC
Contributor
Contributor

We are not seeing it with Win7 - only XP. We see the exact same situation with SVGA consuming enormous amounts of CPU.

0 Kudos