VMware Horizon Community
jay88
Contributor
Contributor

Migration 5.1.1 High Cpu

Since migrating to a newly built cluster running ESXi 5.1 build 838463 and View 5.1.1 build-799444 we experience high cpu utilization on the esx side.  It appears to be related to PCoIP because if I expand the cpu process in esxtop the vmx-svga process ID sits at a constant 20% cpu.  There is nothing going on inside the VM at that time as we were able to duplicate it by just logging into a vm and letting the session sit there.  Once the session is disconnected and logged back into the cpu seems to consistently return to normal for that process.  SVGA driver in use is the version that is installed with the agent not the vmtools, we have removed and re-installed the tools and agent per KB articles, it is very random as not all vms experience this at the same time.

Capture.GIF

Reply
0 Kudos
67 Replies
admin
Immortal
Immortal

I looked at the bugid and it was escalated but one of the customers has not been able to respond with the data / info for our engineering to try and root cause it. The steps are simple if you have an easy repro case. If you have a SR or are willing to shoot me your contact info via PM. I can add it to the bug and ping everyone that you are willing to provide data from a recapture of this happening.

WP

Reply
0 Kudos
C3LLC
Contributor
Contributor

This is an example of a VCops alert that shows the bizarre nature of the problem. How can a single CPU VM use more than 100% of its resources? The answer is the "world" of this VM is using nearly 1.5 CPU cores at this moment, the bulk of which as told by ESXTOP is the SVGA wedge.

Reply
0 Kudos
C3LLC
Contributor
Contributor

Warren-

The tech assigned to my case asked for a bunch of logs and things, but I've already provided tons of information. I asked if I could speak directly with an engineer, perhaps give them access directly to my platform to look. I was told "this is not possible." I honestly don't have a lot of free time to be duplicating efforts in generating logs, and screen shots, etc. I'm happy to do a WebEx with you (or anyone you think can help!) to show the problem in action. It happens thought the business day on the bulk of the XP VMs that we have.

We had to purchase another dual 8 core host to help "mask" the problem until we can get everyone onto Windows 7. This bug has been live since DECEMBER, so we lost faith in VMware and fixed it the only way we knew how: more hardware. Not happy about this, but at least our customers are happy.

Ric

Reply
0 Kudos
TSher
Enthusiast
Enthusiast

Hi All,

I have updated to U1 and still have it at my site and 2 other companies I am contracted to manage.  The other sites I dare not upgrade yet as they are still on XP.  I have provided any logs VMWare have requested, I have asked to speak to an engineer and was told they are too busy.  I have lost track of the amount of webinars that have taken place.  I originally thought Windows 7 was affected but I think this was because I had too many XP VM's that were affected by the bug, as I have pretty much eliminated XP from one site and Windows 7 does seem to be behaving very well, even though it gave the impression that it wasn't when I had a few hundred XP VM's.

I do not understand how they cannot replicate the problem, I have VMWare Workstation with vSphere and View installed with an eval license and I can replicate the problem, it's not exactly difficult.  Although there seems away froward by upgrading to Windows 7, I just feel let down by VMWare who have done nothing over this problem in helping to resolve it.  When I login to 'My Vmware' my open ticket hasn't been responded to since around January with anything remotely useful.  I fail to see how such a big problem was unknown to senior engineers, had I been told from the beginning that a fix is proving difficult and we advise you to upgrade to Windows 7 then I would have had a lot less grief,instead of waiting for a fix and trying to fire fight, and as mentioned before when I started deploying Windows 7 VM's they were showing similar signs within ESXTOP as the XP ones so it really was a battle with no support from VMWare.

Thanks.

Reply
0 Kudos
TSher
Enthusiast
Enthusiast

Hi,

I should probably mention I am using APEX cards and it made no difference whether they were installed or not.  I have also taken hosts out of the cluster and reinstalled from the latest version out as opposed to an update and it made no difference.  The only way of solving it was the resolution fix.

Thanks.

Reply
0 Kudos
gmtx
Hot Shot
Hot Shot

So are you saying it's only an XP issue from what you've seen so far? I'm fully Windows 7 in my environment.

Thanks,

Geoff

Reply
0 Kudos
TSher
Enthusiast
Enthusiast

I am confident it is only a Windows XP issue.  I have had this problem since November and have tried so many different things to try and resolve it.  I would say Windows 7 is not affected and it is most likely something to do with the 3D work that was put in to the ESXi hosts which XP does not support.  The same environment, same number of VM's and my hosts are not going above 55%.  With XP prior to the update that caused the issue my hosts were around 85%, major improvemnt over XP.  At this site we use Windows 7 with 3D and no 3D, each behave as I would expect.  So on a positive note Windows 7 does not appear to be affected and the resource handling is far superior than on a v5 host running XP.  The Apex cards seem to work much better as well on Windows 7, v5.1 than they did with XP, v5 host.

Thanks.

Reply
0 Kudos
admin
Immortal
Immortal

All - sorry for the frustration your feeling. I can promise people are looking at this and its not going un-noticed. I have been looking at this thread some of the SRs you have reported and the bug tracking this issue. I am looking to see if there is anything I can do to assist. There is a lot of data and I am trying to just get some of the basic facts straight.

What I have so far:

- Basic issue is when connected to an XP system using PCoIP the vmx-svga process for each VM consumes a high amount of CPU

- Starting to show up as people upgrade to vSphere 5.1 ( Patches / updates since do not change it )

- Does not affect Win7, only XP

- Does not make difference if its a 3D desktop or 2D, since it only affects XP

- Symptom only occurs when connecting with PCoIP not RDP

- A workaround is to set the resolution higher than what you actually connect with. When the resolution is configured to match the high CPU occurs

One thing that would be really helpful is if someone could check the following. If you connect to a VM and log into the desktop using the vSphere Console do you see the high CPU? The reason this is important is it will give us some indication of where to look more closely since this is not seen with RDP.

WP

Reply
0 Kudos
keviom
Enthusiast
Enthusiast

Hi Warran,

I have tried opening the VM in the console and it didn't seem to use high CPU, but then I am finding it random in terms of when I see the high CPU. For instance my own VM has been showing the 100 % most times I reboot it. Today I rebooted it several times and CPU was normal - a pain as I was trying to get some agent logs for VMWare. Then When I rebooted it at lunchtime it went straight to 100% CPU as soon as I logged in and stayed there until around two and a half hours later, when without me doing anything unusual it suddenly dropped back down to normal usage. I didn't even have to do my usual workaround of setting my windows resolution lower and changing it straight back up again.

Hope that helps some

Kevin

Reply
0 Kudos
C3LLC
Contributor
Contributor

Kev-

Open an XP console window from vSphere Client and, from there, open YouTube and watch a video in full screen mode. I can guarantee you will see it. Smiley Happy

Rick

Reply
0 Kudos
admin
Immortal
Immortal

Rick - Are you saying you do see this symptom just using the console and not a remote PCoIP connection? If yes, this is an important detail. Video is heavy. Are you saying the vmx-svga process is higher when playing video vs. older ESX versions? Is the vmx-svga process high CPU when the system is idle like it has been previously reported?


WP

Reply
0 Kudos
TSher
Enthusiast
Enthusiast

Hi,

From my own experience it affected me as follows:

Zero Client Use

When a Windows XP VM was connected using a Zero Client or View Client the CPU within Windows XP was normal BUT on the host it was showing as 100%.  This was mainly with the SVGA worlds.

vSphere Client

The CPU within Windows XP was normal.  If I watched a you tube video then the CPU within Windows XP would spike but I class this as normal behaviour when watching a video.  Host side it behaved as I would expect.  Never once did the CPU spike on the host in this scenario, unless it was because I was running full screen video etc..

My conclusion is the SVGA Worlds were only affected when connected through PCOIP using either a Thin or Zero Client.  When I used RDP I never once experienced this issue.

Thanks.

Reply
0 Kudos
admin
Immortal
Immortal

Perfect!! This is what I was  looking for.

WP

Reply
0 Kudos
keviom
Enthusiast
Enthusiast

Hi,

I have also carried out some tests this morning using one of my troublesome VMs and here are my findings:-

1.     Logged in to VM on Vsphere client console (XP VM with screen res set to 1920 x 1080)- CPU level normal

2.     Opened youtube and watched a couple of videos in full screen  last one Full HD 1080p, full screen. - VMs Task manager shows CPU usage at 100% and Client Console performance monitor shows Virtual CPU Usage at 100% - ESX TOP shows vmx-svga at 7%

Info from inside VM running on Console.jpg

3.     Logged out of VM on Console - Virtual CPU Usage goes straight back to low level

4.     Logged in to same VM on EVGA Zero Client - VMs taskmanager shows CPU around 4%, but Virtual CPU usage shoots up to 100% an stays there - ESXTOP shows VMX-SVGA at 90% plus

ESXTOP logged in to zero client.jpg

5.     Finally here is a screen capture of the performance monitor of this machine during this time with the relevant points marked

Performance Montor of VM.jpg

Hope that helps. If you want any more info, or to discuss, please PM me or post here (remember I am in BST time zone)

BTW the VMWare support call raised for us by Dell is VMware SR number 13317064304

Kevin

Reply
0 Kudos
shaofis
Contributor
Contributor

We completed our View 5.2 upgrade/new install (From 4.6) and are seeing behavior that looks almost identical.

ESXi 5.1 and View 5.2. All VMware Tools and View Agents are updated.

vCenter shows CPU usage on some XP guests run up to 100%; but guest usage is no where near that. So far performing a vmotion seems to offer some temporary reprieve for that system but it does come back to 100% pretty quick.

On multi processor/core systems we see an interesting thing in the performance charts. The total will show as very high; but the sum of the procs is no where near the total.

snip.PNG

Reply
0 Kudos
admin
Immortal
Immortal

I wanted to update everyone here this has had attention of some of our best resources and engineers. With the help of a customer ( Likely one of you here ) the team has made great progress.

WP

Reply
0 Kudos
thomsit
Enthusiast
Enthusiast

Hello everyone.

Is this problem solved in the meantime? We have the same problem, SR 13326701105 and 13327839705.

Reply
0 Kudos
Joeytjuh
Contributor
Contributor

Any updates on this case?

I have the same situation...

5x HP DL380G6 with the latest firmware P62

VMware ESXi 5.1.0 1117900
VMware Virtual Center 5.1.0 1123961

VMware View Horizon 5.2

200 XP virtual desktops

Teradici Zero Client Samsung NC240 firmware 4.1.0 Tera1

CPU ready times gone skyhigh... sinds the upgrade from ( fresh install ) View 4.6.

CPU on 200 powered on desktops HW vmx-09


Just noticed... that there was a dual CPU used and it was turned back to a single CPU.

From the devicemanager in Windows there still is a Multiprocessor configured.

I`m now trying to configure it back to a Uniprocessor.

I`ll keep you posted.

I`m also gonna create a SR @ VMware.

Reply
0 Kudos
keviom
Enthusiast
Enthusiast

I was asked by VMWare to try a possible temporary fix/workaround for this, which was a registry change / entry. This came with a warning that it could introduce a security hole as it stopped the VM Console from being blacked out in Vsphere Client when the user was using the VDI. Unfortunately the setting didn't work in my case. VMWare tech support checked out via Webex, and have come back and said it worked for them, but as it didn't work for me I was going to have to wait for the next patch release which will have the fix for this issue included, and which will be hopefully available in August (the back end of August that is). They can't / won't release just release the patch for this problem until then.

Reply
0 Kudos
Joeytjuh
Contributor
Contributor

I`m allmost considering a rollback to ESXi 5.0U2... but with VMware Virtual Center 5.1u1a

I think this is a issue from the Virtual Machine HW-layer 9 with ESXi5.1U1, accoording to the different stories and my own experiences.

What a pitty....

Reply
0 Kudos