VMware Cloud Community
PeterBlatherwic
Enthusiast
Enthusiast

vpxd consuming all memory -- vCenter slooooooow and cannot connect

As reported in http://communities.vmware.com/message/2020222 we are still experiencing the issue with vpxd consuming huge amounts of RAM and causing severe vCenter slowness.  This has gotten much worse recently.  We are hoping to renew the conversation on this, as it is becoming intollerable, and our efforts have so far failed. 

Configuratoin: VMware vCenter Server 5.0, update 1, version 5.0.0 build 623373.  Running on Windows Server 2008 R2 Standard SP1, in a virtual machine with 4 vCPUs and 8 GB of memory, SQL database on same machine.

When the issue opccurs, the main vCenter executable, vpxd.exe uses up all available memory, causing severe vCenter slowness, to the point where vSphere Client cannot connect, tasks fail due to timeouts, remote desktop to the machine are extremely slow and almost completely unusable.  After stop/start of vpxd.exe service, or restart the Windows VM, the issue initially clears, but then returns later. We are now seeing this more than once per day, sometime 3 - 4 times in a single day.

In more detail:

- Windows task manager reports 99% of  memory is consumed.  Typically, this is around 6.5 GB of memory consumed  by vpxd.exe.

- Before the issue occurs, vpxd is typically consuming around 330 MB memory (reasonable)

- Other large consumers are sqlservr.exe around 1.0 GB, java.exe around 820 MB + 380 MB (there are two processes), tomecat6.exe 690 MB.  We think these are probably normal, and they do not grow out of control like vpxd does.

- During the issue, CPU also becomes very high, near 100%, as seen through vSphere Client performance

- VM is running current VMware tools (8.6.5, build 652272)

Things we have tried:

- Upgraded to vCenter Server 5.0u1

- Rebalanced vCPUs (as suggested in other thread).  Initially was 4 CPUs on 1 socket, changed to 2 CPUs on 1 socket, now 4 CPUs split across 2 sockets.

- Moved to VM version 8

- Set service to autostart, delayed

- General cleanup -- removed a bunch of servies we are not using (Orchestrator, Update Manger...)

Any further suggestions, known issues, FIXES, whatever would be helpful! 

-- PeterB

54 Replies
Britz
Contributor
Contributor

same issue!!!

0 Kudos
jizaymes
Contributor
Contributor

Also have the same issues. In our deployment, the vCenter server is on a standalone box so vmware tools things dont apply. Its using 6GB of RAM where it shouldn't really be.

0 Kudos
Britz
Contributor
Contributor

We fixed it by installing java from java.com<http://java.com>

We installed it to ship logs to VMware though their website and while doing so it fixed it. Let me know of this was a fluke or an actual fix.

0 Kudos
Britz
Contributor
Contributor

nevermind rebooted after a few weeks problem is back!!!

0 Kudos
Britz
Contributor
Contributor

by chance does anyone have the vsphere web client installed???

disabled the service and seemed to stabalize

0 Kudos
illvilja
Hot Shot
Hot Shot

Hi!

vCenter 5 Update1A was released today which resolves an issue with tomcat6.exe allocating alot of memory.

vCenter Server Web services might consume all the memory assigned to it
vCenter Server Webservices (tomcat6.exe) might consume all the memory assigned it. Increasing the Tomcat maximum memory pool does not resolve the issue.
This affects any vCenter web services related functionality.
This issue is resolved in this release.

http://www.vmware.com/support/vsphere5/doc/vsp_vc50_u1a_rel_notes.html

I'm not sure if this has anything to do with vpxd.exe, but it's worth a shot.

--- Martin

PeterBlatherwic
Enthusiast
Enthusiast

Based on responses, it appears multiple others are also having the same issue, or at least same symptoms. 

We worked through our issues with VMware support, and they also recommended applying 5.0u1a when available.  We will apply it and see.  Read on.

Things we did:

- Increased memory reservation to 16 GB / Unlimited  (YES 16 GB .. hideously large !!)

- Set CPU reservation to 8 Ghz / Unlimited (high, but not as insane as the memory one)

- Disabled web services (tomcat).  This was reenabled later on in the process, after the memory / CPU reservartions changed.

- Increased tomcat memory pool (had no effect)

- Rebalanced vCPUs, as above (no effect on the runaway memory issue)

- Cleaned up dB -- SQL was *very* fragmented (marginal effect on performance, no effect on run-away memory consumption)

- Cleaned up some services we are not actually using daily (Orchestrator, Update Manager)

Lookin at the fix in 5.0u1a release notes, I am personally not very convinced this is really the same thing.  For us, it was vpxd that was consuming all memory, as seen through Task Manager.  We still see memory spikes occasionally, up to about 70+% of total RAM with the new reservation.  This would have exceeded our previous reservation -- seems likely (to me) our vCenter would have locked up at that point. 

But we will apply the update and see.  Then, reduce our memory reservation again, and see. 

-- PeterB

0 Kudos
FGShepherdP10
Enthusiast
Enthusiast

We've been having a similar issue recently (painfully long (30+ minutes) startups/shutdowns of the vCenter server), high "cached" memory counts (In one case, vpxd.exe was using <1GB of RAM, but the server was using 20GB, and when we killed the vpxd process, the RAM dropped to almost 0.)...I was thinking that it was some sort of memory leak (and it may still be), since that had been a problem in the past (not this again, VMware).

In an attempt to troubleshoot the issue, I set all of the vCenter-related services to "Automatic-Delayed Start" and rebooted the server.  (In true break-fix style, I also doubled the RAM (16-->32), added an 8GB reservation, and reserved 6000MHz of CPU, according to the post, above...all in one reboot...So, yes, this will make it difficult to determine which fix actually helped.  You may want to try each on its own to narrow down the best change.)

At any rate, the server is performing extremely well at the moment...better than it has in months.  We're using a LOT of RAM, but our environment is decently large, so that's not surprising.  If things change, I'll re-post.

0 Kudos
FGShepherdP10
Enthusiast
Enthusiast

Update: Nope.  That wasn't the fix.  Over the last 24 hours, at least twice, vpxd used up all available RAM (32GB, see above post), and then Windows paged all of the RAM to disk.  RAM usage slowly falls (vCenter is unavailable during this time, naturally), while the Cached Physical Memory on Windows grows at the same rate.  When it's done, vCenter restarts automatically.

Thinking of bringing the RAM back down in the meantime, while troubleshooting, since it will at least shorten the time for this process (down from 30+ minutes to...?).

Side note: I've also disabled vCenter Operations, a small-footprint vKernel deployment, and EMC's ProSphere that were all connected to our environment (lots of old/idle sessions), but that apparently wasn't utilimately the root of the problem.

0 Kudos
Britz
Contributor
Contributor

We seemed to get it to stabilize. Reboot the os machine with vcenter service disabled. Wait about a half an hour then enable and start only that service and see it is stable. If it is you can start inventory and web services service.

Worth a try till they get update 2 released. From looking at different forums it looks to be an issues when your hosting in a virtual machine and your disk latency spikes. Which means VMware most likely wont come out and say exactly what the problem is as it is against their business model. We have had a case open for over 2 months.

0 Kudos
FGShepherdP10
Enthusiast
Enthusiast

Update: Lowered RAM to 16 GB (to decrease flush time), lowered vCPU to 4, and rebooted.  For about 18 hours, we've been hovering steady around 11 GB of RAM.  Secondary services/appliances are still powered off.

0 Kudos
PeterBlatherwic
Enthusiast
Enthusiast

Hi FGShepherdP10,

Ouch, sounds pretty painful, as our issues have been also.  16 GB memory reservation .... Yeeesh!

Are you running on vCenter 5.0 update 1a, just plain update 1, or something else? 

As above, we intend to move to 5.0u1a, but have not yet been able to do so for operational reasons. We currently have a mostly stable vCenter, only one serious memory event since changes described above (... of course, just before a scheduoled executive demo ... ;^)  I am personally a bit dubious that 5.0u1a will fix the issue, but would like to hear from anyone who has applied it, and can verify (or not) that it does address the issue. 

-- PeterB

0 Kudos
PeterBlatherwic
Enthusiast
Enthusiast

Update:

We have now applied vCenter 5.0 update 1a, the recommended "fix" from VMware.  We are still having issues.

Since applying the update last week, we have had multiple incidents of the vCenter going so sloooooooow we cannot connect through vSphere Client, operations failing on timeouts, etc -- had to restart the VM.  Although the specific symptoms seem to have changed a bit -- seeing high CPU % and relatively lower memory % (higher that should be, but not complete runaway).  Alas...

Has anyone else out there applied 5.0u1a?  Did it fix your issue, or not?  Any work-arounds?

Another thought, we forgot to mention before.  Our vCenter VM was created by transforming a formerly physical machine to a VM.  The physical machine did see this issue, but very very rare.  Now on VM we see it a lot.  But, the high incidence rate did not correspond with importing the physical to the VM -- it started occuring several months later.  Are others also running vCenter that moved from physical to virtual, or having same issue on a VM vCenter that was never physical?

We have re-opened the support ticket with VMware on this.

-- PeterB

0 Kudos
SangNg
Contributor
Contributor

I want to bring this topic back to discussion. I'm having same issue here. VPXD and Java processes consistently peak out CPU and memory. I'm using vCenter 5.0 and sounds like  5.0 u1a will not fix the issue at all. Is anyone able to determine what really causing this? Many thanks

0 Kudos
Britz
Contributor
Contributor

Check to see if any of your VM's CD/DVD rom are connected to a device that doesn't exist.  I have noticed that when Live migrating VM's from one host to another, and it was using the previous Hosts physical CD/DVD rom it can cause a lot of problems.  I had several machines in this state, and since changing it back to Client Device, we have not had this issue again.

0 Kudos
PeterBlatherwic
Enthusiast
Enthusiast

Hi Britz,

> ... see if any of your VM's CD/DVD rom are connected to a device that doesn't exist ...

Interesting suggestion, and something to watch for.  However, very impractical.  We have hundreds of VMs deployed, and we are not even that big a datacenter (we are an R&D team).

Why would CD/DVD on some VM(s) cause vCenter/vpxd to eat all memory and choke.?? 

-- PeterB

0 Kudos
FGShepherdP10
Enthusiast
Enthusiast

Update 2: For over a week now, vCenter has been stable.  We're running 7-12 GB of RAM on the vpxd service itself, and roughly 4 GB of RAM, each, on Tomcat6 and Java.  The 16 GB allocation of RAM to the entire server is mostly used, but response times inside of the client are good and the services have all stayed up with only intermittent, superficial errors in Windows Events.

At this point, I'm willing to point to the auxiliary services that were plugged into vCenter, though I can't say with certainty which one was the real culprit.  Here's what we've turned off:

  • VKernel
  • vCenter Operations Manager
  • vCenter Chargeback Manager (watch this one-- it will CRUSH a datastore)
  • EMC ProSphere

We are not planning on turning any of these services back on yet, but will post results when we do.

BTW, we're still on 5.0.0 (Build 455964), so I'm not looking at this from an upgrade standpoint.

Hopefully, this helps someone...Good luck to those of you still having this problem and thanks to those who have contributed to finding a fix.

0 Kudos
SangNg
Contributor
Contributor

I updated my vCenter today to 5.0 U1a. After the update, vpxd process creeped up and eventually consumed all memory (8GB). I rebooted it but it won't help at all (gradually consumed all memory until vCenter became really slow). I then stopped all vmware related services (update manager, orchestrator, etc) and restarted vpxd (vCenter) service manually. It came back up gradually consuming memory again (up to 3GB) but suddenly vpxd memory dropped and stayed around 600MB. Then I proceeded to start other vmware services. I also configured tomcat to start with 265MB and max at 512MB.

I know this doesn't really show what is the real culprit behind vpxd consuming all memory, but right now my vCenter is consuming only 6GB (I gave it 8GB, with vpxd process taken up only 600MB). I only have update manager and orchestrator running beside vCenter. Hope it helps others who are having same problem because I scratched my head around this problem for last 2 weeks. Good luck!!!!

0 Kudos
jdoz_bw
Contributor
Contributor

I have just installed VC 5.0 Update1b (released last week).  I am seeing Java.exe processes use a ton of memory on my VC server.  It's not pegging out, but it is certainly using a large amount of resources.  Is there a fix for this yet, or a root cause, or is it just by design?

0 Kudos