VMware Cloud Community
pfsit
Contributor
Contributor

Terrible VM Performance / CPU Usage Mismatch / Server 2008+Exchange 2007

I'm running a Windows Server 2008 64-bit VM and Exchange 2007 SP1 RU7 Mailbox role on ESX 3.5u4. 4 vCPU's and 8GB RAM assigned. Performance is terrible.

CPU from the guest perspective will be pegged at 100%, filled with things like taskmgr, services, store, system, svchost, msftefd, spoolsv all taking small 5%-15% chunks of CPU time. No specific offenders. However from the ESX perspective, CPU never goes above 50% while the guest is pegged out and becoming unresponsive, even to pings.

This is a well established 4 server ESX farm with ~80 other VMs working just fine. None of the ESX servers are over 30% CPU and 70% memory capacity, and they have 8 physical cores.

Any ideas on why the guest CPU would be depleted before the ESX host would see it that way? Any suggestions?

Thanks...

I am seeing a good number of these in the system log on the offending 2008 server:

"The system time has changed to 5/28/2009 1:09:28PM from 5/28/2009 1:05:51PM"

Source: Kernel-General

EventID: 1

Reply
0 Kudos
14 Replies
MattG
Expert
Expert

With regard to the time change warnings, it sounds like you might have VM Tools Time Synch (which synch's to the ESX host) and Windows NTP time synch on at the same time.

If so you may want to turn off the VM Tools time synch.

-MattG






If you find this information useful, please award points for "correct" or "helpful".

-MattG If you find this information useful, please award points for "correct" or "helpful".
Reply
0 Kudos
pfsit
Contributor
Contributor

VM Tool time sync is not turned on...

As part of the apparent performance problems, there are times when it becomes completely unresponsive for 60-120 seconds. It seems as though it is getting completely hung for periods of time.

Reply
0 Kudos
MattG
Expert
Expert

Have you looked into your disk performance? I would review the underlying disk LUN (Local or SAN?) to make sure that you are not getting delays.

You may want to try esxtop to see how your physical disk susbsytem is working. Here is a good link on esxtop and disks:

http://communities.vmware.com/thread/203910

-MattG






If you find this information useful, please award points for "correct" or "helpful".

-MattG If you find this information useful, please award points for "correct" or "helpful".
Reply
0 Kudos
pfsit
Contributor
Contributor

I'll look closer at my disks, but how would that translate into 100% CPU utilization at the guest? Also the things that are slow aren't disk bound - pinging for example (which varies from 10 to 1600 ms to unresponsive, instead of <1ms like all other VMs).

I'll monitor it, but I'm trying to understand how it could be manifested in these symptoms...

Thanks.

Reply
0 Kudos
VMmatty
Virtuoso
Virtuoso

How many physical CPUs/cores does the ESX host have? And how many other virutal machines are running on that same host? Exchange is a multithreaded application so if you give it four vCPUs it will try to use all of them. If you don't have enough physical CPU cores in your ESX host you could see performance problems.

Matt | http://www.thelowercasew.com | @mattliebowitz
Reply
0 Kudos
pfsit
Contributor
Contributor

8 physical cores. 4 vCPUs. No other VMs on host.

Reply
0 Kudos
VMmatty
Virtuoso
Virtuoso

Sorry, I missed in your first post where you said how many cores you had. That probably explains why you have 50% utilization on the ESX host but 100% in the guest - it is using 100% of 4 cores.

A few more questions.. Is there anything else on that server other than the Mailbox role of Exchange 2007? Any antivirus or anything other than Exchange? And are there actual users on that server yet? Finally, was this server a new build or was it a P2V conversion?

I've seen many environments running Exchange 2007 in a virtual machine without the kind of CPU utilization you are seeing. I take it you haven't configured any limits or reservations or the VM isn't in a Resource Pool where these are set? That might explain some of the performance problems.

Matt | http://www.thelowercasew.com | @mattliebowitz
Reply
0 Kudos
pfsit
Contributor
Contributor

The "ESX Perspective" I speak of is its view of the guest, not the overall server usage. The ESX server sees the guest as using <50% (usually ~35%) of its CPU even though the guest thinks it is using 100%. Perfmon will show CPU at 100% at the same time ESX shows the guests CPU at 35%, for minutes at a time...

There are only ~30 mailboxes on it. It is mailbox only role. No AV. New server build, no P2V.

Reply
0 Kudos
mtse2545
Contributor
Contributor

I have the similar experience. It takes 15 mins for windows startup and peak 100% CPU usage for my dedicated exchange2007 MB role server ( running on Netapp iscsi lun, with 16GRAM ) The issue was "solved" by truning down the vCPU from 4 to 2, then it takes 2 mins startup and everything working fine then.

Not sure if it is the case in your situation. But no harm to try it out.

Reply
0 Kudos
pfsit
Contributor
Contributor

Sorry for the false alarm, this was self-inflicted. Leftover from a template, there was a 2GB maximum on the memory resource allocation. I overlooked this the first few glances... Very unexpected in how it manifested itself! I'd like to understand the connection, why CPU would max out so easily when it was over its memory maximum. I'm guessing that's how ESX makes the guest wait while it swaps behind the scenes...

Reply
0 Kudos
mvarre
Contributor
Contributor

I wish i had seen this post sooner as i had the same problem with a 2003 server setup. I used a "bad" template and all subsequent guest cpu's were inexplicably pegged. Replaced the bad template with a freshly created one and everything worked great.

Reply
0 Kudos
carlose71
Contributor
Contributor

Hi, I have the exactly same problem. I don't understand how you solved it. What's wrong in the template?

Can you give more information about the solution?

thanks

Carlos

Reply
0 Kudos
Martin_Forster
Contributor
Contributor

Hello

i had the same issue.

Guest Performance counters told me that my cpu is completeley saturated.

ESX Host Performance Counters told me that the cpu is idle ....

I then found out that the esx host was swapping.

There is a Best practise around that you reserve at least 50 % of the configured memory for a guest. Now i know why Smiley Happy

Regards

Martin Forster

Reply
0 Kudos
NTShad0w
Enthusiast
Enthusiast

Martin, and all using VI3/vS4 Smiley Happy

I know its old post but... I know this problem from about 2 yesr...

such situation with eating vCPU happen often when we use Resource pools, the problem is that resource pools can remember machine RAM from last start, so when we for example clone this machine (from template or from guest) it can remember our memory settings and when we change memory to more... it cut our memory to last time used (but only as a memory limit what we dont see on machine, we have to check this value), and of course dont inform us about this fact :((, this cutting memory cause vCPUs to goes 100% and guest comes unresponsivle sometimes on pings and with VERY POOR performance, easiest sollution is to change MEMORY LIMIT (Limit - MB) for any guests to max of ESX server memory (in my example 20000MB), if we change it on stative Resoulce Pools neve change it again and that solve our problems with performance issues of vm guests.

on my example U can see that some machines have set memory limit (I NEVER SET IT ON MY OWN!!!) and some have 20000MB (someof them I set one year ago) and some have Unlimited (its default value, but Resource Pools can change it automaticaly).

So my recommendation is to set Limit - MB memory parameter to some big value like my 20000MB for any of our production vm guests.

regards

Dawid Fusek

IT Security Consultant &

Virtual Infrastructure Designer

COMP SA

Reply
0 Kudos