VMware

This Question is Not Answered

1 "correct" answer available (10 pts) 2 "helpful" answers available (6 pts)
1 2 Previous Next 21 Replies Last post: Nov 19, 2009 6:03 AM by Wadebum  

2 vCPUs Hardware Interupts posted: Oct 15, 2009 2:25 PM

Click to view Wadebum's profile Novice 17 posts since
May 14, 2008
We have decided to move our Oracle Forms servers to VMs. We built a 2 vCPU with 4GB of Ram Windows 2003 SP2 VM on a new ESX 3.5 host using its own LUN on a EVA8100. Our DBAs started to do some forms compiling on the VM and noticed that it was taking around 4 times longer on the VM than it does on physical servers. They wanted me to add more CPUs so I added 2 bringing the server up to 4 total. I watched the performance as they reran the forms compiling and noticed that the actual process doing the compiling was only using about 25% of the CPU (my thinking is that the process does not take advantage of multiple cores). I put the server back down to 2 vCPUs and after rerunning the compile it was using about 50% CPU (adding to my suspicion that it does not take advantage of multiple cores).

Since it was a new Oracle install we thought maybe there is something diffrent between the physical server and the VM, so I did a test with a script that gathers all the email addresses in our Active Directory and puts them in a file. I added a function to the script to put a time stamp at the top and bottom of the file so I could compare between the physical and virtual server. The physical server finished the script in around 6.5 min but the virtual server took around 16-17 min to finish the AD script.


Since we were about to upgrade to ESX 4 we decided not to spend much more time figuring this out. Our upgrade to ESX 4 is now complete and I have upgraded the VMware Tools and Hardware of the VM to the latest versions. It still takes around 16-17min to run the AD script.


I then built a brand new VM with 1 vCPU and 1GB of Ram in the same LUN as the origional server and installed Windows 2003 SP2 onto it and installed the VMware tools. After that I ran the AD script 5 times and it finished on average at just under 5 minutes. I then installed all available WIndows Updates and ran the AD script again with no significant time increase.


Next I shutdown the VM and added a vCPU. After finding the new hardware I ran the AD script 5 more times with an average around 17min. I watched the server with Process Explorer and the Hardware Interupts are running around 20% the entire time the script is running. I used KernView to see what was going on in the Kernal of Windows and I get processor, ntkrnlpa, hal, tcpip, win32k, e1000325, ndis, afd, fltmgr, and ntfs as the top 10 items.


Using Process Explorer on the Physical Server shows the Hardware Interupts around 2% at the highest.


I have gone so far as to make this the only VM running on an HP BL465 G5 with 2 Quad Core AMD 2.7GHz processors and 32GB of ram and setting up CPU and Memory reservations to what the VM has but it does not perform any better. Does anyone have any idea how to get better performance out of a VM with multiple vCPUs? Or why the Hardware Interrupts are so high?

Re: 2 vCPUs Hardware Interupts

2. Oct 21, 2009 5:31 PM in response to: Wadebum
Click to view kichaonline's profile Enthusiast 22 posts since
Feb 22, 2005
Hi,

I'm a performance engineer at VMware, Scott Drummonds forwarded this community thread to me.

I have found alot of things on the internet and VMware's web site about IRQ problems between USB and NICs but I don't see that here
You dont have to worry about Interrupt sharing in ESX 4.0 since almost all devices (USB included) is now owned by the VMkernel and therefore interrupts never gets shared with the service console. I also looked at the /proc/vmware/interrupts screenshot and it looks perfectly fine to me.

However your esxtop screenshot picked up my interest.. VM WBVMTest has 36% USED but its %RUN is 123% . Since your screenshot does not expand on the VM group, I cant say for sure why your VM is accumulating more %RUN time than %USED. I have seen this happening usually if you have wrong CPU affinity settings, but that should also raise %Ready time but I don't see that in the screenshot. I also see that %PCPU and %UTIL values are not matching for your system, this happens usually if power management feature is enabled in the BIOS and CPU clock frequency scaling is in effect. Does this host ever ran any FT enabled VM since its last boot? I want this information to make sense of some of the counters.

Also it would help if you could capture, esxtop in batch mode and upload the csv file for the case where you run the workload with 2vCPUs.

Re: 2 vCPUs Hardware Interupts

4. Oct 22, 2009 10:28 AM in response to: Wadebum
Click to view jpdicicco's profile Enthusiast 49 posts since
Mar 24, 2007

If you plan to take advantage of ESX 4's power managment in the future, then you will need to set the power management to "OS Control Mode." I haven't tested the other features in ESX 4 yet, but I have them enabled under the assumption that in "OS Control Mode" only ESX can initiate their use.

JP

Re: 2 vCPUs Hardware Interupts

5. Oct 22, 2009 4:46 PM in response to: Wadebum
Click to view kichaonline's profile Enthusiast 22 posts since
Feb 22, 2005
I think you may have found the problem already.
On I'm glad that the issue is pretty straightforward (though personally wished it was little bit more challenging :-) )

As for FT we do not have it setup in our environment.
Thanks for confirming this. I had to ask this, because in esx40 the way we charge CPU cycles changes when your run a FT VM(s) on the host.

Let me attempt to explain what you are seeing here.
Your workload (custom script that pulls information from AD server) seem to consume 80% of CPU on average. The "Dynamic power savings mode" mode in HP systems controls the processor P-States based on the PCPU Utilization. This feature puts the processor into low frequency state by default and steps up the clock frequency only if the PCPU utilization increases above 60%. When you are using a 2 or 4 vCPU VM you are spreading the load across multiple PCPUs such that average utilization of any PCPU remains below 60%, so the processor always run in lower frequency mode and hence the slow performance. On AMD systems lowest processor frequency could be as low as 50% of the rated processor frequency (i.e. a 3Ghz processor will run at 1.5 Ghz). Ideally this should result only in 2x performance drop but I guess the "Low power halt state" is also possibly impactingh performance (see next paragraph)

"Low Power Halt state" also known as C1E halt puts the processor into more deeper sleep when idle (like for instance processor cache can be flushed to save power). This means the processor will have to pay a penalty (few wasted CPU cycles and cache misses) when it wakes up. This is usually fine if your system is mostly idle but if your processor frequently goes in and out of idle state (typical of bursty or I/O bound workloads) then this could affect performance quite a lot. Since the load spreads across the vCPUs when using a vSMP VM the chances of the processor going in and out of the idle state also increases. So I suspect that this also contributes to the performance loss. Especially since windows ping-pongs single threaded application across multiple vCPUs to evenly distribute the load (even though it has undesirable effects both natively as well as in a VM ).

In esx40 we now measure CPU utilization in two different ways one with respect P-MAX frequency (i.e the rated clock frequency) and other with respect to the current clock frequency. %USED is based on rated clock frequency, %UTIL is based on the varying clock frequency. So (if %UTIL is 100 then it means you cannot any more juice out of the processor, but if the processor is running at half its rated max frequency then %UTIL would be 100 but %USED would be only 50. In your esxtop screenshot I spotted that %USED and %UTIL were not matching thats why suspected power management. Also %RUN of 100 means that the VM was scheduled 100% of the time during the last refresh interval, if %RUN is 100 and %USED is 50 then it means the VM used the CPU all the time but burnt only 50% of the processors cycles with reference to its rated max.

I;m guessing your Oracle Forms App also has similar CPU utlization pattern so disabling power management should fix your problem. In general for benchmarking we recommend all BIOS power management features to be disabled. For production its a personal tradeoff so we leave it to the customer choice. Also as you might be already be aware that single threaded apps are better run in a single vCPU VM. Ping-ponging single threaded apps to multiple CPUs has performance impact both at the processor micro-architectural level and also in the virtualization layer.

ESX 4.0 has power management feature but it is disabled by default. If you want to use it you should set the BIOS option to "OS Controlled Mode" and flip power management in VMKernel (Advanced settings) and should reboot the host.

Hope this helps.

Re: 2 vCPUs Hardware Interupts

7. Oct 23, 2009 1:33 PM in response to: Wadebum
Click to view kichaonline's profile Enthusiast 22 posts since
Feb 22, 2005
Can you give me details on the storage infrastructure that the native was using. I'm presuming the storage infrastructure might have changed after the P2V. Also could you point me to the LUNs that are being used by this VM so that I dont have to comb through all the data in esxtop?

Re: 2 vCPUs Hardware Interupts

10. Oct 29, 2009 11:36 AM in response to: Wadebum
Click to view kichaonline's profile Enthusiast 22 posts since
Feb 22, 2005
Hi,

I had a cursory look at the esxtop data that you provided. There are many LUNs with the same id prefix naa.600508b400069fab0001200001de0000 (you missed some bytes in the end) so I looked at all of them. There is not much (I/O going on these luns (only few hundreds IOPs) but whatever I/O that is happening has high latency, Some of them are in the range of 40 ms which is excessive and will definitely have a big performance impact. I dont know how disk intensive Oracle Forms workload is but you may want to monitor the disk latency in esxtop and see if "DAVG" is beyond 15 or 20 ms, if so then you need to look at your storage subystem for performance problems. If the DAVG is consistenly less than 10ms and if you still see performance problem then please run "vm-support -s" when the workload is running and then upload the resulting dump file it creates.

Re: 2 vCPUs Hardware Interupts

12. Nov 4, 2009 9:39 PM in response to: Wadebum
Click to view V.'s profile Novice 8 posts since
Sep 24, 2008

Wade, are the vmdk(s) aligned?

www.vmware.com/pdf/esx3_partition_align.pdf

Re: 2 vCPUs Hardware Interupts

14. Nov 5, 2009 3:05 PM in response to: Wadebum
Click to view kichaonline's profile Enthusiast 22 posts since
Feb 22, 2005
Hi

i havent had a full look at the snapshot data yet but I see there are two VMs registered on the host and both of them are residing on the SAN LUN. Could you confirm if you uploaded vm-support data for the right host and if so please give me a pointer to your VM.

VMware Developer

SDKs, APIs, Videos, Learn and much more in the Developer community.

Learn More

Developer Sample Code

Increase your developer productivity with VMware API sample code.

Learn More

VMworld Sessions & Labs

Online access to the latest VMworld Sessions & Labs and online services.

Learn more

Purchase PSO Credits Online

Purchase credits to redeem training and consulting services online.

Buy Now

Community Hardware Software

View reported configurations or report your own.

Learn More

VMware vSphere

Come witness the next giant leap in virtualization.

Register Today

Communities