Hi Guys,
We're undertaking some performance benchmarking as part of a proposed upgrade from 4.1U3 to 5.5U1, and running identical VMs (Red Hat 5.5, 2x vCPU, 4086 RAM, Up to date Tools and vHW) on identical hosts (Dell M610 8x 2.9GHz CPU / 32GB RAM) we see some notable differences when comparing esxtop outpu for VM CPU statistics.
For example - at idle, ESXi 5.5 is running nearly 500 worlds, whereas 4.1 is running 190.
At the VM level, when they are running our test application, we see the following significant differences:
4.1
“NWLD” = 5 per VM
“%WAIT” = 480 per VM
“%RDY” = 0.05 per VM
“TIMER/s (Interrupts)” = 3003 per VM
EMIN = ~7300 per VM
5.5
“NWLD” = 8 per VM
“%WAIT” = 780 per VM
“%RDY = 3.0 per VM
“TIMER/s (Interrupts)” = 1001 per VM
“EMIN” = ~6200 per VM
The knock on effect is that we are losing UDP packets, but that's another SR entirely!
Thanks for your time,
Dan
> At the VM level, when they are running our test application, we see the following significant differences:
What is application performance within the VM disregarding all other esxtop counters, is there a measurable impact within the guest? What is used to measure the perfomance? As you are already mentioning SR, do you have uploaded performance logs and could PM the SR number to have a look at those?
Hi Frank,
The measurable impact is that CPU usage in the 5.5 VMs is notably higher, and the application (an iperf UDP test) drops packets in 5.5 and not 4.1.
I'll PM the SR number and my email address - by all means take a look if you get a chance.
Cheers,
Dan
Have you set the host's power management settings in the BIOS to high performance or OS control? (And in the latter case maybe to high performance in the ESXi configuration). There are some significant changes since ESXi 5.0:
http://www.vmware.com/files/pdf/techpaper/VMware-vSphere-Performance-Management-and-Performance.pdf
For example - at idle, ESXi 5.5 is running nearly 500 worlds, whereas 4.1 is running 190.
This doesn't really say anything in regards to performance and there are a lot more components and sub-processes running in more recent ESXi versions, so this is nothing to worry about per se.
In case of NWLD it's similar to the above. I'm not sure whether this was the case in 4.1 already, but since at least 5.1 the host just runs a separate world instance not only per vCPU, but per virtual disk and svga-GPU as well.
%WAIT is useless to compare, because it includes idle time. Look at %VMWAIT instead.
EMIN is not a performance indicator as well. It just says how many Mhz the VM would get in case the host runs at 100% CPU load. It can fluctuate extremely because it's calculated dynamically depending on how many VMs/vCPUs are running at a given time and if there are resource shares/guarantees in place:
EMIN The Effective Min in MHz for the Resource Pool/World. The amount of CPU resources guaranteed to the world if all the worlds on the system start contending for CPU resources. ESX VMKernel dynamically calculates the EMIN value for all worlds based on the resource settings (Reservations, Limits and Shares) of all the resource pools and VMs on a system. This statistic is for internal to VMware use only.
The %RDY value seems significantly higher, but generally below 5% per vCPU can be considered ok. It can also fluctuate depending on how many vCPUs are running on the host. Did you compare it with the same number of VMs/vCPUs and same workload running on both hosts?
Since you mentioned testing with iperf, are your VMs using an e1000 or vmxnet3 vNIC? Do you see dropped packets in the esxtop network view (%DRPTX, %DRPRX)?
You can also check some counters related to frame drops/errors on the physical NIC with esxcli network nic stats get -n vmnicX
How is the throughput with iperf if you're testing TCP traffic?
Also make sure host's NIC drivers and firmware are up to date.
