Hello need your advice.
I have a number of Host in a cluster which complain about to much CPU contention. However CPU utilization in all hosts is still less than 20%. Kindly assist if there is any known reason why we have hight cpu contention
There are 3 CPU metrics that you need to know about for CPU
CPU usage = The amount of CPU that the VM is using (getting from the host)
CPU demand = The amount of CPU the VM is demanding from the host
CPU Contention = The difference between the above 2.
CPU contention has nothing realy to do with the cpu usage of the cluster it points to 3 things in my experience
The first two are due to VMs not being able to be scheduled the vCPUs on the physical cores because there are to many.
The second one is because the physical cores are powered off (and the cache depending on what it is set to) when not being used and the lag caused when a VM needs to use a powered off core can generate contention.
If your hosts power is set to high performance then you will need to reduce the number of vCPUs in the VMs or move vms off the cluster. I would only worry if you are experiencing performance problems.
The main reason i see contention getting out of control is someone complains that a VM is not performing so the first port of call is always "we need to double the cpu". The VM is blindly upgraded which generated more contention causing the VM to run slower. This is where right sizing is impotent. Only give a VM more vCPU if it needs it
Not sure why this is a case and what exactly can be the Impact of CPU contention in Virtual machines.
You say the host only has 2 VM's on it, and has 80 logical CPU's available? How many vCPU's have you assigned to those VM's, what version of vSphere are you running, is hyper-threading enabled correctly on the host?
Contention, as previously mentioned, has nothing to do with usage, but everything to do with access to resource.
OK, then that should be absolutely fine.
What storage are they sat on? Is it vSAN by any chance?
If so, SSH to the host and run esxtop - do you see lots of python processes? If so, you can check a little further with ps -Tcjstv | grep python, which will show you what has spawned that process. I've seen a few hosts with high CPU be caused by a vsan script spawning multiple processes. You can then kill them with kill <process id taken from esxtop>. Compare the output of the ps command against a host without the issue if you want an idea on what should be visible.
The above also applies if it is something else. Have a look at esxtop and see what processes have the highest %USED values. It might lead you to something else utilising the CPU's enough to cause contention for the two VMs.
thanks all i got it resolved. It was about power setting. It was set to performance per watt while in the vcenter was set to high perfomance. we applied the setting and cleared the issue. CPU contention dropped to Zero after that.