VMware Cloud Community
xlycqr
Contributor
Contributor

Very high _raw_spin_lock cost as compared to bare-metal machine

I am profiling a video processing system. It is a 40 cpus system with 2 sockets (20 cpus each). There is only ONE guest OS of Linux (kernel version 3.10) running under ESXi 6.0. It was found that there were 11.7% processing time (user+sys) spending on a kernel function of _raw_spin_lock. However, on a bare-metal machine (exactly the same hardware platform), it only took 0.33% processing time. I know there is a 'lock holder preemption' problem for multiple guest OS running in a host. But in this case, there is only one virtual machine no cpu scheduling problem between VMs. How could _raw_spin_lock take that long? There do have lots of network transmissions for video data input and output. Can interrupts cause VMExit and VMEntry, and result in'lock holder preemption'?


All inputs are appreciated! Thank you!

5 Replies
thakala
Hot Shot
Hot Shot

Any access to privileged CPU instructions require VMexit, such actions include most hardware access like interrupts and memory allocation. You can reduce overhead caused by hardware access by using paravirtual devices such as VMXNET3 for guest NIC and PVSCSI for guest SCSI adapter. If you can live without some hypervisor features such as snapshots, vMotion and storage vMotion you can lower overhead caused by network access even further using direct path by exposing physical NIC directly to guest VM.

Tomi http://v-reality.info
xlycqr
Contributor
Contributor

Thank you very much! It sounds great! Could you please give me a little more explanation of how paravirtual devices achieve better performance, e.g., VMXNET3? Is there no VMexit required?

I have one more question though: when a  privileged CPU instructions requires VMexit, does it cause the current vCPU VMexit or it cause all vCPU of the same VM VMexit? if it only makes the vCPU (executing privileged CPU instructions) VMexit, we should see performance improvement due to less interference with other threads (then less lock hold preemption). Is it correct?

0 Kudos
xlycqr
Contributor
Contributor

Just found that I was using VMXNET3 for network adapter settings.

0 Kudos
xlycqr
Contributor
Contributor

SCSI controller type was LSI Logic SAS. Should this be VMware paravirtual?

0 Kudos
xlycqr
Contributor
Contributor

It sounds like I should use huge page instead of 4KB page to minimize TLB shootdown, and thus avoid VMexit for better performance. Is it correct? Thank you!

0 Kudos