VMware Cloud Community
hyperPOSH
Contributor
Contributor

Bind Real-Time CPU process of multiCore vCPU linux OS to specific PCPU on Host

We have been working to get a real-time, latency sensitive communications application to run without errors in a virtual environment under ESXi 6.5.0 Hypervisor. Unfortunately to date we have had limited success in that endeavor. The communications systems consists of 2 main parts:

1) A linux (RHEL6.8) comms/audio process we wrote that runs on a dedicated CPU.  This is our RT CPU and runs at a 1,000 Hz frame rate and also talks to the Ethernet Interface (#2 below).

2) An Ethernet interface (Gigabit) to one or more audio distribution devices that runs on a closed LAN that is very time sensitive (i.e. SYNC’d clock network)

Our customer base has historically used dedicated hardware for each instance of the above.  That hardware is a quad core or better system that runs RHEL6.8 natively plus our software.  We do not have any audio issues in this configuration.

Our Host for Virtualization

We have a Supermicro SYS-7038A-I, Intel Xeon CPU E5-2660 v3 2.6 GHz, 10 CPU platform with 32 GB RAM and 2 Gigabit NICs.  The NIC for item #2 above is setup in Passthrough enabled mode for the best results and also based on some recommendations in your white-paper on latency sensitive applications. 

VM Instance Definition

Note: For all the info that follows we are only running a single instance VM of our application.  Hence there is no interference from ‘other’ VMs on the same host.  Also for all of the below results Latency Sensitivity is set to HIGH and full CPU resources are reserved. 

VM Working Configuration

We are able to successfully get our comms application to work in a VM instance, however it is only when Affinity is either ‘undefined’ or when it is set to all CPUs.  Which is ‘0-19’ in this case as hyper-threading is enabled.  When this is working you will see a chart like the one in:

ESXi_MONITOR_PCPU_CONSTANT.png

If you look at this chart you will see that the PCPU#7 is the CPU running our RealTime core process (RT CPU).  PCPU#10 is running our Non-Real-Time Core (NRT CPU) process.  However the important part here is that once I start our software (power on the server) the RT CPU stays on PCPU#10 FOREVER and it never moves. 

VM Non-Working Configuration

This is the same as the working configuration except instead of ‘undefined’ or ‘0-19’ for the affinity we set affinity to ‘0-17’  Hence as its still the only VM running it still has the full access to those 18 CPUs (as opposed to 20), yet it moves around all the time as can be seen in the chart:

ESXi_MONITOR_PCPU_MOVING#.png

It is this movement of our RT CPU process from PCPU to PCPU that ‘may’ cause an issue in audio break-up.  I say ‘may’ because the act of the transition does not guarantee our system will break-up, however it has to date been the only cause for break-up.  Hence if I can lock our RT CPU to a PCPU I can get this all working!!!  I have tried other affinity settings like '0-3' or '10, 12, 14, 15’ or …. with the same basic results. 

So I guess that gets me to my more specific question:

If I have a VM instance defined with say 2 vCPUs and one of those CPUs will be running a time-sensitve process is there a mechanism to lock that RT CPU to a PCPU for the duration of that VM instances uptime (i.e. Until powerdown)?

Thanks

Paul

Reply
0 Kudos
2 Replies
AishR
VMware Employee
VMware Employee

Assign a Virtual Machine to a Specific Processor in the vSphere Web Client. For reference: https://docs.vmware.com/en/VMware-vSphere/5.5/com.vmware.vsphere.resmgmt.doc/GUID-F40F901D-C1A7-43E2...

Reply
0 Kudos
hyperPOSH
Contributor
Contributor

The problem is that you cannot bind an individual CPU to an individual PCPU in a multicore VM instance.  So for example in a 2 core platform I can reserve PCPU 8 and 9 for example.  However if I have a process running on 1 core that process will move as the PCPU moves.  For example:

process 1 - cpu0

process 2 - cpu1

If this is on bare-bones hardware it never moves.  Now with affinity set in a VM you get:

process 1 - cpu0 = PCPU8

process 2 - cpu1 = PCUP9

and then

process 1 - cpu0 = PCPU9

process 2 - cpu1 = PCUP8

and so on...........

Reply
0 Kudos