May be if you can isolate the VMs which are having issues with the error and EVC and enable the EVC on those VMs manually, you can try again post that.
check /var/log/vmware/vpxd/vpxd.log at the same time when you try to enable EVC and get error message which might provide data of VMs..
This is not tested but I am trying to isolate if there might be a VM which we can isolate causing this issue
I ran into the same issue this week with a Skylake CPU cluster (all hosts with the same 61xx CPUs, and the latest vCSA + ESXi 6.7 patches installed). I could't figure out the root cause of this, and due to time pressure we decided to evacuate some hosts, move them to a new EVC enabled cluster, and do a rolling VM power cycle to cold migrate them to the new cluster. In my case it was just a 2 node cluster with only about 20 VMs, but I can feel your pain.
IMO there's something wrong with EVC, i.e. EVC CPU features not matching the physical CPU features.Maybe worth raising a support call with VMware.
It turns out I encountered the same issues on multiple VMWare environment, and clearly, there is a regression as compared to VSphere 6.5.
In 6.5, if a cluster was composed of hosts with same CPU version (say intel-skylake), it was indeed possible to enable EVC mode in this mode with all VMs powered up.
Of course, the best practice has always been and still is to enable EVC mode when you initially form the cluster. However, it was possible to enable it afterwards, with VMs powered up, just before you add new hosts to the clusters, with new generation CPUs.
This is definitely not possible anymore in ESXi 6.7, and seems related to the new per-VM EVC mode which is totally useless by the way (main issue I have with this is you MUST shut down your VMs to enable it; why isn't it possible to let it enable automatically at next VM reboot, like VM Hardware compatibility upgrade?)
6.7 makes also really weird things with the per-VM EVC mode. Althouth documentation specifies this mode is available with VM hardware version 14 minimum, almost all my VMs, running older hardware version, are having some sort of EVC mode enabled. You can see it this way:
- In VCenter, select the VSphere cluster, then go to the "VMs" tab
- Click on a table column header, and in "Show/Hide Columns", enable columns "EVC Mode" and "Compatibility
Why the heck do I have some VMs with HW version lower than 14 that displays an EVC mode??? Why are they different on some VMs? Why does some VMs have no EVC mode displayed?? This is a total mistery, and I cannot find any information nowhere about this weird behaviour... I guess it is related to the host the VM was initially booted on, but all those VMs are running since several years in this cluster, and have been rebooted since.
Wish I could one day understand what is happening here...