vcsa 6.7u2, esxi 6.7u2, 4 nodes,
All Flush VSAN,signal diskgroup for each host
today, we added additional one flash disk to diskgroup to extend capacity,. after erveral hours rebalance. looks goold, vSan Healthy all good.
then we upgrade vcsa from 6.7u2 -> 6.7u3, successed.
then, we upgrade esxi from 6.7u2 -> 6.7u3, as we didn't have DRS licences, we did manually to upgrade
1. host1, we manually move all vms to another host, and put the hose in maintainance "ensure data accessibilty". patch by UM.
patching successed. host1 back online and exit maintainance, rebalanace .
2. then , same as host1. - successed.
3. host3, when we move vms to other hosts it shows as the picture. , alway failure at 72%.
is there any clues for this stuation. thanks
Please check whether the VM's vmware.log contains further details about the error.
seems to be caused by CPU feature mismatch between hosts.
Yes, I observed that, but sitll have no idea where the problem is.
Is your cluster using identical CPUs on each node?
If so, are some of your nodes on a different vSphere Version than others?
Your VM does use vpmc.enable=true, so the KB might describe the reason for your problem.
name: Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz codename: Skylake EP/EN/EX
name: Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz codename: Cascade Lake
host1, host2 6.7.0 build-17700523
host3, host4 6.7.0 build-16075168
KB meantioned CPUS below, is that exactly impacting cpus what my cluster using ? I don't know how to macth with them ,
 Intel® Xeon® Processor E3 v5 and v6 Family (codename Skylake, Kaby Lake)
Intel® Xeon® D (code name Skylake-D)
Intel® Xeon® Scalable Processor and 6th, 7th, and 8th Generation Intel® Core™ i7 and i5 (code name Skylake, Kaby Lake, Coffee Lake and Whiskey Lake)
when I got you right the current situation looks like this.
I might be wrong, but I assume you don't have EVC enabled on your cluster.
If I'm right you'll definitively be affected by the KB article you mentioned, as the VMs running on Host 3 can't migrated to any other host.
Host 1 & 2 are now on ESXi 6.7 P05 that would prevent migration, Host 4 is equipped with Cascade Lake CPUs which already includes the microcode update.
So you should stop those VMs and run a cold migration.