VMware Cloud Community
FS123
Contributor
Contributor

Failed to add host to EVC cluster after upgrade to 6.5

We are in the process of upgrading a vSphere 5.5 installation to 6.5U1.

vSphere 6.5U1 is our target since Veeam doesn't allow us to update to anything higher.

The vcenter upgrade to the appliance has been completed (vcsa 6.5.0 13000 Build 8024368).

Our hosts were ESXi 5.5 6480267 (should be fully upgraded by the update manager).

We are trying to upgrade these hosts to 6.5. All 3 hosts are identical.

However, after having upgraded the first host, it wouldn't return to the cluster.

I've dragged it out of the cluster and updated it some further, it is now running ESXi 6.5.0 8294253.

But it is still unable to join the EVC cluster (of which it used to be part, so it should be compatible).

The current EVC mode is Intel Nehalem.

When trying to add the host to the cluster we get the following error:

The host's CPU hardware should support the cluster's current Enhanced vMotion Compatibility mode, but some of the necessary CPU features are missing from the host. Check the host's BIOS configuration to ensure that no necessary features are disabled (such as XD, VT, AES, or PCLMULQDQ for Intel, or NX for AMD). For more information, see KB articles 1003212 and 1034926.

Please not, nothing has changed on these hosts, only ESXi has been remotely upgraded. No bios setting has been modified.

I'm starting to suspect a problem related to the Spectre patches, but have no idea how to figure this out/fix this.

Any ideas on how I can get this situation resolved?

FirstServed N.V. / S.A. Professional internet hosting & development http://www.firstserved.net
11 Replies
FS123
Contributor
Contributor

I've found that if I create a second cluster with just the upgraded host and the EVC mode applied, I can vmotion VMs between both clusters.

So they seem to be compatible...

At the moment I'm using this to spread the load by moving some less important VMs to the one node cluster.

I could move everything over to the new cluster and upgrade that way, but I really would like to know what went wrong and how to solve it the right way as I will be doing the same type migration more in the near future...

FirstServed N.V. / S.A. Professional internet hosting & development http://www.firstserved.net
0 Kudos
Bill_Oyler
Hot Shot
Hot Shot

I believe this may be related to the new Spectre/Meltdown remediation handling that takes place with ESXi 6.5U1g and later (CPU microcode fixes and vCenter EVC checks).  I believe you'll need to have all of the ESXi hosts in the cluster updated with the Spectre/Meltdown CPU microcode fixes in order to admit "new" hosts to the EVC cluster, including upgraded 6.5 hosts.  I think your work-around of creating a new EVC cluster with the newly-patched host, and gradually moving all hosts to the new EVC cluster as they get upgraded, should work fine.  vMotion will work across the clusters since they'll be at the same EVC mode.  Ever since the changes to EVC with the Meltdown/Spectre patches in January-March 2018, it seems that it's harder and harder to add hosts to EVC clusters.  Smiley Happy

Bill Oyler Systems Engineer
diegodco31
Leadership
Leadership

what is the model of the esxi hardware servers?

Diego Oliveira
LinkedIn: http://www.linkedin.com/in/dcodiego
0 Kudos
stanval
Contributor
Contributor

Yesterday I had the same issue, we upgraded a HP DL560 G8 with VUM from ESXi 5.5 to HP ESXi 6.5 update 1 7388607 from Feb 2018.

After the update, the host couldn't be joined to the cluster he was in before.

We also had a new host (HP DL560 G10) with that same ISO image and clean config, which couldn't be joined.

In VUM, I uploaded ESXi 6.5 update 1 G 7967591 patches from March 18, which I downloaded from the myVMware website.

I upgraded both hosts and after that successful install, I was able to join both hosts to the cluster.

0 Kudos
exploreeverythi
Contributor
Contributor

Just curious - did you apply the 7967591 patch manually, after upgrading to 6.5 u1?

I am in somewhat similar situation - upgraded the host from 6.0u3 to 6.5u1 and the host won't connect to the cluster/vCenter post upgrade. The host remains in 'disconnected' state, so can't apply the 7967591 patch via VUM too.

0 Kudos
stanval
Contributor
Contributor

You can drag the host out of the cluster, into the datacenter, then you can connect the host again.

Then you can update your host via VUM and after that you can drag the host back into the cluster without an error. (it will keep its settings)

It is indeed a two step process, first the host upgrade and then the patch update, best out of the cluster.

JAPainter
Contributor
Contributor

I know this is a few months old but just in case someone else runs into this...  I had this same issue while upgrading from 6.0 to 6.5 on my Nutanix cluster.  It really screwed up the one-click upgrade process in Nutanix.  For me, under the cluster settings>VMware EVC was set to Westmere even though I was running Broadwell CPU's.  (Don't ask me, I didn't install it originally, haha).  Switching EVC on the cluster allowed me to move the 6.5 host back into the cluster successfully. 

0 Kudos
tsemeniuk
Contributor
Contributor

My host in question was on VMware ESXi, 6.5.0, 7273056

The hosts in the cluster were on VMware ESXi, 6.5.0, 7967591

Looks like this one host missed our update cycle. I patched it to the version the cluster was on and I was able to re-join it.

0 Kudos
JvodQl0D
Enthusiast
Enthusiast

i had the same issue all. i upgraded 1 of 3 hosts, and it would not re-connect.
i patched our vCenter appliance (+reboot) and the host reconnected auto-magically.

(otherwise i was about to go through the process of moving hosts into and out-of a temporary cluster)

EDIT: it's not the update that fixes, it's the REBOOT.  my next host did the same, but a vCenter reboot will have it reconnect.

0 Kudos
ndizz
Contributor
Contributor

This is the answer. I was updating a bunch of hosts via vum and for some reason two of them wouldn't join back to the cluster until I pulled them out of the cluster and was able to finally connect / remmediate.

0 Kudos
Makovar
Contributor
Contributor

I had opened Vmware case on similar issue (same st...d EVC problems after upgrade from 6.5 to 6.7 U2 )

Next solution was recommended:

Disable EVC, upgrade all ESX hosts, enable EVC back.

If you

        - cannot migrate some vms off esx after EVC disabling

        - cannot enable EVC back on all cluster.

downtime for All vms on affected ESX or whole cluster needed  Smiley Happy

0 Kudos