VMware Cloud Community
BillClarkCNB
Enthusiast
Enthusiast
Jump to solution

EVC with an existing cluster

I have a 4-node cluster(ESXi v6.5), all 4 servers have the same specs and CPUs (Xeon E5-2690 v4).  We bought a new server that is using the Xeon Silver 4214R CPU.  I've been researching this and still confused as to what EVC mode I need to be at, and how best to proceed.  According to the VMWare Compatibility page, both processor types can support the Haswell mode.  When I start stepping through the EVC wizard, I see "The host cannot be admitted to the cluster's current EVC mode.  Powered-on or suspended virtual machines on the host may be using CPU features hidden by that mode."  This error/warning lists the 4 original hosts, the new one isn't listed.  How is that possible since the hosts can operate in Haswell mode, and all the virtual machines were created on these hosts?  Or, am I reading that wrong and it basically is just telling me I need to power off the virtual machines before I enable EVC mode.  Better yet, why would vSphere squawk at me when both CPU sets can support Haswell, shouldn't vMotion be happy to begin with?  

Tags (4)
Reply
0 Kudos
2 Solutions

Accepted Solutions
a_p_
Leadership
Leadership
Jump to solution

From what I found, the E5-2690 v4 is  a "Broadwell Generation"  processor, i.e. powered on VMs run with Broadwell features.
Since the Silver 4200 series also supports Broadwell this is what you should be able to configure online (assuming that the new host has no powered on VMs.

André

View solution in original post

a_p_
Leadership
Leadership
Jump to solution

It's not the new host that's causing this issue. What I could think of is that' this is somehow related to the Intel's Spectre and Meltdown issues. I ran into such a situation about a year ago, and ended up with creating a new cluster with the required EVC mode configured.

In your case you could add the new host to the new EVC enabled cluster, power off the VM's on one "old" host, and power them on again on in the new cluster. Once the "old" host is evacuated, move it to the new cluster, and proceed with the next host. This allows you to migrate the VM's one-by-one without the need to power off all VMs at the same time.
Make sure that you set DRS on the "old" cluster to manual, or partly automated, so that DRS will not migrate VMs to the host that you are currently evacuating.

André

View solution in original post

Reply
0 Kudos
8 Replies
a_p_
Leadership
Leadership
Jump to solution

From what I found, the E5-2690 v4 is  a "Broadwell Generation"  processor, i.e. powered on VMs run with Broadwell features.
Since the Silver 4200 series also supports Broadwell this is what you should be able to configure online (assuming that the new host has no powered on VMs.

André

BillClarkCNB
Enthusiast
Enthusiast
Jump to solution

I just double-checked against the VMware Compatibility Matrix and it shows "Haswell" with a checkmark as being compatible.  But a search on Intel for that processor shows "Product formerly Broadwell".  What gives?  For grins I tried using "Broadwell" mode in the EVC wizard and it gave me the same warning/error.  And yes, the new host has no powered on VMs and is actually in maintenance mode right now, but it is joined to the cluster.  Should I drop it out of the cluster and then run the EVC wizard?

Reply
0 Kudos
BillClarkCNB
Enthusiast
Enthusiast
Jump to solution

Ok, this is odd.  Went back into the EVC Wizard and selected "Broadwell", and now I see it says "validation succeeded".  Since I have the new host in maintenance mode, and ALL current VMs were created with the current environment, I should be able to click [OK] without having to power off any VMs, right?  Then do I just take the new host out of maintenance mode or should I remove it from the cluster then add it back in?

Reply
0 Kudos
a_p_
Leadership
Leadership
Jump to solution

It's not the new host that's causing this issue. What I could think of is that' this is somehow related to the Intel's Spectre and Meltdown issues. I ran into such a situation about a year ago, and ended up with creating a new cluster with the required EVC mode configured.

In your case you could add the new host to the new EVC enabled cluster, power off the VM's on one "old" host, and power them on again on in the new cluster. Once the "old" host is evacuated, move it to the new cluster, and proceed with the next host. This allows you to migrate the VM's one-by-one without the need to power off all VMs at the same time.
Make sure that you set DRS on the "old" cluster to manual, or partly automated, so that DRS will not migrate VMs to the host that you are currently evacuating.

André

Reply
0 Kudos
a_p_
Leadership
Leadership
Jump to solution

Just read your latest reply after I posted mine.

Glad to hear hat you can enable Broadwell. Once set there's nothing more you need to do.

André

Reply
0 Kudos
BillClarkCNB
Enthusiast
Enthusiast
Jump to solution

You touched on something interesting, thinking part of the issues could be the whole Spectre/Meltdown issue.  We did have problems with that on our hosts where the patch killed hyperthreading and created a huge resource mess.  That aside, I think some of the oddness is coming from the HTML5 vSphere client we are using.  This is just the latest interface issue I've seen since being forced to use the HTML5 client.  We are on 6.5 U3, and I'm planning on upgrading to 7 as soon as I can, hoping the client is much better in 7.

Tags (1)
Reply
0 Kudos
BillClarkCNB
Enthusiast
Enthusiast
Jump to solution

Update:  Fired up the EVC wizard and set the mode to "Broadwell", clicked OK and it went smooth.  I then took the 5th host that had the newer processor out of maintenance mode and it joined with the correct EVC mode without a hitch.  Practiced several vMotion tasks back and forth between the original hosts and the 5th one and it all looks good at this point.  Thanks all!

Reply
0 Kudos
a_p_
Leadership
Leadership
Jump to solution

@BillClarkCNB 

Thanks for the feedback.

Just a quick note about DRS. CPU features are presented to a VM at Power On, i.e. you can usually vMotion VMs between hosts with different CPU generations (without EVC enabled), as long as the VM has initially been powered on on the host with the lowest CPU features. If you want to do a test in your cluster - although I'm sure it will work, because EVC is now enabled - power on a VM on the new host, and then try to vMotion it to one of the old hosts.

André

Reply
0 Kudos