VMware Cloud Community
alwalters
Contributor
Contributor
Jump to solution

Compatibility issues with attempting to vMotion - only some VMs

I'm attempting to use vMotion to move 3 VMs from one host to another.  Of the three, I was able to move one successfully, but the other two are giving a compatibility error "CPUID details: incompatibility at level 0x1 register 'ecx'."  Anything I'm seeing about how to resolve this deals with settings at the host level.  However, given that the one VM moved with no issues, I'm wondering - is there anything that could be set at the VM level that could affect this?

Tags (1)
0 Kudos
1 Solution

Accepted Solutions
ThompsG
Virtuoso
Virtuoso
Jump to solution

No you are right - without shared storage, it’s going to take a while for a cold migration. Cannot just migrate the compute alone Smiley Sad

I agree that I would have thought the HW version where the same but worth the check and would explain the issue. If it does come down to this then you could upgrade the HW version and then migration ”should” work. Test first if that is the case.

View solution in original post

0 Kudos
17 Replies
sk84
Expert
Expert
Jump to solution

Do all 3 hosts in the cluster have the same CPUs? Because it looks like your cluster consists of hosts with different CPU generations or models (for example, Intel Sandy Bridge, Kaby Lake, Skylake, etc.). And when you power on a virtual machine, the CPU feature set of the host is applied, and if the feature set of that host has more features (because it is a newer CPU generation), the VM cannot be moved to a host with fewer CPU features.

--- Regards, Sebastian VCP6.5-DCV // VCP7-CMA // vSAN 2017 Specialist Please mark this answer as 'helpful' or 'correct' if you think your question has been answered correctly.
0 Kudos
alwalters
Contributor
Contributor
Jump to solution

There are two hosts, and I'm trying to move all VMs from host1 to host2.  host1 and host2 do have different CPUs, but they had different CPUs when I just moved VM-B.  But VM-A and VM-C (also source host 1, also destination host 2) won't move.  The one I'm attempting to move to (2) is newer.

0 Kudos
sk84
Expert
Expert
Jump to solution

Okay. That's what I thought.

It depends on where your VMs were powered-on. If VM A was powered-on on the host with the newer CPU, the VM inherits all CPU features of the newer CPU and can be moved to and from the host with the older CPU. The VM meets all CPU requirements of the newer CPU and also of the older CPU. Because usually the CPU feature scope only expands and thus the VM has more CPU features as the old host requires.

In contrast, a VM started on the host with the old CPU has fewer CPU features than the host with the newer CPU requires. This means that it cannot be moved to the newer host when it is powered-on.

And I think that's what happened. VM-B was initially powered-on on the new host and VM-A and VM-C were powered-on on the old host.

What you can do now, is to power-off VM-B and enable EVC mode in the cluster. Set it to the CPU generation of the older CPU. After that you can power-on VM-B again and all VMs should be able to migrate between the hosts without problems.

--- Regards, Sebastian VCP6.5-DCV // VCP7-CMA // vSAN 2017 Specialist Please mark this answer as 'helpful' or 'correct' if you think your question has been answered correctly.
0 Kudos
alwalters
Contributor
Contributor
Jump to solution

All three VMs were powered on on the old host, not the new.  They were all running on the old, and I'm just trying to vMotion all three.  There is no difference in where they were powered on at all...this is why I'm puzzled.

I'm confused as to why I have to shut down VM-B - it was the one that moved successfully and is running without issue on the new host.  I need to move A and C still.  How can I power them on on the new host if I can't move them?  I had read about the EVC setting, and I guess I'll investigate that more...but it seems to be at a host/cluster level, and so it didn't (and still doesn't) explain to me why B moved fine.  I was hoping B had the key so I could just change something at a VM level for A and C.  😞

0 Kudos
sk84
Expert
Expert
Jump to solution

I'm confused as to why I have to shut down VM-B - it was the one that moved successfully and is running without issue on the new host.

Because EVC mode will downgrade your CPU feature set to the oldest CPU generation in your cluster. And this will only work if no VM uses the newer CPU features.

How can I power them on on the new host if I can't move them?

Just try it out. The VM inherits the CPU functions from the host when it is turned on, and the VM keeps this CPU feature set until it is turned off again. It seems that VM-A and VM-C currently have lower CPU capabilities than the host with the newer CPU requires. That's why they can't be moved. You could now also power-off VM-A and VM-C and power them on again on the new host. That would also temporarily solve your problem. But if you turn they off and on again on the older host in the future you would have the same problem again.

--- Regards, Sebastian VCP6.5-DCV // VCP7-CMA // vSAN 2017 Specialist Please mark this answer as 'helpful' or 'correct' if you think your question has been answered correctly.
0 Kudos
ThompsG
Virtuoso
Virtuoso
Jump to solution

Hi there,

With the VM powered down, there are no CPU instructions flying around when you move it. Therefore it will move happily and be ready to power on. Once powered on the instruction set of the host CPU will be available to the guest. This is how you move guests from AMD to Intel and vice-versa.

Now you can modify the VM to mask certain features (Change CPU Identification Mask Settings in the vSphere Web Client) but I wouldn’t contemplate this before using EVC. EVC is a more elegant solution and doesn’t involve rocket science... well it probably did in the VMware R&D teams but not for us.

There are other things to consider as well when enabling EVC like making sure VT and No Execute are enabled on the host.

Hope this helps.

0 Kudos
alwalters
Contributor
Contributor
Jump to solution

So I think what I'm understanding is that I can't move these while they're on (despite one working for an unknown reason)?  One of the two is our file server, which is very critical and also large, so it will take a fair amount of downtime to move it cold.  This is what I was hoping to avoid, but it sounds like there isn't an alternative.

I'm trying to get rid of the old host, so I'm not worried about anything being able to work on it again - I just need them off of there.  And was very much hoping to not have to shut them down to do it.

Or - just re-reading...are you saying I'd have to shut down VM-B because I'd need to shut down *anything* running on the newer instruction set to do the EVC settings?  Also not an easy option, since there are a number of other VMs running on that host (and the third host in the cluster, which has not been involved in any of this moving stuff).

To confirm then, it looks like the options are either 1. shut down the two I need to move and move them cold, or 2. shut down every other machine in the entire cluster to change the EVC settings.  Is that correct?

0 Kudos
sk84
Expert
Expert
Jump to solution

To confirm then, it looks like the options are either 1. shut down the two I need to move and move them cold, or 2. shut down every other machine in the entire cluster to change the EVC settings.  Is that correct?

Correct.

--- Regards, Sebastian VCP6.5-DCV // VCP7-CMA // vSAN 2017 Specialist Please mark this answer as 'helpful' or 'correct' if you think your question has been answered correctly.
0 Kudos
alwalters
Contributor
Contributor
Jump to solution

Unfortunate.  Okay, thank you for the clarification.

Still wish I understood why the one host moved while still live (never having been powered on on the new host), but...guess some mysteries are just unsolvable.  🙂

0 Kudos
ThompsG
Virtuoso
Virtuoso
Jump to solution

Also - when you are "moving" the VMs cold it should take no time at all as long as you are not moving across storage. Just changing the ESXi host will be ~1-5 seconds of registration time as that is what you are essentially doing, i.e. unregistering and registering again under the guise of vMotion.

As to the other - its as mentioned. The processor instruction set is different enough between the hosts that what ever is configured on the guest would not survive a vMotion.

From what you have said, all the VMs were initially powered up on the same host so in theory should have the same instruction set applied. In practice there are other things that can determine this with one being the VM hardware version. For example: in our environment I have some VMs (running HWv7) that are running as "Westmere" whereas HWv8 machines are "Sandybridge". The EVC cluster is currently configured as Sandybridge. Is it possible you have different VM HW versions? Are VMware tools all the same version?

0 Kudos
alwalters
Contributor
Contributor
Jump to solution

We use direct attached storage, though, so these will all be a storage migration as well.  Or - could I do the compute resource migration cold, and then move the storage via vMotion after?

Hmm...I wouldn't have thought they'd be different HW versions, since they were created on the same host in the same timeframe - however it wasn't *identical* timing, so I suppose there could have been a change at some point in the middle.  Will check (and the tools as well).  Thanks!

0 Kudos
ThompsG
Virtuoso
Virtuoso
Jump to solution

No you are right - without shared storage, it’s going to take a while for a cold migration. Cannot just migrate the compute alone Smiley Sad

I agree that I would have thought the HW version where the same but worth the check and would explain the issue. If it does come down to this then you could upgrade the HW version and then migration ”should” work. Test first if that is the case.

0 Kudos
alwalters
Contributor
Contributor
Jump to solution

Oh funny - the one that moved *is* a different HW version!  (10 instead of 😎  Must've been created just on either side of the 5.0 to 5.5 upgrade a while back.  I've set one of the others to upgrade on reboot; I just need to wait until after hours to do the reboot and will see what happens.  🙂  Thanks!  I knew they were all created pretty close to each other so didn't occur to me to check that.

0 Kudos
alwalters
Contributor
Contributor
Jump to solution

Woo - that worked!!!  (I snuck a reboot of the less critical server in over lunch)  On the first one, anyway, I was able to update the HW version and it passed compatibility checks, and now the vMotion is running just fine!  Hopefully the same is true of the file server, since it's the trickier one to keep down for any length of time.  Will update once I've been able to do that one...reboot needs to be scheduled further in advance than I fog away with on the other.  🙂

Thank you!!

0 Kudos
sk84
Expert
Expert
Jump to solution

I'm glad there was an easier way to fix this problem for you. But I'm surprised that EVC was not necessary and why it was the VM hardware version.

Which CPU models do the hosts have (please state the exact model name)?

And do the hosts have different ESXi versions installed?

--- Regards, Sebastian VCP6.5-DCV // VCP7-CMA // vSAN 2017 Specialist Please mark this answer as 'helpful' or 'correct' if you think your question has been answered correctly.
0 Kudos
ThompsG
Virtuoso
Virtuoso
Jump to solution

Awesome and glad it was as simple as that! Have a great day!

0 Kudos
alwalters
Contributor
Contributor
Jump to solution

I was moving them from a server with Xeon CPU E5-2430 (Sandy Bridge), to one with Xeon CPU E5-2650 v2 (Ivy Bridge).  Identical ESXi versions installed - 6.5.0, 8294253.

0 Kudos