VMware Cloud Community
jonkirsch
Contributor
Contributor

vMotion failure handling

Hi all,

I need some help understanding the possible outcomes of a vMotion attempt in the face of a network failure.

Suppose that we're migrating a VM from Box A to Box B.  We've reached the point where we've paused the VM on Box A and Box B sends a message indicating that it has all of the required state.

From what I've found online, Box A will send a commitment message to B and release its state.  If this commitment message arrives at B, B will activate the VM and the migration will have completed successfully.

Here is my question: What happens if the commitment message from A to B is lost due to a network failure?  Will either of the VMs resume?

In general, what property is guaranteed during the migration in the face of failure (e.g., exactly one VM will resume?  At most one VM will resume?)?

Thanks,

Jon

0 Kudos
3 Replies
a_p_
Leadership
Leadership

I can't tell you what exactly happens in the last stage of vMotion and what if the last notification gets lost. However, you will always have only one VM running because of the disk locking. Unless the vMotioned VM on the target host can access the virtual disk, it won't be able to work.

André

0 Kudos
sflanders
Commander
Commander

Andre is correct. Due to disk locking only one VM can be active at a time. Because no hardware failure occurred, the VM will remain active. Unless all messages are successfully transfered the VM will remain on host A. The pause that is used for the final VMotion will time out and the VM will resume on the old host. Assuming a production environment, any single network issue should not result in an outage due to redundancy so hopefully this specific use-case will never be experienced Smiley Happy

Hope this helps! === If you find this information useful, please award points for "correct" or "helpful". ===
0 Kudos
vickey0rana
Enthusiast
Enthusiast

if the commitment msg fails for VM during final step then Vmotion fails for the VM and it keeps running on old host as old host already having all memory bitmaps for VM . all other changes on target host are vanished in this case...

but as you can see in above 2 post.. it general do not happened in virtual environment as we have teamed the vmkernal vmotion port groups in more than 2 NIC card and production environment is not as per our hypothetical approach.. but I must say its a good thought....

Cheerzzzz!!!!

---------------------------------------------------------------- If you found this or any other answer helpful, please consider to award points. (use Correct or Helpful buttons) BR, Ravinder S Rana
0 Kudos