MattGoddard
Enthusiast
Enthusiast

One VM fails to vMotion, other VMs vMotion just fine

I need to put a host into maintenance mode in my ESXi 6.7 U3 cluster. All the VMs migrated off of it just fine...except one, which consistently hangs at 21% and then fails.

The error stack in the task says:

  • vMotion migration [169412107:6390396849079088667] failed writing stream completion: Timeout.
  • Migration to host <10.25.6.18> failed with error Timeout (195887137).
  • vMotion migration [169412107:6390396849079088667] (16-77102179162616) failed to receive 4940/4940 bytes from the remote host <10.25.6.11>: Timeout.
  • Failed waiting for data. Error 195887137. Timeout.

And the reason in the relevant event is:

  • The virtual machine did not migrate. This condition can occur if vMotion IPs are not configured, the source and destination hosts are not accessible, and so on. Action: Check the reason in the event message to find the cause of the failure. Ensure that the vMotion IPs are configured on source and destination hosts, the hosts are accessible, and so on.

However, this is red herring because, as I said, all the other VMs were able to vMotion without issue.

Looking at the hardware, the only thing unusual I see is that it uses IDE disks, not virtual SCSI, but that shouldn't prevent vMotion, should it?

Any ideas how I can troubleshoot this? Unfortunately, I'm finding nothing useful online from that error stack info because all the advice pertains to when vMotion as a whole doesn't work on a host, which isn't relevant here.

0 Kudos
3 Replies
daphnissov
Immortal
Immortal

It's not necessarily a red herring. A vMotion operation moves the contents of all memory pages to another eligible host, and if that VM is "dirtying" memory pages at a rate faster than they can be drained, the operation may fail. Is this VM fairly memory intensive?

0 Kudos
MattGoddard
Enthusiast
Enthusiast

Is this VM fairly memory intensive?

Hardly at all:

[root@ ~]# free -g

              total        used        free      shared  buff/cache   available

Mem:             15           1          12           0           1          13

Swap:             7           0           7

There are 8 processes using around ~1% of memory each. Everything else is using less than that.

0 Kudos
daphnissov
Immortal
Immortal

That looks fairly innocent. You may want to inspect the VM's log file to see what gets written after the vMotion fails. Also check the vmkernel.log file on the source host.

0 Kudos