Hello,
I recently had a vmotion get stuck in progress for over an hour when the high cpu resources of a host caused the esx host to become unresponsive to VC. I tried restarting the vc services, I rebooted VC, and I removed and re-connected the host from vc, with no luck. After all of that the vmotion was still saying in progress and the vm itself began to have resource issues.
I had to migrate all of the vm's off of the host, kill the one running vm that was stuck agressively, and then reboot the server to finally orphan the migration and bring the server back up.
VMware told me that the issue was with resource group reservations and that they really shouldn't be set, the reservations were causing the cpu resources to not be allocated efficiently on the esx host.
Has anyone else seen a migration get stuck and not be able to kill it? OR if you did have a stuck migration and were able to kill it, could you help me out and let me know what you did you stop it. I restarted the vc service and stopped and started the vmware-vpxa service as well as disconnected the host from the vc server all with no luck. The only way I could get the process to orphan was to reboot.
BTW, While the migration was stuck I was able to power off the vm but could no longer power it back on.
Thanks,
The inability to power the VM back on is usually due to one of two things:
1) There's an active lock on the VM, particularly the vswp swapfile.
2) There's an active process on either the origin or destination ESX server (the two servers involved in the VMotion exchange) that's holding the .vmx and/or virtual machine open.
You can usually do a:
vmware-cmd /full/path/to/vmx/file.vmx getstate
vmware-cmd /full/path/to/vmx/file.vmx stop hard
to figure out what's happening.
Paul
I figured there was a lock on something which is why I couldn't power it on and that's fine. But, I couldn't kill the migration in progress even after stopping the vm, restarting the vc service and server, restarting the mgmt-vmware service on the esx servers, restarting the vmware-vpxa service, removing the server from vc ...
I only removed one of the servers maybe I should have tried removing them both?
Does anyone know what process handles the vmotion in progress?
Hello,
A vMotion that gets stuck in the middle could be related to a SCSI Reservation Conflict. YOu will want to review your /var/log/vmkernel and /var/log/vmkwarning files for these types of issues. These can set locks on remote data stores that then need to be cleared up.
Best regards,
Edward