We are tried perform patching at environment at cluster level.( by VUM )
We have 4 hosts in cluster.
hosts: ESXi 6u1
Example name of Hosts:
CMP1
CMP2
CMP3
CMP4
We start install patches on 1st host with successful.
During vMotion from second host and now 4th host ( We dont tested yet cmp3 ) tasks stuck at 13%
Some advices ?
Can you check the following logs after attempting a vMotion.
On the source ESXi host:
/var/log/hostd
/var/log/vmkernel
And the vmware.log file for one of the virtual machines stuck at 13%.
Please check Vmotion network between your Esxi Host. at 13-14% Vmotion checks network connection (VMkernal Port enable for vMotion) if that is not configure properly it waits for some time and then fail.
You can check IP configuration and Network Connection like cable.
Normally, stuck at 14% and 14% is due to a vMotion network connectivity. Try vmkping is getting succeed and the vmotion Network connectivity.
I am having the exact same issues. If I manually vMotion between hosts it works fine but if "System" is the initiator it will sit at 13%. Sometimes the task will clear/fail after 24 hours. I have two vCenter 6.0u2 environments and this is happening in both. I have also have a ticket open with VMware however I have no solution from them yet.
Were you able to figure out what was wrong?
I think I found the solution to at least my problem. We have NSX deployed in both of the environments I'm having issues in. We have two Service Deployments(Networking and Security -> Installation -> Service Deployments). They are Guest Introspection and Trend Micro Deep Security. If either of those deployments show anything but "Succeeded" or "Up" under "Installation Status" or "Service Status" a System based vMotion freezes at 13%. Once the status is corrected the vMotions finish as usual. I'm not sure exactly what is going on but maybe someone on VMware's NSX team would have the answer.
Hopefully this help you.
This seemed to help at the time. However, NSX and Deep Security are running correctly and I am still having the issue. So update from VMware yet.
Pls let me know if you have any update from VMware for this issue
This was an NSX issue. I had to re-deploy the guest introspection appliances. Once I did that then the issue of getting stuck at 13% stopped.
Please try to Restart hostd and vcenter service,
Please let me know further@ !
Hi Joe - just wanted to say a MASSIVE thank you for this; saved my bacon on a change tonight!
We are using Guest Introspection and McAfee MOVE and were in exactly the same boat with guests stuck at 13% migrating.
When I went into the section you pointed out (Networking and Security -> Installation -> Service Deployments) and clicked the 'Resolve' icon next to each entry, it sorted everything out and the vMotions completed successfully
Many thanks again!
We had the exact problem, vCenter 6 U2 (P03) with NSX Guest Introspection and TrendMicro.
The problem started after updating vCNS to NSX 6.2.4, VMware support didn’t know what to do.
After updating NSX from 6.2.4 to 6.3.1 and update the Guest Introspection agents – the problem resolved!
BTW - it also resolved some VUM problem we had but we don't sure if its related.
Removing Guest Introspection (6.2.4) from the cluster immediately resolved our issue as well. We removed Guest Introspection while hosts had vMotions stuck with the Invoking callbacks and as soon as Guest Introspection was removed the vMotion completed successfully.
We are upgrading to 6.3.1 to see if it resolves the issue with Guest Introspection.
Hi,
I came across a exact same case as you ,have you figured out the root causes for this ?
This helped me out! I had a problem with Trend DSVA, GI was fine. I coudl not resolve it, so I deleted Trend from teh cluster to fix and once removed, the vmotions and power ons work fine. I'll fix Trend now via redeploy and bam! Bobs your uncle.
NSX issue for us as well, as soon as we removed Guest Introspection all stuck jobs immediately completed.
Same issue, thanks Google for brining me here.
I removed Trend and GI then it worked. However, I put back GI, adn immediately had stuck VM's again at 13%. So Trend isn't the culprit for me.
I'll have to remove GI again and contact NSX support.