VMware Cloud Community
chris_delaney
Enthusiast
Enthusiast

VMotion Issues - Timeout at 10%

Dear All,

I'm having trouble with a host that I've just upgraded from ESX 3.5 to ESX 4.1.  When running ESX 3.5 VMotions worked fine, however, now ESX 4.1 is on I'm getting timeout errors and VMs are not moving anywhere.

I've tried the various ideas in the VM knowlegebase articles but nothing is giving any clues.  vmware.log just says that there's a timeout error and the vmkernel log is much the same.

Interestingly, I have also got another host with identical hardware which I had to rebuild from scratch rather than upgrade as Update Manager was coming back with errors during remediation.  This, fully rebuilt, host appears to be fine.

Does anyone have any ideas about what I can check to get a little more information from the vmkernel to try and work out what's going on?

Many thanks.


Chris

0 Kudos
7 Replies
Troy_Clavell
Immortal
Immortal

don't know if you've seen the below article, but it may help

http://kb.vmware.com/kb/1003734

0 Kudos
chris_delaney
Enthusiast
Enthusiast

Hello Troy,

I've been through that article sadly and it hasn't made any difference.  It's really frustrating as there doesn't appear to be anything wrong configuration-wise...  I'm really banging my head against a brick wall at the moment 😕

Thanks for the suggestion though.

Cheers.

Chris

0 Kudos
weinstein5
Immortal
Immortal

Typically a failure at 10% is a network issue - I would examine your network configuration of the VMkernel ports you are using for vmotion and making sure you can ping between them -

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful
0 Kudos
chris_delaney
Enthusiast
Enthusiast

Hi Both,

Yes - I've tried all that.  Both normal and vmkping work fine between the hosts.  What I have noticed though is that the ping replies do not change when I kick off a VMotion - normally (between hosts that are working correctly) I see some lag whilst the move is taking place.  It's almost as if the troublesome host isn't even attempting to send anything to the destination one...

Cheers.

Chris

0 Kudos
shishir08
Hot Shot
Hot Shot

Follow this link,

IT has all the possible causes when VMotion can fail at 10%

http://www.vmwarewolf.com/vmotion-fails-at-10-percent/

0 Kudos
chris_delaney
Enthusiast
Enthusiast

Yes - that's one of the first things I went through to try and troubleshoot this issue.

I've logged a call with VMWare as I've hit a bit of a brick wall to be honest.

0 Kudos
chris_delaney
Enthusiast
Enthusiast

VMWare support came back with some really good suggestions including one that I really should have thought about myself - i.e. connect the two rogue hosts together with a crossover network cable.

That worked fine so proved that there was a network issue - I've subsequently changed the network config slightly to use a different VLAN for VMotion and everything seems to be fine now.

Very strange issue really as the rogue hosts were configured identically to the other hosts which worked fine - must have been the physical switches having a funny five minutes.

0 Kudos