VMware Cloud Community
ecleppe
Contributor
Contributor

VMotion not working esxi5 hosts

Hello all,

I don't know what I'm doing wrong, but I can't get vmotion to work on ESXi 5 hosts.  Note that I'm using HP blade servers with HP Flex-10 modules where all server slots have the same storage & network configuration.

I have tried to install a host from scratch and to upgrade one using update manager.  What I also noted is that with the upgrade the vmkernel port designated for vmotion was removed and replaced by a standard vswitch with a vmotion port group.

Weird thing is that I can't also vmkping the vmotion IP addresses of the ESXi5 hosts from my other ESX 4 hosts in the same hardware enclosure.


This is the error.

The vMotion migrations failed because the ESX hosts were not able to connect over the vMotion network.  Check the vMotion network settings and physical network configuration.
vMotion migration [172165841:1315998436308855] failed to create a connection with remote host <10.67.10.200>: The ESX hosts failed to connect over the VMotion network
Migration [172165841:1315998436308855] failed to connect to remote host <10.67.10.200> from host <10.67.10.209>: Timeout
The vMotion failed because the destination host did not receive data from the source host on the vMotion network.  Please check your vMotion network settings and physical network configuration and ensure they are correct.

I'm 100% sure the physical network connection is ok, otherwise my ESX4 hosts won't work either.

I have attached 2 documents with screenshots of the vswitch setup of my esx 4 and esxi 5 host.

Any suggestions?

Thanks

Erik

Reply
0 Kudos
16 Replies
PduPreez
VMware Employee
VMware Employee

Hi

The fact that you cannot ping the other hosts is troublesome

All I can think off is delete the entire vSwitch from ESXi5, create a new VMKernal with the VMotion tickbox selected.

Something might have gone wrong in the upgrade.

My Guess is your management network checks out fine?

regards

Pieter

Please award points if you find the helpful or correct Smiley Happy

Reply
0 Kudos
ecleppe
Contributor
Contributor

Hi,

Thanks for answering, but I actually did 2 installs.  One upgrade & one new install, both having the same issue.

I suspect that there might be an issue with HP's virtual connect & ESXi 5.

Erik

Reply
0 Kudos
roconnor
Enthusiast
Enthusiast

ecleppe

I see you are using a separate vSwitch for VMotion with a VMotion vmkernel portgroup and a distinct IP. I did the same (as we have been doing with our ESX 3.5 4.1 hosts for years)

This is a screen shot of what I normally do, two portgoups on vswitch0 (Service console (called management network here) and VMotion), each with own IP.

1827189.png

In ESXi5 vMotion hangs and fails at 9%...

Checked and played about with the vSwitch NIC teaming, i5 is happy for you to only have one nic for host management, so try removing the extra vmnic for now.

I removed the VMotion portgroup and enabled VMotion on the Management Network portgroup (this is a new feature I think) . Works a treat, and makes sense as traffic only ever passes through the nics added to the vSwitch.

Let me know if it helps,

I'll be looking at how to use a separate uplink for VMotion, and will add it here

rgds

shawnmininger
Contributor
Contributor

I am running into this issue as well.  I have tried every combination of vmkernel ports and no luck.

Reply
0 Kudos
ecleppe
Contributor
Contributor

Thanks for the nice reply, but sadly enough this would not be an option in my scenario.

My vmotion nics, don't have an uplink to my core switches, but network connectivity remains within the blade center enclosure.

So basically vmotion traffic travels over the virtual connect modules, but doesn't leave the enclosure.

This was a valid design in vsphere 4 & 4.1.

Reply
0 Kudos
roconnor
Enthusiast
Enthusiast

Shawn

If you are running your lab inside another ESX, then don't forget to set the vSwitches on the physical host to accept promiscuous mode.

This fixed my issue with using seperate vMotion port groups.

ecleppe

Can you configure anything on blade interconnects? I would seem that is where you issue is, I just tried the separate vSwitch, vk portgroup and IP and it now works for me, so it looks like the networking within the blade.

Reply
0 Kudos
shawnmininger
Contributor
Contributor

I am running the ESXi hosts as embedded VMs running on an ESXi 5 host.  I set the switch to promiscuous when I first created it, all of the port groups show promiscuous 'accept'.

Very frustrating. 

Reply
0 Kudos
roconnor
Enthusiast
Enthusiast

Hi Shawn

My physical host is esx 4.1, maybe your problem is the esx 5 host

Have a look at this

http://communities.vmware.com/message/1819906

Reply
0 Kudos
ecleppe
Contributor
Contributor

I noticed a difference between my ESX 4 and ESXI 5 hosts.

For the vmotion network switch (vmkernel) on ESX4, I was able to leave the default gateway blank.

In the new version ,it gets filled in automatically.. and even if I remove it, it always comes back.


That might be the issue, as I can't have a default gateway in there.   But how do I get rid of that ...


ESXI5

1827189_1.png

ESX4

Reply
0 Kudos
ecleppe
Contributor
Contributor

Problem solved.

I have put in a non-existing IP range for use with the vmotion kernel switch.  That way the ESX host won't use the known gateway he has from the management interface.

Now it routes to 0.0.0.0 , how it has to be !

Reply
0 Kudos
jaysonhc
Contributor
Contributor

Hi, I am kind of newbie with networking on vSphere

could you tell how to put in a non-existing IP range for use with the vmotion kernel switch?

thanks,

J

Reply
0 Kudos
hbaalen
Contributor
Contributor

Just enter an IP-adress that is not in use in your vmotion backbone.

I use 10.10.0.1-10.10.0.100 for my vmotion backbone and entered 10.10.0.254 as gateway. (As you can see in the attached file)

After I did just that it worked perfect.

Hans

Reply
0 Kudos
john_qvortrup
Contributor
Contributor

I had a similar problem. My problem was solved when I changed the ports on the physical switch to trunk ports containing all vlans, vlan 1 on cisco.

I had configured the physical ports to only tag the vmotion vlan.

The funny thing was that vmkping on the ip was possible from the console on the Hosts.      

Reply
0 Kudos
computerguy7
Enthusiast
Enthusiast

@hbaalen is 10.10.0.254 a valid gateway? what is the management (not vMotion) ip address of that host? You can only have one gateway per host NOT one gateway per vmknic, right?

I am having vMotion troubles on ESXi 5.0

http://communities.vmware.com/thread/339242

Reply
0 Kudos
waddells
Contributor
Contributor

I had the same issue, going from some Cisco UCS clusters to a Dell R810 Cluster.  vMotion was working fine amongs the cluster, but I could not vMotion across the clusters.  I fixed it by adding a VLAN ID that corresponded to the IP range setup by my LAN engineer for each vMotion vmk in the Port Properties section

Hope this helps someone.

Reply
0 Kudos
irfans
Contributor
Contributor

I faced same issue and later I found our both esx vmotion IP were same that make conflict of vmotion IP and it get fail with same error you are facing.

After correcting IP all got sorted.

Reply
0 Kudos