VMware Cloud Community
pingnithin
Enthusiast
Enthusiast
Jump to solution

vMotion not working

Hi,

I'm trying to setup vmotion in our new vCenter.

My initial setup and findings (Testing for a 2 host setup)

I've created a vmotion portgroup (new vSwitch and a seperate physical NIC) in the same network as the management (192.168.1.xx/24) and gave a VLAN id to the vMotion portgroup. That didn't work out. After some googling, found that if vMotion and management are in same subnet, the vmotion network will use the first vmnic (ie, vmnic0, management network). And for that reason if vmotion portgroup is given a VLAN id, the communication will not be proper. So I removed the VLAN id, and it worked. But that made the new vmnic1 stand as unutilized.

The current setup

Since I want my vmotion traffic to be in a seperate vmnic (vmnic1), I've created a new network (10.128.88.0/25) and assigned the IPs 10.128.88.10 (host1) and 10.128.88.20 (host2) to two vmotion portgroups of the two hosts. I've also used the VLAN id 25 for both the vmotion portgroups.

But the the command vmkping -I vmk1 10.128.88.20 failed in host 1 and vmkping -I vmk1 10.128.88.10 failed in host 2. And when I removed the VLAN id, the ping is working as expected.

Can someone help me with the reason for this behaviour. Is my vmotion still working through the management network (vmnic0) ?

Or is there any other best practice that I missed to follow ?

Regards,

Nithin

Nithin Radhakrishnan www.systemadminguide.in
1 Solution

Accepted Solutions
JPM300
Commander
Commander
Jump to solution

If you get into the console on that host can you do a vmkping to the ip address of the other vMotion address.  Another way you could troubleshoot this is as your network admin to provision you 1 more port on the switch with the same VLAN 71, then put an ip address in the same range as your vmotion with VLAN 71 on your nic on a laptop, jack that into the new port the Network Admin provisioned you and see if you can ping both vmotion addresses.  If you can't its a switching problem.

Can we get a screenshot of the other host as well?

also:

"But the the command vmkping -I vmk1 10.128.88.20 failed in host 1 and vmkping -I vmk1 10.128.88.10 failed in host 2. And when I removed the VLAN id, the ping is working as expected."

This makes sence before as if the port was in access mode you don't have to put a VLAN ID on your port group as access mode was making anything plugged into those ports VLAN 71.  If you try the same thig now you shouldn't be able to vmkping each other with the VLAN removed now that the switches are in Trunk mode and passing VLAN 71

View solution in original post

14 Replies
DavoudTeimouri
Virtuoso
Virtuoso
Jump to solution

Hi,

When you are creating a management network, you have a check box that if you checked it, vMotion traffic will out/in through the network.

So if you checked it on first MGMT network, vMotion are working with that. Otherwise vMotion traffic can in/out through that.

I think, you need to check your physical switch configurations and especially the VLAN configuration.

Also please share more information about your vSwitches configurations with some screenshots.

-------------------------------------------------------------------------------------
Davoud Teimouri - https://www.teimouri.net - Twitter: @davoud_teimouri Facebook: https://www.facebook.com/teimouri.net/
Reply
0 Kudos
pingnithin
Enthusiast
Enthusiast
Jump to solution

Hi Davoud,

vMotion traffic is not enabled on the management network nic. vSwitch config attached.

The physical switch is configured with VLAN 71.

vmotion2.JPG

vmotion3.JPG

Nithin Radhakrishnan www.systemadminguide.in
Reply
0 Kudos
DavoudTeimouri
Virtuoso
Virtuoso
Jump to solution

Hi,

It's fine. What's you "vMotion" VMkernel port gateway?

Try to ping vMotion port gateway in ESXi shell, if vmk1 can ping its gateway when VLAN is configured for that, your configuration on physical switch is correct.

Because you are using this range for your IP addressing: 10.128.88.1 - 10.128.88.126 and both VMkernel ports are using same gateway, there is no problem about IP configuration.

If ping test was not successful then you have to check your VLAN configuration.

Read this article for configuring vMotion on ESXi and best practices: VMware KB: Creating a VMkernel port and enabling vMotion on an ESXi/ESX host

-------------------------------------------------------------------------------------
Davoud Teimouri - https://www.teimouri.net - Twitter: @davoud_teimouri Facebook: https://www.facebook.com/teimouri.net/
Reply
0 Kudos
pingnithin
Enthusiast
Enthusiast
Jump to solution

Hi,

We don't have a seperate gateway for vMotion. vMotion currently uses the gateway of the mgmt n/w (192.168.1.3). What I could understand is that all the vmkernel ports have a single gateway. Doesn't matter what the IP range is, the gateway is that of the management network. If I try to change gateway for vMotion n/w that changes the gateway for all other vmkernel ports as well.

If I try to vmkping to 192.168.1.3 using vmk1 port, it gives the reply network is unreachable.

But is the gateway concept is of much relevance in this context since the vmotion network doesn't need to have external communication ?

Do I need to configure any static route ?

Thanks

Nithin Radhakrishnan www.systemadminguide.in
Reply
0 Kudos
a_p_
Leadership
Leadership
Jump to solution

The default gateway should not matter since the vMotion IP addresses are in the same subnet.

Your setup should basically work, provided the physical switch ports are configured properly. If you want to set the VLAN ID on the port groups, you need to configure the switch ports as trunk/tagged ports (802.1Q). If the switch ports are configured as access/untagged ports, then do not set the VLAN ID on the port groups.

André

pingnithin
Enthusiast
Enthusiast
Jump to solution

When I checked with the Network admin, the port was configured as Access port. The port is now configured as Trunk port.

But that didn't made any difference in the network communication.

Nithin Radhakrishnan www.systemadminguide.in
Reply
0 Kudos
schepp
Leadership
Leadership
Jump to solution

In your first post you mentioned VLAN ID 25, your screenshot says 71.

What VLAN ID did your network admin configure on both ports?

As it's trunked now, you have to set the VLAN ID in the vMotion VMkernel.

PS: Did you remove the vMotion-checkbox from the management interface?

Tim

Reply
0 Kudos
pingnithin
Enthusiast
Enthusiast
Jump to solution

It was a typo. The actual VLAN is 71. The network admin configured the VLAN as 71.

The vMotion checkbox is not checked for the management networks.

Nithin Radhakrishnan www.systemadminguide.in
Reply
0 Kudos
schepp
Leadership
Leadership
Jump to solution

Well, if you

- configured VLAN 71 on both vMotion vmks,

- the network admin trunked it on both switch ports,

- your subnet mask settings are correct (255.255.255.128)

- and of course your network admin isn't joking with you 😉

you should be able to vmkping the vmks.

Reply
0 Kudos
King_Robert
Hot Shot
Hot Shot
Jump to solution

Cause

There are several possible reasons for this failure:

  1. If there are multiple vmkernel ports in the same network, the ESX/ESXi host may not use the vmkernel port checked off for vMotion when using the vmkping command. The host uses the vmkernel port associated with that IP subnet in its routing/forwarding table. If there is a physical switch configuration problem, vmkping may show connectivity as working correctly, but the actual vmkernel port associated with vMotion may not have access to that IP network on the physical network.

  2. The incorrect vmkernel interface may be selected for vMotion. The ESX/ESXi host uses only the selected interface for vMotion. If a vmkernel interface is in the incorrect IP subnet, or if the physical network is not configured correctly, the vMotion vmkernel interface may not be able to communicate with the destination host.

  3. If using multi-NIC vMotion, the vmkernel interfaces may not all be in the same IP subnet. Multi-NIC vMotion in vSphere 5.x does not work correctly unless all vmkernel ports are in the same IP subnet and all checked off for vMotion in the vSphere Client

Resolution

To troubleshoot this issue, verify these conditions:

  1. Ensure all physical switch ports to be used for vMotion are configured for the correct VLAN, or have access to the correct IP subnet for vMotion on the physical network.
  2. Avoid the use of multiple vmkernel ports in the same subnet. The only exception being for iSCSI Multi-Pathing and Multiple-NIC vMotion in vSphere 5.x.
  3. Ensure vMotion is not enabled on multiple vmkernal port groups. For example, do not enable vMotion on both Management port group and vMotion port group.
  4. Ensure that the subnet mask is consistent across all hosts and ensure that there are no IP address conflicts in the vMotion network.
  5. Adhere to vMotion requirements and best practices.
  6. Check the ESX/ESXi host's routing table to determine which vmkernel interface is used for each IP subnet. If multiple vmkernel ports exist in the same IP subnet, the one listed in the table is used when using the vmkping command.
JPM300
Commander
Commander
Jump to solution

If you get into the console on that host can you do a vmkping to the ip address of the other vMotion address.  Another way you could troubleshoot this is as your network admin to provision you 1 more port on the switch with the same VLAN 71, then put an ip address in the same range as your vmotion with VLAN 71 on your nic on a laptop, jack that into the new port the Network Admin provisioned you and see if you can ping both vmotion addresses.  If you can't its a switching problem.

Can we get a screenshot of the other host as well?

also:

"But the the command vmkping -I vmk1 10.128.88.20 failed in host 1 and vmkping -I vmk1 10.128.88.10 failed in host 2. And when I removed the VLAN id, the ping is working as expected."

This makes sence before as if the port was in access mode you don't have to put a VLAN ID on your port group as access mode was making anything plugged into those ports VLAN 71.  If you try the same thig now you shouldn't be able to vmkping each other with the VLAN removed now that the switches are in Trunk mode and passing VLAN 71

a_p_
Leadership
Leadership
Jump to solution

What type (vendor/model) of switch do you use? Please ask your network admin to provide the configuration for the two physical switch port's used for vMotion. Btw. did you double-check that the network cables are plugged in into the correct ports?

Assuming it's a Cisco switch, provide the output of show run int Gig X/Y. Also ask the network admin to verify that VLAN 71 exists on the switch.


André

Reply
0 Kudos
pingnithin
Enthusiast
Enthusiast
Jump to solution

Tried the communication on both Switch ports by connecting two laptops. It didn't work !!!

The network admin is working on it. Will update this post after the switch setup.

Nithin Radhakrishnan www.systemadminguide.in
Reply
0 Kudos
JPM300
Commander
Commander
Jump to solution

Sounds good.  Let us know how it works out.  I figure once they fix the networking issue you shouldn't have any problems as the rest of the setup looks fine.

Reply
0 Kudos