4 Replies Latest reply on Mar 27, 2019 12:43 PM by karanbehl

    vMotion not working, cannot vmkping vMotion ports

    dlogan Lurker

      I am having a problem with the vMotion interface, for some reason I cannot see the vMotion vswitch on the other machine (doesn't matter which machine in the cluster I am on) so as a result, when I try to migrate a VM, I get the following error message"

       

      "The vMotion migrations failed because the ESX hosts were not able to connection over thevMotion network. Check the vMotion network settings and physical network configuration. vMotion migration [-1978010937:1299041522605750] failed to create a connection with remote host <10.0.0.234>: The ESX hosts failed to connection over the VMotion network Migration [-1978010937:1299041522605750] failed to connect to remote host <10.0.0.234>: Timeout"

       

      The KB article states this could be because of network misconfiguration and gives some suggestions, which I've followed, but I cannot see what I've done wrong.

       

      I have vSwitch3 set with a PortGroup of vMotion (this is identical on both machines) and using vmnic7. I can use vmkping to ping the local interface so I know that ESX server is listening on that IP/Port but can't vmkping the remote server (this is the same symptom on both servers). I can ping all points along the route to the other machine, eg: both gateways are available from the source host and from the destination host but once I move to vmkping, it shuts up shop on the other server.

       

      I have one Service Console vSwitch which is shared with the VM's, one for ISCSI traffic, a separate Service Console and a vMotion switch. The servers are available via both Service Consoles and storage is visible in the vCentre server so the iSCSI vmkernel ports are working fine.

       

      [root@uts-arcs-esx41-svr02 ~]# vmware -v
      VMware ESX 4.1.0 build-260247
      [root@uts-arcs-esx41-svr02 ~]# vmware -l
      VMware ESX 4.1.0 GA

       

      I have two networks setup 10.0.0.192/27 and 10.0.0.224/27, I've set it up according to the documentation that I could find.

       

      [root@uts-arcs-esx41-svr02 ~]# esxcfg-vswif -l
      Name     Port Group/DVPort   IP Family IP Address                              Netmask                                 Broadcast        Enabled   TYPE
      vswif0   Service Console 192/27IPv4      10.0.0.219                          255.255.255.224                         10.0.0.223   true      STATIC
      vswif1   Service Console 224/27IPv4      10.0.0.228                          255.255.255.224                         10.0.0.255   true      STATIC

       

      [root@uts-arcs-esx41-svr02 ~]# esxcfg-vswitch -l
      Switch Name      Num Ports   Used Ports  Configured Ports  MTU     Uplinks
      vSwitch0         128         7           128               1500    vmnic0,vmnic2,vmnic3

       

        PortGroup Name        VLAN ID  Used Ports  Uplinks
        VM Network            0        2           vmnic0,vmnic2,vmnic3
        Service Console 192/27  0        1           vmnic0,vmnic2,vmnic3

       

      Switch Name      Num Ports   Used Ports  Configured Ports  MTU     Uplinks
      vSwitch1         128         5           128               1500    vmnic1,vmnic5

       

        PortGroup Name        VLAN ID  Used Ports  Uplinks
        iSCSI2                0        1           vmnic5
        iSCSI1                0        1           vmnic1

       

      Switch Name      Num Ports   Used Ports  Configured Ports  MTU     Uplinks
      vSwitch2         128         3           128               1500    vmnic4

       

        PortGroup Name        VLAN ID  Used Ports  Uplinks
        Service Console 224/27  0        1           vmnic4

       

      Switch Name      Num Ports   Used Ports  Configured Ports  MTU     Uplinks
      vSwitch3         128         3           128               1500    vmnic7

       

        PortGroup Name        VLAN ID  Used Ports  Uplinks
        vMotion               0        1           vmnic7

       

      [root@uts-arcs-esx41-svr02 ~]# esxcfg-nics -l
      Name    PCI           Driver      Link Speed     Duplex MAC Address       MTU    Description
      vmnic0  0000:01:00.00 bnx2        Up   1000Mbps  Full   b8:ac:6f:9a:55:3f 1500   Broadcom Corporation PowerEdge R710 BCM5709 Gigabit Ethernet
      vmnic1  0000:01:00.01 bnx2        Up   1000Mbps  Full   b8:ac:6f:9a:55:41 1500   Broadcom Corporation PowerEdge R710 BCM5709 Gigabit Ethernet
      vmnic2  0000:02:00.00 bnx2        Up   1000Mbps  Full   b8:ac:6f:9a:55:43 1500   Broadcom Corporation PowerEdge R710 BCM5709 Gigabit Ethernet
      vmnic3  0000:02:00.01 bnx2        Up   1000Mbps  Full   b8:ac:6f:9a:55:45 1500   Broadcom Corporation PowerEdge R710 BCM5709 Gigabit Ethernet
      vmnic4  0000:07:00.00 bnx2        Up   1000Mbps  Full   00:10:18:98:7b:e8 1500   Broadcom Corporation Broadcom NetXtreme II BCM5709 1000Base-T
      vmnic5  0000:07:00.01 bnx2        Up   1000Mbps  Full   00:10:18:98:7b:ea 1500   Broadcom Corporation Broadcom NetXtreme II BCM5709 1000Base-T
      vmnic6  0000:08:00.00 bnx2        Up   1000Mbps  Full   00:10:18:98:7b:ec 1500   Broadcom Corporation Broadcom NetXtreme II BCM5709 1000Base-T
      vmnic7  0000:08:00.01 bnx2        Up   1000Mbps  Full   00:10:18:98:7b:ee 1500   Broadcom Corporation Broadcom NetXtreme II BCM5709 1000Base-T

       

      [root@uts-arcs-esx41-svr02 ~]# esxcfg-vmknic -l
      Interface  Port Group/DVPort   IP Family IP Address                              Netmask         Broadcast       MAC Address       MTU     TSO MSS   Enabled Type
      vmk0       iSCSI1              IPv4      192.168.242.50                          255.255.255.0   192.168.242.255 00:50:56:79:44:ff 1500    65535     true    STATIC
      vmk1       iSCSI2              IPv4      192.168.242.51                          255.255.255.0   192.168.242.255 00:50:56:7f:93:d9 1500    65535     true    STATIC
      vmk2       vMotion             IPv4      10.0.0.199                          255.255.255.224 10.0.0.223  00:50:56:76:1a:b4 1500    65535     true    STATIC

       

      [root@uts-arcs-esx41-svr02 ~]# esxcfg-route -l
      VMkernel Routes:
      Network          Netmask          Gateway          Interface
      10.0.0.192        255.255.255.224  Local Subnet     vmk2
      192.168.242.0    255.255.255.0    Local Subnet     vmk0
      default          0.0.0.0          10.0.0.193   vmk2

       

      [root@uts-arcs-esx41-svr02 ~]# ping 10.0.0.193
      PING 10.0.0.193 (10.0.0.193) 56(84) bytes of data.
      64 bytes from 10.0.0.193: icmp_seq=1 ttl=64 time=0.686 ms
      64 bytes from 10.0.0.193: icmp_seq=2 ttl=64 time=0.586 ms

       

      --- 10.0.0.193 ping statistics ---228 ms
      2 packets transmitted, 2 received, 0% packet loss, time 999ms
      rtt min/avg/max/mdev = 0.586/0.636/0.686/0.050 ms

       

      [root@uts-arcs-esx41-svr02 ~]# ping 10.0.0.225
      PING 10.0.0.225 (10.0.0.225) 56(84) bytes of data.
      64 bytes from 10.0.0.225: icmp_seq=1 ttl=64 time=5.39 ms
      64 bytes from 10.0.0.225: icmp_seq=2 ttl=64 time=0.739 ms
      64 bytes from 10.0.0.225: icmp_seq=3 ttl=64 time=0.604 ms

       

      --- 10.0.0.225 ping statistics ---
      3 packets transmitted, 3 received, 0% packet loss, time 2004ms
      rtt min/avg/max/mdev = 0.604/2.247/5.398/2.228 ms

       

      [root@uts-arcs-esx41-svr02 ~]# vmkping 10.0.0.199
      PING 10.0.0.199 (10.0.0.199): 56 data bytes
      64 bytes from 10.0.0.199: icmp_seq=0 ttl=64 time=0.056 ms
      64 bytes from 10.0.0.199: icmp_seq=1 ttl=64 time=0.029 ms
      64 bytes from 10.0.0.199: icmp_seq=2 ttl=64 time=0.039 ms

       

      --- 10.0.0.199 ping statistics ---
      3 packets transmitted, 3 packets received, 0% packet loss
      round-trip min/avg/max = 0.029/0.041/0.056 ms

       

      [root@uts-arcs-esx41-svr02 ~]# vmkping 10.0.0.234
      PING 10.0.0.234 (10.0.0.234): 56 data bytes

       

      --- 10.0.0.234 ping statistics ---
      3 packets transmitted, 0 packets received, 100% packet loss

       

      Any help or pointers would be most appreciated.

       

      Thanks

      David

        • 1. Re: vMotion not working, cannot vmkping vMotion ports
          ThompsG Master

          Hi David,

           

          Wowsa that's a complicated setup!

           

          Looking through your configuration I have come up with the following information:

           

          uts-arcs-esx41-svr02

          vSwitch 0 - Service Console 1 (10.0.0.219/27) - vlan 0

          vSwitch 1 - will ignore

          vSwitch 2 - Service Console 2 (10.0.0.228/27) - vlan 0

          vSwitch 3 - vmotion (10.0.0.199) - vlan 1

           

          Now if I read the rest of the post correctly you have a second ESX server with, I'm assuming, a vmotion address of 10.0.0.234 - am I correct so far?

           

          Some things you might want to look at is that your vmotion network is on the same subnet as your Service Console 1. This probably means you have the same gateway specified and therefore traffic is probably being routed over the Service Console 1 network since the vmotion address you are trying to connect to is in a different subnet. Then you have different vlans, i.e. none and vlan 1.

           

          Any reason you cannot make the vmotion network a different network ID – say 192.168.1.0?

           

          Kind regards.

          • 2. Re: vMotion not working, cannot vmkping vMotion ports
            dlogan Lurker

            Hi Glen,

             

            Many thanks for the suggestions, I've made the vMotion VMK address in the same subnets as Service Console 1 on both machines. Yes, it is fairly complex but I need maximum redundancy due to data access requirements Means it is fun getting it sorted out but I'm a bit stumped on this one.

             

            Yes, the second server has the vMotion Port address of 10.0.0.234. This is vmkpingable (new word ) from the second server (10.0.0.229).

             

            Both servers have a Service Console in each of the subnets allowing a complete switch/routing failure and still ensuring access to the Service Consoles.

             

            I'll check the VLAN's as I decided to leave those as yet another layer of complexity and perhaps I've mucked that bit up. It is enough for me to look at the moment. Maybe when I've more experience with VMware I might use those

             

            Thanks and regards

            David

            • 3. Re: vMotion not working, cannot vmkping vMotion ports
              dlogan Lurker

              Hi Glen,

               

              The VLAN id's are all 0 on all portgroups and all vSwitches. I think the tabs make it a bit more difficult to read in the post.

               

              Thanks

              David

              • 4. Re: vMotion not working, cannot vmkping vMotion ports
                karanbehl Lurker

                Dear David,

                 

                As per the configuration in your post, we could identify the below vmkernel nic for vmotion has been created on the port group "Service Console 192/27".

                vmk2       vMotion             IPv4      10.0.0.199

                However, There is no such vmkernel nic created for vmotion for the network "Service Console 224/27" network.

                 

                Hence , you are unable to ping the below ip as per vmkping as there is no vmk nic created.

                 

                vmkping 10.0.0.234. (vmkping is specifically for vmkernel adapter).

                 

                But Still , you will be able to ping all the other IPs in the subnet "10.0.0.224/27" like

                 

                ping 10.0.0.225

                PING 10.0.0.225 (10.0.0.225) 56(84) bytes of data.

                64 bytes from 10.0.0.225: icmp_seq=1 ttl=64 time=5.39 ms

                64 bytes from 10.0.0.225: icmp_seq=2 ttl=64 time=0.739 ms

                64 bytes from 10.0.0.225: icmp_seq=3 ttl=64 time=0.604 ms

                 

                Please let me know for any queries and clarifications or If i am not able to comprehend the network properly. Thanks.

                 

                Regards,

                Karan Behl