Solved: Cannot get NIC teaming to work

Kmartin472 · ‎08-17-2015

I'm a total networking noob so please excuse me in advance.

I'm configuring a virtual environment with 3 ESXi 6.0 hosts all with 2 physical NICs connected to a single Catalyst 3560 switch. I have 4 VLANs trunked on a port channel and have the 2 NIC ports on the switch assigned to that channel. I cannot get the host to ping half the time if I have both NICs enabled and I cannot get more than 1 VM per host to ping out. Here is the config on the switch for one host:

interface Port-channel4

description Test Host 3 Port Channel

switchport trunk encapsulation dot1q

switchport trunk allowed vlan 2,5,102,104

switchport mode trunk

spanning-tree portfast trunk

!

interface GigabitEthernet0/5

switchport trunk encapsulation dot1q

switchport trunk allowed vlan 2,5,102,104

switchport mode trunk

channel-group 4 mode on

spanning-tree portfast trunk

!

interface GigabitEthernet0/6

switchport trunk encapsulation dot1q

switchport trunk allowed vlan 2,5,102,104

switchport mode trunk

channel-group 4 mode on

spanning-tree portfast trunk

Everything I have researched is telling me that I have everything configured correctly. Is there further configuration needed within the switch or ESXi 6.0 to enable LACP on this port channel? Any help would be much appreciated!

kclark2907 · ‎08-18-2015

Hi

As the other have said, it sounds like your teaming / config is wrong - refer to http://pubs.vmware.com/vsphere-60/topic/com.vmware.ICbase/PDF/vsphere-esxi-vcenter-server-60-network...

Page 23 for standard switches

Page 76 for distributed switches

Page 93 for policy config

Regards

Ken

View solution in original post

a_p_ · ‎08-17-2015

Did you set the teaming policy on the vSwitch/port group to "IP-Hash"?

André

jagdish_rana · ‎08-17-2015

Hi,

How did you configure network teaming on Vswtich/Dvswtich?

kclark2907 · ‎08-18-2015

Hi

As the other have said, it sounds like your teaming / config is wrong - refer to http://pubs.vmware.com/vsphere-60/topic/com.vmware.ICbase/PDF/vsphere-esxi-vcenter-server-60-network...

Page 23 for standard switches

Page 76 for distributed switches

Page 93 for policy config

Regards

Ken

Kmartin472 · ‎08-18-2015

I cannot even get the host to ping half the time. When it does and I set the IP hash load balancing policy on the vSwitch, only one VM will ping out.

There is obviously something misconfigured on the physical switch. I have 3 hosts in the cluster and currently one is not pinging at all but when I switch the cables on the physical switch, it pings.

Thank you for the document. I will read over it but this appears to be something on the physical switch. The physical switch is on a 192.168.1.x network (VLAN 2) while the host is on the 104 network (VLAN 104).

kclark2907 · ‎08-18-2015

Can you post the output of the following for the connected ports

show interface status

show mac address-table int ****** (both interfaces)

CDP output from the vswitch for both adapters

management config of the host

Kmartin472 · ‎08-18-2015

#sh int status

Port Name Status Vlan Duplex Speed Type

Gi0/1 connected trunk a-full a-1000 10/100/1000BaseTX

Gi0/2 connected trunk a-full a-1000 10/100/1000BaseTX

Gi0/3 connected trunk a-full a-1000 10/100/1000BaseTX

Gi0/4 connected trunk a-full a-1000 10/100/1000BaseTX

Gi0/5 connected trunk a-full a-1000 10/100/1000BaseTX

Gi0/6 connected trunk a-full a-1000 10/100/1000BaseTX

Gi0/7 connected 1 a-full a-100 10/100/1000BaseTX

Gi0/8 connected 1 a-full a-100 10/100/1000BaseTX

Gi0/9 connected 1 a-full a-100 10/100/1000BaseTX

Gi0/10 connected 1 a-full a-100 10/100/1000BaseTX

Gi0/11 notconnect 1 auto auto 10/100/1000BaseTX

Port Name Status Vlan Duplex Speed Type

Gi0/12 notconnect 1 auto auto 10/100/1000BaseTX

Gi0/13 notconnect 1 auto auto 10/100/1000BaseTX

Gi0/14 notconnect 1 auto auto 10/100/1000BaseTX

Gi0/15 notconnect 1 auto auto 10/100/1000BaseTX

Gi0/16 notconnect 1 auto auto 10/100/1000BaseTX

Gi0/17 notconnect 1 auto auto 10/100/1000BaseTX

Gi0/18 notconnect 1 auto auto 10/100/1000BaseTX

Gi0/19 notconnect 1 auto auto 10/100/1000BaseTX

Gi0/20 notconnect 1 auto auto 10/100/1000BaseTX

Gi0/21 notconnect 1 auto auto 10/100/1000BaseTX

Gi0/22 notconnect 1 auto auto 10/100/1000BaseTX

Gi0/23 connected trunk a-full a-1000 10/100/1000BaseTX

Port Name Status Vlan Duplex Speed Type

Gi0/24 connected trunk a-full a-1000 10/100/1000BaseTX

Gi0/25 notconnect 1 auto auto Not Present

Gi0/26 notconnect 1 auto auto Not Present

Gi0/27 notconnect 1 auto auto Not Present

Gi0/28 notconnect 1 auto auto Not Present

Po1 <--- 2GB Link to C connected trunk a-full a-1000

Po2 Test Host 1 Port C connected trunk a-full a-1000

Po3 Test Host 2 Port C connected trunk a-full a-1000

Po4 Test Host 3 Port C connected trunk a-full a-1000

show mac address-table interface GigabitEthernet0/5

Mac Address Table

-------------------------------------------

Vlan Mac Address Type Ports

---- ----------- -------- -----

WUN-TEST-SWITCH#show mac address-table interface GigabitEthernet0/6

Mac Address Table

-------------------------------------------

Vlan Mac Address Type Ports

---- ----------- -------- -----

CDP output on the host comes up with nothing

sh cdp neighbors on the physical switch pulls up:

TEST-SWITCH#show cdp neighbors

Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge

S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone,

D - Remote, C - CVTA, M - Two-port Mac Relay

Device ID Local Intrfce Holdtme Capability Platform Port ID

3750X-CORESW01.

Gig 0/24 168 S I WS-C3750X Gig 1/0/20

3750X-CORESW01.

Gig 0/23 168 S I WS-C3750X Gig 1/0/19

TEST-HOST03 Gig 0/6 123 S VMware ES vmnic0

TEST-HOST03 Gig 0/5 123 S VMware ES vmnic1

I don't know the PowerCLI command to get the management config of the host. I will kindly add it to this post if you could let me know how to obtain it. I also don't know why only one of the 3 hosts are showing up as CDP neighbors on the switch when all 3 hosts are configured identically.

kclark2907 · ‎08-18-2015

Is the device your pinging outside of this switch ? I'm asking because it looks like you have another trunk going somewhere else.

Have you trying pinging from host to host, if you can't ping between hosts can you shut one of the ports(ie gi1/0/6) and try again?

To get the ip info you can run -

get-vmhost | Get-VMHostNetworkAdapter | select VMhost, Name, IP, SubnetMask, Mac, PortGroupName, vMotionEnabled, mtu, FullDuplex

Kmartin472 · ‎08-18-2015

The trunk you see is the uplink to our core switch. The host, when working, can ping everything. All VLAN gateways, all VMs, inside or outside the Test environment, etc. But when I switch the cables, I lose all connectivity on the host. The host is currently reachable with physical NIC0 connected to port 6 and NIC1 connected to port 5. Switching those loses the connection. Shutting down one of the ports doesn't seem to make a difference. Here is the output you requested:

VMHost : test-host03

Name : vmnic0

IP :

SubnetMask :

Mac : 00:21:9b:a3:c5:86

PortGroupName :

vMotionEnabled :

mtu :

FullDuplex : True

VMHost : test-host03

Name : vmnic1

IP :

SubnetMask :

Mac : 00:21:9b:a3:c5:88

PortGroupName :

vMotionEnabled :

mtu :

FullDuplex : True

VMHost : test-host03

Name : vmnic2

IP :

SubnetMask :

Mac : 00:21:9b:a3:c5:8a

PortGroupName :

vMotionEnabled :

mtu :

FullDuplex : False

VMHost : test-host03

Name : vmnic3

IP :

SubnetMask :

Mac : 00:21:9b:a3:c5:8c

PortGroupName :

vMotionEnabled :

mtu :

FullDuplex : False

VMHost : test-host03

Name : vmk0

IP : 192.168.104.103

SubnetMask : 255.255.255.0

Mac : 00:21:9b:a3:c5:86

PortGroupName : Management Network

VMotionEnabled : False

Mtu : 1500

FullDuplex :

I took spanning tree off the core switch port-channel uplink and verified IP hash was selected on all vswitches and I now have successful networking on multiple VMs per host. At least that issue seems to be resolved but I don't understand why it matters which physical port on the switch NIC0 is connected to. This happens to all hosts, btw.

kclark2907 · ‎08-19-2015

I decided to test the issue you're having in my dev environment and I ran in to the exact same issue, after spending a bit of time thinking wtf, it was facepam time this is ESXi 6 not 5.5. Nic teaming is different, if you change the dvSwitch to use lag groups your problem will be fixed and your nic teaming should start working. If your not sure how to make the changes look at pages 67 - 76 in http://pubs.vmware.com/vsphere-60/topic/com.vmware.ICbase/PDF/vsphere-esxi-vcenter-server-60-network...

Good luck

Kmartin472 · ‎08-19-2015

I do not have any distribution switches configured in this environment. Is that necessary?

kclark2907 · ‎08-19-2015

Opps, not sure why I had it in my head that you were using dvSwitches, It works fine with standard switches as long as your config is correct, your switch config and vswitch config looks fine, if you can't ping within the same physical switch ie another host then it point to something within the esxi node. pitty we can't to a webex, I'd like to actually have a proper look. Another way to test would be to create a vmkernel port on the same range as a VM and try pinging it

If you created the trunk / etherchannel after the portgroups etc were created double check that the portgroups updated to ip hash and both nic are in use

Kmartin472 · ‎08-19-2015

All of that tested fine. The physical switch was configured well before any hosts. Fault tolerance seems to be working within vCenter, meaning if I shutdown a port on a switch, the VMs and hosts are still reachable. Just something with manually switching the cables is the only issue at the moment and I don't plan on doing that very often, if ever. I'll go ahead and say this is a closed discussion and the IP hash is the resolution. Thank you all for your help!