VMware Cloud Community
vinny95
Contributor
Contributor

nic teaming strange behaviour

Hi,

I have an esxi 5.5, and I wanted to test nic teaming for vmk management address.

I set a distributed vswitch on 2 physical nics (vmnic0 & vmnic1), each one on a different physical switch

A port group with vlan id, failback to No & load balancing to physical nic load

A management vmk on this port group

I shut one physical switch port (connected to one of the two vmnics :

esxi sees uplink down & lost redundancy, we lost one packet when pinging the esxi

2015-03-18T09:44:54.003Z [79980B70 info 'Vimsvc.ha-eventmgr'] Event 2002 : Lost uplink redundancy on DVPorts: "1738/08 09 23 50 9b 30 e6 89-3c 8e 65 73 a1 83 f1 7b", "1738/08 09 23 50 9b 30 e6 89-3c 8e 65 73 a1 83 f1 7b". Physical NIC vmnic1 is down.

2015-03-18T09:51:40.842Z cpu6:33515)<6>igb: vmnic1 NIC Link is Down

So my traffic is through vmnic0.

> I re-enable the uplink of vmnic1 on the physical switch : I expect nothing because I set failback to No and my traffic is already on vmnic0..

But I lose esxi connection for a few seconds,lose 7 to 10 packets when pinging esxi management address.

In the logs I see :

2015-03-18T09:47:09.003Z [799C1B70 info 'Vimsvc.ha-eventmgr'] Event 2004 : Uplink redundancy restored on DVPorts: "1738/08 09 23 50 9b 30 e6 89-3c 8e 65 73 a1 83 f1 7b", "1738/08 09 23 50 9b 30 e6 89-3c 8e 65 73 a

2015-03-18T09:47:11.965Z [799C1B70 info 'Vimsvc.ha-eventmgr'] Event 2005 : The dvPort 1738 link was down in the vSphere Distributed Switch  in ha-datacenter

2015-03-18T09:47:11.966Z [799C1B70 info 'Vimsvc.ha-eventmgr'] Event 2006 : The dvPort 1738 was unblocked in the vSphere Distributed Switch  in ha-datacenter.

2015-03-18T09:47:11.968Z [799C1B70 info 'Vimsvc.ha-eventmgr'] Event 2007 : The dvPort 1738 link was up in the vSphere Distributed Switch  in ha-datacenter

(in vmkernel.log)

2015-03-18T09:52:11.753Z cpu2:33533)<6>igb: vmnic1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX

2015-03-18T09:52:12.954Z cpu0:32884)NetPort: 1632: disabled port 0x3000004

2015-03-18T09:52:12.954Z cpu0:32884)NetPort: 2905: resuming traffic on DV port 1738

2015-03-18T09:52:12.955Z cpu0:32884)Uplink: 6529: enabled port 0x3000004 with mac 0c:c4:7a:48:fe:ab

Is this behaviour "normal" ?

Why enabling one physical NIC in the team causes a lost of management network for 10 seconds ?

vinny

0 Kudos
4 Replies
jrmunday
Commander
Commander

Hi Vinny,

How are your physical switches configured?

Do you have PortFast enabled on these ports to ensure that Spanning Tree is not causing issues?

Cheers,

Jon

vExpert 2014 - 2022 | VCP6-DCV | http://www.jonmunday.net | @JonMunday77
0 Kudos
vinny95
Contributor
Contributor

Hi Jon,

My physical switches are configured like this (cisco).

(the nic I disabled then re-enabled)

> vmnic1 :

interface GigabitEthernet1/x

description xx

switchport trunk encapsulation dot1q

switchport trunk allowed vlan xx

switchport mode trunk

logging event trunk-status

The other nic : vmnic0

interface GigabitEthernet1/x

description xxx

switchport trunk encapsulation dot1q

switchport trunk allowed vlan xx

switchport mode trunk

logging event link-status

logging event trunk-status

load-interval 30

storm-control broadcast level 33.00

no cdp enable

spanning-tree portfast

Looks like network team does not set the same configuration on each switches, could it be an issue ?

vinny

0 Kudos
jrmunday
Commander
Commander

Hi Vinny,

I would certainly start with getting the ports setup the same, including enabling portfast (not currently configured on vmnic1 port).

Not related to your issue, but I would also get CDP enabled, so that you can see this port information in the virtual switch.

Cheers,

Jon

vExpert 2014 - 2022 | VCP6-DCV | http://www.jonmunday.net | @JonMunday77
0 Kudos
vinny95
Contributor
Contributor

I configured both port with same configuration and took a close look at esxi management mac addr.

mac @ is on switch A

shutdown port on switch A

> mac @ put on switch B, no packet lost, everything's OK

no shutdown for port on switch A

mac @ is seen on switch A & B, all packet lost (between 5 to 10), esxi not reachable

then mac @ only on switch A, esxi reachable

I triple checked the port group and failback is Off.

So why the mac @ tries to get on the previous switch ?

vinny

0 Kudos