VMware Cloud Community
beholder242
Contributor
Contributor

Stand-Alone vSphere host loses management when adding a 4th NIC to a 3-NIC vSwitch

OK, guys, here's a weird one for you, but I'm sure someone out there has the answer!

I have an HP DL360 G7 server running the HP-customized vSphere 6 base build (5050593).  The machine has 4 NIC ports built-in and also has a 4-port add-on NIC.  Right now, I only have 2 VMs running through a vSwitch with 3 NICs (vmnic0-2) to a Cisco 2960X-48TD-L.  The ports on the Cisco switch are configured for Etherchannel:

interface GigabitEthernet1/0/13

description HP VM Network

switchport mode access

spanning-tree portfast edge

channel-group 1 mode on

!

interface GigabitEthernet1/0/14

description HP VM Network

switchport mode access

spanning-tree portfast edge

channel-group 1 mode on

!

interface GigabitEthernet1/0/15

description HP VM Network

switchport mode access

spanning-tree portfast edge

channel-group 1 mode on

!

interface GigabitEthernet1/0/16

description HP VM Network

switchport mode access

shutdown

spanning-tree portfast edge

channel-group 1 mode on

As you will see, G1/0/16 is shut down.  If I plug a cable into G1/0/16 (or any other port in the same channel-group) and plug the other end into vmnic3 or 4 (and imagine 5-7 as well), I lose connectivity to the management IP.  VM's may or may not be affected.  When I discovered this, I was unable to either get into my vSphere host or a Windows guest server.  Once I either disconnect the additional vmnic or shut the port, connectivity is instantly restored.

Is there some sort of limitation on the size of vSwitches in the free edition, or the number of NICs that can be used?  I also tried creating a new standard vSwitch, but the issue persists.  As soon as I make a fourth network connection, the management IP goes dead.

0 Kudos
5 Replies
Finikiez
Champion
Champion

Hi!

What is configured on vswith in nic teaming policy section?

The most frequent issue with described sympthoms is VLAN misconfiguration on physical switch ports.

0 Kudos
beholder242
Contributor
Contributor

NIC Teaming is set as follows:

Load balancing: Route based on IP hash

Network failover detection: Link status only

Notify Switches:  Yes

Failback:  No

Can't be a VLAN issue because I'm only using the default VLAN.

0 Kudos
Finikiez
Champion
Champion

Ah, I missed switch port configuration.

I would try to recreate etherchannel configuration on switch from scratch.

0 Kudos
bspagna89
Hot Shot
Hot Shot

Hi,

When you add the fourth NIC to the vSwitch have you checked your switch or ESXi logs (once restoring the connectivity)?

Try to reproduce it, and check the following :

On your switch, do "Show Etherchan sum"

As an example, you should see something like this :

1      Po1(SU)          -        Te1/0/3(P)  Te2/0/3(P)  -- P means it's bundled in the etherchan. I'm curious to know if everything is bundled at that point.

You can also check the interface itself - "sh int po1" and also the physical interface "sh int gi1/0/16" (or whichever the last one is). I do agree with the last post, you should try re-creating the etherchan. Before re-creating it, try using standalone access ports with all 4 ports and see if you lose management. This would simply be a test, not the end result. If it works, we can move on to seeing why the Etherchan causes the host management to disconnect. Can you also try shutting down gi1/0/13 and enabling 14,15,16 and see if it stays up?

I was not able to find any documentation regarding vSphere Free and NIC limitations.

You will also want to check ESXi Logs and see if something is causing the issue on the host side. I'd start with the vmkernel.log and also worth checking vobd.log.

New blog - https://virtualizeme.org/
0 Kudos
beholder242
Contributor
Contributor

Been busy with some other projects, and just now got back to this.

Fortunately, I had a chance to shut down my vSphere host and VM's, and then bring up the host to work on this further and found my mistake.

Just to document my steps:

1.  I reduced the vSwitch to a single NIC, vmnic0.

2.  Removed the etherchannel configuration from the switch ports  (i.e. "no channel-group 1 mode on")

3.  Verified connectivity.  All good.

4.  Wired up the other 7 NIC ports on the host into the switch.

5.  Added the other 7 NICs (vmnic1-7) into the vSwitch.

6.  Verified connectivity.  All still good.  All 8 links up.

7.  Added the etherchannel onto the 8 switch ports connected to the host.

8.  Connectivity dropped.  Reverted the etherchannel, connectivity restored.

9.  Did a little more digging and found KB1022751.  It appear that my mistake was not changing the NIC teaming/load-balancing mode on my Management Network to "Route based on IP hash" to match the vSwitch it's associated with.

10.  After changing the setting on my Management Network and reinstating the etherchannel, the system dropped 2 packets, but came back pinging once again.

Problem appears solved!

0 Kudos