VMware Cloud Community
RichardD45
Contributor
Contributor

VM's not finding correct network path when a physical switch fails

Hi everyone,

I'm hoping I'm in the right place for help with this question.

Currently we are having problems with some VM's losing network connectivity when one of our physical switches fails.

We have a blade environment (C7000 with 3 VMware blades) with 2 virtual connects and each of these is virtual connects is connected to a S5700 Huawei switch. The other day one of our Huawei switches powered off due to a power failure (the other switch is on a UPS). The system should have continued to work but some VM's lost connection. Once the Huawei switch came back online the VM's were there as if nothing had happened.

It does not seem to be host related as some VM's on the same host continued to work OK. So, from my point of view the problem seems to be with the VM not seeing a new network path. I did notice that 'Notify Switches' was set to NO and my inclination is to set that to YES. Am I right in thinking this or is something else causing the VM's to become lost from the network.

Any help someone can give would be much appreciated?

Many thanks

Richard

Reply
0 Kudos
3 Replies
bspagna89
Hot Shot
Hot Shot

Hi,

Do you have any LACP or load balancing configured on your Switch/blade uplink ports?  If not, we can look into further configuring this. Essentially, VMs were connected to the switch that went down because VMs -> port group -> VMNICx -> Powered off switch. If load balancing was working correctly/in place, a failure would have been detected and a new path discovered.

New blog - https://virtualizeme.org/
Reply
0 Kudos
RichardD45
Contributor
Contributor

Hi,

I have checked the configuration of the Huawei switches and these are set as 'Trunk' for our VLAN Ethernet traffic to the VC's. The virtual connects are set as trunk mode 'AUTO'. I can see no mention of LACP being set.

The two Huawei switches are in a stack but the ports on each switch which transport Ethernet traffic to the virtual connects are not linked.

Thanks for your prompt reply. I apologise if my answers are a bit vague as I'm at the limit of my knowledge on a system which was configured by an external company so I'm learning as I go along.

Thanks

Richard

Reply
0 Kudos
a_p_
Leadership
Leadership

Welcome to the Community,

the issue in this case is that the ESXi host isn't aware of the switch failure. It only sees the internal switch ports, i.e. the ones of the Virtual Connect module.

I did configure C7000 blade enclosures in the past, but with Cisco switches. These switches allow to configure Link State groups. With Link state groups, all downlinks (the ports to the Blades) will go down if the uplink(s) (ports to the external switch) in the group fail. I'm not sure whether the Virtual Connect switches also offer such a feature!?

André

Reply
0 Kudos