Hi all,
I have a host cluster with about 20 esxi hosts. HA and DRS is enabled.
If I set one host to maintenance mode and then reboot the host I get a network redundancy loss alert on three of my other hosts. The hosts are up und running with no failure and the vms work fine but I wonder why this happens. Any suggestions?
This aleart usually pops up if you have just one uplink on your management network. To avoid this error you can configure the management network with two network cards and team them. Either by keeping both active or one active and another standby.
If you already have the teaming and still getting the error you can refer this KB Article http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=100470...
Hi,
thanks for reply.
all Hosts are connected with 2 Nics to the Management Portgroup on a vDSwitch. The Teaming and Failover setting is active active.
Are the servers heartbeating each other and this is the cause? I thougt that for cluster heartbeating the datastores are used?
I only would like to ensure that no misconfiguration in my environment exists and I get in future trouble. But sounds that this is normal behavior.
Hi,
Do the physical switches report any loss of link? I don't think this is by design, because in all my enviroments, I don't see this issue. Are both nics added as active?
All hosts are added with two nics to the portgroup, yes they are both placed under active uplinks in the Management Portgroup. I dont understand why one host affects the others when rebooting.
Do the physical switches report loss of link? Is it always the same machine reporting this?
I have to simulate this again while the networking guys have a eye on this. I think its always the same what happens: Reboot vm033 and vm031, vm035 and vm036 are getting the alert. On reboot of other hosts nothing happens.
I will try to reproduce and let you know.
Thanks, that would help. You could also monitor the vmkernel.log (tail -f /var/log/vmkernel.log via SSH) once you do this.
Did you see the KB Article posted?
Hi,
digging deeper I found an interesting fact. Its not the management nic thats create the alert, its the iscsi one!!
ISCSI is also configured with 2 physical nics on an vds but with 2 port-groups iscsi_A and iscsi_B. The two uplinks have each a own IP on the same subnet. On ISCSI_A the dvUplink1 is active and dvUplink2 not used, on the ISCSI_B vice versa.
So you have multipathing configured. I'm still curious to see if the networking guys see a physical link go down.
Ok i have now the feedback from netwoking, the links went down at this moment I rebooted the host.
The links on an OTHER host went down once you rebooted a different host?
Thats what confusing me.
I have vm020, vm021, vm022..... to vm041
If I reboot vm035, I get the alert on vm029, vm031 and vm033.
Mysterious.
Check this together with the network team. That might be your issue:
Ok, thank you very much, I will check with networking devision and mark your answer as the right one if succeeded.