VMware Cloud Community
parkarnoor
Enthusiast
Enthusiast
Jump to solution

NIC Team fails in ESXi 5.5


Hi All,

i have 3 node cluster running ESXi 5.5.

i have network setup as given below.  i have 4 NICs per server.  All 4 ports are trunk.

Created two switches - vSwithch0 with 2 NICs (Management Console and vMotion) and vSwitch1 with 2 NICs (Both nic will be used for VM's as per their respective Production VLAN)

management network chosen vmnic0 as active/vmnic 1 as standby and for vMotion network vmnic1 active/vmnic0 standby

and for vSwitch1 - i added vmnic2 and vmnic3 as active/active

So summary including sample IPs.

vSwitch0 - management network(removed check mark from vmotion traffic) - e.g 10.34.45.x (say vlan 45)- vmnic0 active/vmnic1 standby - physical switch port trunk - management network allowed VLANS (all 4095).

               - vMotion network(removed check mark from management traffic) - private vlan e.g 192.168.12.x (say vlan 12) - vmnic1 active/vmnic0 standby - physical switch port trunk- vmotion network allowed VLANS (all 4095)

vSwitch1 is working fine. (multiple port groups in seperate VLANS ). both vmnic2/3 are active/active and ports trunk. (no probs)

But :

above setup is working fine as long as each nic is associated to its own network(i.e vmnic0 to management network and vmnic 1 to vmotion). NOW  i am doing failover testing.

in order to test failover testing, i am not removing cable from vmnic0. i am just moving vmnic0 to unused adapter and moving vmnic1 to active adapter for management network under vswitch0 on one of the host. Ping to 10.34.45.x is breaking. so it is failed. i am unable to ping over vmnic1. and ESX automatically reverting changes back to original. i.e making vmnic0 as active and vmnic1 as standby and it starts working again with an error in VCENTER.

is there any wrong configuration above. speciallly IP level. i created vmnic0 and vmnic1 as trunk from physical switch level. so that if vmnic 0 fails, vmnic1 reaches another VLAN i.e managment network VLAN. but it is not working.

Note: below graphs are taken from another thread but i have same setup.

pastedImage_0.pngCapture1.JPG

Management Port Group with vmnic0=Active , vmnic1=standby

pastedImage_1.pngCapture2.JPG

vMotion Port Group with vmnic0=standby and vmnic1=active

pastedImage_0.pngCapture3.JPG

Project VM's with their respective VLAN's both the vmnic2=active and vmnic3=active is active.

21953_21953.JPGCapture4.JPG

.

Thanks

Noor parkar

UAE

Noor Parkar
Reply
0 Kudos
1 Solution

Accepted Solutions
a_p_
Leadership
Leadership
Jump to solution

Sounds good to me.

André

View solution in original post

Reply
0 Kudos
10 Replies
a_p_
Leadership
Leadership
Jump to solution

First of all please clarify "All 4 ports are trunk." What kind of trunk, and how many trunks? For Cisco a trunk port is basically a tagged (802.1Q) port, whereas for other vendors a trunk is channeling ports (e.g. LACP).

André

Reply
0 Kudos
a_p_
Leadership
Leadership
Jump to solution

... in case of channeling/link aggregation, please read VMware KB: Sample configuration of EtherChannel / Link Aggregation Control Protocol (LACP) with ESXi...

One of the statements there is: "Do not configure standby or unused uplinks with IP HASH load balancing."


André

Reply
0 Kudos
parkarnoor
Enthusiast
Enthusiast
Jump to solution

hi

It is Cisco a trunk port which is tagged (802.1Q) port. thats right.

as i said each server has 4 NICs. all uplinks to Cisco switch where ports is trunk. 4 NICs - 4 Trunks.

so do you think my network IP configuration seems fine.

Noor Parkar
Reply
0 Kudos
a_p_
Leadership
Leadership
Jump to solution

In this case it might be the self healing, which kicks in when you switch the vmnics at the same time. What you may try, is to set both vmnics active in the first step, and then set the other previously active vmnic to stand-by in a second step. Anyway, I'd suggest you either test the configuration with physically pulling a network cable or simply disabling the physical switch port.

One more thing to check is that the physical switch ports have spanning-tree portfast trunk configured.

André

Reply
0 Kudos
vmsatkum
Contributor
Contributor
Jump to solution

Hi VMwareins I am facing the same issue too

And  found the problem was on the physical redundant switch , firstly the inters switch link between physical switches was wrongly patched and second the the Physical switches are from Huvaei and not trunked , third the physical switches were configured as Active/standby

V Switch 0

     NIC01 and NIC00 Active /Active

     PG01 Vmgmt           NIC 0     Active (NIC0)/ Standby (NIC1)  Teaming values Use Explict Failover order/Link state only/Yes/No

     PG02 Vmotion         NIC 1     Active (NIC1)/ Standby (NIC0)  Teaming values Use Explict Failover order/Link state only/Yes/No

Reply
0 Kudos
parkarnoor
Enthusiast
Enthusiast
Jump to solution

hi,

I checked with network administrator and he gave important information as below

vmnic 0 is trunked as per layer 3 routing in the physical switch and vmnic0 is in the server vlan network.

vmnic 1 is in private range i.e. 192.168.10.x range for vmotion. it is configured in the switch as layer 2 routing.

so may be,

once vmnic 0 is down the vmnic1 is not reaching another VLAN.

as per network administrator, vmnic1(after failover) cannot communicate to server VLAN network.

since vmnic1 is in private IP range, he configured it in layer 2 routing.

another way we can check it to remove vmnic1 from private IP range and assign another VLAN to it instead of private IP range.

and then configure it as normal layer 3 routing enabled VLAN.

if there is anything you can understand from above. let us know.

thanks

Noor Parkar
Reply
0 Kudos
a_p_
Leadership
Leadership
Jump to solution

The physical switch ports should both be configured as "Trunk" (802.1Q) ports, and allow the required VLANs for Management and vMotion. With such a physical configuration, and the appropriate VLAN-ID being set on the respective port-groups, Management as well as vMotion will be able to communicate on both physical ports.

André

Reply
0 Kudos
parkarnoor
Enthusiast
Enthusiast
Jump to solution

Yes, that's right.

This is my original configuration. but network administrator did not configure it properly in the beginning and I was assuming it was correct.

I will be working on this.

thanks once again.

Noor Parkar
Reply
0 Kudos
parkarnoor
Enthusiast
Enthusiast
Jump to solution

Hi

the problem is resolved now. NIC failover is working fine.

one last query about NIC setup. for vmotion(below)/management network selected failover order and other settings as given below. is it proper. These are VMkernels networks under switch 0.

vmotion NIC.JPG

and for vSwitch 1 where separate port group for each different VLAN is created. I did not chose failover at all. since I am using NIC3 AND NIC 4 as active /active. let me know if it is correct.

different VLAN per port group.JPG

let me know if anything.

Noor Parkar
Reply
0 Kudos
a_p_
Leadership
Leadership
Jump to solution

Sounds good to me.

André

Reply
0 Kudos