Can anyone tell me the best way to configure vDS with LAGs are. We seem to have some disagreement on how it should work and how it needs to be configured.
Currently we have 2 hosts with 4 NICs dedicated to the vDS for VM traffic. These 4 NIcs are in port channels on a cisco switch stack. In the vDS we have assigned 4 NICs per host to the LAG group, so the LAG contains 8 NICs in one LAG. From what I can see, this works and the VM traffic seems to work fine, we did get a NIC flapping error on the switch but think this is due to having an incorrect load balancing option selected as should be "Route based on IP hash". That aside though, are these LAGs configured correctly, if not, how should they be configured, as the other way we initially tried was to create a LAG for each ESXi host and assign only the 4 NICs from the host to it. But as yoyu can only have one active LAG in a port group and no others, then this stops VMs from communicating if the VM is on another host as it, obviously, cant use the LAG as its in the "unused" section. There is no way to make both active so to me, this approach will never work.
The config we have on the switch end is seemingly correct. We have 2 Cisco switches in the stack, ports one and 2 on both are dedicated to Port Channel 1 for all 4 NICs out of ESX1, Ports 3 and 4 and dedicated to port channel 2 for all 4 NICs out of ESX2, etc etc.
So with regards to the LAGs, should we be creating one single LAG group and assigning all the NICs to this from each host, or individual LAG groups per host and assigning only that hosts NICs to it, in which case I must have missed something with the configuration as I cant get this way to work, but the "all in" way I can.
Any ideas? Whats the way we should be doing this?
Setting up LAGs is actually a pretty straight forward task.
First you create a LAG with the desired settings, and the number of ports per ESXi host, then you assign the ports from each host to the LAG uplinks. The last step is to set the LAG active on the dvSwitch (all the other dvSwitch uplink ports to unused). The Load Balancing setting on the dvSwitch itself doesn't need to be set to anything special (I usually leave it at the default "Route based on originating Port ID"), because the dvSwitch itself sees only a single uplink - the LAG - and the aggregation is done at the LAG level.
So what's important is to match the load balancing method on the LAG with the physical switch, and decide whether the LAG needs to be active, or passive.
Yes it was pretty easy to do I agree.
It has been done as you said, create the LAG, define the number of ports per host and assign them to the LAG uplinks. Set it as active. So basically we have one LAG with 8 ports assigned, with 4 NICs from each host assigned to it and each ESX uplink assigned to a LAG port. So essentially, all ports from both ESX hosts all in one LAG group.
What we don't do is create separate LAG groups per "host" and assign them to the vDS as the port groups can only use a single LAG as active and therefore one of the LAGs would be in the unused section and if the VM is on that host, then its not going to be able to communicate. That's correct aint it?
Yes, that should be fine.
A single LAG group configured with 4 uplink ports. The ports from each participating host are assigned to those 4 uplink ports, and the LAG group is the only "Active" member on the dvSwitch.
It's basically only the LAG's load balancing mode, which needs to match the physical switch settings.
As mentioned earlier, load balancing on the dvSwitch itself may remain on the default setting.
What you also want to check, is whether the host's NIC firmware/drivers are up-to-date, to rule out such issues.