VMware Cloud Community
TheVMinator
Expert
Expert

Choosing a virtual switch Load balancing policy with a Cisco UCS or virtual I/O device

How does having a virtual I/O device affect my choice of a load balancing policy?  For example I have 2 vmnics for NFS.  Normally I am using load based teaming to distribute load among two physical adapters.  But now those physical adapters are now virtual adapters.  They map to specifc physical ports based on settings in the Cisco UCS, Xsigo, or other virtual I/O device.  Should I still use load based teaming in ESXi, or allow the virtual I/O device to handle the load distribution?

Reply
0 Kudos
7 Replies
chriswahl
Virtuoso
Virtuoso

For example I have 2 vmnics for NFS.  Normally I am using load based teaming to distribute load among two physical adapters.

I'm assuming you refer to an NFS implementation with multiple subnets? A single NFS session will only ever use a single vmkernel port and uplink.

That said, I typically go with "route based on physical NIC load" in Enterprise Plus scenarios or stick with "route by virtual port ID" in other license levels. I've also discussed the benefits of fabric pinning for vMotion (or other highly layer 2 traffic) here:

http://wahlnetwork.com/2013/01/15/why-you-should-pin-vmotion-port-groups-in-converged-environments/

VCDX #104 (DCV, NV) ஃ WahlNetwork.com ஃ @ChrisWahl ஃ Author, Networking for VMware Administrators
TheVMinator
Expert
Expert

OK thanks.

-A single NFS session uses only one uplink, but when there are multiple NFS sessions, wouldn't load based teaming allow the next session to choose the path with the most available bandwidth?

-Thanks for the great tip on pinning vmotion traffic.

-Regarding the place of virtual I/O device , In this case, I'm thinking more specifically around Xsigo than UCS but here is where I'm not clear:

No matter which policy I choose, load based teaming, route based on virtual port id, route based on IP hash, etc, VMware is only using that algorithm to select the virtual nic that will be used to send traffic out.  But the decision of which physical Port the traffic goes out on still depends on which physical port that virtual nic is mapped to within the  Xsigo director.  So all the configuration and tuning I do on VMware doesn't give the hypervisor visibility into how much traffic is going to end up getting crammed on one physical Xsigo port.  For example, in a worst case scenario bad design, I could have 10 vnics mapped to one physical Xsigo ethernet port, so that that link gets saturated up to the physical switch, then I could have one virtual nic mapped to only one Xsigo physical NIC port using a10Gb link all to itself which is underuitlized.  I wouldn't do that but you see the point, that I can defeat a great virtual networking design in VMware with a bad design on the virtual I/O device.  I'm trying to understand more about the interplay between the two configurations and how to make sure the goals I defined before the virtual I/O device was in place by implementing a certain virtual networking configuration in VMware are still being represented once I introduce the virtual I/O device.

Reply
0 Kudos
chriswahl
Virtuoso
Virtuoso

-A single NFS session uses only one uplink, but when there are multiple NFS sessions, wouldn't load based teaming allow the next session to choose the path with the most available bandwidth?

LBT will balance vmkernel ports, not sessions. If you're using one subnet, they will all choose one vmkernel port.

http://wahlnetwork.com/2012/04/23/nfs-on-vsphere-technical-deep-dive-on-same-subnet-storage-traffic/

So all the configuration and tuning I do on VMware doesn't give the hypervisor visibility into how much traffic is going to end up getting crammed on one physical Xsigo port.

Indeed. Smiley Happy

VCDX #104 (DCV, NV) ஃ WahlNetwork.com ஃ @ChrisWahl ஃ Author, Networking for VMware Administrators
TheVMinator
Expert
Expert

We are using one NFS vmkernel port group, with 2 dvuplinks, as in the example below:

http://blogs.vmware.com/performance/2010/12/vmware-load-based-teaming-lbt-performance.html

In that example, LBT seems to work just fine without any mention of needing 2 subnets as far as I can read.

"LBT will balance vmkernel ports, not sessions. If you're using one subnet, they will all choose one vmkernel port."

So with LBT,  NFS traffic in our case all goes from all virtual machines through one vmkernel port group, then it chooses a dvuplink before leaving for the physical switch.  If dvluplink1 is saturated to 75%, for example, it can start to go through dvuplink2.  Its using the same vmkernel port in any case.  Even with one subnet, I'm still getting LBT to work, because dvuplink2 still gets used.  Correct?

My vmkernel port is assigned the IP 192.168.1.10.  My NFS target has IP 192.168.1.20.  If it has 2 dvuplinks, it should still be able to use LBT to balance load between the 2 dvuplinks, correct?  Are you saying this won't happen if there is only one subnet involved?

Reply
0 Kudos
chriswahl
Virtuoso
Virtuoso

So with LBT,  NFS traffic in our case all goes from all virtual machines through one vmkernel port group, then it chooses a dvuplink before leaving for the physical switch.  If dvluplink1 is saturated to 75%, for example, it can start to go through dvuplink2.  Its using the same vmkernel port in any case.  Even with one subnet, I'm still getting LBT to work, because dvuplink2 still gets used.  Correct?

Sadly no. LBT would have to migrate the entire vmkernel port to the other uplink (you can see which uplink it is using with ESXTOP). If you have only one vmkernel port on a single subnet, it will never consume more than one associated uplink.

My vmkernel port is assigned the IP 192.168.1.10.  My NFS target has IP 192.168.1.20.  If it has 2 dvuplinks, it should still be able to use LBT to balance load between the 2 dvuplinks, correct?  Are you saying this won't happen if there is only one subnet involved?

LBT cannot load balance a single object (in this case the vmkernel port) over multiple uplinks. This simply isn't possible.

I suggest multiple subnets based on how vSphere handles NFS traffic. NFS traffic will always find the first vmkernel port on the subnet that will reach the destination and use that one vmkernel port. Even if you had 10 vmkernel ports (192.168.1.10,11,12,13-and so on) it would still only use one.

Here's an example from my lab. Even though there are a ton of vmk ports available, NFS only chooses vmk7 which is bound to uplink vmnic6. LBT is useless to help NFS in this situation.

Source: http://wahlnetwork.com/2012/04/23/nfs-on-vsphere-technical-deep-dive-on-same-subnet-storage-traffic/

http://wahlnetwork.com/wp-content/uploads/2012/04/nfs-lab-test1-esxtop.png

VCDX #104 (DCV, NV) ஃ WahlNetwork.com ஃ @ChrisWahl ஃ Author, Networking for VMware Administrators
Reply
0 Kudos
TheVMinator
Expert
Expert

Regarding this exchange still not able to mesh what you are saying with what I found on VMware's website.  For example:

So with LBT,  NFS traffic in our case all goes from all virtual machines through one vmkernel port group, then it chooses a dvuplink before leaving for the physical switch.  If dvluplink1 is saturated to 75%, for example, it can start to go through dvuplink2.  Its using the same vmkernel port in any case.  Even with one subnet, I'm still getting LBT to work, because dvuplink2 still gets used.  Correct?
Sadly no. LBT would have to migrate the entire vmkernel port to the other uplink (you can see which uplink it is using with ESXTOP). If you have only one vmkernel port on a single subnet, it will never consume more than one associated uplink.

In this article:

http://blogs.vmware.com/performance/2010/12/vmware-load-based-teaming-lbt-performance.html

The author tests a single dvs port group with 2 dvuplinks.  When 1 dvuplink hits the saturation threshhold of 75% (and not before), he is saying that the other uplink is invoked.  Here is the specific quote that I'm referring to:

We then reconfigured the vDS with two dvUplinks and a single DV Port Group to which all the vNICs of the VMs were mapped. The DV Port Group was configured with the LBT teaming policy. We used the default settings of LBT, which are primarily the wakeup period (30 seconds) and link saturation threshold (75%). Our goal was to evaluate the efficacy of the LBT policy in terms of load balancing and the added CPU cost, if any, when the same benchmark load of 30,000 SPECweb2005 support sessions was applied.

Before the start of the test, we noted that the traffic from all the VMs propagated through the first dvUplink. Note that the initial affiliation of the vNICs to the dvUplinks is made based on the hash of the virtual switch port IDs. To find the current affiliations of the vNICs to the dvUplinks, run the esxtop command and find the port-to-uplink mappings in the network screen. You can also use the “net-lbt” tool to find affiliations as well as to modify LBT settings.

Phase 1: Because all the virtual switch port IDs of the four VMs were hashed to the same dvUplink, only one of the dvUplinks was active. During this phase of the benchmark ramp-up, the total network traffic was below 7.5Gbps. Because the usage on the active dvUplink was lower than the saturation threshold, the second dvUplink remained unused.

Phase 2: The benchmark workload continued to ramp up and when the total network traffic exceeded 7.5Gbps (above the saturation threshold of 75% of link speed), LBT kicked in and dynamically remapped the port-to-uplink mapping of one of the vNIC ports from the saturated dvUplink1 to the unused dvUplink2. This resulted in dvUplink2 becoming active.  The usage on both the dvUplinks remained below the saturation threshold.

Phase 3: As the benchmark workload further ramped up and the total network traffic exceeded 10Gbps (7.5Gbps on dvUplink1 and 2.5Gbps on dvUplink2), LBT kicked in yet again, and dynamically changed port-to-uplink mapping of one of the three active vNIC ports currently mapped to the saturated dvUplink.

Phase 4: As the benchmark reached a steady state with the total network traffic exceeding little over 13Gbps, both the dvUplinks witnessed the same usage.


It seems that only one vmkernel port is using more than one dvuplink, after the saturation threshhold has been reached and the wakeup perioed has passed.  Is he wrong? Am I misunderstanding something?

Reply
0 Kudos
chriswahl
Virtuoso
Virtuoso

I think I see where the confusion is stemming from.

So, LBT definitely looks at the dvUplink to determine saturation, and then finds any port groups set to use LBT in order to migrate VM vNICs and vmkernel ports around. I think we're solid on this.

The caveat is that the vmkernel port pushing NFS traffic is considered one object. If it causes the dvUplink to > 75% saturation, the only option LBT has is to move the vmkernel port to another dvUplink in the team. LBT cannot split out the vmkernel port onto two dvUplinks (such as 50% on dvUplink 1 and 25% on dvUplink 2). No one object will use two dvUplinks unless there is an EtherChannel or LACP involved (with IP Hash).

So if you had something else causing >75% saturation, such as VM traffic, on a dvUplink that was also being used by a vmkernel port handling NFS, LBT might move the vmkernel port or it might move the VM vNIC depending on what would fix the saturation problem.

If the 2 dvUplinks were just doing NFS then LBT would not migrate the vmkernel port as it wouldn't fix anything (75% saturation on dvUplink 1 would just simply become 75% saturation on dvUplink 2).

LBT is much more effective with VM vNICs because there are typically a lot of them. LBT can move a few VM vNICs off a saturated dvUplink easily. But remember - it is still moving the vNICs from one dvUplink to another - the per VM vNIC traffic itself is never "split" between uplinks.

VCDX #104 (DCV, NV) ஃ WahlNetwork.com ஃ @ChrisWahl ஃ Author, Networking for VMware Administrators
Reply
0 Kudos