VMware Cloud Community
mbx65555412
Contributor
Contributor

Connect VMware vSAN dual 10G NIC nodes to Cisco Nexus 5548UP using vPC

Hello!

I have 4-Node (Intel S2600TPR) Hybrid Hyperconverged Cluster connected to 2xCisco Nexus 5548UP Switches with L3 modules.

Each node has 2x1Gbps NICs (Intel I350 onboard, not unused) and 2x10 Gbps NICs (Intel 82599EB SFP+, connected to Nexus Switches).

On Nexus vPC is configured for each vSAN host.

On vSAN Distrubuted Switch is used with LACP LAG for two uplinks.

The question is:

What configuration best practices for Nexus and Distributed Switch? Especially, LACP or not, MTU, load-balancing algorithm on both ends etc. Are any special issues with traffic polarization and failover?

Great thanks for any help!

4 Replies
soebul
Contributor
Contributor

Hi,

I have almost the same configuration with the difference being L2 setup for the VSAN network and I am looking for best practices to be followed or a working example from someone who has done this before with Cisco Nexus 5548UP switches (or similar).

Attached is a gist of the network topology we are using.

My questions are concerning the VSAN network (color green in the diagram) specifically -

1. We are using Nexus switches and will be configuring VPC (virtual port-channel) for the VSAN network. Is this supported for Vmware VSAN hosts? What is the recommended option in this case?

2. Which load balancing or NIC teaming option is recommended for the ESXi hosts?

3. Our plan is also to use vSpeher HA and to configure 2 isolation IP addresses (Nexus-3 and Nexus-4 will have this configured) so that during failure of a host, VMs on that host are powered off and restarted on available hosts. Is this the correct setup?

One thing that we have observed so far with VPC configured on Nexus-3 & 4 (static etherchannel) and "Route based on IP hash" load balancing method configured on the ESXi hosts - the connectivity does not remain stable and we frequently see VSAN cluster partition error appearing even though the isolation IP addresses are reachable from each host. Any assistance here to point out what we are missing would be really appreciated.

Thanks.

0 Kudos
DCChrisMc
Enthusiast
Enthusiast

Hi there.

Its quite common to split the 10Gb NICS to have one dedicated for VSAN and the other for everything else.

Then have these active/ fail-over with Network IO Control (NIOC).

Best Practices for vSAN Networking

MTU is a difficult one. If you can do Jumbo, do it. There is benefits. There was a excellent presentation by Cormac Hogan and Paudie ORiordan going over this very topic.

https://videos.vmworld.com/global/2018/videoplayer/26709

However, If its brownfield and you are worried about getting it done properly, it may not be worth the risk/ heartache.

For Load Balancing it wont matter as much if you do the design above. But John Nicholson goes through them better than I:

Designing vSAN Networks - Using Multiple Interfaces? - Virtual Blocks

I would say dont mix your 10Gbps and 1Gbps links in the same vDS. Strange things happen.

DCChrisMc
Enthusiast
Enthusiast

We are using Nexus switches and will be configuring VPC (virtual port-channel) for the VSAN network. Is this supported for Vmware VSAN hosts? What is the recommended option in this

case?

Yes, its quite common

https://www.cisco.com/c/dam/en/us/products/collateral/switches/nexus-7000-series-switches/C07-572831...

One thing that we have observed so far with VPC configured on Nexus-3 & 4 (static etherchannel) and "Route based on IP hash" load balancing method configured on the ESXi hosts - the connectivity does not remain stable and we frequently see VSAN cluster partition error appearing even though the isolation IP addresses are reachable from each host. Any assistance here to point out what we are missing would be really appreciated.

What version of ESXi/ VSAN are you using?

Getting multicast to work over different switches can be a pain. If your entire cluster is using VSAN6.6 or above it should be Unicast so it will be something else thats wrong.

If you are using a older VSAN have you tried troubleshooting Multicast when you have issues? 

Virtual SAN Troubleshooting: Multicast - VMware vSphere Blog

3. Our plan is also to use vSpeher HA and to configure 2 isolation IP addresses (Nexus-3 and Nexus-4 will have this configured) so that during failure of a host, VMs on that host are powered off and restarted on available hosts. Is this the correct setup?

VMware HA works a bit different with VSAN

http://www.yellow-bricks.com/2017/11/08/vsphere-ha-heartbeat-datastores-isolation-address-vsan/

Duncan has some clear guidelines for you to follow.

soebul
Contributor
Contributor

Thanks DCChrisMcfor all the valuable references provided, will definitely go through each of these.
DCChrisMc

Thanks for pointing out about jumbo frame support - We can definitely enable Jumbo frame support for the VSAN network but just need to be consistent with the rest of the network configurations. So I will discuss with my peers if the performance benefits outweigh the management/administration complexity involved.

By the way, we are using ESXi/VSAN version 6.7.0 U1.

Thanks.

0 Kudos