I am having a difficult time understanding how to setup 100% redundant networks from an ESXi host to multiple NFS storage devices. Please see the attached image for this question.
Goal: Be able to lose any single ethernet link OR any single switch without interruption to the NFS datastore.
- I want to leave the switches UNstacked (so can't use LACP)
- NASes supports ALB and static XOR
- Scenario A is ideal, where the switches are totally independent and not connected
- Using the default Route Based on Originating Virtual Port, traffic will only flow out of vmnic1 OR vmnic2, but not both at the same time (since I only have 1 vmkernal). This means all NASes will use only vmnic1 or vmnic2 at any given time.
- Both switches are Cisco 10G
Not being able to use both vmnic1 and vmnic2 at the same time seems to be a big issue for this scenario. The second hurdle is that VMware can't "see" when one of the two NAS links becomes disconnected. It can only detect its link and the switch.
In scenario A, if all traffic is using vmnic1 and link 3 dies (but switch1 and link 1 are fine), then VMware will lose connection to NAS1 without failover, even though there is a physical path from NAS1 back to the host.
In secnario B, NAS vendor has recommended against linking the switches together (link 7) due to unknown switch behavior in the event of a link failure. But I'm not seeing a way to avoid this...
- How would scenario A be possible where I could lose links on the NASes and still have redundant connections? Or is link 7 really required?
- I don't believe adding a second VMK to the vSwitch and assigning each VMK to each vmnic would be beneficial in any way (because I believe VMware just picks one of the two VMKs only)
- I don't think I can use Beacon Probing because that will not detect NAS link disconnects (and really needs a different physical topology). Right? I don't think you can specify a list of targets to probe/test.
- Would a different load balancing algorithm be better than the default? I would ideally like to team them without switch configuration such that I could use 2x of the bandwidth, but I don't see any options to allow that or prove advantages.
- Would NFS multi-pathing (included in 4.1) be a solution here?
Any insight into how to achieve the goal would be GREATLY appreciated!
failover-scenarios.png 75.8 K