We have two Dell MD3660i ISCSI SANs.
We have a dedicated ISCSI storage network (physically isolated 10GbE switches )
We have 10 ESXi hosts in a single cluster each with one dual port 10 GbE NIC dedicated for storage networking
The Dell SANs each have two controllers. Each controller has one dual port 10GbE NIC. Dell insist that to utilise both 10GbE ports in each controller efficiently, each port in each controller has to be on its own subnet. For this purpose I can probably use two subnets per SAN distributed between controllers
So we end up with say with 4 subnets and 4 VLANs on the storage network.
I immediately have a problem as there should be a 1:1 ratio of vmk to physical NIC and subnet on ESXi, and I only have two physical 10GbE ports in each host.
Is it safe to break the 1:1 ratio rule? Are there any workarounds (apart from adding more cards)?
Assuming I have set this all up somehow breaking the 1:1 ratio rule, and I go to add the vmks to the ISCSI software adapter. How does the ISCSI multipathing know which SAN is which when a link goes down?
There's a vDS design where you don't bind iSCSI vmk to a physical NIC. Create the 4 iSCSI vmk, make both pNIC active in all iSCSI vmk and let NIOC control them. Haven't really played around with this configuration, so couldn't tell you how well it works.
Vmware multipathing drivers handle the detection of using different paths/addresses to hit the same LUN. This is a pretty good step-by-step on how I would do this (from Synology, but applicable):
Edit: just realized the problem, you'll nee two vmkernel ports per adapter since you have 4 subnets and 2 ports. You can't do this with multipathing in VMware since you can't add an adapter that has more than one vmkernel port to the network configuration tab of the ISCSI adapter settings as shown in the guide above.
You don't have to use multipath, but then you'll only be able to point to one controller port on one controller on the SAN.
The use of multiple subnets makes little sense if the subnets all go through the same switching hardware and hit the same targets. I would put them all on the same subnet or use two subnets, and port A on both controllers in subnet 1 and both ports B on subnet 2, but 4 subnets means 4 NICS in each host.
Thanks for the reply.
Yes the issue is that Dell insist that it is a requirement to have a different subnet on each port on each controller, and I am not sure how wise it is to use the same subnet/VLAN for each separate SAN. I think it may be best practice to have the ISCSI traffic on two separate SANS on separate subnets/VLANs....
You are going to need 2 more 10GbE ports per host in that case. Only way to retain multipath, load balancing and fault tolerance.
There is a way to get it to work, but you'll lose fault tolerance and load balancing. Just make a vswitch with both 10GbE connected as Trunk ports. Make vmkernel ports as needed on that switch with different IPs tagged with different VLANs. Then you don't configure anything in the ISCSI adapter settings except for targets. VMware automatically finds the ISCSI target based on connected VMkernels. Since you have 2 links to the vswitch you'll be able to withstand a port or switch loss, but you won't fail over if a SAN controller goes down, unless it has a virtual IP you can connect to.
I did it this way with Lefthands back in the 3.5/4.0 days before multipath was really an option. It works, but it's no longer recommended.
Yes I am going to contact Dell to see if I can say use just two subnets/VLANs. One for SAN1 and the other for SAN2. This way I can have 1 vmk to pNIC ratio and declare then in the ISCSI Software adapter and effectively have proper MPIO as you suggest. The disadvantage may be that according to Dell (although this is not yet clear in my mind or in the documentation) if you do not assign a separate subnet to each port then multiple ports will not be utilised for IO on each controller. As the ports are 10GbE this may not be an issue anyway. Perhaps working MPIO is more important than a fat 2x10GbE pipe.
Just a thought I had, do the subnets have to be isolated? If you enabled routing, you could have different subnets, but they could still all talk to each other. If your 10GbE switches are layer 3, then it shouldn't add any noticeable overhead.
I think routing will not work. According to Dell -
"Port binding requires that all target ports of the storage array must reside on
the same broadcast domain as the VMkernel ports because routing is not supported
with port binding. See VMware KB #2017084"
That KB article says:
Port binding requires that all target ports of the storage array must reside on the same broadcast domain as the vmkernel ports because routing is not supported with port binding.
This basically means VMware does not support what Dell is telling you to do. VMware says all storage needs to be on the same subnet. They even give a sample of using different subnets being the problem in that KB.
At 1 of my engagement, they were using Dell MD3200 which had a similar issue. What the network team recomemend was 1 VLAN and 1 /24 subnet for the switch/routers. For the ESXi hosts and SAN, the /24 was broken down to /26.