VMware Cloud Community
EhsanRajabi
Contributor
Contributor

Multiple ESXi and Multiple SAN Storage

Hello everyone

I've searched a lot in PDFs and forums to find an answer to a really important question but unfortunately no result found.

Everyone knows we can pass HA requirements in Host level using shared storage and HA mechanism and also Vmotion or .... . I mean we can add extra Host and migrate VMs manually or automatically between them. We only need a shared storage ( SAN , NAS ).

Main Question :

How we can tolerate failure during a SAN failure ? We differently need another SAN storage . But could you tell me how to configure SANs and ESXi Hosts to work together? I mean when a host is working with VMFS volume on a SAN and it fails, how another SAN continues servicing Host without any delay and data lost?

Is there any special configuration need on ESXi Host ( Except multipathing ) or all configuration must be implemented on SAN storage?

It would be really pleasant if you send my configuration guide for both SAN and ESXi Host

15 Replies
christianZ
Champion
Champion

Hello here,

some quick information here.

In production you should use a redundant storage, that means you have min. 2 controllers/nodes and raid based arrays, where both controllers have a shared access.

When one controller fails, the another would take over the lun/volume management, so your hosts don't loose the access to your  luns.

When a disk fails, the raid/controller (e.g. level 5 or 6) can rebuild the failed disk.

You need of course redundant san connection, i.e. 2 fibre channel  or ethernet (if iscsi or nfs) switches. There are some sas storages  with redundunt direct connection, so you don't need any switches there.

And if you have 2 or more hosts with 2 or more hbas each, then you are basically save (at the storage site).

The higher option of storage security  would be storage mirroring  over 2 sites. There are some performers on the market with a such solution.

Reg

Chris

hussainbte
Expert
Expert

There is no such Fail-over mechanism at the SAN level that if one array fails the other will take over the VMs.

1)The storage array itself has Multiple Storage processors for redundancy purposes to avoid failures.

2) There are multiple paths configured from FC/iSCSI switches to make sure there is no single point of failure.

If you found my answers useful please consider marking them as Correct OR Helpful Regards, Hussain https://virtualcubes.wordpress.com/
EhsanRajabi
Contributor
Contributor

Thanks for replying

As you said, nowadays all storage systems have 2 SP ( Storage Processor ) and Arrays are working with them simultaneously . I mean if one SP fails another one will handle jobs. We can only use more than one path to connect to second SP.

But imagine Whole storage system fails! So there is no differences between SP1 and 2 . What do you do? Here we need another SAN storage . Can you explain me the situation and configuration approach ?

Reply
0 Kudos
EhsanRajabi
Contributor
Contributor

Thank you for replying

I do agree with you in a less complex design with 2 Host and 1 San storage.

But what should we do when whole SAN storage system goes down?

So we need 2 Hosts and 2 SAN storage systems . Could you help me to implement an architecture in this situation?

a_p_
Leadership
Leadership

There are storage solutions - e.g. HPE's Lefthand, or EMC's VPLEX to name some of them - which provide such capabilities. In case of a single storage node failure, the failover occurs more or less transparently to the ESXi hosts, and there's no downtime for the VMs. Once the failed storage node comes back to life again, the data is synchronized from the surviving nodes.

André

Linjo
Leadership
Leadership

That is one of the reasons why hyper-converged solutions are so hot right now, then you basically eliminate the need and complexity for a SAN at all and all the different failure scenarios is much easier to deal with. You also reduce your opex in most cases.

Disclaimer: I work for a hyper-converged vendor.

// Linjo

Best regards, Linjo Please follow me on twitter: @viewgeek If you find this information useful, please award points for "correct" or "helpful".
Teddy092
Enthusiast
Enthusiast

Hello EhsanRajabi

It is very unlikely to see a whole storage array going down. A SP could fail, a SPS, a drive and so on... but the whole storage system, very rare especially nowdays.

I've never seen or design architecture with 2 storage arrays in a production environment just for 2 hosts. Do you have a DR site and if yes a SAN array and hosts on it?

Otherwise, like a.p. suggested, you have virtualized storage system that can help but cost a lot! Or you could go ahead for a HCI (Hyper-Converged) solution that are doing a great job and is a better option from my point of view.

If you found this or any other answer useful please consider the use of the Helpful or Correct buttons. Many thanks.
kabanossi
Enthusiast
Enthusiast

Usually there is no "storage failover" or "v-motion" going on.

When you have multiple SAN boxes or you are using vSAN you will have this storage presented as iSCSI targets to the ESXi host(s).

The only thing that needs to be done on the ESXi side is they need to be Discovered and added as Datastore.

"storage failover" happens on the SAN/vSAN side and nothing needs to be configured on the ESXi for that.

Magneet
Hot Shot
Hot Shot

This is why you have stretched or metro clusters, if one dies the other takes over. See http://www.virtualizationsoftware.com/vmware-metro-storage-cluster-part-1/ for some explanation, you can do this 1 one dc also offcourse.

EhsanRajabi
Contributor
Contributor

Hello

Thanks for replying

could you explain more please ?

Reply
0 Kudos
EhsanRajabi
Contributor
Contributor

Hello

Thanks for replying

No, I'm not going to design a DR site right now , but we can think about it in future. It's just in local site . We'll have about 15-20 VMs which some of them would be mission critical so 2 Hosts handle resource and work together in a HA Cluster. If one host fails , the second one will be placed ( No risk ) . But storage is also important . One storage system may fail and ... .

How do you design this visualization scenario with appropriate HA level?

Reply
0 Kudos
EhsanRajabi
Contributor
Contributor

Hello dear kabanossikabanossi

Thanks for replying

Yes, when we have multiple SAN system , we can present LUNs as data store, but imagine storage which hosts LUN X fails . So all VMs on that will be lost . So multiple SANs work separately and are not sync .

Is there any solution to sync data stores ? Then, your solution works

Reply
0 Kudos
Magneet
Hot Shot
Hot Shot

With those amount of vm's i would start thinking about hyperconverged, for robo locations we use a 3-node 2U nutanix block and that can handle the loss of 1 node. Hyperconverged means you have nodes with their own build in storage. Not sure if Nutanix does 2-node clusters but simplivity does and both have build in backup possibilities. VSAN also needs 3 nodes but would work perfectly for you and depending on your license might be included in it already!

Reply
0 Kudos
kabanossi
Enthusiast
Enthusiast

There are many vSAN solutions that keep your data synchronized all the time.

For example StarWind Virtual SAN.

In case one of you hosts(storage boxes) will go offline, VMs will just continue to run because they will still be gaining the access to the storage from the synchronous partner host(storage box).

In case you are now thinking of Hyper-converged solutions out there, StarWind also has one that can be build only with 2 nodes and no switching will be required for it.

Teddy092
Enthusiast
Enthusiast

Hi EhsanRajabi‌,

If you have multiple Storage array in a SAN, then the best option for your would be a Virtualized Storage System.

You have VSAN from VMware but also DataCore SAN Symphony V which is a very good product too.

Check this out DataCore San Symphony V

If you found this or any other answer useful please consider the use of the Helpful or Correct buttons. Many thanks.
Reply
0 Kudos