VMware Cloud Community
eoinkim
Contributor
Contributor

Infrastructure design question

Hi all,

I'd like to get some advice regarding infrastructure design. My company is thinking moving onto ESXi environment from Xen and I am currently drawing a infrastructure design but would like to get some advice. I have drawn the following diagram that shows a physical connection between two data centres.

layer1.png

My idea is:

  1. VMs should be able to get live migrated if required. This includes internet connectivity. For example, a VM running on ESXi1 in DC A has public IP address. And if it moves to DC B, I don't need to change its IP address at all and internet connectivity should still work.
  2. A separate network for VM migration and VM storage (Red - 10G). All other traffics, e.g. private network traffic between VMs, cluster traffic between ESXi hosts use different network (Green - not 10G).
  3. VMs can be distributed on both data centres' ESXi hosts but their disks are all in one storage (Storage 1) and each VMs backup is stored into other data centre storage (Storage 2) or the whole storage itself is replicated.

I was thinking those three things as main factors so far. My question is:

  1. Assuming that all routing is sorted, will the above diagram work? I've tested with other product (Proxmox) internally (not on DC) and confirmed the concept is working. So, I guess it would in VMware as well?
  2. If this is going to work, should I make a single cluster across the data centres? If not, any recommendations? My previous test with Proxmox used one cluster.
  3. I am a bit worrying about #2 from my idea. Will the green segment be okay for all other traffics without saturating it? I am thinking of using 1G.
  4. Regarding #3 in my idea, if I am going to use ZFS as a storage, what would be better for backup? Just use individual VM backup on other storage (I guess it doesn't need to be ZFS) or ZFS replication? Of course, minimal downtime is better when disaster happens but it doesn't need to be must. Also, I guess using iSCSI over ZFS is recommended if ZFS is used?

Apologies for poor English. Hope I explained okay. If I could get any advice, I'd really appreciate it. Thanks a lot.

Eoin

11 Replies
scott28tt
VMware Employee
VMware Employee

Moderator: Thread moved to the broader vSphere area since you have questions about networking, backups, replication, migration.

You should also consider vCenter Server in your deployment.


-------------------------------------------------------------------------------------------------------------------------------------------------------------

Although I am a VMware employee I contribute to VMware Communities voluntarily (ie. not in any official capacity)
VMware Training & Certification blog
0 Kudos
eoinkim
Contributor
Contributor

Anyone could help me out here? Thanks.

Eoin

0 Kudos
scott28tt
VMware Employee
VMware Employee

You are asking a consultancy kind of question on a user community forum, you might be better engaging with a local VMware partner.


-------------------------------------------------------------------------------------------------------------------------------------------------------------

Although I am a VMware employee I contribute to VMware Communities voluntarily (ie. not in any official capacity)
VMware Training & Certification blog
0 Kudos
ZibiM
Enthusiast
Enthusiast

Hello

Few things to consider:

1. whatever we tell you here this won't be even a draft. you need someone who can sit with you, check over all the consideration and guide you (and your environment) through this. yes - he will take some money for that, but he/she also can be held responsible

2. ZFS is not supported storage solution for the Vmware. I suppose it works, but it cannot be recommended, especially not for the metro cluster design

3. Your environment with 6 ESXi servers looks quite OK for VSAN - this in turn usually requires brand new servers, b/s rebuilding your old ones usually is much more expensive

4. Interlink between the sites - this is something that is hard to be advised. There are quite few options for stretching/communicating networks over the distance. Some of them have some other limits beside the throughput. Usually pair of 10G is enough

5. Internet network that is switching between the sites - this is smthg that you HAVE to consult with your networking dept. It depends on several factors, it requires FW/routers working in sync between the sites, etc.

Generally you need to think first what you want to achieve

Will it be a metro cluster environment - that is solution that supports active-active

Or do you need to have DR environment - with one primary site and the 2nd acting as standby

0 Kudos
eoinkim
Contributor
Contributor

Hi @

0 Kudos
depping
Leadership
Leadership

Good morning,

  1. If routing is properly configured and VLANs are available across locations then "stretching" your environment should work
  2. You however when you want to create a "stretched cluster" also would need "active/active" aka "clustered" storage, which you don't appear to have from looking at your diagram. So a single cluster wouldn't be recommended, as a single cluster would typically mean ALL storage is available to ALL hosts in the cluster
  3. What is the distance between these datacenters? What is the latency? As you say you want to cross connect and use a SAN as primary storage and ZFS as backup? A couple of problems with that:
    1. See point 1 and 2, normally you would stretch the storage in this situation, or replicate the data and create a DR plan
    2. ZFS is not supported by VMware, please check the VMware Compatibility Guide

you could consider using software based replication from VMware to replicate between locations (vSphere Replication), or any other solution out there. I feel it is best you work with a local consultancy company to figure out what would work best in your scenario, as there are many variables.

0 Kudos
eoinkim
Contributor
Contributor

Hi depping​,

Thanks for that. Routing will be configured okay, not a problem. The storages will be visible to all hosts across the data centres. We will lease a dedicated lines between DCs. Physical location of data centres are Sydney and Melbourne, say around 1000 km far between.

Hmm.... I thought iSCSI target is available for ESXi. I'll check the compatibility guide again, thanks. If not possible, what would be the suitable option? Should I just go with NFS?

Cheers.

Eoin

0 Kudos
depping
Leadership
Leadership

iSCSI and NFS are supported, but it also depends on the storage system you use. The storage system has to be supported as well. This is all documented via our VMware Compatibility Guide which you can find here: VMware Compatibility Guide - System Search

On that list you can find servers, storage systems etc.

When it is 1000 KMs apart, I don't see HOW you could run VMs in one location which are accessing storage in the other. The latency will easily be 10ms RTT, and that is considering an optimal situation. And that is latency for ALL I/O, which I normally don't see as being acceptable.

Like I said, what you are trying to do is not a "walk in the park", this sounds like a complex setup. Talk to an expert locally to go over all requirements / constraints etc.

0 Kudos
eoinkim
Contributor
Contributor

Thanks for that.

Yes, I checked the RTT between data centres and it is around 11, 12 ms in a good state.

For my curiosity, what would be the acceptable latency for this model?

Thanks again.

Eoin

0 Kudos
ZibiM
Enthusiast
Enthusiast

You need to understand that in metro cluster scenario any write IO need to be confirmed by the 2nd site before it is considered written.

If you have RTT 11,12 ms that means you are adding local storage latency + RTT + remote storage latency before you can consider IO committed.

This is usually perceived as way to low for any production application.

People start to complain at around 5 ms, above 10 ms it is widely noticeable, above 15 ms it is unacceptable.

Maybe think about Disaster Recovery using Site Recovery Manager and asynchronous replication.

larstr
Champion
Champion

Eoin,

Stretching layer 2 and storage between such distances is fairly complicated stuff that could cause big problems. It would be better as suggested above, to use routed networking and in case of an emergency you could use SRM to fail over networks and VMs.

Lars

0 Kudos