nzsteve
Enthusiast
Enthusiast

SRM datastore, design and sizing

Jump to solution

Hi,

I'm hoping someone can give me an opinion over a proposed implementation I'm working on. It's a new build, and my first venture into SRM, so want to make sure I get the datastore layouts right first time.

We have about 100 total VMs that we will be looking to replicate and protect with SRM. Around 70 at the main site, and 30 at the secondary site. Each site will have a cluster of 11 ESX hosts, and we will be licensing all hosts at both ends for bi directional SRM (I know we could probably split site B to a smaller cluster to save on some licensing costs, but not we're too concerned about that for now). There will be other VMs in the environment that won't be protected, these will all be run on other clusters/hosts, and stored on their own datastores seperate from the SRM protected hosts.

In terms of datastore layout in the past I've always stuck with 300-500GB luns with 10-15 VMs on each. I'm considering reducing to smaller groups, maybe 5 VMs average, to allow us greater flexibility in the replication schdules, and in the protection groups and recovery plans we can draw up to recover subsets of servers.

If we go down to 5 VMs per lun we're looking at around 20 VMFS volumes max on the cluster. We also have around 40 existing SAN volumes that will be connecting to approx 20 VMs as RDM (some vm's will have 3 or 4 RDMs), so we're looking at a total of ~60 luns required to be mounted on each host.

My initial concern was that over 11 hosts thats one heap of SAN zoning and path management (SANs are both EVA6000, active active, we have 2 ports in each host, and are planning on performing some manual path management to ensure the highest load RDMs are split). The storage admin is happy to take on the extra overhead of more total luns on the basis that it gives us more control of replication schedules and what we failover.

Is there anything I'm missing from the above that we should consider in the decision? We did look at 2 smaller clusters of 5/6 hosts, but decided one large one would remove the need to balance workload accross clusters, keeping the overall layout abit simplier.

Appreciate any comments!

Cheers,

Steve

Tags (3)
0 Kudos
1 Solution

Accepted Solutions
depping
Leadership
Leadership

No this is spot on. The smaller the LUNs and the amount of VMs the more choices you will end up. In other words, I prefer to do it this way as well.

Duncan

VMware Communities User Moderator | VCP | VCDX

-


Blogging:

Twitter:

If you find this information useful, please award points for "correct" or "helpful".

View solution in original post

0 Kudos
4 Replies
depping
Leadership
Leadership

No this is spot on. The smaller the LUNs and the amount of VMs the more choices you will end up. In other words, I prefer to do it this way as well.

Duncan

VMware Communities User Moderator | VCP | VCDX

-


Blogging:

Twitter:

If you find this information useful, please award points for "correct" or "helpful".

0 Kudos
JeffDrury
Hot Shot
Hot Shot

You will also want to place your VM's on the datastores based on their priority during a recovery. If you have high, medium, and low recivery priorities make sure all of your high priority VM's are grouped on the same set of datastores. If one of the low priority VM's has data on a datastore where high priority data also exists then that low priority data will be grouped in with the high priority VM's during a recovery. This can cause issues when bring up VM's at the recovery side as you may have to bring up a low priority VM before a Medium priority VM simply based on datastore placement. SRM automatically groups datastores together based on what data sources the VM's are using. This is great in that it won't allow a VM to failover without it's underlying data, but it does make the placement of your data more about recovery priority than anything else. If you haven't thought about datastore placement before using SRM then now is a good time and you better brush up on storage vMotion.

Also be mindful of the storage space required at both sites in order to test your recovery plans. Some storage technologies, (not sure about the EVA), require as much as 2x of the orignal storage space at the recovery site to perform a test recovery. This requirement is dependent on the individual storage technology and not SRM but can be frustrating when attempting to test your recovery plan. During an actual failure you will not need 2x the storage capacity, but the goal with SRM is to test and hopefully never have to press the big red button.

nzsteve
Enthusiast
Enthusiast

That's good to know, thanks for the input.

Also, thanks for the work you put into your blog, I've picked up heaps of useful tips and guidelines from it. :smileygrin:

Cheers,

Steve

0 Kudos
nzsteve
Enthusiast
Enthusiast

Thanks for the info. The VM to datastore mapping is what we're working on at the moment, and the priority thoughts are another reason for smaller LUNs I think.

I wasnt aware of the greater than double storage implications, but it does make sense now that you've pointed it out. I'll do some digging into the EVA to confirm.

Thanks,

Steve

0 Kudos