VMware Cloud Community
GJAY01
Contributor
Contributor
Jump to solution

Storage Design with VAAI

Hey everyone, have a storage design question.

Back when we didn't have a SAN with VAAI, scsi reservations were a major limiting factor in all design considerations.  We made smaller clusters and presented a fewer number of luns - with those luns being shared amoungst the fewest number of hosts possible - and ran fewer VMs per lun.

Now, however, we have a new SAN with VAAI capabilities.  We currently have 7 clusters - all of which have their own set of luns/datastores.

Is there any reason (since scsi reservations are no longer an issue) why I shouldn't present all luns to all hosts in all clusters? I know there is a 255 lun limit, however, I'm not concerned with hitting that limit. Will this impact performance in any way? Is there serivce console memory overhead?  Without scsi reservations, our main limiting factor (as I see it) is lun queue depths - which would be per host.  So if I can increase the number of hosts that can be seen by a single datastore - I should be able to also increase the number of VMs on that datastore as long as the VM load is spread across multiple hosts.  Is this correct?

Also, most of our clusters are built the way they are because of software licensing restrictions, so it does happen often that we have to move a VM between clusters - I used to cold migrate - but now I use a Templates/Transition datastore that is presented to all hosts regardless of cluster.  While this means no downtime - it is still several steps to move the VM ( 2 svmotions and a vmotion).  It would be very convenient if all hosts could see all datastores - but I don't want to break any design best practices.

Thanks, and appreciate any input!

0 Kudos
1 Solution

Accepted Solutions
Slingsh0t
Enthusiast
Enthusiast
Jump to solution

Everyone's circumstances are different and if you find that a best practice recommendation doesn't suit then considering an alternative design choice may be the most appropriate solution for you.

I'm not entirely sure scsi reservations are no longer an issue, but the impact of which has been greatly reduced with the hardware assisted locking.  AFAIK there are no additional hypervisor overheads in having additional datastores.  Performance should not be impacted in any way except when doing things like scans or storage refreshes, these may take longer.  The HA master election process in vSphere 5 also analyses the connected datastores on each host, this may also add a small overhead (almost irrelevant though)...

*edited as I missed some valuable bits in the original post rendering my response useless*

View solution in original post

0 Kudos
3 Replies
Slingsh0t
Enthusiast
Enthusiast
Jump to solution

Everyone's circumstances are different and if you find that a best practice recommendation doesn't suit then considering an alternative design choice may be the most appropriate solution for you.

I'm not entirely sure scsi reservations are no longer an issue, but the impact of which has been greatly reduced with the hardware assisted locking.  AFAIK there are no additional hypervisor overheads in having additional datastores.  Performance should not be impacted in any way except when doing things like scans or storage refreshes, these may take longer.  The HA master election process in vSphere 5 also analyses the connected datastores on each host, this may also add a small overhead (almost irrelevant though)...

*edited as I missed some valuable bits in the original post rendering my response useless*

0 Kudos
vGuy
Expert
Expert
Jump to solution

There is an overhead involved with this type of configuration, from vSphere Storage guide:

"By default, the host performs a periodic path evaluation every 5 minutes causing any unclaimed paths to be claimed by the appropriate MPP."

Since there is a rescan occuring every 5 minutes there would be additional overhead for the extra LUNs. This will also increase your host boot time and also the automatic rescans are done when you add a new LUN or Datastores.

Now, you have the option to tweak these using advance settings such as disabling host.rescanfilter and limiting the no. of LUNs to be scanned using Disk.MaxLUN parameter (per host). But in my opinion the operational overhead and the possibility of manual errors is far more than maintaining your current transfer/transient volume.

GJAY01
Contributor
Contributor
Jump to solution

Thanks for the information guys, it's been really helpful.  I think we will stick with the datastores isolated to the clusters; using the transfer volume if/when a VM needs to move between clusters.

0 Kudos