VMware Cloud Community
lonecedar2
Contributor
Contributor

EMC AX150i and number of XP vm's per LUN

Hello

I have been troubleshooting an issue with a 3 host ESX 3.0.2 cluster with DRS and HA enabled. The cluster users Dell PE 6950 servers each with 64 GB memory and quad dual core AMD cpu's.

There are approximately 89 VMs on this cluster with 70 of those VMs as XP workstations. This is a Virtual Desktop solution using Wyse terminals.

There is one 1.5 TB LUN that all 70 of the XP VMs have their C drives located on a EMC AX150i SAN configured for iSCSI. ( I believe this is the issue but my management wants proof before they will purchase additional storage.) Every once in a while we will have one of these VM's slow down drastically - and at that time access from the other on this LUN are affected too. When we reboot a vm it takesover 10 minutes for it to reach the login screen and it is very sluggish. VM's on other LUNS don't have this issue so it is pointing to the LUN.

When I troubleshoot using ESXtop and "cat /proc/vmware/scsi/vmhbaN/N:N" I see the "active" commands are high ( between 4 and 11) on one or two of the ESX hosts. I never see any "Queued" commands. They stay high for the same period of time the VM's are sluggish. This always occurs when we are powering all the VM's up and it sometimes occurs just during normal operation.

My questions are:

1. Should I be looking for "Queued" commands on this LUN or is that not always the case.

2. Are these "commands" also SCSI reservations? I cant seem to find any documentation that defines this.

3. Is there a Maximum number of VMs per LUN that we should not exceed? I have read anywhere from 8- 16 for NFS and iSCSI to 40.... depending.

4. Has anyone else had similar issues with VDI and similar storage as the AX150I?

Thanks for your input

0 Kudos
2 Replies
lonecedar2
Contributor
Contributor

Maybe I have asked too many questions in one post.

If anyone could answer the following question it would be very helpful.

Has anyone else had similar issues with VDI and similar storage as the AX150I?

Thanks

-Pat

0 Kudos
kastlr
Expert
Expert

Hi,

you should choose a different design with more LUN's and less VM's per LUN.

I'm not familar with VDI, but right now you've only one LUN which need to handle the complete IO load.

With more LUN's you could

- spread the IO flow over several switches

- spread the IO load over both SP's

- spread the IO load over all HBA's

- spread the IO load caused by ESX host swapping

All of this should increase IO bandwidth, as long as the IO load could be handled by the storage array in a proper manner.

I assume there's a spec paper available for the AX150i which does contain infos regarding the IO load.

Whenever a condition occur which does disturb the IO flow to your current LUN, ALL ESX server and ALL VM's would be affected.

By separating the VM's using more LUN's, the effects should be limited.

But I would bet that your AX150i is already fully equiped and such a change is not possible anymore (as usual).


Hope this helps a bit.
Greetings from Germany. (CEST)