zenomorph
Contributor
Contributor

Vmware storage/adapter performance issue

Hi I'd like some advice we've encountered an issue with our newly setup ESXi Ent Plus 5.0 cluster with 4 hosts connected to an EMC SAN via FC. Were running some tests on disks and finding were getting varying performance results on a VM.

We use SQL to test the disk I/O by doing a dB dump and evaluating the disk performance write time eg. 160MB/sec etc. this VM sits on a single datastore with its C:\ and D:\ and we vMotion the VM to each ESXi host and run the SQL test doing a dB dump and look at the disk performance.

What were finding is that in particular 2/4 hosts is showing very poor disk performance figures eg. 160MB/sec vs. 45MB/sec which is only 1/4 the throughput. The SAN is not very busy so the performance should be the same on all ESXi hosts and since there not running any other VM's (this is a newly setup cluster with 1 VM only). We have tried this test various times and confirmed the results that for some reason 2/4 hosts the disk I/O performance is really poor - the problem is we cannot confirm whats the cause.

We've already looked at things like the SAN switch config. for all host and also registering the host on the SAN and there's no errors plus the HBA card policy which were running "Fixed VMware" plus the HBA adapter queue depths.

All hosts were setup using the same ESXi setup disk and running exactly the same hardware and component firmware versions we do not configure any of the advanced settings and leave majority default so all host configs. should be the same - does anyone have some suggestion how we can troubleshoot or try to identify the cause. We've looked at the performance stats of the VM in vCenter and indeed the disk access is slower on 2/4 hosts.

many thanks....:smileyconfused:

0 Kudos
4 Replies
jrmunday
Commander
Commander

Have you been through the BIOS settings on each host to ensure that the hosts are all configured the same?

Things that come to mind are disabling C1E and node interleaving and setting the power profile to maximum performance.

vExpert 2014 - 2018 | VCP6-DCV | http://www.jonmunday.net | @JonMunday77
Gortee
Hot Shot
Hot Shot

Fun one. 

Troubleshooting Steps in my mind:

1.  You know it's not the virtual machine because it performs differently on hosts I assume none of your tests were done during the actual vmotion (yes that would change performance) If you vmotion it back to the original host I assume performance is back to normal.  Given that assumption here is my test cases:

2. Something has to be different with the remaining componets: Server, FC Switches, SAN Setup.  I am going to assume they are configured exactly the same on SAN and Fiber channel switches.  So that leaves vmware configuration and server hardware.

3. You mentioned you are using fixed vmware.  Is this the best setting for your array?  Is it a ALUA array ?  Is is possible that you are using the passive path and it's fixed?  This type of performance could be possible when using a passive on ALUA.   You will want to make sure you are using vendors specific best case path scenario and you might want to make sure your using EMC's pathing solution powerpath. 

4. I agree with jrmunday check Bios settings look for anything different.

5. Make a host profile of the good host and compare to all others to check for config drift.

6 If the firmware hardware and config are 100% the same (My vote is on the fixed path policy as the cause) then reload ESXi (I doubt this is the issue but worth a try)

Let me know if any of this works.

J

Joseph Griffiths http://blog.jgriffiths.org @Gortees VCDX-DCV #143
zenomorph
Contributor
Contributor

Thanks guys for the suggestion, we are using indeed Fixed Path ALUA for EMC SAN, you mentioned "that we may be using the passive path" how can we identify that and if need switch back to the preferred path?


Also after some detailed review of the storage and VMFS paths - we have vmhba2 and vmhba3 our pathing policy on all hosts is Fixed(Vmware) VMW_SATP_ALUA_CX (for EMC VNX). What I noticed was on the 2 normal (faster) performing hosts all VMFS paths relied on vmhba2 and the preferred path actually matched the "Active (I/O) paths, however on the slower performing ESXi hosts the paths seemed a bit strange some were showing connected to vmhba3 and the Active (I/O) actually differred from the "Preferred" paths when I go into the "Managed Paths" properties. And also noticed that for certain datastores the runtime name showed "vmhba2:CO:T0:L14" but when I go into managed paths it shows the Acitve (I/O) is actually on vmhba3:CO:T3:L14 with "*" preferred as well.

Is this actually indicating that the pathing for the datastores and hba's seem a bit messed up and they need to be reconfigured properly? If so whats the easiest and best way to do this, remove the datastores and add them again so it builds the paths again?

Many thanks

0 Kudos
Gortee
Hot Shot
Hot Shot

Evening,

This is a odd one out of the box the drivers should handle this issue.  They should choose the prefered path via a scsi sense code. 

Is this actually indicating that the pathing for the datastores and hba's seem a bit messed up and they need to be reconfigured properly? If so whats the easiest and best way to do this, remove the datastores and add them again so it builds the paths again?


Well the issue is for some reason ESXi is choosing the non-prefered path on a different HBA.  This normally means the prefered path is not available.  I would check your zoning on the poorly performing nodes.   If zoning is 100% the same then I would work with EMC / support for VNX to figure out if it's a driver issue.  Honestly it's acting oddly the lun should have the same prefered path from every host on an AULA array.  I bet it's the zoning.


Some items to consider and make sure you have done:

- All of the servers initiators should be in the same storage group on the VNX

- It should be set to fixed path for best performance

A document to review and make sure you are following: (I am sure you are but just in case)

http://www.emc.com/collateral/hardware/technical-documentation/h8229-vnx-vmware-tb.pdf

BTW I would give lots more help but I have never used a VNX... I have used Clariion's and based on the documentation they are very similar.  I also have lots of experience with other storage vendors. 

Joseph Griffiths http://blog.jgriffiths.org @Gortees VCDX-DCV #143
0 Kudos