VMware Cloud Community
snowdog_2112
Enthusiast
Enthusiast

ESXi 5.1, IBM DS3200 SAS - high disk latency with 2 hosts running

(2) IBM x3650 hosts running esxi 5.1 with 2xHBA SAS connections to IBM DS3200 dual controller SAN.

I have 2 arrays, both set to multi-host access for vmware.

With a single host powered on and running all vm's, the disk performance seems normal.

As soon as I power on the second host, the latency on host-1 soars (as high as 500ms!!), even though there are *no vm's* on host-2!!!  If I power down host-2, latency on host-1 returns to acceptable levels.  Again, there should be *NO* disk activity caused by host-2 since it has no vm's and boots from internal storage.

I've motioned all vm's to host-2 and the symptoms are the same - disk latency is fine with all vm's on host-2 and host-1 powered off.  If I power host-1 back on, the vm's running on host-2 grind to a near-halt due to latency.

The only oddity I've noticed is the path for one of the LUN's is different than what I think it should be.

Host-1

hba1 - runtime name: hba1:c0:t0:L1

hba2 - runtime name: hba1:c0:t0:L1   <-- note, this shows the same runtime name as hba1

manage path: hba1:c0:t0:L1

Host-2

hba1 - runtime name: hba1:c0:t0:L1

hba2 - runtime name: hba1:c0:t0:L1  <-- note, same

manage path: hba2:c0:t0:L1  <-- notice how this says hba2, not hba1 (hba1 is "standby")

One additional note:

Before the second esxi 5.1 was added, there was a Windows bare-metal host connected to the SAN, with the esxi host accessing 1 array, and the Windows host accessing the other array.  In that configuration, disk latency was also normal.  The Windows host has been removed and disconnected.  I also removed the host group in the Storage Manager.

Stumped...

Tags (4)
Reply
0 Kudos
22 Replies
hfharder
Contributor
Contributor

Was this ever resolved? I'm experiencing the exact same issue.

Reply
0 Kudos
Hyperlink201110
Contributor
Contributor

hi,

sorry to ask now (its a very old thread) but, did you find a solution for this?

im using a old ds3200 to store data protection applicance and im facing the disk latenvy when shared this storage with another host.

Reply
0 Kudos
snowdog_2112
Enthusiast
Enthusiast

The issue ended up taking over 6 months to resolve.

I don't know if this was the official fix, but the last of my notes show that we completely removed and recreated ARRAY's and the LUN's on the SAN and used 256k for the Segment size when creating the Arrays.

This was a destructive event, but we had enough capacity to move all VM's to one array, destroy/recreate the other array, then move everything to the new array and destroy/recreate the second array.

The SAN ran stable for another 2 years and was replaced with a V3700 with SAS disks.  The DS3200 is still in service at the DR site.

Hopefully this helps - I had hundreds of hours spent on this issue.

Reply
0 Kudos