Solved: Re: Datastore / Disk latency problems with HP ProL...

Honzze · ‎10-16-2014

Hello,

we are currently investigating massive datastore and disk (mainly READ) latency problems on some of our HP ProLiant DL380 G7 with the P410i (BBWC) controller.

It all started after our regular half-year update cycle for ESXi 5.1 U2 to:

- ESXi510-201407001 (no problems)

- hp-esxi5.0uX-bundle-2.1.1-2 (no problems)

- hp-HPUtil-esxi5.0-bundle-2.1-15 (no problems)

- hpsa-5.0.0.74 (driver) & SPP2014.09.0 (- problems started)

It is not a permanent increase in read latency, but insted peaks occour which trigger alerts in Veeam. Datastore read latency rises from about 5 ms to 100 ms as soon as this happens, but after a few seconds everything is back to normal. Those particular datastores are RAID5 (but we've seen some alerts on RAID1 also), have read and write cache enabled and the most affected machines are those with the LOWEST :smileyalert: usage.

We have some WinServer2012R2 RDP for a low number of Sessions (less than 5) - there is no noticeable delay upon working on these.

Maybe someone of you has experienced a similar behavior after applying the latest updates to HP ProLiant Servers. We don't have a clue on what to try next, a SPP rollback isn't possible, maybe the driver, but we've got to find the correct predecessor, since one of the older 2014 HPSA drivers leads to PSOD on the Host.

Best regards,

HZ

maaca · ‎01-08-2015

Hi,

most probably, you have already solved the issue... I made some investigation and was in contact with HP. This is issue with hpsa-5.0.0.74 driver. It crashes from time to time when used with P410i. In logs, you can find "WARNING: LinScsi: SCSILinuxAbortCommands:1843: Failed, Driver hpsa, for vmhba0". HP knows about it and working on new driver. As workaround, you can use hpsa-5.5.0.60 (at least this works well for us).

BR

maaca

View solution in original post

JPM300 · ‎10-16-2014

Hey,

Is it possible that the new driver shifted you read/write % on the raid controller, or maybe changed the write cache settings?

Boot up one of the hosts with HP's Offline Array Control Utility and check the RAID controller settings. You can also change your % here if your workload is more read. Back when we used to deploy a lot of HP servers for Disk-to-Disk-to-Tape systems I would change the RAID controller settings to 80% write 20% read as the backup server was doing 80% writing anyhow.

Hope this has helped

Honzze · ‎10-16-2014

Thanks, JPM300,

Good point, but I've already checked the cache settings, and they are still 50/50 for read/write in our case, because read latency is the problem here, not write latency. We've also checked the BBWC module, and it's working fine.

Best regards,

HZ

maaca · ‎01-08-2015

Hi,

most probably, you have already solved the issue... I made some investigation and was in contact with HP. This is issue with hpsa-5.0.0.74 driver. It crashes from time to time when used with P410i. In logs, you can find "WARNING: LinScsi: SCSILinuxAbortCommands:1843: Failed, Driver hpsa, for vmhba0". HP knows about it and working on new driver. As workaround, you can use hpsa-5.5.0.60 (at least this works well for us).

BR

maaca

Honzze · ‎01-09-2015

Hello,

many, many thanks maaca! No, I did not get an answer from our HP partner. Because of you post I can finally do something against all that monitoring alerts. I'll install hpsa-5.5.0.60 in our testing lab and roll it out as soon as I'm sure it works!

Thanks again and best regards!

VirtualCop · ‎10-16-2015

this issue still exist also in 106 (scsi-hpsa-5.5.0.106-1OEM.550.0.0.1331820.x86_64.vib).

HP works on ver.110, but it seems to be unstable and HP-customized 5.5U3a will be released with 106 again.

The only workaround we found: to unconfigure spare drive from RAID5 volumes.

maaca · ‎10-16-2015

You are right. Now I see it also. Somehow, it didn't appear in our lab, but I can see it in production.

So the only stable versions are .60 and .84, right?

VirtualCop · ‎10-16-2015

Hi Maaca,

>>Somehow, it didn't appear in our lab

is the spare disk configured for your RAID LUNs in your lap environment ?

If not, could you please try to reproduce it?

>> So the only stable versions are .60 and .84, right?

I didn't downgrade scsi-hpsa driver, because the statement from HP: ..If the events: "Lost access to volume + Successfully restored access to volume" appear periodicaly (10-30 minutes cycle), they can be ignored.

HP call is opened. HP promized me to release the driver next months.

I'm waiting.

JK

maaca · ‎10-16-2015

Hi JK,

yes, this seems to be correct. We don't have configured spare disks in our lab.

maaca

Daniel76 · ‎10-30-2015

Hi,

we had the same problem with 106 after update HP driver und VMware 5.5 to U3. I made a manual downgrade from 106 to 60 this worked for me, but write performance is still bad (cache settings is 25% read and 75% write).

We have still hotspare hd defined on raid 5 , maybe i try to unconfigre this and look if something gets better. The new Version 110 is still not available...

VirtualCop · ‎11-27-2015

Hello guys,

ESXi 5.5 U3a ISO (HP customized: VMware-ESXi-5.5.0-Update3-3116895-HP-550.9.4.26-Nov2015.iso) is available for download:

https://my.vmware.com/web/vmware/details?downloadGroup=HP-ESXI-5.5.0U3A-GA&productId=353

it looks the HPSA module was released in the version 114:

scsi-hpsa 5.5.0.114-1OEM.550.0.0.1331820 Hewlett-Packard VMwareCertified

Could you please post results, if anyone of you going to test this release/driver?

Thx!

Cop

maaca · ‎11-27-2015

Hello,

we are running on .114 for few days on few hosts and it seems to be stable.

maaca

digitalnomad · ‎12-02-2015

Apparently Problems still exist with the updated 114 HPSA Driver opened a new thread

Datastore / Disk latency problems with HP ProLiant G7 - HP Smart Array P410i controller " WARNING: L...

All

Datastore / Disk latency problems with HP ProLiant DL380 G7 - HP Smart Array P410i controller after SPP 2014.09.0 / hpsa-5.0.0.74 update