I have four identical HP ML350 Gen 9 servers.
Each with:
16gb memory
RAID 10 consisting of 2tb drives with two hot spares.
When using ESXi 6.5 I have had RAID failures on all three servers.
All three servers are running the latest firmware and all updates I can locate from HP.
The ESXi install is the HP distribution.
Typical example:
Yesterday I was creating two 2012 server VMs (stored on the RAID 10) and installing the OS via an ISO also stored on the RAID 10.
Each VM was thick provisioned with primary drive 160gb or 200gb.
Provisioning completed without issue.
I then started both systems to install the operating systems; it reached 40-48% then stopped.
No obvious errors, etc. Completely unresponsive to shutdown request, etc.
When I looked at storage, it was gone.
I checked iLO to see the health of my storage, it reported a failed RAID.
Storage Info
Logical View | Physical View |
Controller Status | OK |
Serial Number | N/A |
Model | Dynamic Smart Array B140i Controller |
Firmware Version |
|
Controller Type | HPE Smart Array |
Status | OK |
Drive Bays | 4 |
Status | OK |
Drive Bays | 4 |
Status | Failed |
Capacity | 3725 GiB |
Fault Tolerance | RAID 1/RAID 1+0 |
Logical Drive Type | Data LUN |
Encryption Status | Not Encrypted |
Status | OK |
Serial Number | Y67HK2LXF1BA |
Model | MB2000GDUNV |
Media Type | HDD |
Capacity | 2000 GB |
Location | Port 1I Box 3 Bay 1 |
Firmware Version | HPG4 |
Drive Configuration | Configured |
Encryption Status | Not Encrypted |
Status | Failed |
Serial Number | Y652K472F1BA |
Model | MB2000GDUNV |
Media Type | HDD |
Capacity | 2000 GB |
Location | Port 1I Box 3 Bay 2 |
Firmware Version | HPG4 |
Drive Configuration | Configured |
Encryption Status | Not Encrypted |
Status | OK |
Serial Number | N4G6ZSTY |
Model | MB2000GFEMH |
Media Type | HDD |
Capacity | 2000 GB |
Location | Port 1I Box 3 Bay 3 |
Firmware Version | HPG2 |
Drive Configuration | Configured |
Encryption Status | Not Encrypted |
Status | Failed |
Serial Number | N4G70M0Y |
Model | MB2000GFEMH |
Media Type | HDD |
Capacity | 2000 GB |
Location | Port 1I Box 3 Bay 4 |
Firmware Version | HPG2 |
Drive Configuration | Configured |
Encryption Status | Not Encrypted |
Interestingly, when I reboot the hardware it shows those failed drives as being just fine. Also, this is the same failure I see on my other servers.
I have also tested the drives individually and they are good.
Thoughts?
I have created a vmware support bundle from the server if anybody wants to view it.
I am having this exact same issue on vsphere 6.5 with a HPE DL 380 Gen9 with 12 2TB drives in a RAID6 on a P440ar controller. Every couple of weeks the storage disappears. In my case iLO also shows the storage as being healthy and reboot brings back the volume/datastore in vsphere.
Unfortunately not the case for me. I have to recreate the array which destroys everything.
It was odd this server had been rock stable before moving to 6.5
The saga continues.
Yesterday HP sent out two new drives.
Today I built out the raid, created a few vm's. About two hours later the RAID failed. This time it was not the new drives, it was the existing drives. Arg!
Hi,
same identical problem with an HPE ML110 and B140i.
4 WD RED Pro 4TB drives configured in 1 LUN RAID 10.
ESXi 6.5 installed from HP ISO on a microSD. Everything is working fine until I stress the storage section with a backup, LUN failed with two mirrored disks failed (of the totale 4, as your case). I reboot the host, I go in the Smart Array page, all the drives are OK, I re-enable the LUN and it is working fine...until the next crash. It seems a software bug, but a huge one!!!
I tried to update the ESXi with all the latest patch using the Update Manager but with no luck, same problem.
Did you find a solution?
Thanks a lot.
Best Regards.