Hi I am having an issue with disk failures in Vsan even though the dell server says the disk is fine. I reboot and the issue goes away for awhile then comes back when I try to create VM's. It appears its only the SSD flash drives with the issue
All servers and disks are new and reporting healthly. Fresh install of VSAN 6
All servers and drives are in the HCL. The only issue is vmware lists the wrong firmware for the sandisk optimus ascend SSD. The one they list does not exist from what I can tell from sandisks website.
3 Dell R630 Servers New
Each server has 1 Sandisk Optimus SSD and 2 Seagate 10k HD's All New
Vsan network consists of each server with 10GBe intel X540 nics
Switches are netgear XS712T
Any insight would be helpful. Thanks
I had the same issue with another component. Firmware list by VMware was not available on the manufacture's website. I had to contact the manufacture to obtain the proper firmware. Not sure if it would be Dell or SanDisk in your case, assuming the SanDisk was purchased OEM from Dell. Thank you, Zach.
Hi! Could you find out something? Having the same hardware and the same problem.
Only difference is the flash drive which make the trouble, it's Intel not Sandisk.
Dell suggested to upgrade all firmware to the newest version but the problem still keeps coming back.
When a disk exhibits very high latency for long time , VSAN marks such devices degraded. This will reflect in disk failures.
The failed disk status remains in the memory and is not persistent across reboot.
When server is rebooted, the disk again turns up to be healthy.Again if the device starts showing very high latencies
you would see this issue again. Best solution for your problem would be to get the disk/firmware/driver right.
Hi,
Facing same issue on HP setup here, got all LDs as failed in RAID 0, deleted all LDs, re-created them and error seemed to have gone, but once the ESXi was put back in prod, this issue popped up again. Using 5.5. Any suggestion/solution would be great.
Update: Got update from HP, seems like controller needs to be disk replaced even though it is showing healthy in ILO.
Regards,
Pranav
sorry if I profit of your configuration, but I have pretty the same config with 3 hosts, same 10g card and netgear XS708E switch (the smaller one) and have bad result on multicast test of the healt check plugin.
Have you ever tryed it? What's your results?
Many thanks for your help
Manuel
On 31th of July this year we were contacted by DELL informing us of urgent firmware updates regarding some SSDs and some Raid-Controllers. But i guess the main thing was on the SSDs because of some "hang"-cases. So - please use the latest Nautilus and update the drives. Attention: You HAVE TO use Nautilus (in UEFI mode - if you have installed ESXi6 via BIOS, make sure to swith to UEFI before booting Nautilus but make sure even more to switch back to BIOS when using ESXi again IF you installed ESXi via BIOS). For the drive FW you can´t use LC at this time, even with the R730 series. So always have the latest Nautilus on hand after doing the usual SUU stuff.
http://www.dell.com/support/home/us/en/04/Drivers/DriversDetails?driverId=TTRG8
Best,
Joerg
We had the same issue. After upgrading the Dell firmware and patching the ESXi the problem seems to be solved.
For detailes please see: