We have an LSI MegaRaid 9240-8i, running in VMDirectPath I/O pass-through mode on an ESXI 5.5 host. We are running Centos 7 as a VM in an ESXI 5.5 host. All ran well for 3 months, then Friday at 10:06, the error log was filled with xxx messages starting with this:
fs1 kernel: sd 0:2:0:0: [sdb] megasas: RESET cmd=88 retries=0
megasas: [ 0]waiting for 6 commands to complete
megasas: [ 5]waiting for 6 commands to complete
.
megasas: [175]waiting for 6 commands to complete
megasas: moving cmd[0]:ffff8802b0631a20:0:ffff8802adb4c1c0 the defer queue as internal
megasas: moving cmd[1]:ffff8802b0631a20:0:ffff8802adb4c1c0 the defer queue as internal
megasas: moving cmd[2]:ffff8802b0631a20:0:ffff8802adb4c1c0 the defer queue as internal
megasas: moving cmd[3]:ffff8802b0631a20:0:ffff8802adb4c1c0 the defer queue as internal
megasas: moving cmd[4]:ffff8802b0631a20:0:ffff8802adb4c1c0 the defer queue as internal
megasas: moving cmd[5]:ffff8802b0631a20:0:ffff8802adb4c1c0 the defer queue as internal
megasas: Waiting for FW to come to ready state
megasas: FW now in ready state
Megasas has been mostly quiet since Friday, but the server has been off-line. WEBBIOS says the 4 drives are OnLine, so I'm not sure where to hunt next.
Could I have a bad RAID controller, or is there a more likely solution to my problem.
Other hosts on host are functioning as expect, they don't interface with the RAID array.
It's working now.
Turns out the 9240 was checking array for consistency. I stopped the check & all is well. Looking for a new RAID controller.