2 Replies Latest reply on Jun 7, 2013 7:52 AM by abcam

    LSI SAS error?

    brandontuch Lurker

      Hello all. I've been running an evaluation version of ESXi [5.1] at my company, as we're trying to move torwards a more professional virtualization solution. I have a R515 however that doesn't seem to be able to work with VMware. I can add it as a host, but whenever I try to make a new VM on it, or move a VM to it, it just sits there for over an hour, then says Disconnected. I did some poking around on the forums here and found out how to turn on the shell and SSH, then went in to look at the dmseg. I found the following (apologies for the wall of text):

       

      2013-03-13T08:28:39.108Z cpu3:13162)WARNING: LinScsi: SCSILinuxAbortCommand:1949:The driver failed to call done from itsabort handler and yet it returned SUCCESS
      2013-03-13T08:28:39.108Z cpu3:13162)WARNING: LinScsi: SCSILinuxAbortCommands:1816:Failed, Driver LSI Logic SAS based MegaRAID driver, for vmhba2
      2013-03-13T08:28:39.305Z cpu3:13162)megasas: ABORT sn 7723822 cmd=0x2a retries=0 tmo=0
      2013-03-13T08:28:39.305Z cpu3:13162)<5>0 :: megasas: RESET -7723822 cmd=2a retries=0
      2013-03-13T08:28:39.305Z cpu3:13162)<3>megasas: cannot recover from previous reset failures
      2013-03-13T08:28:39.305Z cpu3:13162)WARNING: LinScsi: SCSILinuxAbortCommand:1949:The driver failed to call done from itsabort handler and yet it returned SUCCESS
      2013-03-13T08:28:39.305Z cpu3:13162)WARNING: LinScsi: SCSILinuxAbortCommands:1816:Failed, Driver LSI Logic SAS based MegaRAID driver, for vmhba2
      2013-03-13T08:28:39.630Z cpu3:13162)megasas: ABORT sn 7723825 cmd=0x2a retries=0 tmo=0
      2013-03-13T08:28:39.630Z cpu3:13162)<5>0 :: megasas: RESET -7723825 cmd=2a retries=0
      2013-03-13T08:28:39.630Z cpu3:13162)<3>megasas: cannot recover from previous reset failures

       

       

      There's pages and pages of that It's a good server, before I put ESXi on it the uptime was 16 months. I would like to use this server for VMs but don't know if this is something that can be fixed or not. Is this just incompatability, or is the SAS card failing because of ESXi? It's a Dell H700, so I imagine that's a fairly popular (and supported) card.

        • 1. Re: LSI SAS error?
          anicdjw Lurker

          We are seeing this same error on an M610. Were you able to get this resolved?

          • 2. Re: LSI SAS error?
            abcam Lurker

            Hello,

            We just had the exact same problem on a Dell PowerEdge R610 equipped with a Dell PERC H700 after an upgrade from ESXi 4.1 to ESXi 5.1u1. The problem was caused by the version 12.10.3-0001 of the RAID controller, that reports some false alarms about multibit ECC errors. The issue was corrected by upgrading the firmware of the RAID controller to the latest version: 12.10.4-0001 (A10). You can download the firmware upgrade here: Driver Details | Dell US. As you can see in the release notes, this version "Fixes an issue where seemingly random multibit ECC errors are seen".