1 2 3 Previous Next 42 Replies Latest reply on Jul 2, 2014 12:13 PM by laura_g

    New ESXi 5.5 Install threw PSOD, Raid controller driver?

    Dirtrunner Lurker

      Can I get someone to glance at the PSOD I got on a new install of 5.5 ?

       

      Installed this on Friday night and this Monday morning it was sitting at a purple screen. Ran fine all weekend as far as I can tell.

       

      Its a DL380p G8 with both p420i and p420 Raid controllers

      using HP-ESXi 5.5.0 iso 1331820

       

      I think its yelling about the Raid controller but cant say for sure.

       

      Looking at the vmkernal log im seeing this line over and over.

      2014-03-09T13:08:07.636Z cpu2:286677)<4>hpsa 0000:02:00.0: out of memory at vmkdrivers/src_9/drivers/hpsa/hpsa.c:3562

       

       

      2014-03-09T14:30:45.964Z cpu11:303182)<4>hpsa 0000:0a:00.0: cp 0x410a2b700000 has status 0x2 Sense: 0x5, ASC: 0x20, ASCQ: 0x0, Returning result: 0x2

       

      I attached vmkwaring.log and vmkernal.log and a screenshot of the error.

       

      Thank you guys!

        • 1. Re: New ESXi 5.5 Install threw PSOD, Raid controller driver?
          grasshopper Virtuoso
          vExpert

          Hi ,

          Indeed it appears that you have discovered a potential memory leak that is starving the vmkernel or otherwise causing mayhem.  Please ssh into the host and capture a support bundle asap while it's still fresh.  Do this by typing vm-support.  The logs will be saved in /var/tmp.

           

          You should also grab the vmkernel zdump by performing an esxcfg-dumppart -L against the zdump filename.  This creates a file that is very useful for support in debugging the diagnostic screen.  Remember that the zdump you are targeting is probably in /var/core, but the file it outputs will be saved in your current directory.  So I like to cd to the /var/tmp directory before running this.  That way all my logs are in one place.

           

          Optional:  This is contained in the vm-support bundle, but may be handy for quick reference.  Run an 'esxcli software vib list > /var/tmp/my-vibs.txt' so you have a list of all software installed on the host.

           

          Gather all that using WinSCP (in SCP mode, not SFTP) and share with HP and VMware Support (also include the screenshot for good measure).  HP will be interested to know which SPP you are running (i.e. is firmware the latest, etc.).  Also check the iLO logs to see if there were any disk failures or raid rebuilds that were aggravating the issue.

           

          Let us know if you need anything.

          1 person found this helpful
          • 2. Re: New ESXi 5.5 Install threw PSOD, Raid controller driver?
            Jon Munday Master
            vExpert

            Agreed with grasshopper.

             

            Check what version of the scsi-hpsa VIB you have installed, and update this if required (including hardware firmware).

             

            Here is an example from my nested lab running on my laptop;

             

            ~ # esxcli software vib list | grep -i scsi-hpsa

            scsi-hpsa                      5.5.0-44vmw.550.0.0.1331820           VMware  VMwareCertified   2013-12-09

             

            ~ # esxcli software vib get -n scsi-hpsa

            VMware_bootbank_scsi-hpsa_5.5.0-44vmw.550.0.0.1331820

               Name: scsi-hpsa

               Version: 5.5.0-44vmw.550.0.0.1331820

               Type: bootbank

               Vendor: VMware

               Acceptance Level: VMwareCertified

               Summary: hpsa: scsi driver for VMware ESX

               Description: HP Smart Array SCSI Driver

               ReferenceURLs:

               Creation Date: 2013-09-19

               Depends: vmkapi_2_2_0_0, com.vmware.driverAPI-9.2.2.0

               Conflicts:

               Replaces:

               Provides:

               Maintenance Mode Required: True

               Hardware Platforms Required:

               Live Install Allowed: False

               Live Remove Allowed: False

               Stateless Ready: True

               Overlay: False

               Tags: driver, module

               Payloads: scsi-hps

            ~ #

             

            Looks like there were previous issues with this VIB, so I wouldn't be too surprised if not all issues were resolved;

            VMware KB: VMware ESXi 5.0, Patch ESXi500-201310204-UG: Updates VMware ESXi 5.0 scsi-hpsa vib

             

            Cheers,

            Jon

            1 person found this helpful
            • 3. Re: New ESXi 5.5 Install threw PSOD, Raid controller driver?
              Dirtrunner Lurker

              Thank you to both of you. Your guys' help is immensely appreciated!

               

              I will get a support ticket going with support just as grasshopper recommends.

               

              As for jrmunday 's post, I used the latest HP ISO which installed the 5.5.0.58-1OEM.550.0.0.1331820 VIB

               

              I also checked the firmware on the raid controller and it is at 3.04 and the current version seems to be at 5.22 and the backplane expander also has firmware updates that can be applied.

               

              Im going to schedule a maintenance window for this server, flash the latest and greatest and report back if the logs are still showing that out of memory message or anything else funky. Hopefully this will help someone else out one day.

              • 4. Re: New ESXi 5.5 Install threw PSOD, Raid controller driver?
                Jon Munday Master
                vExpert

                Hopefully the latest firmware and drivers will help resolve this. I had an issue recently where HP branded Qlogic 2560 HBA's had old firmware but hosts (HP DL380p Gen 8) had new drivers. In this case the FC ports would constantly flap up/down and crash the hosts with a PSOD. As soon as I flashed the firmware, and updated the drivers to the latest supported version all the issues disappeared.

                • 5. Re: New ESXi 5.5 Install threw PSOD, Raid controller driver?
                  YVesli Lurker

                  I have the same problem as well. Is the firmware update helps resolving the issue?

                  • 6. Re: New ESXi 5.5 Install threw PSOD, Raid controller driver?
                    Dirtrunner Lurker

                    Yes updating the firmware did solve the issue along with having the latest drivers for the controllers.

                    • 7. Re: New ESXi 5.5 Install threw PSOD, Raid controller driver?
                      javier_dp Novice

                      I have just hit this yesterday as well. Uptime was 18d.

                      Server is BL460G8, esxi build 1623387

                       

                      Hewlett-Packard_bootbank_scsi-hpsa_5.5.0.58-1OEM.550.0.0.1331820

                         Name: scsi-hpsa

                         Version: 5.5.0.58-1OEM.550.0.0.1331820

                         Type: bootbank

                         Vendor: Hewlett-Packard

                         Acceptance Level: VMwareCertified

                         Summary: hpsa: scsi driver for VMware ESX

                         Description: HP Smart Array SCSI Driver

                         ReferenceURLs:

                         Creation Date: 2013-12-16

                         Depends: vmkapi_2_2_0_0, com.vmware.driverAPI-9.2.2.0

                         Conflicts:

                         Replaces:

                         Provides:

                         Maintenance Mode Required: True

                         Hardware Platforms Required:

                         Live Install Allowed: False

                         Live Remove Allowed: False

                         Stateless Ready: True

                         Overlay: False

                         Tags: driver, module

                         Payloads: scsi-hps

                      • 8. Re: New ESXi 5.5 Install threw PSOD, Raid controller driver?
                        jonsaville Lurker

                        We have seen this since twice upgrading to 5.5.0 in February and 5.5.0u1 in March.

                         

                        Server is a HP DL160 G6, latest BIOS. Storage is P410 with four SATA drives in 2 mirrors. iSCSI for near-line storage. Minimally loaded.

                         

                        We upgraded to 5.5.0u1 in response to the first incident, but have just seen our second. P410 firmware was out of date (v5), patched today to 6.40.

                         

                        scsi-hpsa is 5.5.0.58-1OEM.550.0.0.1331820 and seems to be the culprit:

                         

                        2014-04-12T09:26:42.353Z cpu0:6329569)<4>hpsa 0000:07:00.0: cmd_special_alloc returned NULL!

                        2014-04-12T09:26:42.358Z cpu0:6329569)<4>hpsa 0000:07:00.0: out of memory at vmkdrivers/src_9/drivers/hpsa/hpsa.c:3562

                        [7m2014-04-12T09:27:12.354Z cpu0:6329642)WARNING: LinDMA: dma_alloc_coherent:726: Out of memory[0m

                         

                        Our PSOD is slightly different (Failed to ack TLB invalidate). Attached.

                         

                        This machine (and another identical) were running with ESXi 5.0u1 for 18 months with zero problems, so this is a little frustrating.

                        PSoD.JPG

                        • 9. Re: New ESXi 5.5 Install threw PSOD, Raid controller driver?
                          MillardJK Enthusiast
                          vExpert

                          Ding ding ding!

                           

                          Add another install with the issue. Similar to Javier, we're on BL460c Gen8 w/5.5.0. The most annoying part: these are diskless blades, booting from SD card and using FC SAN for all storage. There isn't a newer version from HP, so I'm just going to remove the VIB and hope that does the trick.

                          • 10. Re: New ESXi 5.5 Install threw PSOD, Raid controller driver?
                            rabittom Novice

                            Hi all,

                             

                            i've updated a BL460-G7 a couple weeks ago - smooth. After 2 weeks uptime i've been faced with a PSOD.

                            VM-support analyzed and told me to update the firmware from HP (which was outdated at that time).

                            i ran SPP 2014.02 with the latest hotfixes and patched to ESX-build 1746018.

                            After one week uptime the same happened again - PSOD with the same indicators.

                             

                            last weekend i've had the same experiance on another Bladecenter with the same blade-types.

                            two hosts crashed more or less at the same time with a psod.

                            i opened a case at VM-support and the told me the following:

                             

                            [Snip]

                            This is a know issue to VMware.
                            It is a problem with the hpsa driver.

                            We are advising anyone that has this PSOD to open a HP Support Request and
                            reference HP case 4648045806.

                            HP are working on an updated driver to resolve this issue.

                            [Snip]

                             

                            I've contacted HP Brazil (where the affected Bladecenter ist located) - they told me that HP is aware of this issue and that they are working on that since April/14.

                            i'Ve contacted HP US (where i have some connections) - no answer till yet.

                            i've contacted HP Austria (where our HQ is located) - no answer till yet.

                             

                            I've stopped the upgrade for the remaining (+50) hosts to 5.5 till HP find's it worth to inform their customers..

                            BTW: i do have a C-7000 with BL460-Gen8 (all on build 1746018)- no problem till yet.

                             

                            CU

                             

                            • 11. Re: New ESXi 5.5 Install threw PSOD, Raid controller driver?
                              klimenta Novice

                              I had the same issue yesterday. The screenshot from the original post pretty much is the same as what I got on a BL460c Gen8 blade with P220i Smart Array controller.

                               

                              Call VMware, they have an updated driver only available thru them at this point.

                              scsi-hpsa-5.5.0.58-2OEM.550.0.0.1198611.x86_64.vib is the file that needs to be uploaded and installed.

                              The original driver is the same version but with "-10EM" suffix.

                              • 12. Re: New ESXi 5.5 Install threw PSOD, Raid controller driver?
                                jfbordenjr Lurker

                                Did scsi-hpsa-5.5.0.58-2OEM.550.0.0.1198611.x86_64.vib help you?  I have put that on and testing now. 

                                 

                                Thanks,

                                 

                                John

                                • 13. Re: New ESXi 5.5 Install threw PSOD, Raid controller driver?
                                  klimenta Novice

                                  So far so good with "20EM". Nine days without a problem.

                                  • 14. Re: New ESXi 5.5 Install threw PSOD, Raid controller driver?
                                    rabittom Novice

                                    I've talked to VMware-support and the confirmed that this is a known bug. HP released that driver-version - but it was a beta and HP did not allow to deploy it anylonger when i called.

                                    The guy told me that HP released an internal information that "if everything goes fine" they will have an official driver ready by this weekend. So we have to wait...

                                     

                                    this is the official statement from VM-support:

                                    ************************************************************************

                                    The reoccurring PSOD that your ESXI hosts are receiving is a known issue.

                                    VMware and HP are working closely on finding a resolution for this problem.

                                    We have identified that the problem is with the hpsa driver.

                                    We know that this PSOD is caused by an out of memory condition but we don't know what triggers the
                                    issue.

                                    HP are currently working around the clock to release an update driver that should resolve this issue.

                                    All things going well HP are hoping to have the driver this weekend, please
                                    note this is subject to change.

                                     

                                    We currently only have internal documentation on this issue.

                                     

                                    However, VMware and HP are currently working on a
                                    public facing document that we can provide to customers who hit this issue.

                                    ************************************************

                                    i'm working with HP for quiet a long time so i assume that the driver will not be ready this weekend - any chance for me to get the offline-bundle from you?

                                     

                                    thanks

                                    tom

                                    1 2 3 Previous Next