7 Replies Latest reply on Aug 17, 2009 5:14 AM by JohnBons

    Cannot find a path to device vmhba

    JohnBons Enthusiast

      Goodmorning!,

       

      We have a problem with one of our host's in the cluster(3 host's)

      The vm's were still online but the host wasn't (according to vcenter). I couldnt login to the console.

      After a reboot it was working again.

      But now i wanna know were to start troubleshooting this host.

       

       

       

      Message was edited by: JohnBons Removed some details

        • 1. Re: Cannot find a path to device vmhba
          AndreTheGiant Guru
          User Moderators

          Do you have a FC SAN?

          Do you have some error on storage and/or on SAN switches?

           

          Andre

          • 2. Re: Cannot find a path to device vmhba
            JohnBons Enthusiast

             

            Were using netapp iscsi.

            Im checking our storage department if there was any problem last weekend.

            Also im questioning our network department.

             

             

            • 3. Re: Cannot find a path to device vmhba
              AndreTheGiant Guru
              User Moderators

              But the error message was clear: you lost connectivity with your LUN.

               

              So check network connection and be really sure that is ok.

              And make also failover test with multipath (disable temporally a path in datastore / multipath configuration) to be sure that is working fine.

               

              Andre

              • 4. Re: Cannot find a path to device vmhba
                JohnBons Enthusiast

                But the weird part is that the vcserver was reporting that the host was unavailable. And the console was reporting that a LUN was unavailable.

                Is that normal behavior for the vcserver?

                 

                The network department didnt see any weirdhickup's last weekend.

                • 5. Re: Cannot find a path to device vmhba
                  AndreTheGiant Guru
                  User Moderators

                  And the console was reporting that a LUN was unavailable.

                   

                  This IS the big problem, and could be caused by network problem.

                   

                  But the weird part is that the vcserver was reporting that the host was unavailable.

                   

                  This could be relater to network problem, or ESX too slow to give acknowledge to VC (during path rescan).

                   

                  Andre

                  1 person found this helpful
                  • 6. Re: Cannot find a path to device vmhba
                    JohnBons Enthusiast

                    double post....

                    • 7. Re: Cannot find a path to device vmhba
                      JohnBons Enthusiast

                      Talked to vmware about this. And they advised me a couple of things to check.

                       

                       

                       

                         1. Try to serialize the operations of the shared LUNs, if possible, limit the number of operations on different hosts that require SCSI reservation at the same time.

                         2. Increase the number of LUNs and try to limit the number of ESX hosts accessing the same LUN.

                         3. Avoid using snapshots as this causes a lot of SCSI reservations.

                         4. Do not schedule backups (VCB or console based) in parallel from the same LUN.

                         5. Try to limit the number of virtual machines per LUN.

                         6. What targets are being used to access LUNs?

                         7. Check if you have the latest HBA firmware across all ESX hosts.

                         8. Is the ESX running the latest BIOS (avoid conflict with HBA drivers)?

                         9. Contact your SAN vendor for information on SP timeout values and performance settings and storage array firmware.

                        10. Turn off 3rd party agents (storage agents), rpms not certified for ESX.

                        11. MSCS rdms (active node holds permanent reservation).

                        12. Ensure correct Host Mode setting on the SAN array.

                        13. LUNsremoved from the system without rescanning can appear as locked.

                        14. When SPs fail to release the reservation, either the request did not come through (hardware, firmware, pathing problems) or 3rd party apps running on the service console did not send the release. Busy virtual machine operations are still holding the lock.

                       

                      Note: Use of SATA disks is not recommended in high I/O configuration or when the above changes do not resolve the problem while SATA disks are used.