1 2 Previous Next 15 Replies Latest reply on Apr 22, 2020 9:02 AM by nachogonzalez

    ESX Host Freezes Randomly

    AndrewAdvnetsol Novice

      I have an ESX 6.5.0 Update 2 (Build 8294253) that is a standalone host.  It randomly locks up.  I know it is locked up because my running VM isn't accessible, I cannot connect to the host using the web gui, and when I got to the console I cannot login.  I cannot even get a response from the keyboard by pushing the num lock key.  My only option is to power off the physical server and then turn it on again.

       

      I am new to dealing with VM Logs.  Is there anything I need to turn on or setup to better log what is happening?  What should I be looking for in the log files and in what log file should I be looking?

       

      I generated a support bundle from the last 2 times the server locked up so I can post logs if anyone would like to look at them.

       

      I appreciate any help anyone has to offer.

       

      Thank you.

        • 1. Re: ESX Host Freezes Randomly
          daphnissov Guru
          Community WarriorsvExpert

          What is the hardware on which you are running this ESXi host?

          • 2. Re: ESX Host Freezes Randomly
            AndrewAdvnetsol Novice

            It is a brand X box that was laying around and unused.  It has a Xeon E3-1220 V2 CPU.  It has 16GB of Ram.  I did put in another NIC because I thought the onboard NIC might be an issue.  Beyond that I don't know what exactly is in it.

            • 3. Re: ESX Host Freezes Randomly
              daphnissov Guru
              Community WarriorsvExpert

              So totally unsupported hardware to begin with then.

              • 4. Re: ESX Host Freezes Randomly
                AndrewAdvnetsol Novice

                Sorry it took me so long to get back to you.  Yes it is unsupported hardware, though I have 4 or 5 other servers running on unsupported hardware and have no issues.

                • 5. Re: ESX Host Freezes Randomly
                  nachogonzalez Enthusiast

                  Hi AndrewAdvnetsol
                  Do you have remote console access to the server to check if there is a PSOD or to check the logs?
                  Do you have a Syslog?

                  Are you using external storage (SAN, NAS, iSCSI, etc)?

                  VMKernel.log might be a good hint on this.

                   

                  but based on what you said regarding the unsupported hardware, my guess is that is is causing the issues. 

                   

                  Looking forward to hearing from you

                   

                  Regards

                  • 6. Re: ESX Host Freezes Randomly
                    AndrewAdvnetsol Novice

                    Yes I have console access.  When it locks up there is no PSOD.  It is just sitting at the console screen, but I don't get any response out of the keyboard so I am unable to login.

                    I do have logs, but I do not know what I am looking for in the logs.

                    There is no external storage.

                     

                    Attached are the logs.

                    • 7. Re: ESX Host Freezes Randomly
                      Tayfun DEGER Expert
                      vExpert

                      Which ESXi host is usually caused by the storage connection. In cases such as APD or PDL on ESXi host, ESXi host can be locked. You cannot log in to the console or via the web gui. Check out the Task event section? Do you see warnings like Lost access volume here?

                      Also, what is the hardware brand model? Didn't you use ESXi custom ISO?

                      --
                      Blog: https://www.tayfundeger.com
                      Twitter: https://www.twitter.com/tayfundeger

                      vBlogger, vExpert, Cisco Champions

                      Please, if this solution helped your problem, "Helpful" if it solves your problem "Correct Answer" to mark.
                      • 8. Re: ESX Host Freezes Randomly
                        Tayfun DEGER Expert
                        vExpert

                        You are having a problem accessing the disks. This may be a driver or firmware problem or a defective part. What is Hardware's brand model? Did you install with Custom ESXi ISO?

                        --
                        Blog: https://www.tayfundeger.com
                        Twitter: https://www.twitter.com/tayfundeger

                        vBlogger, vExpert, Cisco Champions

                        Please, if this solution helped your problem, "Helpful" if it solves your problem "Correct Answer" to mark.
                        • 9. Re: ESX Host Freezes Randomly
                          nachogonzalez Enthusiast

                          hey bud, can you please provide the timeframe the error ocurred on that log bundle?

                           

                          warm regards

                          • 10. Re: ESX Host Freezes Randomly
                            AndrewAdvnetsol Novice

                            I don't know the exact time but I think somewhere right around 10:20 pm on March 21st based on failed backups.

                            • 11. Re: ESX Host Freezes Randomly
                              AndrewAdvnetsol Novice

                              How did you determine I was having trouble accessing the disks?  I don't know what brand it is off hand.  All I know off hand is that it has a Xeon E3-1220 V2 CPU on the sandybridge paltform.  It has 16GB of Ram.  I did put in another NIC because I thought the onboard NIC might be an issue.  Beyond that I don't know what exactly is in it.

                              • 12. Re: ESX Host Freezes Randomly
                                AndrewAdvnetsol Novice

                                Sorry I forgot to tell you that it is not a custom ISO for ESXi.  It is the standard one downloaded from the VMware website.

                                • 13. Re: ESX Host Freezes Randomly
                                  nachogonzalez Enthusiast

                                  I see two interesting things:

                                   

                                   

                                   

                                  1. The VMKernel.log you've provided is filled with this entry:

                                   

                                  2020-03-19T15:14:18.686Z cpu3:69553)MemSched: 14635: Admission failure in path: hostd/python.69553/uw.69553

                                  2020-03-19T15:14:18.686Z cpu3:69553)MemSched: 14642: uw.69553 (22449) extraMin/extraFromParent: 64/64, hostd (705) childEmin/eMinLimit: 82684/82688

                                  2020-03-19T15:14:18.686Z cpu3:69553)MemSched: 14635: Admission failure in path: hostd/python.69553/uw.69553

                                  2020-03-19T15:14:18.686Z cpu3:69553)MemSched: 14642: uw.69553 (22449) extraMin/extraFromParent: 64/64, hostd (705) childEmin/eMinLimit: 82684/82688

                                  2020-03-19T15:14:18.686Z cpu3:69553)MemSched: 14635: Admission failure in path: hostd/python.69553/uw.69553

                                  2020-03-19T15:14:18.686Z cpu3:69553)MemSched: 14642: uw.69553 (22449) extraMin/extraFromParent: 64/64, hostd (705) childEmin/eMinLimit: 82684/82688

                                  2020-03-19T15:14:18.686Z cpu3:69553)MemSched: 14635: Admission failure in path: hostd/python.69553/uw.69553

                                  2020-03-19T15:14:18.686Z cpu3:69553)MemSched: 14642: uw.69553 (22449) extraMin/extraFromParent: 64/64, hostd (705) childEmin/eMinLimit: 82684/82688

                                  2020-03-19T15:14:18.686Z cpu3:69553)MemSched: 14635: Admission failure in path: hostd/python.69553/uw.69553

                                  2020-03-19T15:14:18.686Z cpu3:69553)MemSched: 14642: uw.69553 (22449) extraMin/extraFromParent: 64/64, hostd (705) childEmin/eMinLimit: 82684/82688

                                  2020-03-19T15:14:18.686Z cpu3:69553)MemSched: 14635: Admission failure in path: hostd/python.69553/uw.69553

                                  2020-03-19T15:14:18.686Z cpu3:69553)MemSched: 14642: uw.69553 (22449) extraMin/extraFromParent: 64/64, hostd (705) childEmin/eMinLimit: 82684/82688

                                  2020-03-19T15:14:18.686Z cpu3:69553)MemSched: 14635: Admission failure in path: hostd/python.69553/uw.69553

                                  2020-03-19T15:14:18.686Z cpu3:69553)MemSched: 14642: uw.69553 (22449) extraMin/extraFromParent: 64/64, hostd (705) childEmin/eMinLimit: 82684/82688

                                  2020-03-19T15:14:18.686Z cpu3:69553)MemSched: 14635: Admission failure in path: hostd/python.69553/uw.69553

                                  2020-03-19T15:14:18.686Z cpu3:69553)MemSched: 14642: uw.69553 (22449) extraMin/extraFromParent: 64/64, hostd (705) childEmin/eMinLimit: 82684/82688

                                  2020-03-19T15:14:18.686Z cpu3:69553)MemSched: 14635: Admission failure in path: hostd/python.69553/uw.69553

                                  2020-03-19T15:14:18.686Z cpu3:69553)MemSched: 14642: uw.69553 (22449) extraMin/extraFromParent: 64/64, hostd (705) childEmin/eMinLimit: 82684/82688

                                  2020-03-19T15:14:18.686Z cpu3:69553)MemSched: 14635: Admission failure in path: hostd/python.69553/uw.69553

                                  2020-03-19T15:14:18.686Z cpu3:69553)MemSched: 14642: uw.69553 (22449) extraMin/extraFromParent: 64/64, hostd (705) childEmin/eMinLimit: 82684/82688

                                  2020-03-19T15:14:18.686Z cpu3:69553)MemSched: 14635: Admission failure in path: hostd/python.69553/uw.69553

                                  2020-03-19T15:14:18.686Z cpu3:69553)MemSched: 14642: uw.69553 (22449) extraMin/extraFromParent: 64/64, hostd (705) childEmin/eMinLimit: 82684/82688

                                  2020-03-19T15:14:18.686Z cpu3:69553)MemSched: 14635: Admission failure in path: hostd/python.69553/uw.69553

                                  2020-03-19T15:14:18.686Z cpu3:69553)MemSched: 14642: uw.69553 (22449) extraMin/extraFromParent: 64/64, hostd (705) childEmin/eMinLimit: 82684/82688

                                   

                                   

                                   

                                   

                                   

                                  Please check
                                  VMware Knowledge Base

                                   

                                  2. There are lots of SCSI errors

                                   

                                   

                                  020-03-21T14:35:06.033Z cpu0:66064)ScsiDeviceIO: 2954: Cmd(0x4395008c9b80) 0x1a, CmdSN 0x75cc from world 0 to dev "naa.600605b006eb6730220acb13229ebeae" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-21T14:59:09.876Z cpu0:66064)ScsiDeviceIO: 2954: Cmd(0x4395008ecf00) 0x1a, CmdSN 0x762c from world 0 to dev "naa.600605b006eb6730220acb13229ebeae" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-21T15:19:10.114Z cpu2:66064)ScsiDeviceIO: 2954: Cmd(0x4395009f8b00) 0x1a, CmdSN 0x768b from world 0 to dev "naa.600605b006eb6730220acb13229ebeae" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-21T15:35:38.121Z cpu3:66064)ScsiDeviceIO: 2954: Cmd(0x439500935580) 0x1a, CmdSN 0x76cb from world 0 to dev "naa.600605b006eb6730220acb13229f15cf" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-21T15:41:08.184Z cpu1:66064)ScsiDeviceIO: 2954: Cmd(0x439500954300) 0x1a, CmdSN 0x76ea from world 0 to dev "naa.600605b006eb6730220acb13229ebeae" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-21T16:05:38.200Z cpu3:66064)ScsiDeviceIO: 2954: Cmd(0x439500976c00) 0x1a, CmdSN 0x31e from world 67393 to dev "naa.600605b006eb6730220acb13229ebeae" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-21T16:35:08.873Z cpu1:66064)ScsiDeviceIO: 2954: Cmd(0x4395009d4d00) 0x1a, CmdSN 0x77b4 from world 0 to dev "naa.600605b006eb6730220acb13229ebeae" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-21T16:59:11.345Z cpu1:66064)ScsiDeviceIO: 2954: Cmd(0x4395009d1c00) 0x1a, CmdSN 0x7814 from world 0 to dev "naa.600605b006eb6730220acb13229ebeae" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-21T17:19:11.541Z cpu2:66064)ScsiDeviceIO: 2954: Cmd(0x43950098c300) 0x1a, CmdSN 0x7873 from world 0 to dev "naa.600605b006eb6730220acb13229ebeae" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-21T17:35:38.481Z cpu0:66064)ScsiDeviceIO: 2954: Cmd(0x43950098f780) 0x1a, CmdSN 0x78ae from world 0 to dev "naa.600605b006eb6730220acb13229f15cf" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-21T17:41:09.790Z cpu0:66064)ScsiDeviceIO: 2954: Cmd(0x439500991000) 0x1a, CmdSN 0x78d2 from world 0 to dev "naa.600605b006eb6730220acb13229ebeae" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-21T18:05:38.562Z cpu1:66064)ScsiDeviceIO: 2954: Cmd(0x439500906180) 0x1a, CmdSN 0x33e from world 67393 to dev "naa.600605b006eb6730220acb13229ebeae" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-21T18:35:11.591Z cpu1:66064)ScsiDeviceIO: 2954: Cmd(0x4395009cd980) 0x1a, CmdSN 0x799c from world 0 to dev "naa.600605b006eb6730220acb13229ebeae" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-21T18:59:12.665Z cpu0:66064)ScsiDeviceIO: 2954: Cmd(0x439500927580) 0x1a, CmdSN 0x79fc from world 0 to dev "naa.600605b006eb6730220acb13229ebeae" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-21T19:05:38.724Z cpu1:66064)ScsiDeviceIO: 2954: Cmd(0x43950094f600) 0x4d, CmdSN 0x349 from world 67393 to dev "naa.600605b006eb6730220acb13229f15cf" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.

                                  2020-03-21T19:19:12.895Z cpu2:66064)ScsiDeviceIO: 2954: Cmd(0x439500948d00) 0x1a, CmdSN 0x7a5b from world 0 to dev "naa.600605b006eb6730220acb13229ebeae" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-21T19:41:11.327Z cpu1:66064)ScsiDeviceIO: 2954: Cmd(0x43950094d680) 0x1a, CmdSN 0x7aba from world 0 to dev "naa.600605b006eb6730220acb13229ebeae" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-21T20:05:38.908Z cpu0:66064)ScsiDeviceIO: 2954: Cmd(0x439500993680) 0x1a, CmdSN 0x35e from world 67393 to dev "naa.600605b006eb6730220acb13229ebeae" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-21T20:35:14.279Z cpu0:66064)ScsiDeviceIO: 2954: Cmd(0x43950097e300) 0x1a, CmdSN 0x7b84 from world 0 to dev "naa.600605b006eb6730220acb13229ebeae" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-21T20:35:38.984Z cpu0:66064)ScsiDeviceIO: 2954: Cmd(0x4395008d7800) 0x1a, CmdSN 0x7b8f from world 0 to dev "naa.600605b006eb6730220acb13229f15cf" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-21T20:59:14.628Z cpu1:66064)ScsiDeviceIO: 2954: Cmd(0x4395009d8500) 0x1a, CmdSN 0x7be4 from world 0 to dev "naa.600605b006eb6730220acb13229ebeae" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-21T21:19:14.991Z cpu1:66064)ScsiDeviceIO: 2954: Cmd(0x4395008cde00) 0x1a, CmdSN 0x7c43 from world 0 to dev "naa.600605b006eb6730220acb13229ebeae" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-21T21:41:12.823Z cpu1:66064)ScsiDeviceIO: 2954: Cmd(0x4395008e8c80) 0x1a, CmdSN 0x7ca2 from world 0 to dev "naa.600605b006eb6730220acb13229ebeae" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-21T22:05:39.279Z cpu1:66064)ScsiDeviceIO: 2954: Cmd(0x4395009b6300) 0x1a, CmdSN 0x37e from world 67393 to dev "naa.600605b006eb6730220acb13229ebeae" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-21T22:35:17.392Z cpu1:66064)ScsiDeviceIO: 2954: Cmd(0x43950099f700) 0x1a, CmdSN 0x7d6c from world 0 to dev "naa.600605b006eb6730220acb13229ebeae" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-21T22:35:39.355Z cpu3:66064)ScsiDeviceIO: 2954: Cmd(0x43950091f080) 0x1a, CmdSN 0x7d72 from world 0 to dev "naa.600605b006eb6730220acb13229f15cf" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-21T22:59:16.899Z cpu1:66064)ScsiDeviceIO: 2954: Cmd(0x439500919900) 0x1a, CmdSN 0x7dcc from world 0 to dev "naa.600605b006eb6730220acb13229ebeae" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-21T23:19:17.279Z cpu1:66064)ScsiDeviceIO: 2954: Cmd(0x4395008dc500) 0x1a, CmdSN 0x7e2b from world 0 to dev "naa.600605b006eb6730220acb13229ebeae" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-21T23:41:14.644Z cpu2:66064)ScsiDeviceIO: 2954: Cmd(0x43950098d800) 0x1a, CmdSN 0x7e8a from world 0 to dev "naa.600605b006eb6730220acb13229ebeae" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-22T00:05:39.623Z cpu3:66064)ScsiDeviceIO: 2954: Cmd(0x4395008f5400) 0x4d, CmdSN 0x399 from world 67393 to dev "naa.600605b006eb6730220acb13229f15cf" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.

                                  2020-03-22T00:05:39.633Z cpu3:66064)ScsiDeviceIO: 2954: Cmd(0x439500933d00) 0x1a, CmdSN 0x39e from world 67393 to dev "naa.600605b006eb6730220acb13229ebeae" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-22T00:35:19.837Z cpu1:66064)ScsiDeviceIO: 2954: Cmd(0x4395009b1600) 0x1a, CmdSN 0x7f54 from world 0 to dev "naa.600605b006eb6730220acb13229ebeae" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-22T00:59:18.861Z cpu2:66064)ScsiDeviceIO: 2954: Cmd(0x4395009b7100) 0x1a, CmdSN 0x7fb4 from world 0 to dev "naa.600605b006eb6730220acb13229ebeae" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-22T01:19:19.224Z cpu0:66064)ScsiDeviceIO: 2954: Cmd(0x4395009c4300) 0x1a, CmdSN 0x8013 from world 0 to dev "naa.600605b006eb6730220acb13229ebeae" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-22T01:35:39.898Z cpu1:66064)ScsiDeviceIO: 2954: Cmd(0x4395009ece00) 0x1a, CmdSN 0x8053 from world 0 to dev "naa.600605b006eb6730220acb13229f15cf" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-22T01:41:16.556Z cpu0:66064)ScsiDeviceIO: 2954: Cmd(0x4395009f1e80) 0x1a, CmdSN 0x8072 from world 0 to dev "naa.600605b006eb6730220acb13229ebeae" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.

                                  2020-03-22T02:05:39.995Z cpu0:66064)ScsiDeviceIO: 2954: Cmd(0x4395008b9500) 0x1a, CmdSN 0x3be from world 67393 to dev "naa.600605b006eb6730220acb13229ebeae" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 

                                   

                                   

                                   

                                  This might indicate that you are having a driver issue, please update to the latest driver and get back

                                   

                                  Warm regards

                                  • 14. Re: ESX Host Freezes Randomly
                                    AndrewAdvnetsol Novice

                                    Thank you very much for finding that.  When it comes to drivers and VMware I have never updated drivers before.  Is this done the the host through web GUI and going to Manage -> Packages -> Install Updates? It looks like it is my MagaRaid drive that needs to be updated.  The 2 devices listed our my Local LSI Disks, which list a model of MR9271-8i.  When I look at Storage -> Adapters I see one that is using a driver lsi_mr3.  I assume that is the driver I want to update.

                                     

                                    Correct me if I am wrong on any of this.

                                     

                                    Thank you again for all your help.

                                    1 2 Previous Next