6 Replies Latest reply on Jan 5, 2018 12:27 PM by cypherx

    stateless error: lost connectivity backing the boot filesystem

    cypherx Hot Shot

      Hello,

       

      I have had this error on one of our servers that states "Lost connectivity to the device mpx.vmhba:32:C0:T0:L0 backing the boot filesystem /vmfs/devices/disks/mpx.vmhba:32:C0:T0:L0.  As a result, host configuration changes will not be saved to persistent storage.

       

      All VM's have been running without an issue.  My concern is if I would reboot the host, will it boot?


      I went into the Dell R620 iDrac and under Removeable flash media it shows IDSDM SD1 and IDSDM SD2 are both Good and redundancy status is full.  How is it possible that the cards are good but VMWare is not seeing it?

       

      Do you think its just a glitch?

        • 1. Re: stateless error: lost connectivity backing the boot filesystem
          MBreidenbach0 Hot Shot

          I've seen that happen with HPE Blade Servers a lot. Usually there a iLO firmware upgrade helps since there the SD card somehow is managed via iLO (the HPE iDrac equivalent).

           

          I don't have experience with that problem on Dell servers.

           

          So far I had these situations:

          SD card comes back after reboot

          SD card comes back after power off / power on

          SD card comes back after iLO reset / firmware update

          SD card has to be removed + reinserted (happend just a week ago)

          SD card needs to be replaced + ESXi needs to be reinstalled

           

          When the SD card comes back it contains old config data so depending on what happened in the mean time some config may have to be updated manaually.

           

          Good luck !

          • 2. Re: stateless error: lost connectivity backing the boot filesystem
            dineshgoundar Enthusiast

            I have seen this error in HP rack mount servers as well. Updating iLO/iDRAC firmware may resolve the issue. If you have redundancy in your cluster (which you should), then evacuate all VMs from this host and reboot. If it doesn't come up, perform a cold boot. Or open the covers, remove the SD card controller and plug it back in. This was what HP support told us. Updating firmware fixed one of our servers but had to re-seat the controller on another. Open a ticket with Dell. Good luck.

            • 3. Re: stateless error: lost connectivity backing the boot filesystem
              cypherx Hot Shot

              Thanks, will have to have vmware support fix our vCenter server first so I can vmotion machines off.

               

              We tried to update vCenter server 6.0, 3634793 to 6.0, 5326177 and we got

              "Installation of component VCSServiceManager failed with error code '1603'. Check the logs for more details."  It rolls back and I manually had to restart the services, however the VMWare profile driven storage service will not start, and without that service vmotion will not work.  I'm kinda stuck between a rock and  a hard place.

               

              Also whenever they tell me they are going to call, nobody calls.

              • 4. Re: stateless error: lost connectivity backing the boot filesystem
                Expert
                vExpert

                shut down all VM's and try some stuff out.

                 

                  1-  Set ESXi into maintainance mode

                  2-  Download VMware vSphere PowerCLI

                  3-  Run as administrator

                   4- Connect to ESXi (Connect-VIServer -server 0.0.0.0 -user user -pass pass) - where 0.0.0.0 = ip from ESXi, user = login for ESXi and pass = password for this login.

                5-   Back-up the current running-configuration (Get-VMHost 0.0.0.0 | Get-VMHostFirmware -BackupConfiguration -DestinationPath 'D:\temp\') - where 0.0.0.0 = ip from ESXi, 'D:\temp\' =  destination folder for the backup (be sure this folder exists on the PC you are executing it from)

                   6- Shutdown your ESXi host

                 

                However, after a clean install you can also just restore the backup configuration. For restoring the backup  you can use the following steps:

                 

                  1-  Boot you ESXi host (make sure it is installed)

                    2-Connect to ESXi (Connect-VIServer -server 0.0.0.0 -user user -pass pass) - where 0.0.0.0 = ip from ESXi, user = login for ESXi and pass = password for this login.

                    3-Restore to the backup configuration (Get-VMHost 0.0.0.0 | Set-VMHostFirmware -Restore -Force -SourcePath 'D:\temp\configBundle-0.0.0.0.tgz') - where 0.0.0.0 = ip from ESXi, 'D:\temp\configBundle-0.0.0.0.tgz' =  the backed up configuration file (be sure this file exists on the PC you are executing it from)

                  4-  Wait a few seconds, if it doesn't automatically reboot, do it manually

                   5-The configuration should now be loaded again, and the error of the missing boot drive should be gone.

                 

                 

                I hope this helps someone else who has the same problem

                1 person found this helpful
                • 5. Re: stateless error: lost connectivity backing the boot filesystem
                  cypherx Hot Shot

                  Thats awesome ranchuab.  Thank you for those clear and consise steps to back up and restore the configuration!

                   

                  I was able to fix the vcenter service and get the latest 6.0 update installed, so I can vmotion again.  Now I can plan to evacuate this host, back it up just in case and then reboot.  If its faulted I can try the clean install or swap the two internal Dell SD cards (they are in a redundant configuration).  Maybe just the first slot is bad.

                  • 6. Re: stateless error: lost connectivity backing the boot filesystem
                    cypherx Hot Shot

                    Just an update.

                     

                    I backed up the config, ensured I had screen shots of particular network settings and the liscence key, just to know which mac address I was to use to get it on the network in order to restore a config.  I had the VMWare iso freshly burned and verified to disk.  Migrated VM's, maintenance mode, rebooted it.  Guess what?  It came back fine.  It did think some VM's were on it that I migrated away, so I canceld the original reconnect operation, removed it from vcenter and then re-added it.  I just had to bind my dv switch uplinks, which was easy.  Then I was able to update it to the latest patch level, reboot again and everything is working without an issue.

                     

                    Not sure what caused it, but it was benign.