1 2 Previous Next 21 Replies Latest reply on Nov 15, 2017 10:13 AM by davidcrowder

    Consolidation failure

    davidcrowder Novice

      We have been trialing Dell Rapid-Recovery for ESXi backups, and occasionally experience consolidation failures.  Any advice on how to track down why, so that we can fix it?

       

      This server is running ESXi 5.5 build 2068190.

       

      Here is some info from hostd.log:

      2016-07-21T09:00:16.847Z [52E80B70 info 'Vimsvc.TaskManager' opID=hostd-854b user=root] Task Created : haTask-4-vim.VirtualMachine.consolidateDisks-344057184

      2016-07-21T09:00:16.848Z [51080B70 info 'Vmsvc.vm:/vmfs/volumes/548c3c98-2367a4b4-9aa2-0025908c25f8/domain1/domain1.vmx' opID=hostd-854b user=root] State Transition (VM_STATE_ON -> VM_STATE_CONSOLIDATE_ALL_DISKS)

      ...

      (Lots of verbose messages that do not appear to have anything to do with consolidation)

      ...

      2016-07-21T09:00:18.740Z [4F5C1B70 verbose 'Vmsvc.vm:/vmfs/volumes/548c3c98-2367a4b4-9aa2-0025908c25f8/domain1/domain1.vmx'] Consolidate Disks translated error to vim.fault.FileLocked

      2016-07-21T09:00:18.740Z [4F5C1B70 info 'Vmsvc.vm:/vmfs/volumes/548c3c98-2367a4b4-9aa2-0025908c25f8/domain1/domain1.vmx'] Consolidate Disks failed: vim.fault.FileLocked

      2016-07-21T09:00:18.740Z [4F5C1B70 verbose 'Vmsvc.vm:/vmfs/volumes/548c3c98-2367a4b4-9aa2-0025908c25f8/domain1/domain1.vmx'] Consolidate Disks message: An error occurred while consolidating disks: Failed to lock the file.

      -->

      2016-07-21T09:00:18.740Z [4F181B70 info 'Vimsvc.ha-eventmgr'] Event 7495 : Virtual machine domain1 disks consolidation failed on vsphere1 in cluster vsphere1 in ha-datacenter.

      2016-07-21T09:00:18.742Z [4F181B70 verbose 'Vmsvc.vm:/vmfs/volumes/548c3c98-2367a4b4-9aa2-0025908c25f8/domain1/domain1.vmx'] Time to gather Snapshot information ( read from disk,  build tree): 1 msecs. needConsolidate is true.

      2016-07-21T09:00:18.742Z [4F181B70 verbose 'Vmsvc.vm:/vmfs/volumes/548c3c98-2367a4b4-9aa2-0025908c25f8/domain1/domain1.vmx'] Snapshot property update: Configure will be invalidated for:

      2016-07-21T09:00:18.758Z [4F181B70 verbose 'Vmsvc.vm:/vmfs/volumes/548c3c98-2367a4b4-9aa2-0025908c25f8/domain1/domain1.vmx'] Time to gather config: 15 (msecs)

       

      This is the third time it has done this, each time with a different guest / vmdk.  I'm at a loss on how to proceed.  The only fix that has worked on the prior occasions was to reboot the ESXi host and do a manual consolidation.

       

      The vSphere Client and command-line tools all give similar errors when attempting to consolidate without a reboot, stating that the they are unable to access the file since it is locked.  Even restarting the hostd daemon is not sufficient to allow consolidation to proceed -- nothing but a full host reboot.

       

      Any ideas how to proceed?

       

      Thanks in advance

        • 2. Re: Consolidation failure
          davidcrowder Novice

          firestartah:

           

          Thank you for the post.

           

          I have looked into that article.  Unfortunately, it largely does not apply.  We are not a large datacenter; our ESXi servers are stand-alone.  So, the first 3/4 of that article, which is focused on determining which vSphere server has the vmdk locked, do not apply.

           

          After determining which server it is, the advice basically boils down to "restart the host".  I already know to do that...

           

          My goal is to discover why this is happening and how to fix it so that I can trust Rapid Recovery & ESXi to always successfully consolidate after backups.

           

          Thanks

          • 3. Re: Consolidation failure
            Graham Expert
            User ModeratorsvExpert

            Try restarting all management agents instead of rebooting.

             

            To restart all management agents on the host, run the command:
            services.sh restart

            Restarting the Management agents on an ESXi (1003490) | VMware KB

             

            Typical troubleshooting steps I try when this happens:

             

            • Try to vMotion the VM to another host
            • Another option is if the backup software uses hot-add (vRanger etc) then look at the VM settings of the VM doing the backups and see if the VM with the error has one of it's disk attached to the backup VM. Detatch if required and try to consolidate again.
            • Try to create a new snapshot and then delete all snapshots
            • You can try to restart your backup server in case this has somehow locked the VMDKs
            • 4. Re: Consolidation failure
              davidcrowder Novice

              grba:

              Thank you for the reply.  services.sh restart is a better method than a full host reboot.

               

              Unfortunately, this is for a small shop.  The license level is Essentials.  There are not enough servers to have the spare capacity to do vMotion, even if the license level supported;  these guests are stuck where they are.

               

              I have tried creating other snapshots, and then using Delete All.  It fails without restarting the host (or all the management services, at least).

               

              I have verified that it is not the backup system locking the files.  It's something in ESXi, itself... although I haven't a clue how to track that one down.

               

               

              So, while knowing I don't have to restart the host every time is a positive thing, it still leaves us in the situation where simply using our backup software can leave us in a state where our guests crash as they run out of disk space.  Not good.

              The only real, permanent solution is to find out why ESXi is failing to clean up snapshots when told -- why they're in a locked state -- and fix that, so that we can move forward.

               

              I'm considering using updating that to the latest build of 5.5... but I hate running host updates on otherwise perfectly functional systems without clearly knowing it's the necessary fix.

               

              I appreciate your help.

               

              Thank you

              • 5. Re: Consolidation failure
                VMBoy79 Novice

                Hi David,

                 

                Please consider the size of the disk as well while using the VADP mode for backup. If it is more than 1 TB sometimes and you use LAN for VADP then the consolidation gets fail because of time out issue.

                 

                For clearing the locked files we fix it by restarting the management agent of the host.. Please let us know if upgrading the built fix the issue...

                • 6. Re: Consolidation failure
                  Graham Expert
                  vExpertUser Moderators

                  Did you see this option:

                   

                  • Another option is if the backup software uses hot-add (vRanger etc) then look at the VM settings of the VM doing the backups and see if the VM with the error has one of it's disks attached to the backup VM. Detatch if required and try to consolidate again.

                  Hot-add can cause the vmdk to get locked and you will not be able to consolidate if another VM has the disk attached to it. Although it does not explain why a reboot resolved the problem.


                  Try the above and let us know how you get on.

                  • 7. Re: Consolidation failure
                    davidcrowder Novice

                    VMBoy79:

                     

                    Thank you for your reply.

                    Please consider the size of the disk as well while using the VADP mode for backup. If it is more than 1 TB sometimes and you use LAN for VADP then the consolidation gets fail because of time out issue.

                     

                    Dell Rapid Recovery is a VADP solution, utilizing CBT.  Some of the vmdk's are more than 1 TB, while others are significantly less.  It fails to consolidate, randomly, on either.  Size does not appear to be an issue.

                     

                    Decreasing the number of simultaneous backups being run on a single ESXi host seems to lower the likelihood of a consolidation failure.  However, it has not out-right eliminated this from occurring; even running just a single backup during off-peak hours can sometimes result in a "stuck" snapshot.

                    • 8. Re: Consolidation failure
                      davidcrowder Novice

                      grba:

                       

                      Thank you for your reply.

                      Hot-add can cause the vmdk to get locked and you will not be able to consolidate if another VM has the disk attached to it. Although it does not explain why a reboot resolved the problem.

                       

                      Dell Rapid Recovery is a VADP backup solution.  It does not use hot-add.

                      • 9. Re: Consolidation failure
                        davidcrowder Novice

                        I plan on using the ISO to upgrade to the latest version of 5.5 this weekend.

                         

                        Before doing so, I'd like to ask:  Has anyone had any trouble with this?  Especially going from an early version of 5.5 all the way to Update 3?

                         

                        Thanks

                        • 10. Re: Consolidation failure
                          VMBoy79 Novice

                          David,

                          We have ESXi 5.5 update 3 in our environment, Still we are seeing issue with disk consolidation sometimes. but however, the no of lock file issues is once in a week. We are fixing it by restarting the management agents.

                          • 11. Re: Consolidation failure
                            davidcrowder Novice

                            VMBoy79,

                             

                            That is unfortunate.  Because of the amount of data some of these VMs write, and that most of them are using thick provisioning, we could easily find ourselves running out of space on our datastores;  this is one bug we cannot leave unfixed.

                             

                            If anyone has any ideas for a permanent fix, something where I'm not reacting to the problem, but a solution that will actually stop this consolidation issue from happening, I would appreciate it very very much.

                             

                            Thanks

                            • 12. Re: Consolidation failure
                              VMBoy79 Novice

                              Currently we have changed the backup strategy in order to come across the above issue. We have shortlisted the VM having more than 1 TB hard disk; we are running the VADP backup for only the OS drive; and running file level backup for all the other drives.

                               

                              This drastically bring down the consolidation issues ..

                              • 13. Re: Consolidation failure
                                PhoenixStores Lurker

                                I had this exact same problem today and came across this post.  I was able to resolve the issue however by performing a storage vMotion.  If you have more than one datastore with the space to hold the VM, you can perform a storage vMotion to move the VM to a different datastore, which automatically successfully consolidates your VM.  You can then storage vMotion the VM back to its original location.

                                • 14. Re: Consolidation failure
                                  CHTIOUI Lurker

                                  We had the same problem, we have Netvault as a backup tool installed on a physical server, all attempts to consolidate the disks of a VM failed, we restarted the backup server and the consolidation is executed successfully

                                  1 2 Previous Next