13 Replies Latest reply on Mar 16, 2009 7:38 AM by diederikm

    Storage vmotion fails, stuck with delta files

    Argyle Hot Shot

      I'm migrating VMs from a HP EVA 5000 SAN to an HP EVA 8000 SAN with storage vmotion using the Remote CLI tool. It works for most part but sometimes it fails midway and I'm stuck with:

       

      • vmdk files created on new LUNs (sometimes just one disk, sometimes all disks)

      • swap and .vmx files moved to new LUNs

      • delta files left on old LUNs

      • VM running but using delta files on old LUNs and swap on new LUNs

       

      I could use some help completing or restarting the process. I was thinking about creating a new snapshot and committing it to get rid of the storage vmotion delta files but the option "Take snapshot"  is gayed out in the Virtual Center for the VMs that fail this way.

       

      Error this time showed up after about 10 minutes. Sys disk is 15 GB and data disk is 20 GB so no large files involved. Had same problem with a VM with 100 GB data disk, it failed after 20-30 minutes. The server running Remote CLI is on the SC network.

       

      Writing down some system data and log info below:

       

      Command and error message:

      -


      C:\Program Files\VMware\VMware VI Remote CLI\bin>svmotion.pl

      --server=vc.mydomain.com --username=some_name --password=some_pass --datacenter="My Data Center"

      --vm="[Sys-Disk-306] SERVER01/SERVER01.vmx: Sys-Disk-327"

      --disks="[Sys-Disk-306] SERVER01/SERVER01.vmdk:Sys-Disk-327, SERVER01/SERVER01.vmdk:Data-Disk-328"

      --verbose

       

      Attempting to connect to service url.

      Connected to server.

      Resolving the input arguments.

      Performing Storage VMotion.

      Received an error from the server: An error occurred while communicating with the remote host.

       

      C:\Program Files\VMware\VMware VI Remote CLI\bin>

      -


      Interesting info in the VMs wmare.log:

       

      Jul 17 15:04:40.904: vcpu-0| HBACommon: First write on scsi0:0.fileName='/vmfs/volumes/47500341-572986f8-bf6e-001cc478a0d4/SERVER01//DMotion-scsi0:00_SERVER01.vmdk'

      Jul 17 15:04:40.918: vcpu-0| DISKLIB-CHAIN : UpdateContentID: old = 0xc4692175, new = 0x3b52f9e2

      Jul 17 15:12:40.901: vmx| vmdbPipe_Streams Couldn't read: OVL_STATUS_EOF

      Jul 17 15:12:40.955: vmx| SOCKET 1 client closed connection

      -


      Currently the old LUNs look like this:

      ll /vmfs/volumes/Sys-Disk-306/SERVER01/

      -rw-------    1 root     root     67141632 Jul 17 15:57 DMotion-scsi0:00_SERVER01-delta.vmdk

      -rw-------    1 root     root          326 Jul 17 15:04 DMotion-scsi0:00_SERVER01.vmdk

      -rw-------    1 root     root     16106127360 Jul 17 15:04 SERVER01-flat.vmdk

      -rw-------    1 root     root          342 Jun 14 13:15 SERVER01.vmdk

       

      ll /vmfs/volumes/Data-Disk-308/SERVER01/

      -rw-------    1 root     root     16820224 Jul 17 15:28 DMotion-scsi0:01_SERVER01-delta.vmdk

      -rw-------    1 root     root          326 Jul 17 15:28 DMotion-scsi0:01_SERVER01.vmdk

      -rw-------    1 root     root     21474836480 Jul 17 14:28 SERVER01-flat.vmdk

      -rw-------    1 root     root          342 Jun 14 13:21 SERVER01.vmdk

       

      Currently the new LUNs look like this:

      ll /vmfs/volumes/Sys-Disk-327/SERVER01/

      -rw-------    1 root     root     805306368 Jul 17 15:04 SERVER01-a0151293.vswp

      -rw-------    1 root     root     16106127360 Jul 17 15:12 SERVER01-flat.vmdk

      -rw-------    1 root     root         8684 Jul 17 15:04 SERVER01.nvram

      -rw-------    1 root     root          403 Jul 17 15:08 SERVER01.vmdk

      -rw-------    1 root     root            0 Jul 17 15:03 SERVER01.vmsd

      -rwxr-xr-x    1 root     root         2789 Jul 17 15:17 SERVER01.vmx

      -rw-------    1 root     root          265 Jul 17 15:17 SERVER01.vmxf

      -rw-rr    1 root     root        22802 Jul 17 15:03 vmware-31.log

      -rw-rr    1 root     root        28840 Jul 17 15:03 vmware-32.log

      -rw-rr    1 root     root        31444 Jul 17 15:03 vmware-33.log

      -rw-rr    1 root     root       112306 Jul 17 15:03 vmware-34.log

      -rw-rr    1 root     root        30702 Jul 17 15:03 vmware-35.log

      -rw-rr    1 root     root      1479819 Jul 17 15:03 vmware-36.log

      -rw-rr    1 root     root        48791 Jul 17 15:28 vmware.log

       

      ll /vmfs/volumes/Data-Disk-328/SERVER01/

      -rw-------    1 root     root            0 Jul 17 15:08 SERVER01.vmdk

      -


      The .vmx files show that its using vmdk and delta files on old LUNs and swap on new LUN.

       

      Anyone experienced the same thing and is there a safe way to complete or rollback the process?

        • 1. Re: Storage vmotion fails, stuck with delta files
          dmaster Expert
          vExpertVMware Employees

          Hi Argyle,

           

          i found for you the following link..

           

          http://forums.virtualizationadmin.com/SVMotion_Plugin/m_21/tm.htm

           

          see the requirements for using SVmotion

          • 2. Re: Storage vmotion fails, stuck with delta files
            JWVMCS Novice

             

            Hi Argyle,

             

             

            I've had the same issue on one of my failures.  The config/swap files move but SVMotion fails to move the vmdk's, as I said i only have one failure that has the DMotion.scsi* files on the src datastore(the other failures dont have the extra vmdk's). So, to recover I intend to power of the vm's and cold migrate the disk files and reconfigure the VMX.  I can't do this until I get some downtime so if anyone knows of a way to recover in live state (I think not) please let us know.

             

             

            JW

             

             

            • 3. Re: Storage vmotion fails, stuck with delta files
              Argyle Hot Shot

               

               

               

              Thanx for the link. We fulfill all the requirements though. The main problem now is how to complete or rollback these storage vmotions that got stuck half way with delta files.

               

               

              -Virtual machines with snapshots cannot be migrated using Storage VMotion.

              There is no shapshot on the machines

               

               

              -Virtual machine disks must be in persistent mode or be raw device maps.

              Persistent mode here

               

               

              -The host on which the virtual machine is running must have sufficient resources to

              support two instances of the virtual machine running concurrently for a brief time.

              There is enough resources

               

               

              -The host on which the virtual machine is running must have a VMotion license,

              and be correctly configured for VMotion.

              It has license and is configured for vmotion

               

               

              -The host on which the virtual machine is running must have access to both the

              source and target datastores.

              It has access to all datastores

               

               

              -VMware Infrastructure 3 supports a maximum of four simultaneous VMotion or

              Storage VMotion accesses to a single datastore.

              We only do one storage vmotion at a time.

               

               

               

               

              • 4. Re: Storage vmotion fails, stuck with delta files
                dmaster Expert
                VMware EmployeesvExpert

                maybe the use of cold migrations is an option for you ? Migrate the machines back to their original datastore and commit or remove the snapshot file.

                 

                p.s. if you think answers on the forum are usefull or correct please award them with points.

                1 person found this helpful
                • 5. Re: Storage vmotion fails, stuck with delta files
                  BigHug Hot Shot

                  I will suggest you to call Support. It happened to me once. Support will route the case to the storage group. They are pretty good. Basically they will find out the right disk chain. And use vmkfstools to put the delta back to the new vmdk. It's not difficult. But I will not do it myself.

                  • 6. Re: Storage vmotion fails, stuck with delta files
                    Argyle Hot Shot

                     

                    dmaster: Was looking into that but the VM is in a midway state somehow. A lot of options are greyed out so I can't migrate it back. Also ESX think that no snapshots exist, the vmsd file is blank. It looks like dmotion delta files are a bit different for some reason.

                     

                     

                    BigHig: Yea a case is opened. Was hoping someone had run into the same thing on the forum and had a nice solution

                     

                     

                    A side not is that the main cause seem to be resource starvation of the service console. The problem occurs on one specific ESX host that had the console CPU pegged due to a bad behaving VM. This impacts the storage vmotion process so it terminates.

                     

                     

                     

                     

                     

                    • 7. Re: Storage vmotion fails, stuck with delta files
                      Argyle Hot Shot

                      After a lot of testing I found a solution that:

                      - commits the delta data to the original vmdk files

                      - resets DMotion state

                      - keeps the VM online

                       

                      =======================

                      DISCLAIMER:

                      I take no responsibility for the result in your specific environment. The following worked for me.

                      =======================

                       

                      Description:

                      -


                      Create a snapshot of the VM and then remove the snapshots. This will commit the dmotion deltas too. Note that you can't remove/commit the dmotion delta files directly since they don't count as a normal snapshot. Running vmware-cmd with hassnapshot parameter doesn't return a value of "1".

                       

                      After that edit the .vmx settings to remove entries for the DMotionParent parameters or it will restart the migration at next reboot. You will have no options like "Edit settings" etc in Virtual Center unless you do this.

                       

                      Having data left in the DMotionParent parameters will result in only having one option available in Virtual Center GUI at next reboot called "Complete migration".

                       

                      You do not want to use this option though since it completes the migration on the source LUNs, ignoring your previous destination LUNs. You need double the space to do this if you still want to perform it. In my case I had a 100 GB VM disk on a 150 GB LUN and it will fail.

                       

                      Once you remove the value in the DMotionParent parameters via vmware-cmd in ESX, Virtual Center will display all normal options again. Note that editing the .vmx file directly will not trigger a reload of the .vmx config.

                       

                      Step by step:

                      -


                      - You have a VM with two disks on LUN1 and LUN2 that you want to migrate to LUN3 and LUN4

                       

                      - Storage vmotion fails midway for reason X

                       

                      - We have a running VM with .vmx and swap on LUN3 and vmdk and dmotion delta files on LUN1 and LUN2. VM is running on the delta files.

                       

                      - Log in to the ESX that has the VM to create a snapshot on VM (its not available via Virtual Center GUI in this state), make sure there is room on LUN3 that holds vmx files.

                       

                      - Find the UUID path to your VM with:

                      vmware-cmd -l

                       

                      - Create a snapshot (of all disks) with:

                      vmware-cmd /vmfs/volumes/487...d4/MYSERVER/MYSERVER.vmx createsnapshot snapname snapdescrition 1 1

                       

                      You get files like this on LUN3:

                      DMotion-scsi0:00_MYSERVER-000001.vmdk

                      DMotion-scsi0:00_MYSERVER-000001-delta.vmdk

                      DMotion-scsi0:01_MYSERVER-000001.vmdk

                      DMotion-scsi0:01_MYSERVER-000001-delta.vmdk

                       

                      - Remove (Commit) the snapshots:

                      vmware-cmd /vmfs/volumes/487...d4/MYSERVER/MYSERVER.vmx removesnapshots

                       

                      The above commit all delta files, including dmotion files.

                       

                      - We have a server running on the original disks with all data intact.

                       

                      Virtual Center still thinks its in dmotion state so you can't edit settings, perform vmotion or anything via Virtual Center.

                       

                      To fix we need to clear the DMotionParent parameters in the .vmx file with the following command from ESX:

                      vmware-cmd /vmfs/volumes/487...d4/MYSERVER/MYSERVER.vmx setconfig scsi0:0.DMotionParent ""

                      vmware-cmd /vmfs/volumes/487...d4/MYSERVER/MYSERVER.vmx setconfig scsi0:1.DMotionParent ""

                       

                      If we do not do this the only option after VM shut down will be "Complete Migration" in Virtual Center. If you select this option it will try to rerun storage vmotion again (offline) but it will use same destination as the source disks. Not good if we don't have space on those LUNs.

                       

                      - We still have vmx and swap on LUN3 and vmdk files on LUN1 and LUN2.

                       

                      - Perform a new storage migration to move back the vmx files.

                       

                      Example of just moving vmx file

                      -


                      C:\Program Files\VMware\VMware VI Remote CLI\bin>svmotion.pl

                      --server=vc.mydomain.com --username=some_name --password=some_pass

                      --datacenter="My Data Center"

                      --vm="[Sys-Disk-002] MYSERVER/MYSERVER.vmx:Sys-Disk-001"

                      --disks="[Sys-Disk-001] MYSERVER/MYSERVER.vmdk:Sys-Disk-001, MYSERVER/MYSERVER.vmdk:Data-Disk-001"

                      --verbose

                       

                      Attempting to connect to service url.

                      Connected to server.

                      Resolving the input arguments.

                      Performing Storage VMotion.

                      Storage VMotion completed successfully.

                       

                      Disconnecting.

                      -


                       

                      - Clean up previous destination LUN3 and LUN4 by removing any vmdk files and folders that was created.

                       

                      - Done. We are back where we started with no downtime.

                       

                      - We can now try storage vmotion again.

                      • 8. Re: Storage vmotion fails, stuck with delta files
                        JWVMCS Novice

                        Great detail Magnus! most Useful !

                        • 9. Re: Storage vmotion fails, stuck with delta files
                          hennish Hot Shot
                          vExpert

                          Worked for me too. Thanks a lot! I was kinda nervous about that hung svmotion.

                          • 10. Re: Storage vmotion fails, stuck with delta files
                            ebowser Novice

                             

                            Hey Argyle - great post and excellent info, thanks much.

                             

                             

                            I just had this happen to a VM this morning.  I got the delta files merged back in, and now it's running as split - vmx & vmswap on one datastore, vmdk on the other.  I have modified the DMotionParent line in the vmx, and reloaded the config but all options in VI Client are still greyed out.  I even tried restarting management services on the ESX host but still no luck.

                             

                             

                            Any other ideas?

                             

                             

                            Thanks again!

                             

                             

                            • 11. Re: Storage vmotion fails, stuck with delta files
                              Argyle Hot Shot

                               

                              Did you modify the DMotionParent info via the commandl line tool vmware-cmd? If you do it manually with a editor it won't trigger a reload in virtual center of the config.

                               

                               

                              Example:

                              vmware-cmd /vmfs/volumes/487...d4/MYSERVER/MYSERVER.vmx setconfig scsi0:0.DMotionParent ""

                              vmware-cmd /vmfs/volumes/487...d4/MYSERVER/MYSERVER.vmx setconfig scsi0:1.DMotionParent ""

                               

                               

                               

                               

                               

                              • 12. Re: Storage vmotion fails, stuck with delta files
                                ebowser Novice

                                 

                                Hiya Argyle.

                                 

                                 

                                 

                                 

                                 

                                We're running ESXi, not ESX, so I don't have the vmware-cmd command.  I did run "vim-cmd vmsvc/reload <vmid>" after making the edits, to no avail.  I then actually moved my "/etc/vmware/hostd/vmInventory.xml", restarted management, replaced the file and restarted management, still to no avail.

                                 

                                 

                                 

                                 

                                 

                                Thanks,

                                 

                                 

                                Eric

                                 

                                 

                                • 13. Re: Storage vmotion fails, stuck with delta files
                                  diederikm Novice

                                   

                                  Thanks very much for your detailed explanation it helped us a lot!