VMware Cloud Community
kenner
Contributor
Contributor

Wierd Storage VMotion error

What does this mean? It seems to have left debris around in lots of places, including /vmfs/devices/deltadisks.

Performing Storage VMotion.

0% |----


| 100%

##########################################################################################

Received an error from the server: A general system error occurred: DMotion: Failed to unstun VM after disk reparent. You will have to manually perform the relocation.

Tags (2)
0 Kudos
53 Replies
jstoltz_w
Contributor
Contributor

I get the same error that kenner had. If it happens again I'll try manually completing the migration from virtualcenter by right clicking on the machine.

Thanks

0 Kudos
kitcolbert
VMware Employee
VMware Employee

Ok, myself and a few other engineers are looking into why this is happening, but it'd really help us to have a vm-support and vc-support. Can anyone who's hitting this grab both of those next time they hit it and either open an SR and give me the SR number or just contact me about sending the *-supports directly to me.

Thanks a lot!

Kit

0 Kudos
kenner
Contributor
Contributor

I can reproduce it at will, so I just did. I ran vm-support and it made an 18MB file. I'm not sure what "vc-support" refers to. I also don't see a "Complete migration" or similar on VIC when I right-click the vm in question. Let me know what to run for the vc-support and where to put the vm-support file.

0 Kudos
kitcolbert
VMware Employee
VMware Employee

Thanks kenner. "vc-support" refers to the VC server log bundle. If you have access to the VC server host, then under start->programs->vmware, then should be a link to get the VC server log bundle. If you click that, it will grab all the settings and logs and everything. I'll PM you with instructions on where to upload it.

Thanks!

Kit

0 Kudos
ROMCH
Contributor
Contributor

Hi,

I had exactly the same behavoir like in this thread. Also I ended up with a powered-off VM. I checked the hole VC but there was no option how did allow me to "Complete the Migration" or something like that.

thanks,

ROM

VCP4 & VCP3 & CCNA
0 Kudos
goheels
Contributor
Contributor

I'm hitting this same error with the 3.5 GA Release. Let me know if you need any additional support materials or if I can assist with debugging or testing.

Regards, Greg

0 Kudos
admin
Immortal
Immortal

I am trying to reproduce the same problem. I have a Linux VM with IO going on inside that. I ran 10-20 svmotion but could not hit the issue. Can somebody guide me on what exactly I need to do to reproduce the problem ?

0 Kudos
ROMCH
Contributor
Contributor

I'd run about 50 svmotions with small and big vmdk's until I across the Issue, I'd opened an SR at vmware's support.

Likely the issue happens when the VM or the ESX has a high CPU load!

I'll try to keep the thread up to date.

VCP4 & VCP3 & CCNA
0 Kudos
kenner
Contributor
Contributor

My experience is that it follows the VM. svmotion will always fail on one particular VM and never fail with this problem on any other VMs.

0 Kudos
ROMCH
Contributor
Contributor

Here is the result of my VMware SR I made some days ago:

Snapshot Operations Submitted Directly to an ESX Server Host During Storage VMotion Corrupt Virtual ...

VCP4 & VCP3 & CCNA
0 Kudos
jaygriffin
Enthusiast
Enthusiast

This seems to be a different issue than we have been discussing in this thread. We are talking about doing svmotions not completing cleanly.

0 Kudos
ROMCH
Contributor
Contributor

When a backup-software like VCB / EsxRanger / esXpress runs, it add's and removes snapshots to the involved VMs.

I'd made a esXpress-test-backup on a VM while "Storage Vmotion" was runnning on the same VM. This produced a "snap on snap" situation. Whenr esXpress has finnished only the Dmotion... snaps are still there. After "Storage Vmotion" is almost finished the long error-message comes up:

VMware ESX Server unrecoverable error: (vmx) DMotion: Failed to unstun VM after disk reparent. A log file is available in "vmware.log". A core file is available in ...../vmware-vmx-zdump.000". Please request support and include the contents of the log file and core file. To collect files to submit to VMware support, run "vm-support". We will respond on the basis of your support entitlement

I was told by vmware, this is a bug because ESX does not block the creation of additional snaps while svmotion is running. Possibly this issue happens also to a VCB backup-procedure.

VCP4 & VCP3 & CCNA
0 Kudos
kenner
Contributor
Contributor

The way I read that KB entry, this will only happen if the snapshot is taken directly on the host, but not if it's taken via VC. esXpress, as I understand it, will take the snapshot via the host if it's not in a snap-on-snap, but will use VC if it is. I think VCB will always use VC so it can't happen with it.

But I'm not sure that the snapshot issue is the cause of this message. What I was told is that this message, or at least my occurrence of it, is caused by having a disk that was originally thin-provisioned. I think there are multiple different svmotion issues here.

0 Kudos
jaygriffin
Enthusiast
Enthusiast

I agree Kenner. Snapshots were not involved in my issue.

0 Kudos
stu
Contributor
Contributor

Agreed that there do seem to be many different scenarios occurring when this error comes up.

In my situation - A brand new Windows 2003 VM, no users hitting it yet, normal (not thin) disks, no backups running of any kind. Host not under load.

99% complete, then the error.

0 Kudos
ian4563
Enthusiast
Enthusiast

I agree, mine did not involve snapshots either, that KB doen't apply. So far 3 out of 9 (33%!) svmotions have given me this error making it an experimental feature in our environment. I would STRONLGY recommed that people not use it on critical VMs, if you get this error you will have downtime and probably data loss!

0 Kudos
kitcolbert
VMware Employee
VMware Employee

Hi all,

I apologize for the delay, but I'd like to give you all an update on this issue. As many of you have noticed, the error that you're presented with may actually be caused by a number of different things; there are at least two different root causes of it.

As mentioned, there is a KB covering the case where a snapshot is taken during Storage VMotion using a client or application connected directly to the ESX host (issue #1).

However, most likely more of you are hitting a second issue, which is caused by using a VM with thin-provisioned disks disks or using a VM that was deployed from a template VM that has thin-provisioned disks and trying to Storage VMotion it. For everyone who is hitting this issue and has ruled out issue #1, can you please verify the following:

1. ssh to the ESX host

2. cd to the VM's directory

3. For each VM disk, "cat" the .vmdk file (NOT the -flat.vmdk file!). The .vmdk file is just a simple text file and it should only be a few hundred bytes in size or so. So if your VM's disk is named "foo.vmdk", then you'd simply type "cat foo.vmdk".

4. In the output, you'll see a lot of lines that start with "ddb.", if you see a line like the following:

ddb.thinProvisioned = "1"

then Storage VMotion'ing your VM will fail in the ways described here. Please do not attempt to Storage VMotion these VMs at this time!

Please report back if your VMs that fail during Storage VMotion have the ddb.thinProvisioned flag set or not.

We are still working on a workaround and patch. Thanks a lot for your help and patience.

Thanks,

Kit

0 Kudos
stu
Contributor
Contributor

Well Kit,

That nailed my situation. My VM's were deployed from template, and sure enough, the 'C' drive on my Windows servers indicate ddb.thinProvisioned = "1".

I guess that means that none of my servers can be moved using svmotion until you guys come out with a patch.

I will be anxiously awaiting that.

Thanks !

0 Kudos
goheels
Contributor
Contributor

Ditto. All my VMs were deployed by a template and ALL have ddb.thinProvisioned = "1".

Let me know if I can assist with anything.

Greg

0 Kudos
jaygriffin
Enthusiast
Enthusiast

I had 2-3 fail and don't recall which ones they were now. But one I know failed does have thin provisioning.

0 Kudos