VMware Cloud Community
vinhdat82
Contributor
Contributor

Duplicated VM when trying to cancel Storage Vmotion

Hi all,

I'm using vSphere 5.1.

When doing Storage vmotion a poweron VM between 2 local disks, I decided to cancel the task (around 33%).

I waited 6 hours, but the task didn't cancel.

So I decided to restart vCenter Services.

Then a duplicated VM appearred in "Discovered VM..."

I couldn't rename / remove from inventory / Delete from disk, vCenter says it' not in an allowed state.

The source VM is still power on.

Any help is appreciated.

RHCE, VCI
0 Kudos
7 Replies
a_p_
Leadership
Leadership

When you cancel Storage vMotion it depends on the current progress, how long it takes to revert to the "old" state. vCenter's responsibility is to initiate/monitor the Storage vMotion job. With restarting the vCenter Server services, you actually didn't cancel the job which is running on the ESXi host but broke the management. I guess you now have a running or stuck job on the ESXi host, so what you can do is to check which files are in the source and destination folder to estimate, what may need to be re-assembled if you now kill the job on the host.

André

0 Kudos
vinhdat82
Contributor
Contributor

It didn't make sense to take more than 6 hours to stop a 33% Storage Vmotion.

So what should I do?

Thanks so much for any help.

RHCE, VCI
0 Kudos
peetz
Leadership
Leadership

Some things to try:

-Log on to the host (on which the Storage vMotion was running) directly using the vSphere client. Can you unregister the VM then?

-Restart the management services on this host, and try again.

- Andreas

Twitter: @VFrontDe, @ESXiPatches | https://esxi-patches.v-front.de | https://vibsdepot.v-front.de
vinhdat82
Contributor
Contributor

When vSphere client directly to the host,  all the options are disabled for that VM.

Restart management network is risky, I need to stay besises the host.

RHCE, VCI
0 Kudos
vinhdat82
Contributor
Contributor

I'm able to delete all file except vswp and lck file.

RHCE, VCI
0 Kudos
vinhdat82
Contributor
Contributor

After I restarted hostd as in /etc/init.d/hostd restart, I could remove the VM from the inventory.

That's enough.

I'm still unable to remove it from disk. The swsp and lck are still locked.

But nevermind, I will wait for the next host reboot.

RHCE, VCI
0 Kudos
Tommygee
Contributor
Contributor

Hello,

I had a similar issue before. We had SRM in the mix and because you cannot storage vMotion a VM with replication enabled, it ended up having a split brain scenario after we tried to move it to a new datastore. This sounds similar in nature to what you are seeing if I understand correctly.

You need to look at the location of the .VMX file because it is likely spanned (in two different places) across the two datastores. We found that there was a zombie VM (simliar to yours) on the same host showing no cpu or memory utilization. The only fix was to reboot the host, remove the VMs from inventory and then move the .VMX file back to the correct location (same datastore directory) or point the VMX file that was on the destination datastore back to the source .VMDK directory. Then add that VMX back into inventory.

I hope that makes sense and is helpful!

Cheers,

Tom

0 Kudos