VMware Cloud Community
mrweetman
Contributor
Contributor

cold migration corrupted NTFS on VM

Hi,

A few days ago I did a cold migration from one cluster to another within the same datacenter. This caused NTFS to behave badly on 3/4 of the VM's migrated. 3 machines never booted (can not find ntoskrnl.exe) several either crashed on next reboot or had to be restored from backup due to different problems (services not starting and so on)

What did I do:

  • Selected 6-7 powered on VM's from the VC client . The VM's where possibly on several hosts and on different LUN's.

  • gave the "guest power off" command

  • migrated all VM's to the same LUN in a different cluster

  • powered the VM's back up

I did this twice for a total of 13 VM's.

My environment consists of

Virtual Center server on version 4.01

ESX hosts on 3.5 update 4

LUN's on a EMC Symmetrix SAN

My questions are?

Has anyone else had similar experiences?

Is vSphere VC and ESX3.5 a bad combination?

Is it not recommended practice to move several VM's to the same LUN at one go?

Any suggestions to why this happens? SCSI-reservations to due simultaneous writes has been suggested from my storage team..

Regards

Marius Aulie

Tags (3)
0 Kudos
5 Replies
AmaroITSS
Contributor
Contributor

>> Has anyone else had similar experiences?

I know costumers that had simliar issue

>> Is vSphere VC and ESX3.5 a bad combination?

Not that I'm aware.

>> Is it not recommended practice to move several VM's to the same LUN at one go?

It really depends on the type of storage you are using, I never experienced OS corrupction due to the migration of multiple VMs.

Hmmm If a migrate multiple VMs I will distribute the load to different LUNs or I migrate two at a time to prevent performace issues to the existing VMs of that LUN.

>> Any suggestions to why this happens?

The only concern is that you "power off" the virtual machines, which is the same as unpluging the power cable to a physical server which can be the cause of OS corruption.

Recomendations:

Send I "Shut down Guest" command if you cannot VMotion VMs

If you need to migrate more VMs clone the VM(s) or backup the VMDKs prior to the migration until you are sure there is not issues with that datastore.

Backingup the VMDKs is a lot faster and safer then backups from third party applications.

Hope this helps.

0 Kudos
mrweetman
Contributor
Contributor

Thank you for your reply. It is interesting to hear that this has happened to others (but it is not a good thing that it happened).

Just to clear some misunderstandings

I did use the power off guest command (which shut's down the guest operating system before powering off)

I cannot use VMotion because hosts are in two seperate clusters and do not share SAN resources (and due to LUN ID reasons it is not possible to give a host storage from both clusters)

We just configured snapshots on the san so I got a backup just a some days old (which was good enough)..

Edit: I too usually move one or two VM's at time and spread the load on different LUN's. But this was a maintainace window work so performance loss for other users was not a issue and I wanted to move many VM's so I wanted to be efficient and move a whole bunch at once (yeah, saved me a whole lot of time :s )

-Marius

0 Kudos
jpdicicco
Hot Shot
Hot Shot

I cannot use VMotion because hosts are in two seperate clusters and do not share SAN resources (and due to LUN ID reasons it is not possible to give a host storage from both clusters)

This comment surprises me. I have several vdisks/volumes on an EVA that are attached to multiple clusters. In each cluster, they are assigned different LUNs. This has not been a problem, and shouldn't be. In fact, you can have different hosts within a cluster have the same volume assigned to different LUNs. This is not a best practice as it makes management thorny at best, but it is possible.

I'm curious, what setup do you have that you cannot share the volumes across clusters?

Happy virtualizing!

JP

Please consider awarding points for correct and/or helpful answers

Happy virtualizing! JP Please consider awarding points to helpful or correct replies.
0 Kudos
mrweetman
Contributor
Contributor

We just got the new SAN some months ago and have just started using it. I am unsure as to the details behind why one host can not be a member of two storage groups, as this could be done on the old SAN. The reason I got from my storage guy is that the LUN ID's are identical in the different storage groups and therefor one host can not be a member of more than one group at the same time. I haven't gotten so far as to investigate the details so that's all I know

-M-

0 Kudos
mrweetman
Contributor
Contributor

As AmaroITSS said, he has heard of (experienced) similar cases, and I am sure others have too. As the week has gone by all of the VM's I moved has crashed. Can it be so that a stressed out VMFS disk (LUN on SAN) can corrupt a virtual machines harddisk when it is written (moved) to the datastore? I find me wanting a reason.. Anyone has any idea as to where I can find more information or start troubleshooting this??

-Marius

0 Kudos