jtloser
Contributor
Contributor

VM Crashes shortly after a VMotion

Jump to solution

I have an ESX farm with 4 Proliant DL360G5 Dual Quad-core servers (all identical) with 32Gb of RAM. Datastore is on an HP EVA and I have 30 VM's running currently. I have just noticed that when I VMotion a machine from one host to another, everything seems fine after the VMotion. I can login (or if I was already logged in, I am still logged in. However, after about 2 minutes, the machine that I moved will crash and reboot. The only indication of the problemis that there is a message in the Windows event log of the VM indicating that the last system shutdown was unexpected. If I am logged in to the Console, it just goes black and reboots. It is essentially as if I had reset the VM.

Has anyone else seen this type of behavior?

I am running ESX 3.5 Update 3 (build 123630)

Virtual Center is 2.5 build 119598

Thanks

Tags (1)
0 Kudos
1 Solution

Accepted Solutions
AdamSnow
Enthusiast
Enthusiast

This is a bug that was introduced in ESx 3.5 Update 3. It only happens if you have virtual Machine Monitoring enabled. What is happening is that the heartbeat is getting screwed up during a VMotion, so after 2 minutes the virtual machine monitoring thinks the VM is down, so HA kicks in and starts it on another server. We experienced the same issue right after "upgrading" to Update 3. After turning off VMM it never happened again.

I was hoping the new patches released today would fix the issue, but according to the changelogs this bug is not addressed. Amazing. I would have thought after the Update 2 timebomb fiasco that quality control and testing would be important, and that if something did slip through, a fix would come relatively soon thereafter.

View solution in original post

0 Kudos
6 Replies
weinstein5
Immortal
Immortal

Welcome to the Forums - Is this happening to just 1 VM or all 30? What applications is the VM running?

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful
0 Kudos
AdamSnow
Enthusiast
Enthusiast

This is a bug that was introduced in ESx 3.5 Update 3. It only happens if you have virtual Machine Monitoring enabled. What is happening is that the heartbeat is getting screwed up during a VMotion, so after 2 minutes the virtual machine monitoring thinks the VM is down, so HA kicks in and starts it on another server. We experienced the same issue right after "upgrading" to Update 3. After turning off VMM it never happened again.

I was hoping the new patches released today would fix the issue, but according to the changelogs this bug is not addressed. Amazing. I would have thought after the Update 2 timebomb fiasco that quality control and testing would be important, and that if something did slip through, a fix would come relatively soon thereafter.

0 Kudos
ChrisDearden
Expert
Expert

does it depend on the level of agressivness you have virtual machine monitoring set on ?

If this post has been useful , please consider awarding points. @chrisdearden http://jfvi.co.uk http://vsoup.net
0 Kudos
jtloser
Contributor
Contributor

Turning off VMM seems to have done the trick!

Thands Adam!

0 Kudos
jtloser
Contributor
Contributor

Sorry

0 Kudos
jtloser
Contributor
Contributor

I thought my reply wasn't working.

0 Kudos