VMware Cloud Community
jtloser
Contributor
Contributor
Jump to solution

VM Crashes shortly after a VMotion

I have an ESX farm with 4 Proliant DL360G5 Dual Quad-core servers (all identical) with 32Gb of RAM. Datastore is on an HP EVA and I have 30 VM's running currently. I have just noticed that when I VMotion a machine from one host to another, everything seems fine after the VMotion. I can login (or if I was already logged in, I am still logged in. However, after about 2 minutes, the machine that I moved will crash and reboot. The only indication of the problemis that there is a message in the Windows event log of the VM indicating that the last system shutdown was unexpected. If I am logged in to the Console, it just goes black and reboots. It is essentially as if I had reset the VM.

Has anyone else seen this type of behavior?

I am running ESX 3.5 Update 3 (build 123630)

Virtual Center is 2.5 build 119598

Thanks

Tags (1)
Reply
0 Kudos
1 Solution

Accepted Solutions
AdamSnow
Enthusiast
Enthusiast
Jump to solution

This is a bug that was introduced in ESx 3.5 Update 3. It only happens if you have virtual Machine Monitoring enabled. What is happening is that the heartbeat is getting screwed up during a VMotion, so after 2 minutes the virtual machine monitoring thinks the VM is down, so HA kicks in and starts it on another server. We experienced the same issue right after "upgrading" to Update 3. After turning off VMM it never happened again.

I was hoping the new patches released today would fix the issue, but according to the changelogs this bug is not addressed. Amazing. I would have thought after the Update 2 timebomb fiasco that quality control and testing would be important, and that if something did slip through, a fix would come relatively soon thereafter.

View solution in original post

Reply
0 Kudos
6 Replies
weinstein5
Immortal
Immortal
Jump to solution

Welcome to the Forums - Is this happening to just 1 VM or all 30? What applications is the VM running?

If you find this or any other answer useful please consider awarding points by marking the answer correct or helpful
Reply
0 Kudos
AdamSnow
Enthusiast
Enthusiast
Jump to solution

This is a bug that was introduced in ESx 3.5 Update 3. It only happens if you have virtual Machine Monitoring enabled. What is happening is that the heartbeat is getting screwed up during a VMotion, so after 2 minutes the virtual machine monitoring thinks the VM is down, so HA kicks in and starts it on another server. We experienced the same issue right after "upgrading" to Update 3. After turning off VMM it never happened again.

I was hoping the new patches released today would fix the issue, but according to the changelogs this bug is not addressed. Amazing. I would have thought after the Update 2 timebomb fiasco that quality control and testing would be important, and that if something did slip through, a fix would come relatively soon thereafter.

Reply
0 Kudos
ChrisDearden
Expert
Expert
Jump to solution

does it depend on the level of agressivness you have virtual machine monitoring set on ?

If this post has been useful , please consider awarding points. @chrisdearden http://jfvi.co.uk http://vsoup.net
Reply
0 Kudos
jtloser
Contributor
Contributor
Jump to solution

Turning off VMM seems to have done the trick!

Thands Adam!

Reply
0 Kudos
jtloser
Contributor
Contributor
Jump to solution

Sorry

Reply
0 Kudos
jtloser
Contributor
Contributor
Jump to solution

I thought my reply wasn't working.

Reply
0 Kudos