Jeroenix
Contributor
Contributor

Same issue: cluster of BL460C's on iSCSI to a 3Par7200. Backed up 213 VM's using VeeAm 8, and one of the VM's crashed and failed over while VeeAm was removing the snapshot. It happened on one of the very few VM's that had an upgraded VMtools (I hadn't come round to updating the Tools on all VMs, so only 10 of them are running upgraded versions).

Are there more people like KBAdmin and me who experience more frequent crashes with updated VMtools than VMs with old VMtools?

Reply
0 Kudos
JumpMaster
Contributor
Contributor

I have 23 ESXi hosts upgraded to Update 3 and am seeing this issue with veeam.  This started happening shortly after upgrading to veeam 8 so had tickets open with veeam and vmware.  After two days of sending log, after log, after log, (you know the drill) they told us about this issue.  All the time we were continuing our upgrade to update 3.

It didn't sound like they were close to a fix.

Reply
0 Kudos
fr8rt8rt
Contributor
Contributor

what they told you about this issue?i want to upgrade to u3!

Reply
0 Kudos
JumpMaster
Contributor
Contributor

DON'T!

Reply
0 Kudos
MikeStone226
Contributor
Contributor

I started reverting my hosts back to U2 (VMware KB: Reverting to a previous version of ESXi).  since I haven't received *ANYTHING* on my Support Request (which was shocking, usually support is great). I rolled back a host for testing and found that the guests are now only visible through the vSphere Web Client and not the vSphere Client.  That might be specific to me, but who knows, just a heads up.

Reply
0 Kudos
KBadmin
Contributor
Contributor

@ MikeStone226

Had the same Problem after reverting my Hosts. I think you must connect every VM on the vSphere Client manually too.

Hope vmware fix this bug soon!

Reply
0 Kudos
igonzalez82
Contributor
Contributor

Same problem here.

I've some ESXi in version 5.5 Update 2, and the issue doesn't happen.

I'm also using Veeam and multivendor 10GB SANs (Netapp + Solidfire)

Someone have more information from VMware?

Do you recommend to downgrade, disabling CBT can cause problems if you SAN infrastructure is a bit busy.

Reply
0 Kudos
airfrog7
Enthusiast
Enthusiast

I have a case raised with VMware about this issue. The tech support guy I spoke to acknowledged this is a pretty major bug. Something is causing the Windows guest (I don't know if it affects Linux) to crash on snapshot removal. They don't have a workaround other than downgrading, which we are currently doing. This issue has apparently been escalated to the highest level as it is affecting an awful lot of customers. They don't have an ETA for when a fix will be available.

The error you see in the vmware.log file for the VM will look something like this:

2015-09-29T17:24:42Z[+8.996]| vcpu-0| I120: SymBacktrace[1] 000003fffbf1af60 rip=00000000162957c5 in function (null) in object /bin/vmx loaded at 000000001611f000

2015-09-29T17:24:42Z[+8.996]| vcpu-0| I120: SymBacktrace[2] 000003fffbf1b460 rip=0000000016377766 in function (null) in object /bin/vmx loaded at 000000001611f000

2015-09-29T17:24:42Z[+8.996]| vcpu-0| I120: SymBacktrace[3] 000003fffbf1b640 rip=00000000003d500f

2015-09-29T17:24:42Z[+8.996]| vcpu-0| I120: Msg_Post: Error

2015-09-29T17:24:42Z[+8.996]| vcpu-0| I120: [msg.log.error.unrecoverable] VMware ESX unrecoverable error: (vcpu-0)

2015-09-29T17:24:42Z[+8.996]| vcpu-0| I120+ Unexpected signal: 11.

In our case it affecting maybe 1% of our VMs every evening, but different VMs each time.

Reply
0 Kudos
KBadmin
Contributor
Contributor

I think so, it´s causing Windows guests.

We had this Problems only on 64 bit Systems of Server´s 2008 und Server´s 2012 R2. We have 2 VM´s (Windows Server 2003 - 32bit) we need them for old applications - this Systems never crashed!

Our Linux Server´s never crashed, too - because on the Linux machines are other vmtools.

We had the same effect, different machines crashed after deleting snapshot´s.

After rollback our hosts to update 2 we haven´t Problems.

Reply
0 Kudos
Sergey_Petrushi
Contributor
Contributor

Not only Windows guests. We have same issue with Debian based VM's.

Reply
0 Kudos
vAMenezes
Enthusiast
Enthusiast

Is everyone here using Veem? Or is this happening with different backup tools? Anybody here using Commvault? I have upgraded to U3 but these are new hosts so I don't have anything in production running on them yet, so if I'm going to downgrade I need to do it now.

Reply
0 Kudos
gregsn
Enthusiast
Enthusiast

I've been able to reproduce the problem by manually creating and deleting snapshots so I don't think it's related to any particular backup software.

Reply
0 Kudos
drc0106
Contributor
Contributor

I have been working on a simular issue with VMWare when Update 2 came out. It was in relation to snapshots with quiesced turned on. Check out form Windows 7 and Windows2008R2 VM BSOD ntfs.sys and KB2115997.

We have not installed Update 3 yet to fix the issue so I have emailed the engineer I have been working with and seeing if he has any insight into the issue. You may want to see if turning off quiesce snapshots and seeing if that works. If so, reinstall the VMWare Tools without the VSS Writer. The issue we are having is related to the fact that Update 2 changed the VSS Writer and was causing our Windows servers to BSOD when snapshots were created. They were suppose to roll back to the old VSS Writer in Update 3 to fix this issue. That rollback may have caused some issues. However, I am only speculating here as we have not updated to Update 3 yet or experienced the issue when snapshots are deleted only when created.

Jeroenix
Contributor
Contributor

Because in my case, very few VMs go down during backup (only one, to be precise) I decided to let the VeeAm backups run. Last night, guess what: the very same VM went down. Out of 213 VMs. I checked out this one VM but can't find anything out of the ordinary. I also opened a ticket with VMware, maybe if they collect all our logs, they can find a pattern.

Reply
0 Kudos
igonzalez82
Contributor
Contributor

Hi,

News from the support team:

"Last night engineering found the root cause. They will need to produce an express patch.

We have asked how long are we expecting to wait for this, I will follow up once a response has been received."

Fantomas01
Contributor
Contributor

Would this bug be causing issues with Server 2012 VM's not booting up.  We havent deleted any snapshots.  I patched our hosts yesterday as we were having an issue with a couple of our 2012 VMs getting stuck on the splash screen and not progressing.


The KB that I read about that issue said the issue was fixed in u3.

Thanks

Reply
0 Kudos
GMZSE
Contributor
Contributor

Still shocked there is no mention of this major issue when you are at the download page. Does anyone know an ETA for the fix?

Reply
0 Kudos
mfedermanv
VMware Employee
VMware Employee

Does this issue still exist for you, and what version of NetBackup?

Reply
0 Kudos
suprnova13
Contributor
Contributor

I have this same issue.  Spoke with support, here is the KB article: VMware KB: Snapshot consolidation causes virtual machines running on VMware ESXi 5.5 Update 3 hosts ...

He did say it should be fixed in express patch 8, but that won't be released for another two to three weeks.

Reply
0 Kudos
admin
Immortal
Immortal

The following KB article has been updated with additional workarounds as well as the symptoms cleaned up:

Hope this helps!

http://kb.vmware.com/kb/2133118

Snapshot consolidation causes virtual machines running on VMware ESXi 5.5 Update 3 hosts to fail with the error: Unexpected signal: 11 (2133118)

Reply
0 Kudos