riki78
Contributor
Contributor

Problem with quiesced snapshot on Server 2008 R2 with ESXi 4.1 U1

Hello

Since we have updatet from ESXi 4.1 to Update 1 we have problems with taking quiesced snapshot. Before we have installed the update 1 quiesced snapshot worked witout problems.

When we take a quiesced snapshot, Windows 2008 R2 mount a new volumes and remove them after the snapshot is taken.

In the Eventlog i can see following errors and infos:

multiple entrys from this 2 event:

Log Name:      System
Source:        Virtual Disk Service
Date:          11.04.2011 22:40:26
Event ID:      3
Task Category: None
Level:         Information
Keywords:      Classic
User:          N/A

Description:
Service started.
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="Virtual Disk Service" />

and

Log Name:      System
Source:        Service Control Manager
Date:          11.04.2011 22:40:26
Event ID:      7036
Task Category: None
Level:         Information
Keywords:      Classic
User:          N/A
Computer:      TESTTEMPLATE.gef.be.ch
Description:
The Virtual Disk service entered the running state.

and

Log Name:      System
Source:        Ntfs
Date:          11.04.2011 22:20:57
Event ID:      57
Task Category: (2)
Level:         Warning
Keywords:      Classic
User:          N/A
Description:
The system failed to flush data to the transaction log. Corruption may occur.

and

Log Name:      System
Source:        Ntfs
Date:          11.04.2011 22:20:57
Event ID:      137
Task Category: (2)
Level:         Error
Keywords:      Classic
User:          N/A
Description:
The default transaction resource manager on volume \\?\Volume{d533451f-6478-11e0-94e8-005056ae007a} encountered a non-retryable error and could not start.  The data contains the error code.

and

Log Name:      Application
Source:        VSS
Date:          11.04.2011 18:05:48
Event ID:      12289
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      G3110SRV014APL.gef.be.ch
Description:
Volume Shadow Copy Service error: Unexpected error DeviceIoControl(\\?\fdc#generic_floppy_drive#6&2bc13940&0&0#{53f5630d-b6bf-11d0-94f2-00a0c91efb8b} - 0000000000000418,0x00560000,0000000000000000,0,000000000027CDB0,4096,[0]).  hr = 0x80070001, Incorrect function.
.

Operation:
   Exposing Recovered Volumes
   Locating shadow-copy LUNs
   PostSnapshot Event
   Executing Asynchronous Operation

Context:
   Device: \\?\fdc#generic_floppy_drive#6&2bc13940&0&0#{53f5630d-b6bf-11d0-94f2-00a0c91efb8b}
   Examining Detected Volume: Existing - \\?\fdc#generic_floppy_drive#6&2bc13940&0&0#{53f5630d-b6bf-11d0-94f2-00a0c91efb8b}
   Execution Context: Provider
   Provider Name: VMware Snapshot Provider
   Provider Version: 1.0.0
   Provider ID: {564d7761-7265-2056-5353-2050726f7669}
   Current State: DoSnapshotSet
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
  <System>
    <Provider Name="VSS" />
    <EventID Qualifiers="0">12289</EventID>
    <Level>2</Level>
    <Task>0</Task>
    <Keywords>0x80000000000000</Keywords>
    <TimeCreated SystemTime="2011-04-11T16:05:48.000000000Z" />
    <EventRecordID>6430</EventRecordID>
    <Channel>Application</Channel>
    <Security />
  </System>
  <EventData>
    <Data>DeviceIoControl(\\?\fdc#generic_floppy_drive#6&amp;2bc13940&amp;0&amp;0#{53f5630d-b6bf-11d0-94f2-00a0c91efb8b} - 0000000000000418,0x00560000,0000000000000000,0,000000000027CDB0,4096,[0])</Data>

When we remove the snapshot the vm freeze for about 30second (no Ping no RDP)

Have anyone the same issue?

I have try to install a new clean server 2008 r2 but have the same problem.

Best Regards
Simon

0 Kudos
59 Replies
Krille666
Contributor
Contributor

Hey,

we have the same Problem. That's the answer from vmware support (2 Mails):

1.

Thank you for your Support Request.

Please see KB 2006849 (http://kb.vmware.com/kb/2006849) for a description of your issue.

Let me know via replying to this email if you have any more questions!

Best Regards,

Then I asked about an "Fix by" Date.

2.

Yes, VMware and Microsoft are currently investigating a way to suppress these false positives. If you subscribe to the RSS feed of the KB article I sent you, you will be automatically notified of any updates. Unfortunately as of yet we don't have a 'Fix By' date.

So, I think we have to wait.

0 Kudos
vmbud
Contributor
Contributor

Just out of curiosity:

Is any of the Win 2008 R2 with the problem using vmdk as dynamic disks?

                                      

I have not experience any kind of additional issues once 100MB system portioning was removed.

We have few SQL 2008, and other 2008 R2 Servers each with multiple vmdk’s (some RDM’s) but all initialized as basic disks.

One thing I do recall is that mounted restore points for browsing and data recovery ware read only and in some scenario could not continue browsing the contents, etc…Disabling User Account Control fixed this issue.

- I’m using FC storage only (no iSCSI, no NFS)

- Windows guests with basic disks (no dynamic disks)

Other Win 2008 R2 observations:

- for every quiesced snapshot command there are 2 snapshot files (xxxx-00000001.vmdk and xxxx-00000002.vmdk)

- VDR mounts and backs up xxxx-00000002.vmdk and not original vmdk

0 Kudos
Guillir
Enthusiast
Enthusiast

Same errors with vSphere 5 build 474610.

0 Kudos
Rajko
Contributor
Contributor

Hello,

I have same errors with vSphere 5 build 504890.

0 Kudos
Rajko
Contributor
Contributor

Same problem in vSphere 5 build 515841

0 Kudos
bstephens
Contributor
Contributor

Have any of you tried seeing if you have any unreleased unnecessary shadow copies within the VM?  We were having this very issue with multiple Windows 2008 R2 and R2 SP1 virtual machines, but not all of them.  As we began looking for differences between the VMs, we looked to see what if any shadow copies might be showing up within the VM by using the Microsoft DiskShadow command.

We simply opened a command prompt within both a broken VM and a working VM and ran diskshadow.exe, issued the "list shadows all" command, and on the broken VM we had an unreleased shadow copy from quite awhile back and no such shadow copies on the working VM.  So we decided to delete the unreleased Microsoft shadow copy, as we had no use for it, by using the "delete shadows all" command of diskshadow.exe.  We used that command as we did not need to single out any particular shadow copy for deletion.

Once we deleted the unreleased Microsoft Shadow Copy, that we believe may have been generated during a failed backup, from this broken Windows 2008 R2 SP1 VM that could not properly generate quiesced ESX snapshots at the application level and then attempted to generate a quiesced ESX snapshot at the application level it worked fine.  Directly before this deletion of the Microsoft Shadow Copy, that same ESX snapshot attempt failed.

We have confirmed that these steps have fixed all of our Windows 2008 R2 and R2 SP1 virtual machines that were having issues generating quiesced ESX snapshots.

Hopefully this may be the root of some of the issues listed here by people as well and these steps may help some of you resolve them.

Good Luck.

Current Environment -

vCenter Server 5.0.0-455964

Hosts running mix of ESXi 5.0.0-469512 Enterprise and Enterprise Plus

Virtual Machine Version: 8

0 Kudos
Krille666
Contributor
Contributor

Hi bstephens,

I just tried your solution.

But on our Machine, where the failure exists, we have no Shadow Copys.

0 Kudos
mitchellm3
Enthusiast
Enthusiast

I don't know how everyone is fairing with this problem but I thought I'd throw out what our company found to get around this issue.

Our issues started when we went to vSphere 5.0.  We had used 4.1 but most of our environment at the time was 4.0.  With the new VMs on 5.0 we couldn't figure out why this issue was happening.  As everyone knows the VM snapshot would fail to quiesce, the backup would fail and most likely, the VM itself would get so jacked up because of it, we would have to power cycle the VM.  When the VM would come back up we'd have to consolidate the snapshot due to the residue of the failed snapshot.

We found the work-around to change the disk.enableuuid to false in order to get good backups again.  I did place a call with VMware at the time and they said it was an issue with Microsoft and that I'd have to put in a call with them to see what the bad vss writer was.  They did however point to an article about how to run a snapshot and exclude certain VSS writers but I didn't find that article to be the greatest.  Anyway, the solution going forward was to change the disk.enableuuid to false during our patching cycles until we came up with a better solution.

Well after some tinkering I finally found out the issue that was causing the backups to fail and it was dumb luck too.  We just so happened to be preparing to upgrade our primary backup servers from Arcserve 15 sp1 to Arcserve 16 sp1 in order to use VMFS5 and also to get around the "restore" issue with ESXi 5.  Anyway, since all the agents would need to be upgraded at the same time as the backup server, we have been uninstalling the open file agent.  The reason for this is we can push the 16 sp1 client upgrade to all the VMs but the open file agent always requires a reboot.  The open file agent also really isn't being used in these servers because we do the image backup with file level restore...so no need to open file agent.  Also, we have been configuring the open file agent to use VSS instead of using its own open file agent...kinda like vmware using VSS instead of their sync driver.  Anyway, when i went to test a backup with the disk.enableuuid option set to true, to my surprise it worked.  I then installed the open file agent from Arcserve and then it bombed.  Uninstalled it and it worked again.

So, the good news is that now we know that uninstalling the open file agent fixes our problem.  I'm glad that I didn't go the route of following that VMware article and trying to exclude each VSS writer from being used because arcserve didn't install any writers and I would never have gotten the backup to work with the disk.enableuuid option set to true.  The bad news is that now 2/3 of our 2008 R2 servers have disk.enableuuid set to false and we need to change them back.

Doh!

0 Kudos
sgunelius
Hot Shot
Hot Shot

We're also seeing this on our 2008 R2/Exchange 2010 CAS/HT VMs.  The VMs are on ESXi 4.1 U1 and are Thick provisioned.  We've been seeing errors for a while, but now the issue seems to upset CAS and it won't support access to Exchange for clients until the guest is rebooted.  We are running ARCserve Backup r16 and backing up the VMs in Raw mode.  I'm opening up a case with VMware through HP and another with CA technical support.  I'm hoping somebody has a solution because periodically rebooting the server isn't a great workaround.

0 Kudos
EL-JK
Contributor
Contributor

Bedankt voor uw e-mail.

I.v.m. verlof ben niet in de gelegenheid om uw bericht te lezen vóór woensdag 11 april

Met vriendelijke groet,

Jeroen Klein

Systeem & Netwerkbeheerder

Gemeente Etten-Leur | Afd: BCO

Postadres: Postbus 10100 | 4870 GA Etten-Leur

T 076 - 5024472 | F 076 - 5033880 | E jeroen.klein@etten-leur.nl | I www.etten-leur.nl

************************DISCLAIMER************************

Deze e-mail is uitsluitend bestemd voor de geadresseerde(n).

Verstrekking aan en gebruik door anderen is niet toegestaan.

Gemeente Etten-Leur sluit iedere aansprakelijkheid uit die

voortvloeit uit elektronische verzending

This e-mail is intended exclusively for the addressee(s),

and may not be passed on to, or made available for use by

any person other than the addressee(s)

Gemeente Etten-Leur rules out any and every liability

resulting from any electronic transmission.

0 Kudos
Yps
Enthusiast
Enthusiast

I still got this error on 5.1 with latest tools on Win2008r2.

Does it exists a recommendation from Microsoft on this?

Is it a VMware or Microsoft problem?

KB from Vmware: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=103129...

Thanks, Magnus

0 Kudos
sgunelius
Hot Shot
Hot Shot

I'll be out of the office on Thursday, November 8th. Please direct any immediate requests for assistance to the GOAA help desk at (407) 825-4500. Thank you.

0 Kudos
softeislutscher
Contributor
Contributor

Hi guys,

I'm running on the same issue with some ESXi Hosts...

but as this is a problem which exists since a few years, shouldn't be there a solution someday?

the funny thing is, it seems to be not with all VMware Builds...

thanks

Alex

0 Kudos
Krille666
Contributor
Contributor

Vielen Dank für Ihre Nachricht. Ich bin zur Zeit nicht im Haus und Ihre Mail wird nicht weitergeleitet. Sie erreichen mich wieder ab dem 23.11.2012.

In dringenden Fällen wenden Sie sich bitte an die IT-Hotline, Tel. 0511 / 30031-555.

Mit freundlichen Grüßen

Christian Wahlmann

0 Kudos
smockey
Contributor
Contributor

Hello,

Sorry to dig up this discussion, but is there a solution about this problem ?

KB VmWare doesn't explain any solution http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=200684...

I'm with Esx 5.0 and W2k8 R2 up to date and some of VM are not backuped due to this problem (event ID 12289)

If anybody could help me.

Regards

Damien

0 Kudos
theklaas
Contributor
Contributor

Hello,

So far as i know, there is no solution for this "problem".

But there are multiple issues that associated to this problem. Vmware and Microsoft point to each other to blame.

If the event log shows something about a floppy disk, just remove the floppy drive and controller.

Keep windows and vmware up to date.

And when the VM runs an application like Exchange of SQL, you can ignore this error. But make an back-up with a proper backup solution.

For a complate store, see the post of mulio at page 2

Source: http://technet.microsoft.com/de-de/library/ee923636(WS.10).aspx

If the shadow copy is successfully created, the Volume Shadow Copy Service returns the location information for the shadow copy to the requester. In some cases, the shadow copy can be temporarily made available as a read-write volume so that VSS and one or more applications can alter the contents of the shadow copy before the shadow copy is finished. After VSS and the applications make their alterations, the shadow copy is made read-only. This phase is called Auto-recovery, and it is used to undo any file-system or application transactions on the shadow copy volume that were not completed before the shadow copy was created.

greetings,

peter

0 Kudos
Fusi
Contributor
Contributor

Is a solution now available?

0 Kudos
Thorian93
Contributor
Contributor

Just to keep this thread active: Same question as Fusi‌.

We are running ESX 5.5 but as far as my research took me, this problem occurs on all version above 4.1 U1.

The following KB article contains information about the problem itself but states no satisfying solution: Creating a quiesced snapshot of a Windows virtual machine generates Event IDs 50, 57, 137, 140, 157,...

Has anyone news on this?

"A person who never made a mistake never tried anything new." - Albert Einstein
0 Kudos
VMWareESX1
Contributor
Contributor

We too are affected by this issue.  It is not very promising to see how far back this issue dates, April 2011.  We have over 20 servers, there is not a single one that doesn't produce these error messages while quescing the virtual machine 08 R2,. 2012, 2012 R2.  Please do not suggest disabling VSS writers/uninstalling VMWare Tools specific VSS writers, unless of course you don't care about your backups having any integrity.  Also, please be aware that Disk.EnableUUID = False is said to also leave the VM in an inconsistent state, so it should never be called a workaround.  I would like to see a solution eventually, although at this rate it looks like this issue will never get resolved.

0 Kudos
brykan
Contributor
Contributor

Brand new setup with an EMC Unity 300 hybrid SAN, 3 Dell R630 hosts running VMware 6.0 and mostly Server 2012 R2

All my 2012 R2 based virtual machines display Ntfs errors (event IDs 50, 57, 137, 140, 157, or 12289) directly after backups launch (after the VMware Snapshot Provider service entered the running state). I called VMware tech support and they want me to open a case with Microsoft.  I have an open/ archived case number VMware - SR 17471003405 and now am trying to figure out what to do next. Of course I have found the related article VMware article kb2006849 which basically acknowledges this problem going back to version 4.1 but doen;t propose any solutions. I have also found many dead end threads both inside and outside this forum talking about this issue with no resolution. How can this be? I am so put off by the fact VMware wants to point fingers and have me to spend $500 to open a case with Microsoft that I just want to get rid VMware altogether at this point.

The only article I have found that actually proposes a fix is here: https://www.vnotions.com/2015/10/20/fix-vmware-quiesced-snapshots-failing-unexpected-error-deviceioc...

but wow. I am not a huge environment so I could do this. Has anyone out there actually called worked with VMware to call Microsoft and successfully resolve this otherwise?

 

0 Kudos