VMware Cloud Community
irvingpop2
Enthusiast
Enthusiast

Resolution: VDR "Trouble writing to destination" errors with CIFS target on Windows 2008 R2

VDR users,

We recently switched our CIFS backup destinations from a Windows 2003 server to Windows 2008 R2 (SP1, 64-bit).   Since that point, we had nothing but problems:   nearly all backups were failing and corrupted,   integrity checks always failed.    I'm posting my resolution here so that others may find it and avoid the same issue.

The error shown in the VDR console was this:

Trouble writing to destination, error -102 ( I/O error)
And on the appliance itself,  repeated kernel messages like this:
CIFS VFS: Send error in Flush = -11
CIFS VFS: Send error in Flush = -9
The root cause, we discovered, was a setting found in this document:

Performance Tuning Guidelines for Windows Server ... - Microsoft

Namely on page 58:

  

TreatHostAsStableStorage

HKLM\System\CurrentControlSet\Services\LanmanServer
\Parameters\(REG_DWORD)

The default is 0. This parameter disables the processing of write flush commands from clients. If the value of this entry is 1, the server performance and client latency for power-protected servers can improve. Workloads that resemble the NetBench file server benchmark benefit from this behavior.

Change this setting to 1 and the server will accept Flush commands again from the VDR appliance.    Since making this change (and rebooting) our VDR backups have been running smoothly again.

This should be noted in a KB item somewhere, so others won't have to lose any hair over it.   Like one of these:


Reply
0 Kudos
3 Replies
rmalayter
Contributor
Contributor

Great find... we're been experiencing this too intermittently against a cleanly installed Win2008R2 target, but only when we have several backup jobs going at once. We will give it a shot, but it makes sense.

Reply
0 Kudos
melevy
Contributor
Contributor

I, too, found that the error would only occur when multiple jobs were running at once. We dropped the default of 8 simultaenous jobs to 4 and several stubborn VMs that would consistantly get that erorr completed successfully.

Mark Levy

Enterprise Technology Associates

Reply
0 Kudos
irvingpop2
Enthusiast
Enthusiast

I have, like many others in this forum, completely given up on VDR.     Sure you can get it working OK for a little while,  but at some point it will explode in your face.  

When it does,  you'll realize that you have no way to repair your corrupted destination or manually extract any of the data from it.  All of your backup data will be lost. 

After some digging you'll also realize that the VDR appliance (CentOS 5.5) ships with old and buggy CIFS VFS client drivers and that this whole thing was an accident waiting to happen.

It could be prevented so easily by:

  1. Allowing native NFS destinations (which will always be more stable and performant on Linux than CIFS)
  2. Using a newer kernel and CIFS clients  (like CentOS 6.2 or Ubuntu 12.04)

Better yet,  allow the VDR appliance to run on a standalone ESX host outside of the VMware cluster or on physical hardware.     Then you could actually scale and perform like proper backup solutions. 

Reply
0 Kudos