VMware Cloud Community
Eric911
Contributor
Contributor

Exchange 2010 cluster loses quorum during nightly Avamar VMDK backups

Hi,

I've been struggling with this issue for a little while now, it was happening when we had backup exec and it continues to happen with Avamar.  I am currently performing VMDK backups nightly using avamar, it would appear that during the stun phase of the snapshot of the Exchange VM's it is causing the cluster to lose quorum.  It is able to recover on it's own but each night my mail databases have been mounted on a different server.

We are using Exchange 2010 SP3 with Microsoft Failover Clustering (Configured for node majority) on server 2008 R2 VM's.  I have 3 database servers and 2 CAS; the CAS utilizes Microsoft NLB.

ESX 5.5 with vcenter across 6 hosts.  Shared storage on NetApp FAS3220 SAN, VMFS, 2 Shelves of SATA disks.

I've checked the registry settings in each of the exchange server OS and the disk timeout settings are set at 60 seconds (has anyone successfully increased this beyond 60 with no adverse affects?).  Below are the Failover Cluster properties for the DAG:

T  Cluster              Name                           Value

-- -------------------- ------------------------------ ---------

DR DAG1             FixQuorum                      0 (0x0)

DR DAG1             IgnorePersistentStateOnStartup 0 (0x0)

SR DAG1             SharedVolumesRoot              C:\Cluste

D  DAG1             AddEvictDelay                  60 (0x3c)

D  DAG1             BackupInProgress               0 (0x0)

D  DAG1             ClusSvcHangTimeout             60 (0x3c)

D  DAG1             ClusSvcRegroupOpeningTimeout   5 (0x5)

D  DAG1             ClusSvcRegroupPruningTimeout   5 (0x5)

D  DAG1             ClusSvcRegroupStageTimeout     7 (0x7)

D  DAG1             ClusSvcRegroupTickInMilliseconds 300 (0x

D  DAG1             ClusterGroupWaitDelay          30 (0x1e)

D  DAG1             ClusterLogLevel                3 (0x3)

D  DAG1             ClusterLogSize                 100 (0x64

D  DAG1             CrossSubnetDelay               4000 (0xf

D  DAG1             CrossSubnetThreshold           10 (0xa)

D  DAG1             DefaultNetworkRole             2 (0x2)

S  DAG1             Description

D  DAG1             EnableSharedVolumes            0 (0x0)

D  DAG1             HangRecoveryAction             3 (0x3)

D  DAG1             LogResourceControls            0 (0x0)

D  DAG1             PlumbAllCrossSubnetRoutes      0 (0x0)

D  DAG1             QuorumArbitrationTimeMax       20 (0x14)

D  DAG1             RequestReplyTimeout            60 (0x3c)

D  DAG1             RootMemoryReserved             429496729

D  DAG1             SameSubnetDelay                2000 (0x7

D  DAG1             SameSubnetThreshold            10 (0xa)

B  DAG1             Security Descriptor            01 00 04

s)

D  DAG1             SecurityLevel                  1 (0x1)

M  DAG1             SharedVolumeCompatibleFilters

M  DAG1             SharedVolumeIncompatibleFilters

D  DAG1             ShutdownTimeoutInMinutes       20 (0x14)

D  DAG1             WitnessDatabaseWriteTimeout    300 (0x12

D  DAG1             WitnessRestartInterval         15 (0xf)

Any help would be greatly appreciated!!

0 Kudos
0 Replies