Hi,
I've been struggling with this issue for a little while now, it was happening when we had backup exec and it continues to happen with Avamar. I am currently performing VMDK backups nightly using avamar, it would appear that during the stun phase of the snapshot of the Exchange VM's it is causing the cluster to lose quorum. It is able to recover on it's own but each night my mail databases have been mounted on a different server.
We are using Exchange 2010 SP3 with Microsoft Failover Clustering (Configured for node majority) on server 2008 R2 VM's. I have 3 database servers and 2 CAS; the CAS utilizes Microsoft NLB.
ESX 5.5 with vcenter across 6 hosts. Shared storage on NetApp FAS3220 SAN, VMFS, 2 Shelves of SATA disks.
I've checked the registry settings in each of the exchange server OS and the disk timeout settings are set at 60 seconds (has anyone successfully increased this beyond 60 with no adverse affects?). Below are the Failover Cluster properties for the DAG:
T Cluster Name Value
-- -------------------- ------------------------------ ---------
DR DAG1 FixQuorum 0 (0x0)
DR DAG1 IgnorePersistentStateOnStartup 0 (0x0)
SR DAG1 SharedVolumesRoot C:\Cluste
D DAG1 AddEvictDelay 60 (0x3c)
D DAG1 BackupInProgress 0 (0x0)
D DAG1 ClusSvcHangTimeout 60 (0x3c)
D DAG1 ClusSvcRegroupOpeningTimeout 5 (0x5)
D DAG1 ClusSvcRegroupPruningTimeout 5 (0x5)
D DAG1 ClusSvcRegroupStageTimeout 7 (0x7)
D DAG1 ClusSvcRegroupTickInMilliseconds 300 (0x
D DAG1 ClusterGroupWaitDelay 30 (0x1e)
D DAG1 ClusterLogLevel 3 (0x3)
D DAG1 ClusterLogSize 100 (0x64
D DAG1 CrossSubnetDelay 4000 (0xf
D DAG1 CrossSubnetThreshold 10 (0xa)
D DAG1 DefaultNetworkRole 2 (0x2)
S DAG1 Description
D DAG1 EnableSharedVolumes 0 (0x0)
D DAG1 HangRecoveryAction 3 (0x3)
D DAG1 LogResourceControls 0 (0x0)
D DAG1 PlumbAllCrossSubnetRoutes 0 (0x0)
D DAG1 QuorumArbitrationTimeMax 20 (0x14)
D DAG1 RequestReplyTimeout 60 (0x3c)
D DAG1 RootMemoryReserved 429496729
D DAG1 SameSubnetDelay 2000 (0x7
D DAG1 SameSubnetThreshold 10 (0xa)
B DAG1 Security Descriptor 01 00 04
s)
D DAG1 SecurityLevel 1 (0x1)
M DAG1 SharedVolumeCompatibleFilters
M DAG1 SharedVolumeIncompatibleFilters
D DAG1 ShutdownTimeoutInMinutes 20 (0x14)
D DAG1 WitnessDatabaseWriteTimeout 300 (0x12
D DAG1 WitnessRestartInterval 15 (0xf)
Any help would be greatly appreciated!!