Hi,
We created test DRS HA cluster in vcenter 6.0 that contains two hosts and one shared storage. When host entering maintenance mode all vms are migrated to anither node. But when we force power off the host vms are not starting on second node, and displays in "Powered On" state Status "Unknown".
Our DRS/HA settings are:
DRS -> ON anf Fully automated
HA -> ON
Host Monitoring -> On
Protect against Storage Connectivity Loss -> Off
Virtual Machine Monitorind -> Disabled
Host Isolation -> Power off and restart VMs
VM restart priority -> High
VM monitoring connectivity -> High
Admission Control -> Do not reserve failover capacity
Datastore for Heartbeating -> Use datastores from the specified list and complement automatically (so we have there one shared Datastore)
Why wm is not starting on another node? What we are doing wrong?
which procedure you are using to power off the node ?
We are using "Force Power Off" from server ilo.
I see some entries on the second node in fdm.log:
error fdm[FFB0BB70] [Originator@6876 sub=Election opID=SWI-60b7acd9] [ClusterElection::SendAll] [60 times] sendto 10.1.1.11 failed: Host is down
verbose fdm[FFCD1B70] [Originator@6876 sub=Policy] [LocalIsolationPolicy::ProcessDatastore] Issuing lock check for datastore /vmfs/volumes/594a2455-34d25690-c5ec-a0d3c102aa60
verbose fdm[FFA89790] [Originator@6876 sub=Cluster opID=SWI-a048760] [ClusterDatastore::DoCheckIfLocked] Checking if datastore /vmfs/volumes/594a2455-34d25690-c5ec-a0d3c102aa60 is locked
verbose fdm[FFA89790] [Originator@6876 sub=Cluster opID=SWI-a048760] [ClusterDatastore::DoCheckIfLocked] Checking if datastore /vmfs/volumes/594a2455-34d25690-c5ec-a0d3c102aa60 lock state is 3
verbose fdm[FFA89790] [Originator@6876 sub=Policy opID=SWI-a048760] [LocalIsolationPolicy::ProcessDatastoreLockState] check of /vmfs/volumes/594a2455-34d25690-c5ec-a0d3c102aa60 returned 3 (scheduled=true)
verbose fdm[FFCD1B70] [Originator@6876 sub=Policy] [LocalIsolationPolicy::ProcessDatastore] Issuing lock check for datastore /vmfs/volumes/594a2455-34d25690-c5ec-a0d3c102aa60
verbose fdm[FFA89790] [Originator@6876 sub=Cluster opID=SWI-2080867b] [ClusterDatastore::DoCheckIfLocked] Checking if datastore /vmfs/volumes/594a2455-34d25690-c5ec-a0d3c102aa60 is locked
verbose fdm[FFA89790] [Originator@6876 sub=Cluster opID=SWI-2080867b] [ClusterDatastore::DoCheckIfLocked] Checking if datastore /vmfs/volumes/594a2455-34d25690-c5ec-a0d3c102aa60 lock state is 3
verbose fdm[FFA89790] [Originator@6876 sub=Policy opID=SWI-2080867b] [LocalIsolationPolicy::ProcessDatastoreLockState] check of /vmfs/volumes/594a2455-34d25690-c5ec-a0d3c102aa60 returned 3 (scheduled=true)
The datastore 594a2455-34d25690-c5ec-a0d3c102aa60 - is a shared datastore
Does it mean that there is dome lock on vm files and second node cannot get the correct status?
lock state is 3. strange..
mode 3 is used by MSCS or FT.
Mode should be 1 (VM powerOn )or 0 (No lock).
I found a problem. Ping was forbidden on default gateway which HA uses as isolation address. Due to all the hosts are located in one subnet, I didnt consider that warning important.
Anyway, speaking out on an issue helps to solve it! Thanks!