i have a question:
we have 2 datacenter - they are connected with 2 cables.
Last week we had an failure - both cables were "down" at once.
So we had on datacenter 1 - 2 hosts ok and 2 hosts disconnected -- the same on the other side.
And -> we have a lot of disconnected VMs on the both sides.
We have an synchronized storage between the two datacenter, so the VMs on each side could run.
The problem was, that i had to shutdown all machines on side 2, remove the disconnected hosts on side 1,
re register the VMs etc..
How can i prevent this ? is there any possibility to say:
datacenter 1 and his hosts are the "master" - if datacenter 2 fails (the hosts, network etc) shutdown all machines on datacenter 2
and start them in datacenter 1 ?
I think you probably need to explain your environment a litle more.
From what you have written, it seems to me that you have some sort of Metro cluster between DC A and DC B meaning that you have a single vCenter cluster that contains hosts from both sites?
Before trying to offer advise I would like to know a little more about the infrastructure and what is being attempted
what we have is an ESXI Cluster with (for example) 4 Hosts
2 in DC A and 2 in DC B
Storage (datastores) clustered via DATACORE - synchron
Now - what was happen - DC B was down (cables) and all VMs and Hosts were disconnected.
We had to manually shutdown them on DC B - deregister and register on DC A start up
This is what we want to prevent - that we have manually have to do this things.
But i dont know how to realize
Okay - sorry more questions to help get the full picture and so I don't make any assumptions
No we do not have this in place. I dont know if datacore has such a feature.
Host Isolation is not active now.
Cause i dont know what will happen..... (regarding my first question)
So the first thing I would confirm is if DataCore supports the concept of a witness, i.e. how does the storage determine which side is not available. Without this you could get into a split brain scenario and have both sides hosting the same VMs.
Once you have determined a way for the storage to keep consistent between both sites plus work out which site is down, then you can start to look at the possibility of implementing Host Isolation. Here is a couple of good blog articles that explain the concept:
So by using something that both sides of your ESXi cluster can see, will allow them to work out which side is surviving. From there the otherside will power down the VMs and then start on hosts which are not isolated. Problems I can see however is that replication will also be severed between the two storage arrays so what is the arrays response to this situation as well?
This also needs to be determined so you know where your data is and what point of time the VMs are at.
You should start with reading this KB:
This explains the Datacore stretched cluster technology. It also points to the correct documentation for installing and configuring the vSphere environment. On top of that, it points to a different document which explains how to configure HA for Stretched.
In this document VMCP is described and it is recommended to configure it for APD and PDL. (Guessing the scenario you describe leads to an APD). More details on how this works are to be found here: