Just wanted your points of view on the matter.
I was lately looking into the subject and I found it a pretty interesting solution:
http://www.netapp.com/us/library/technical-reports/tr-3548.html
My idea was that you could create a stretched storage server that would persist even in the case of one of the two buildings in the Campus collapse. One would be led to think this way as you read the "pyramid" at pag 4: Metrocluster can be used for "Datacenter / Site disasters".
However it turns out that a complete site disaster (i.e. head + shelves) doesn't provide automatic switch over to the other head (and mirrored shelves). You can depict this from this VMware KB (http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1001783) as well as from section 2.11 of the NetApp doc above which lists the advantages of MetroCluster Vs standard Syncronous replication:
1) Low aggregate level RAID mirroring (less performance impact)
2) Automatic switchover to remote copy upon failure
3) Site failover with a single command
4) Simpler to manage than multiple replication relationships
5) No extensive scripting required to make data available after failover
I am ok with them... however #3 looks a bit simplistic. While you can failover your storage with a single button... restarting all your VMs onto the surviving site can be as problematic as restarting them in a synchronous replication scenario (where at least you have SRM to optimize the whole thing). See the VMware KB above.
I was told that this "limitation" is due to the typical potential split brain issue of clustering solutions (i.e. you don't really know whether the building has collapsed or the sites just lost communication with each other). I can understand this is not trivial.
Don't get me wrong... I think it's a wonderful solution..... but this "little thing" left me with a bad taste in the mouth.....
Comments?
Massimo.
I was lately looking into the subject and I found it a pretty interesting solution:
http://www.netapp.com/us/library/technical-reports/tr-3548.html
My idea was that you could create a stretched storage server that would persist even in the case of one of the two buildings in the Campus collapse. One would be led to think this way as you read the "pyramid" at pag 4: Metrocluster can be used for "Datacenter / Site disasters".
However it turns out that a complete site disaster (i.e. head + shelves) doesn't provide automatic switch over to the other head (and mirrored shelves). You can depict this from this VMware KB (http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1001783) as well as from section 2.11 of the NetApp doc above which lists the advantages of MetroCluster Vs standard Syncronous replication:
1) Low aggregate level RAID mirroring (less performance impact)
2) Automatic switchover to remote copy upon failure
3) Site failover with a single command
4) Simpler to manage than multiple replication relationships
5) No extensive scripting required to make data available after failover
I am ok with them... however #3 looks a bit simplistic. While you can failover your storage with a single button... restarting all your VMs onto the surviving site can be as problematic as restarting them in a synchronous replication scenario (where at least you have SRM to optimize the whole thing). See the VMware KB above.
I was told that this "limitation" is due to the typical potential split brain issue of clustering solutions (i.e. you don't really know whether the building has collapsed or the sites just lost communication with each other). I can understand this is not trivial.
Don't get me wrong... I think it's a wonderful solution..... but this "little thing" left me with a bad taste in the mouth.....
Comments?
Massimo.