I've been working with the beta for sometime. I have about 247 A4 pages of material I've written - I'm hoping to release a book on SRM in the next couple of months. I still have some outstanding questions which I think a VMware employee, or a member of the SRM Team is best placed to answer.
Currently templates are not included in SRM, which makes sense as they they not business critical. But I've been thinking there would be nothing stopping anyone - converting a template into VM, storing that on the recovery site - triggering failover, and then converting them into template at the recovery site. I'm think there maybe parts of a DR recovery plan which require the creation of a new virtual machine and templates are the best way. Example. Suppose I have a substantial VDI deployment - it would be costly to replicate/snapshot them when I would redeploy them as part of a pool. To do that I need my templates
2. Stretched VLANs
Does the SRM team have any thoughts recommendations on Stretched VLANs. Changing IPs is a roll PITA, even with a guest customization for each and every VM!!! I'm thinking the best route out of this altogether is a stretched VLAN configuration - it would also simply inventory mappings - as the VM would not move from one network to another
3. Placeholder/Shadow VMs
Does it really matter where you put your placeholders - local storage seems tempting but if an ESX host goes down its placeholders wouldn't be available. Would the SRM team recommend relatively small LUNs on shared storage that every ESX hosts could see - are there any positions or recommendations on this
4. VM Response Time
When you create a recovery plan we can control start-up times. This is a one size fits all value - with no way to customize it on a per-VM basis. After all services on some VMs will take longer to start on another. Does the SRM team plan to change this? In the meantime are there any work arounds?
5. Repair Arrays Button
What exactly is the button for - in what ways/scenarios might my array get "damaged" that needs repairing?
6. Powering of VMs before failback
I notice that some instructions from the storage vendors require you to power off VMs before starting the failback process. This is ensure they are quiesced. Strictly speaking is this neccessary? We do test runs with VMs powered on. Is it a limitation that varies from storage vendor to storage vendor?
Well, that's about it for my questions. I've almost completed the 1st draft of my book on SRM - it needs to go through some levels of proof-reading both from a readability stand-point but also from a technical stand-point. I've got some key guys waiting in the wings who have volunteered to do that for me. Hopefully, I should have "Authors Limited Edition" ready for VMworld - with the book being available for order on lulu.com soon after. Not decided on a price yet - cross that bridge when I get to it!
ps sorrry if these questions are RTFM - not had time to review the offical docs that got released with the GA. That's my next step...
I'll take a stab at these:
1) Templates are supported in SRM GA.
2) Agreed, per-VM IP configuration can be quite burdensome, so having a flat address space between protected site and recovery site would definitely be a win.
3) The disadvantage of putting placeholder VMs on local storage is that it restricts SRM to powering the VM on on the specific host. With shared storage, SRM will use DRS recommendations to power on a VM on the most appropriate (lightest loaded) host. That said, if you know exactly on which host you want the VM powered on then using local storage is fine.
4) Being able to override timeouts per-VM timeouts is a good idea. A workaround is to provide a post-power-on callout script that waits for the essential services on the VM to start. This has the advantage of verifying that the specific services (rather than just the VM) are up.
5) The repair arrays button allows you to re-configure the recovery-side arrays from the recovery side if the primary site is down, as with an actual disaster. In other words, the normal flow is to configure both arrays from the protected side, but if the protected side is a smoking crater you may still want to fiddle with your recovery side array config.
6) Powering-off of VMs during the failback workflow is about maximizing the chances of getting a consistent copy back at Site A. If you are not reversing the direction of replication or you don't care about data written at Site B or you are doing synchronous replication back to Site A then you may not need to power off the VMs at Site B.
just want to add a bit more to 3) - use of local storage as shadow VM datastore should keep the following point in mind - when it comes to maintenance of the ESX host with the local shadow VM datastore - test recovery would fail. Having shared storage to host these shadow VM config file would eliminate this problem altogether. My recommendation is the following:
1)have a dedicated LUN for shadow VM (can be very small in size, few gigs should be more than enough), and give it a meaningful name (easy to SRM admin to identify it during protection group creation)
2)present this LUN to all ESX hosts in the recovery site
Hope this helps!
Thanks for the responses - I did actually get a response to this from Lee Dilworth in EMEA... but it was a private email - so I could follow thru on this thead...
But more or less Lee said much the same thing...
Thanks for the responses