vSphere 4.1 RDM machines failover/back error with SRM 4.1
Here is the situation:
ESX 4.1 hosting a couple of VMs with only VMFS vDisks and VMs with VMFS disks for OS and application files and vRDM disks for data.
half of the RDM machines vmx and vmdk files are created on one datastore (1 MB block size), name it Datastore-A, and the other half are created on another datastore with the same block size (Datastore-B). this is due to segregation requirement.
The setup is a recent upgrade from VI 3.5. one of the early problems I ran into was snapshotting the RDM VMs. according to a vmware fix, I created a new datastore with 8 MB block size and at which I configured the working directory of those RDM VMs. the snapshots went OK. but then, when I started running the recovery plan at the installed SRM 4.1, all VMs with vDisks only failed over and back successfully, in addition to all RDM machines at Datastore-A, but Datastore-B RDM VMs failed. I had a look at the array config of SRM and noticed that the new datastore i created is only associated with the Datastore-A but not B, though all three datastores were replicated successfully. then I reviewed the settings for the working directory of each RDM machine, and to my surprise, only one VM had the setting right, but the rest were having their working directory set to the Placeholders datastore created at the DR site.
I suspected the snapshot fix, so I removed the line from my RDM vmx files. another failover displayed even a more alarming error that it could find any RDM devices to mount. I almost lost my production data trying to recovery and failback to the original state.
I now suspect that SRM failover/back is somehow analogous to VM snapshots in estimating the total amd max file sizes before proceeding with the operation. Being so, will it work for me if I create a 4th datastore with 8MB block size for the failing VMs, or even better, will it work if I Storage vMotion all my RDM machines to two new datastored with 8M block size?