Do you really want RAC on top of VMware though? Both are techniques for partitioning hardware and so I don't really see the benefit. Yes, RAC runs nicely on ESXi lab servers but I'm not sure I'd want to have the two technologies together in production.
If it were physcial RAC that certainly is feasible, though there have been some ASM bugs affecting these cross site systems. Don't forget you'll need a small server at a 3rd site for the voting disk to prevent split brain scenarios.
You do also need to consider whether a RAC cluster is justified (i.e. do you need horizontal scalability and as little unplanned downtime as possible?). Otherwise you might be better with VMware and Data Guard (or 3rd party standby management) or even using vMotion etc.
I clearly understand your points. This is the solution our Oracle DBA's are used to and I just presented my case this way.
Note I know nothing about Oracle.
To my understanding RAC / CRS is quite similar to Windows' Clustering and I agree with you it may not make sense when using under vSphere.
I am more concerned about having the database portion replicated "realtime". That is why I exposed my scenario with an ASM Disk Group with disks from 2 different sites inside. Those two sites (computer rooms) will be connected by FC and fast ethernet connectivity as if it was the same room. Thinking about this concept, I was just wondering if the data was effectively written to all the disks of this group - and if my room #1 (SAN #1 and ESXi #1) dies - will the database be in a crash consistent state when turning on the VM in room #2 with the disks from SAN #2?
You talked about "some ASM bugs affecting these cross site systems". Do you know where I can find this information?
Again, my main issue is about having this data replicated on the other site with no loss of information. With Data Guard will not a substential amount of time be taken to turn it into an active database and effectively replace my production site?
Thanks again for any advices.
Sorry not to reply soon - must have missed the email.
I don't have the exact number for the ASM issue but should be some notes for it in My Oracle Support - I've just searched MOS for ASM Extended Cluster and nothing obvious has jumped out but I know from some user group sessions I attended that there were issues (maybe fixed now) so it would spending an hour or two digging. It was related to split-brain issues if you lose the interconnect I think.
Switching over to a standby managed by Data Guard can be pretty quick (say, a minute or so). Of course you will lose all your connected sessions (as compared to RAC cluster where you might lose, say, half of the sessions) but, if you don't need RAC for scalability reasons, Data Guard is a far less complex solution (and well understood by most DBAs). Switching the primary back to the other site these days is also much easier using flashback and then rolling forward. Plus with Data Guard you have the benefits of having a standby database you can check, take backups off, etc.
These are all just my opinions of course; others may disagree (or suggest a VMware way of doing things like vMotion etc) ;-)