Update: we moved the shared virtual disks from VMFS to RDM with compatibility mode=physical, and now the problem is solved, the RAC cluster works properly.
Funny thing was when we first setup our it did with with the RMD and then a few months back when preparing for go-live I move these test servers to use the vmdk's. We never had this reboot issue with the RDM's and now we have it with the VMDK setup.
My question is how has it been running with the RDM's? any tips or pointers?
This is the first time I have found anyone doing what we were
2 node physical rhle5.3 cluster with GFS and then running RAC 11g on top.
We mirror that in ESX using manual fence node? - Have you tried the VMfenceing
Just curious how things are working for you
i have exactly the same problem on VMware ESX 3.5U4 with Oracle-RAC 10.2.0.1 running on SLES-10 SP3. I use a second LSI adapter (1:0) type "virtual" on both nodes. All nodes can access the shared raw partitions but after the CRS is running both system reseting. Here are the last lines of my
/var/log/messages of both cluster
Nov 25 11:39:42 rac1 kernel: sd 1:0:1:0: reservation conflict
Nov 25 11:39:42 rac1 kernel: sd 1:0:1:0: SCSI error: return code = 0x00000018
Nov 25 11:39:42 rac1 kernel: end_request: I/O error, dev sdc, sector 49
Nov 25 11:40:16 rac1 logger: Oracle CSSD failure. Rebooting for cluster integrity.
sdc ist the Oracle-RAC voting-disk. Any help would be appreciated.