3 Replies Latest reply on Nov 25, 2009 3:17 AM by phimic

    Oracle 11g RAC on ESX 3.5 "cluster in a box": reservation conflict, reboot

    Stefano Giuliano Novice

      We created 2 Linux RedHat 5 virtual machines on the same ESX 3.5 and installed Oracle 11g RAC.

      Boot disk of every machine is in local storage of the ESX server, while shared disks are on a VMFS on SAN (EMC CX3-10).

      The shared disks are on dedicated SCSI controller set to "virtual bus sharing".

      Installation was ok, and the RAC seems to work, but sometimes a machine reboot unexpectedly.

       

      From the /var/log/messages I see:

      Aug 25 06:38:33 ttsc-rac2 kernel: sd 1:0:1:0: reservation conflict

      Aug 25 06:38:33 ttsc-rac2 kernel: sd 1:0:1:0: SCSI error: return code = 0x00000018

      Aug 25 06:38:33 ttsc-rac2 kernel: end_request: I/O error, dev sdd, sector 81

      Aug 25 06:38:33 ttsc-rac2 logger: Oracle clsomon failed with fatal status 12.

      Aug 25 06:38:33 ttsc-rac2 logger: Oracle CSSD failure 134.

      Aug 25 06:38:33 ttsc-rac2 logger: Oracle CRS failure.  Rebooting for cluster integrity.

       

      /dev/sdd is the voting disk.

       

      On Oracle log (/u01/app/crs/11.1.0/crs/log/ttsc-rac1/alertttsc-rac1.log) I see a lot of lines saying that "voting file is offline", but not time related with the error above. After many errors the voting disk returns online.

       

      Any idea?

       

      Thanks,

      Stefano

        • 1. Re: Oracle 11g RAC on ESX 3.5 "cluster in a box": reservation conflict, reboot
          Stefano Giuliano Novice

          Update: we moved the shared virtual disks from VMFS to RDM with compatibility mode=physical, and now the problem is solved, the RAC cluster works properly.

           

          S>

          • 2. Re: Oracle 11g RAC on ESX 3.5 "cluster in a box": reservation conflict, reboot
            bhoros Novice

             

            Hey,

             

             

            Funny thing was when we first setup our it did with with the RMD and then a few months back when preparing for go-live I move these test servers to use the vmdk's.  We never had this reboot issue with the RDM's and now we have it with the VMDK setup.

             

             

            My question is how has it been running with the RDM's?  any tips or pointers?

             

             

            This is the first time I have found anyone doing what we were

             

             

            2 node physical rhle5.3 cluster with GFS and then running RAC 11g on top.

             

             

            We mirror that in ESX using manual fence node? - Have you tried the VMfenceing

             

             

            Just curious how things are working for you

             

             

            • 3. Re: Oracle 11g RAC on ESX 3.5 "cluster in a box": reservation conflict, reboot
              phimic Lurker

              Hello Community,

               

              i have exactly the same problem on VMware ESX 3.5U4 with Oracle-RAC 10.2.0.1 running on SLES-10 SP3. I use a second LSI adapter (1:0) type "virtual" on both nodes. All nodes can access the shared raw partitions but after the CRS is running both system reseting. Here are the last lines of my

              /var/log/messages of both cluster

               

              Nov 25 11:39:42 rac1 kernel: sd 1:0:1:0: reservation conflict

              Nov 25 11:39:42 rac1 kernel: sd 1:0:1:0: SCSI error: return code = 0x00000018

              Nov 25 11:39:42 rac1 kernel: end_request: I/O error, dev sdc, sector 49

              Nov 25 11:40:16 rac1 logger: Oracle CSSD failure.  Rebooting for cluster integrity.

               

              sdc ist the Oracle-RAC voting-disk. Any help would be appreciated.