VMware Cloud Community
kmzimm
Enthusiast
Enthusiast

vCD Connecting to RAC?

Hey everyone,

I've got a vCD installation running in the lab that I'm trying to connect to Oracle RAC to provide HA failover capabilities.

I've followed the directions in http://kb.vmware.com/kb/1025995, and ended up with this client side configuration (names changed to protect the stupid)

------

database.jdbcUrl = jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS_LIST=(LOAD_BALANCE=ON)(ADDRESS=(PROTOCOL=TCP)(HOST=node1-vip)(PORT=1521))(ADDRESS=(PROTOCOL=TCP)(HOST=node2-vip)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=VCLOUD)))
database.dataSourceName = oracle
database.oracle.isFastConnectionFailoverEnabled = true
database.oracle.onsConfig = nodes=node1:6250,node2:6250

------

ONS is confirmed to be running on node1 and node2, and I can connect to both nodes on the ONS ports from the vCD cell.

From what I'm seeing though, the vCD cell is only connecting to a single node's ONS instance, which happens to be whichever node I put in first. in the onsConfig nodes= list. If I switch them, it connects to the other instance.

The result of this seems to be that I can shut down either DB node application fine, and the connections gracefully failover to the surviving node. I can also do a full reboot of the node that is NOT the ONS connection without issue.

The problem comes into play when I reboot the node with the ONS connection. Since ONS goes away, the system doesn't know to failover, and vCD goes boom.

Stopping ONS on the node vCD connected to makes vCD do a single retry to restore the connection ( I see the SYN_SENT ), then from what I can tell, it never reestablishes a connection to ONS on either node.

Does anyone know how to correct this behavior and get vCD connected to both nodes?

0 Kudos
7 Replies
admin
Immortal
Immortal

That's a very tough question. I've read up on RAC but don't have an environment to reproduce the problem. From all the JDBC and RAC guides I just reviewed, it seems like you've got all the right settings, but none of the Oracle publications describe the behavior of the connection pool that's being used in regards to the ONS connections (i.e. will it connect to all listed vs. one, failover when necessary, etc.).

There is an additional property that we left out of the KB that really doesn't seem relevant, but is mentioned repeated but you might try editing /etc/profile.d/vcloud.sh and add this line:

export VCLOUD_JAVA_OPTS="-Doracle.ons.oraclehome=$ORACLE_HOME"

Where $ORACLE_HOME is the environment variable if you've got it set, or the path to a local Oracle/ONS installation. Again, not sure why it would be needed when it's using the remote ONS configuration method, but all of the Oracle docs still mention it [1].

One last silly question... ONS is definitely running on node2, right? `onsctl ping` returns a positive result?

[1] See the "Remote Configuration" section of this page: http://download.oracle.com/docs/cd/E18283_01/java.112/e12265/rac.htm#CHDHCGGG

0 Kudos
admin
Immortal
Immortal

Btw, I'm not entirely surprised that it doesn't connect to all ONS nodes; the docs make it clear that all ONS nodes share config data as well as all events, so if you've got a connection to one node in the network, you're getting everything there is to see.

0 Kudos
kmzimm
Enthusiast
Enthusiast

Kyle,

Thanks for the response!

Both nodes are definitely up and running. If I switch the node order in the nodes= list in the etc/global.properties, it connects to the other node instead.

One thing I can add after seeing your comments is that no oracle specific installation was done on either of my vCD nodes, since it wasn't in the documentation or the KB that I saw. Is this necessary for this to work? I don't see an ons.jar anywhere in the cloud-director hierarchy.

I've discussed the possibility that it old needs to connect to one node for ONS to function with some people on my side, and it might be the case. The problem then would be why the application side (vCD) doesn't switch to the other node for ONS if its ONS connection dies. Once it stops talking to ONS, things go south if one of the nodes goes offline (many java errors in the vCD logs about not being able to rollback, heartbeat failures, transaction failures), and the application ceases to function until it gets restarted.

Thanks again!

0 Kudos
admin
Immortal
Immortal

As I mentioned, I really don't think that parameter (and thus, an Oracle installation of some kind) is needed on the cell but all of the Oracle documentation states that it's needed specifically for the remote configuration case. That seems backwards to me. I would expect it to be necessary for the local ONS daemon case, but it seemed worth a shot.

ons.jar is named ons.wrapper-11.2.0.1.0.jar in our case.

I agree with you on the problem. If you've seen the Oracle JDBC User's Guide or the RAC Config & Deployment Guide, we're using method documented in there in vCD – Oracle's Universal Connection Pool (UCP) and the remote ONS configuration option. It's unclear to me why it's not working as we both expect.

If you're able to open a service request I would recommend doing so and I'll continue to see if I can find an answer as well.

kmzimm
Enthusiast
Enthusiast

One item to add to this. We’re noticing this error coming out of the environment during startup. This gets logged into the cloud-director/etc directory:

----

!SESSION 2011-05-02 17:19:26.774 -----------------------------------------------
eclipse.buildId=unknown
java.version=1.6.0_19
java.vendor=Sun Microsystems Inc.
BootLoader constants: OS=linux, ARCH=x86_64, WS=gtk, NL=en_US
Command-line arguments:  -configuration /opt/vmware/cloud-director/etc

!ENTRY org.eclipse.osgi 4 0 2011-05-02 17:19:27.166
!MESSAGE Bundle org/apache/servicemix/bundles/org.apache.servicemix.bundles.fastinfoset/1.2.2_1/org.apache.servicemix.bundles.fastinfoset-1.2.2_1.jar@start not found.

!ENTRY org.eclipse.osgi 4 0 2011-05-02 17:19:27.191
!MESSAGE Bundle com/vmware/vcloud/external/commons-dbcp.wrapper/1.2.2/commons-dbcp.wrapper-1.2.2.jar@start not found.

----

That last missing .jar seems to be related to the Spring framework DB connection pool. I do not see it present in either released version of vCD.

0 Kudos
kmzimm
Enthusiast
Enthusiast

Confirmed as a bug, in that it simply doesn't work. To be fixed in a future version.

0 Kudos
admin
Immortal
Immortal

Do you have a filed SR for this that confirmed it as a bug?

0 Kudos