VMware Cloud Community
steelweb
Contributor
Contributor

Solaris Cluster 3.3 on VMware ESX 4.1

Hi

It has been a very difficult process to setting up solaris cluster on 2 ESX boxes with a shared storage.

I had lots of failures at the end I thought that pain finished but...

cluster software installed successfully and rebooted the 2nd node ok, and rebooted itself then everytime solaris login screen comes 5 seconds responds to my ping then it goes down. If you try to type anything on the screen the node reboots itself.

there is no error log or anything.

If i boot them in non cluster mode they work just fine!

I have setup a shared disk between them with a different scsi adapter and set them to PHYSICAL to make it shared accross boxes. see 1.png

network communications seems to be ok. i couldnt make it work with 3 ethernet adapter so i use 2 instead 3.

I use custom cluster installation instead typical.

I wonder what am i missing??

is there anyone out there setted up Solaris Cluster on Vmware ESX 4.1 accross boxes using SAN ?

I do aware that  this config is not supported but some geeks out there make it happen!

Please let me know if you know anything about reboot issue on cluster mode.

Reply
0 Kudos
3 Replies
DSTAVERT
Immortal
Immortal

I would do a search in the forums for anything related to solaris clusters. It isn't a supported configuration. If you were to use iSCSI from within the OS you might have better luck.

-- David -- VMware Communities Moderator
Reply
0 Kudos
Gleed
VMware Employee
VMware Employee

Hi,

I'm curious why you are running Sun Cluster inside your VMs?   What is it you are not getting with VMware HA?

You need to use RDMs for the shared disk (quorum disk).   VMFS disks get locked when accessed by a VM which presents concurrent access.  Sounds like when the cluster boots, the first note accesses/locks the quorum disk, which blocks the second node out causing it to reboot. 

Switch your quorum disks to RDMs.

Again, I'm curious why you have choosen an unsupported Sun Cluster solution with all the extra complexity/overhead when you have VMware HA at your disposal?

Regards,

-Kyle

Reply
0 Kudos
steelweb
Contributor
Contributor

@Gleed

Hi thanks for the reply,

The reason that i cannot use vmware HA is simply HA does not provide application level clustering.

HA: provides reboot on another ESX which means downtime.

FT: provides good availability but with only 1 vCPU which is not enough for me.

So in that sense I need to cluster 2 SUN Solaris 10 on OS level.

Regarding the disk lock issue, i have removed the disk lock by creating another quorum disk with scsi id 1:1 and changing the scsi sharing from none to physical makes that disk available through multiple vms. So both machines are able to read/write to that disk.

about this RDM, thats what i was trying to establish. Do I need fibre channel to do this?

I see that RDM is not enable when creating a disk on SAN. I have iscsi connection to SAN and cannot enable RDM Smiley Sad

thanks!

Reply
0 Kudos