I'm a relative newcomer to ESX and have run into a problem with a cluster-in-a-box setup I've inherited. Apologies for any etiquette breaches, this is my first post.
We have two clusters, both are fairly simple two-node, active/passive MSCS clusters on the same host with all storage local. One cluster is configured correctly with Node B pointing to the quorum and other shared disks in Node A's /volumes/ folder, as per the doco; . This cluster is operating normally.
On the other cluster I noticed that when failing it over the content of the shared disks changes. Investigating the setup I found that Node B is not pointing to the shared virtual disks in Node A's /volumes/ - rather it is configured to point to these disks in it's own /volumes/. The MSCS service is still able to fail the groups over but it doesn't seem to be aware that the shared disks aren't actually shared. If we ignore the obvious questions ("how did it get like this?" and "how does it actually failover?") the immediate issue is how to rectify the cluster so that it's configured to operate correctly.
I implemented a plan to backup both nodes, shut them both down and reconfigure Node B in VC so that all of the shared disks point to the disks in Node A /volumes/. After that Node A started OK but when I tried to power on Node B VC gave me an error "the file is locked" and it failed to start.
I've considered introducing a new Node C into the cluster as a workaround but I'd prefer to resolve the actual problem if I can. Can anyone provide some feedback on an approach to get Node B and my cluster as a whole configured correctly? Thanks for your assistance.
CABF.
cheers
Charlie
Are you running ESX 2 or ESX 3? If you are running ESX 2 that document will not apply as there have been some changes on the setup for MSCS - see http://www.vmware.com/pdf/esx25_admin.pdf see section 10 on clustering -
The host is 3.02. Apologies for posting to the wrong forum.
Thanks for the feedback, I'll check the admin guide for v3 for similar considerations.
If you are getting a file locked error, then most likely, you are not using a scsi controller set to virtual bus sharing. On your nodes, you will need a second scsi controller (scsi1), that you will attach your shared disks to. To get this 2nd controller, set your shared disks to use a scsi1:x address in the ID section of your disk. Once you do that, a 2nd scsi controller will be created. Set that controller to virtual bus sharing. Do the same on the 2nd node, and you should be able to boot up and have access to the disks from both nodes.
-KjB
Moved to more appropriate forum.
Ken Cline
Technical Director, Virtualization
VMware Communities User Moderator
Thanks KjB,
Implemented the fix based on your feedback and the cluster is now fixed. Thanks!
Glad all is well.
-KjB