VMware Cloud Community
Kake
Contributor
Contributor

Red Hat Clustering Fencing fails

Hello,

I'm trying to demo (and learn) a Red Hat Cluster on ESXi 3.5 update2.I've set a shared fencing device with the info for my ESXi server. I try fencing a node from Luci, but it only tells me that fencing failed. Messages log has following agent "fence_vmware" reports: Unable to connec/login to fencing device. I checked and rechecked password and the connection properties.

Any ideas why this is happening. Is it possible to use ESXi as fencing?

Tags (2)
0 Kudos
9 Replies
PaulSvirin
Expert
Expert

Here is a bug description, does it describes yours: ?

---

iSCSI SAN software

http://www.starwindsoftware.com

--- iSCSI SAN software http://www.starwindsoftware.com
0 Kudos
Kake
Contributor
Contributor

I'm not so sure it's any bug problem yet. The version I'm using is later than in bug report.

It might be my syntax.

fence_vmware -v -x -a 1x.1x.2x.12x -n /vmfs/volumes/4b56c0c3-0fbf32f0-a0d7-0013d4123d11/Revan/Revan.vmx -o status -l root -p xxx -L root -P xxx

root@1x.1x.2x.12x's password:xxx

Unable to connect/login to fencing device

# xxx

-bash: xxx: command not found

#

Edit:

This is the syntax from manuals example

fence_vmware -a 192.168.1.1 -l test -p test -L root -P root -o status -n /vmfs/volumes/48bfcbd1-4624461c-8250-0015c5f3ef0f/Rhel/Rhel.vmx

Question is VMware ESX management console user and password are different from login name and password?

0 Kudos
Yosias
Enthusiast
Enthusiast

VMware ESX management console user and password are different from login name and password? Which login name and password are you comparing it to?

0 Kudos
Kake
Contributor
Contributor

VMware ESX management console user and password are different from login name and password? Which login name and password are you comparing it to?

Hello,

I found the answer to the mysterious logins from Fencing wiki: http://sources.redhat.com/cluster/wiki/VMware_FencingConfig

+As you can see, guest1 connect to VMware management console (with hostname/login/password (-a/-l/-p) for ssh) and there, vmware-cmd is runned (with hostname/login/password (-A/-L/-P for VMware).

So why we have 2 set's of parameters? Because:

•On guest machine, you don't need to install anything. No vmware-cmd. You just connect to other machine, which has this command. •On dom0, ssh is not allowed for user root. So we can't use root login/password.

•Mostly, owner of Virtual Machine is root, but again, ssh is not allowed for that user.

Recomended way, how to use this agent is:

•Install ESX server and set root pasword (for example root)

•Create normal (non-root) user on VMware dom0 (in this example, I will suspect user test with password test)

•Create virtual machine (there is funny thing. You must use Windows, because web console is not able to create new virtual machine )

•Install your cluster node

+

Now that was for ESX not ESXi specially, but it should apply. How ever it's not working :smileygrin:

I made new account for logging into the ESXi (I have opened the ssh for ESXi). Here it gets weird. No matter what I can't make the new user to login the ESXi with ssh. I can't find any config files for ssh in ESXi. I can't even find sshd from the ESXi.

The login however should not matter, since root can login ssh anyway.

Here is what messages log has to say from esxi

Feb 6 07:28:35 dropbear[4952626]: Child connection from 10.xx.xxx.xxx:38504

Feb 6 07:28:35 dropbear[4952626]: password auth succeeded for 'root' from 10.xx.xxx.xxx:38504

Feb 6 07:28:35 vmkernel: 17:00:51:22.165 cpu0:4952626)WARNING: UserObj: 555: Failed to crossdup fd 6, fs: def5 oid: 1a000000030000009 type CHAR: Busy

Feb 6 07:28:40 dropbear[4952626]: exit after auth (root): Exited normally

and with -x tag on

Feb 6 07:30:53 dropbear[4953053]: Child connection from 10.xx.xxx.xxx:33370

Feb 6 07:30:53 dropbear[4953053]: password auth succeeded for 'root' from 10.xx.xxx.xxx:33370

Feb 6 07:30:53 vmkernel: 17:00:53:40.445 cpu0:4953053)WARNING: UserObj: 555: Failed to crossdup fd 6, fs: def5 oid: 1a000000030000009 type CHAR: Busy

Feb 6 07:30:58 dropbear[4953053]: exit after auth (root): Exited normally

Feb 6 07:31:03 vmkernel: 17:00:53:50.098 cpu0:2143)WARNING: UserSocketInet: 588: waiters list not empty!

Feb 6 07:31:03 Hostd: Activation : Invoke done on

Feb 6 07:31:03 Hostd: Throw vmodl.fault.RequestCanceled

Feb 6 07:31:03 Hostd: Result:

Feb 6 07:31:03 Hostd: (vmodl.fault.RequestCanceled) { dynamicType = , msg = "" }

Feb 6 07:31:03 Hostd:

Feb 6 07:31:03 Hostd: Failed to send response to the client: Broken pipe

Message was edited by: Kake

0 Kudos
Yosias
Enthusiast
Enthusiast

ok one thing at a time.

1. how did you enable ssh on your esxi server?

2. all this tuff you are talking about seems like vmserver. this forum is for VMWare ESXi. I would suggest you put your questions on the vmserver forum for better answers. but yes vmserver the web console you need to setup an administrator user, on the windows os side, that will let you login to the web console and I believe you have to log in using that user to be able to access your server.

0 Kudos
Kake
Contributor
Contributor

2) I assure you it's ESXi 3.5 build 110271

1) I enabled SSH by going to the console (unsupported) and modifying the inetd.conf. I am able to login as root on ssh, but not with any other user I have created.

0 Kudos
Yosias
Enthusiast
Enthusiast

ok Smiley Happy now when you go to the permissions tab of your host using virtual client, do you see those users that you have added?

Kake
Contributor
Contributor

Yes, the new accounts are there with administrator role.

I checked the messages log and found the problem.

Feb 8 19:04:59 LSIESG: LSIESG:INTERNAL :: StorelibManager::createDefaultSelfCheckSettings - failed to get TopLevelSystem

Feb 8 19:04:59 sfcbd: INTERNAL StorelibManager::createDefaultSelfCheckSettings - failed to get TopLevelSystem

Feb 8 19:05:16 dropbear[4789772]: Child connection from xxx.xxx.xxx.xxx:xxxx

Feb 8 19:05:20 dropbear[4789772]: password auth succeeded for 'ars' from xxx.xxx.xxx.xxx:xxxx

Feb 8 19:05:21 vmkernel: 19:12:28:07.610 cpu0:4789772)WARNING: UserObj: 555: Failed to crossdup fd 6, fs: def5 oid: 1a000000030000009 type CHAR: Busy

Feb 8 19:05:21 dropbear[5677583]: exit after auth (ars): error changing directory

Feb 8 19:05:21 dropbear[4789772]: exit after auth (ars): Exited normally

Simply passwd file has wrong home path.

Okay now we have totaly new error message to figure out.

~ $ PS1='[PEXPECT]\$ '

$ /usr/bin/vmware-cmd -H localhost -U root -P xxxx '/vmfs/volumes/4b56c0c3-0fbf32f0-a0d7-0013d4123d11/Revan/Revan.vmx' -v getstate

-ash: /usr/bin/vmware-cmd: not found

It seems ESXi doesn't have vmware-cmd. It is starting to look like fencing with ESXi 3.5 is somewhat difficult. Atleast with scripts provided by RH clustering. But now is not time to give up :smileygrin:

Wait a minute!

Changed the command to vim-cmd

~ $ PS1='[PEXPECT]\$ '

$ /usr/bin/vim-cmd -H localhost -U root -P xxxx '/vmfs/volumes/4b56c0c3-0fbf32f0-a0d7-0013d4123d11/Revan/Revan.vmx' -v getstate

Invalid command '/vmfs/volumes/4b56c0c3-0fbf32f0-a0d7-0013d4123d11/Revan/Revan.vmx'.

$ Status: OFF

Okay, so it still gives staus OFF.., Which ofcourse is wrong.

I need to change the script to give following command.

/bin # vim-cmd -H localhost -U root -P xxxx vmsvc/power.getstate /vmfs/volumes/datastore1/Revan/Revan.vmx

Retrieved runtime info

Powered on

Message was edited by: Kake

0 Kudos
Kake
Contributor
Contributor

I was able to modify the fence_vmware python script to replace vmware-cmd to vim-cmd.

From commandline it now works fine. Tho sometimes you have to retry few times due to timeouts.

Basically status/on/off works fine.

I'll have to test how luci works now.

Steps I did

1) Enable SSH on ESXi

2) Add user

3) Replace vmware-cmd with vim-cmd (the command syntax is different but simply change -o parse)

4) Configure Fencing Agent

vim-cmd vmsvc power.on needs VMID so use that instead of path to vmx.

This is by no means perfect and I wouldn't run any production clusters on this.

0 Kudos