kolev
Contributor
Contributor

MGE shutdown module on ESX 3.5

Hi,

I have a VI 2.5 and 3 host with ESX 3.5. I tried to install on the hosts mge shutdown module, but no success.

In the manual I found this...

download mge shutdown module from mge site nsm_linux_cli_3_xx_xx.run ,

copy to ESX host (with WinSCP for example) to /tmp directory,

logs to host (with putty) and add execution right to this file - chmod 755 nsm_linux_cli_3_xx_xx.run

After that install the package with command ./nsm_linux_cli_3_xx.run -install - silent

But after execution of this, I got the message from the hosts:

PHP Warning: Unknown(): Unable to load dynamic library './php_domxml.so' - libgcrypt.so.11: cannot open shared object file: No such file or directory in Unknown on line 0

What have to do to install this mge module...?

Thanks in advance!

Nikolay Kolev

0 Kudos
33 Replies
selak
Contributor
Contributor

Good morning,

I installed the nsm_mge agent on a vMA (which is the guest of a ESXi 3.5 U3) and followed the guide accurately but it doesn't work.

I'm testing with the Test part of the agent menu (in Actions > Test and select the electrical loss ("perte secteur" in french 8>) .

My vMA shutdown but not the 3 others VMs (without nsm agent according to the guide) and the ESXi server (192.168.0.5).

- VMware tools are installed.

- Firewall is full opened.

The parameters of System shutdown : "../bin/tools/shutdown.sh 1000 -shutdown-esxi 192.168.0.5"

Thanks you for reading.

0 Kudos
pauska
Contributor
Contributor

DavideDG, thanks so much for the wonderful responses here.

After getting rid of the doxml errors, I still can't get communication to my Web/SNMP card working. The Network Shutdown Module only gives me the error that there is a network communication problem, and I have absolutely no idea how to debug it. Any tips?

The ESX firewall is shut off for debugging, but it didn't help.

-Erik

Message was edited by: pauska, added comment about ESX firewall.

0 Kudos
DavideDG
Contributor
Contributor

Hi all,

I would check a couple other things:

- On the NSM agent web page, set the communication mode to TCP ONLY (see attached image)

- Check the firmware level of your UPS (btw, which model is it?) and apply any applicable upgrades. 1 or 2 years ago this was a mandatory step (firmware below a certain level could not communicate with newer NSM agents).

Other troubleshooting steps that come to my mind are:

1.Try this on a non production system:

- test that the UPS can really communicate with NSM agents: ensure communications with a Windows machine first.

- try with different cables and switches

- avoid any routing between UPS and NSM agent: try to stay on same broadcast domain and possibly on same network segment.

- try a fresh test ESX 3.5 update 4 server (you can install it on commodity hardware or even in VMWare Workstation if you know how to (search this forum :smileygrin:)).

2. Try vSphere (ESX 4.0) : I installed some these days and worked flawlessly.

- you will need a different version of the NSM agent (the 64-bit one... you'll find it on the same EATON webpage - always the cli/console version!).

- you won't need to fix any library errors (it worked like a charm :smileygrin:)

- you will still need to open firewall ports.

Once again, hope you eventually work it out :smileygrin:

Once you will, remember to actually test the shutdown procedures before going production (you'll have to configure ESX server to suspend/guest shutdown the VMs with the ESX servers or tweak the shutdown.sh script).

I am leaving for holidays tomorrow, so I won't be able to see/answer soon (I'll be back in september).

By

Davide DG - VCP 3.5

========================================================================= Davide DG - VCP 3.5/4.0
0 Kudos
DavideDG
Contributor
Contributor

Hi selak,

I never tried on ESXi hosts but I read the docs and spent a couple of minutes to read the scripts...

I would try to troubleshoot it by manually calling the script from the vMA machine, ie: log into vMA and create a copy of the shutdown.sh script and manually invoke it.

Of course you will have to modify this script to insert "debug lines" (one hint: comment out the actual shutdown of the vMA itself to save time during tests!).

This to test which part of the script is actually failing.

This will require some BASH scripting knowledge, though.

HTH.

Davide DG - VCP 3.5

========================================================================= Davide DG - VCP 3.5/4.0
0 Kudos
pauska
Contributor
Contributor

Hi again,

The UPS is a Powerware 9125 with a ConnectUPS Web/SNMP Card. It's running the latest avaible firmware. All servers and the UPS are on the same subnet, with no routing in between them (a private 192.168.x.x/24 subnet).

I have a windows server 2003 running on the same subnet, it can register with the UPS card fine and iniate shutdown procedures.

The NSM is set to TCP only. Different cables and switches would be a bit redunant, as I have lots of other traffic going over the same network (and ports on the ESX servers), wich is working just fine.

I guess I'll just have to wait for VMWare to put my SAN on the HCL, so I can upgrade to 4.0.. Clean install of ESX usually fixes everything.

Thanks again for the help,

Erik.

0 Kudos
selak1
Contributor
Contributor

Hello David,

Hope you are still enjoying your holydays... drinking mojitos in a swimming pool... watching bikini babes... hum... Soon for me !

I was debugging the script and I found another problem which really annoying me :

SOAP Fault:

----


Fault string: fault.RestrictedVersion.summary

Fault detail: RestrictedVersionFault

Operation cannot be performed.

They talk about this here.

So thanks to VMWare policies, the script "shutdown.sh" can't work anymore...

0 Kudos
DavideDG
Contributor
Contributor

http://communities.vmware.com/thread/203414

Ahh... I see Smiley Sad

That's odd... did you try with the vimsh suggested in that thread?

Good luck!!

========================================================================= Davide DG - VCP 3.5/4.0
0 Kudos
begoua
Contributor
Contributor

Hi,

Eaton have put on this website a new release of NSM (3.22) that works fine to shutdown an ESXi.

and there is a little change on the documentation.

Good luck

0 Kudos
DavideDG
Contributor
Contributor

Long time no update on this, but I recently had time and need to make it work.

Actually, even version 3.22 does not work on ESXi FREE-licensed edition, basically because of the same error (RestrictedVersionFault)

I found so many sites that states that this thing (issuing remote commands to free ESXi) either:

  • cannot be done

  • or that can be done using an "emulated" VIClient (via a SOAP request)

Actually, the SOAP method I found (vGhetto repository), seemed to work at first, but crashed my VMs later.

It indeed initiated a shutdown, and AutoStartManager was configured to shutdown/suspend VMs... but it only gave ~2 minutes for all VMs... and after that, it crashed them (powered off the physical server).

After digging a bit, I think I found the reason being /sbin/shutdown.sh (which performs the shutdown of autostart VMs) does NOT wait for these VMs to be powered off, but merely initiates their guest shutdown, and then continuining with host shutdown.

This thing might be resolved modifying the /sbin/shutdown.sh with a custom one, but I really dislike fiddling with system files.

Personally, I resolved this way:

  • enable the unsupported SSH server in ESXi

  • use Putty's PLINK command, from a Windows VM, to call via SSH a custom script that I uploaded on ESXi

  • this script does the following:

    • calls vicmd hostsvc/autostartmanager/autostop (which initiates VMs shutdown)

    • checks every 30sec if VMs are Powered off (using vicmd vmsvc/getallvms and vmsvc/power.getstate)

    • if VMs are still Powered on after a threshold, it powers them off (vmsvc/power.off)

    • lastly, calls /sbin/shutdown and /sbin/poweroff to halt the physical ESXi.

I don't really like the idea of using the unsupported SSH, nor that I am executing scripts INSIDE the busybox of vmkernel, but I could not come up with a better solution. Unless... I might consider to modify the /sbin/shutdown.sh or its definition in /etc/inittab but I think it's even more intrusive.

Any thoughts ?

--

Davide DG - VCP 3.5/4.0

========================================================================= Davide DG - VCP 3.5/4.0
0 Kudos
begoua
Contributor
Contributor

Have you used the new perl script that use the VMware API?

0 Kudos
DavideDG
Contributor
Contributor

Hi,

do you mean the shutdownHostViaSOAPAPICall.pl ? Yes. I did not try the original set of windows batch files (that need ncat.exe) by Simon Seagrave, though.

But I think that the problem resides in the way the /sbin/shutdown.sh (which is called upon system shutdown via /etc/inittab I think) works.

That is, as far as I have undersoot, the shutdown process goes like this:

  1. Shutdownis triggered by either DCUI, SOAP, GUI, API, etc.

  2. /etc/inittab defines a shutdown script which is /sbin shutdown.sh

  3. this script at a certain point calls "/sbin/vmware-autostart.sh stop"

  4. the vmware-autostart.sh script calls "vim-cmd hostsvc/autostartmanager/autostop"

  5. then calls "/sbin/services.sh stop" and then exits, effectively continuing the system shutdown/poweroff.

This means that, even though the autostartmanager is correctly invoked (and VMs actually start to eiteher suspend or guest shutting down), nowhere in the process there is a waiting for VMs to be fully shut off (exspecially slower ones... consider I tested with SBS2008 which takes ages).

I posted my problem also on William Lam's vGhetto page for the script, maybe I'm just missing something in the setup!

=========================================================================

Davide DG - VCP 3.5/4.0

========================================================================= Davide DG - VCP 3.5/4.0
0 Kudos
begoua
Contributor
Contributor

No, it's an easiest perl script that is in the NSM 3.22 package (called ShutdownESXi.pl)

0 Kudos
DavideDG
Contributor
Contributor

Ah... yes yes I have seen and tested it, but actually it does (and cannot) work with FREE-licensed versions of ESXi because free versions have the Restricted read-only APIs support (since 3.5update4).

=========================================================================

Davide DG - VCP 3.5/4.0

========================================================================= Davide DG - VCP 3.5/4.0
0 Kudos
begoua
Contributor
Contributor

Oh....I do not know, i have a vSphere 4 licenced Enterprise Edition...

Sorry..., thanks for the information!

0 Kudos