VMware Cloud Community
0j3r3my0
Contributor
Contributor
Jump to solution

APC PowerChute on VMware ESXi 4

Hi!

We've a problem with the ESXi 4 during the test with our APC SUA-2200 UPS. We have only 1 ESXi server. The vMA doesn't want to shut the ESXi down when receive the shutdown signal, so when the UPS battery. So when the battery dies the machine just powers off without graceful shutdown of the VMs. The UPS has an installed network management card. We deployed a vMA to our ESXi and installed the APC's PowerChute Software (PCNS). So the status is the following:

1. The network management in the UPS card is working correctly (we can access the webUI from any computer)

2. The vMA has a correct Powerchute software. We opened all require ports with the iptables. So we can reach the PowerChute's webUI.

3. We configured the ESXi with the vSphere to automaticaly startup/shutdown with the system. If we reboots the ESXi with the vShere the system shuts the Virtual Machines down correctly.

4. The PowerChute can communicate with the UPS Network card correctly. And the vMA receives the shutdown signal correctly. But the vMA only shuts down itself and leaves the ESXi on, that is our problem.

How can we force the vMA to shutdown our ESXi?

Have you experienced a problem like this before with the PowerChute, or do you have any idea what causes this error?

Thank you for your help!

Regards,

Robert

<!Session data>

Tags (2)
0 Kudos
59 Replies
DSTAVERT
Immortal
Immortal
Jump to solution

It would be worth exploring the beta Mobile Web interface appliance since those are probably already identified and more.

-- David -- VMware Communities Moderator
0 Kudos
DSTAVERT
Immortal
Immortal
Jump to solution

I am guessing that William will be up late tonight.

-- David -- VMware Communities Moderator
0 Kudos
lamw
Community Manager
Community Manager
Jump to solution

Actually this one is a pretty easy one, majority of the work was actually done by James Pearce and Simon Seagrave blog post. I'll post something after dinner

=========================================================================

William Lam

VMware vExpert 2009

VMware ESX/ESXi scripts and resources at:

Twitter: @lamw

VMware Code Central - Scripts/Sample code for Developers and Administrators

VMware Developer Comuunity

If you find this information useful, please award points for "correct" or "helpful".

0 Kudos
lamw
Community Manager
Community Manager
Jump to solution

Great post and great work James.

Here is my version of the exact same script which utilizes Perl and some SOAP libs that's part of the vSphere SDK for Perl which is bundled with the vCLI. The script doesn't require as many files as your .bat script(s) and it works on either Linux/Windows or vMA

=========================================================================

William Lam

VMware vExpert 2009

VMware ESX/ESXi scripts and resources at:

Twitter: @lamw

VMware Code Central - Scripts/Sample code for Developers and Administrators

VMware Developer Comuunity

If you find this information useful, please award points for "correct" or "helpful".

0 Kudos
fixitchris
Hot Shot
Hot Shot
Jump to solution

Isn't there a better way to determine if the host is shutting down than to rely on the HTTP200 response?

0 Kudos
lamw
Community Manager
Community Manager
Jump to solution

Sure, you can query using the APIs to see what tasks are going on, though if you get a 200 okay, means at least your response was accepted. If you have no VMs running, then you won't be able to get much back then the 200.

=========================================================================

William Lam

VMware vExpert 2009

VMware ESX/ESXi scripts and resources at:

Twitter: @lamw

VMware Code Central - Scripts/Sample code for Developers and Administrators

VMware Developer Comuunity

If you find this information useful, please award points for "correct" or "helpful".

0 Kudos
fixitchris
Hot Shot
Hot Shot
Jump to solution

I know we are not really using the VI SDK here but what about this:

VMware® End User License Agreement
3.  Restrictions:  You agree that you will not [http://....|http://....] (5) use the Software to (a) create, design or develop software or service to circumvent, enable, modify or provide access, permissions or rights which would violate the technical restrictions of VMware Products, any additional licensing terms provided by VMware via product documentation, email notification and/or policy change on VMware website, and/or the terms of the End User License Agreements of VMware products;

?

0 Kudos
lamw
Community Manager
Community Manager
Jump to solution

*shrugs*

=========================================================================

William Lam

VMware vExpert 2009

VMware ESX/ESXi scripts and resources at:

Twitter: @lamw

VMware Code Central - Scripts/Sample code for Developers and Administrators

VMware Developer Comuunity

If you find this information useful, please award points for "correct" or "helpful".

0 Kudos
J1mbo
Virtuoso
Virtuoso
Jump to solution

Hmm looks ominous - doesn't say anything about what is essentially a replay attack though, so my version should be fine :smileygrin:

Please award points to any useful answer.

0 Kudos
fixitchris
Hot Shot
Hot Shot
Jump to solution

0 Kudos
fixitchris
Hot Shot
Hot Shot
Jump to solution

Here's a version w/o using VIM assemblies: http://poshcode.org/1525

0 Kudos
J1mbo
Virtuoso
Virtuoso
Jump to solution

Excellent so to sum up we now have three different options for integrating free ESXi with a UPS -

- Perl, by lamw

- PowerScript, by fixitchris

- my clunky-by-comparison windows .bat version

OP - take your pick :smileygrin:

0 Kudos
fixitchris
Hot Shot
Hot Shot
Jump to solution

Issue still remains of shutting down guests:

I would recommend using APCUPSD.com to monitor the UPS, when battery is running low, run a PowerCLI script to determine all VMs that are powered on; retain their MoRef objects; create our web service proxy; issue StopVM for each VM MoRef; then issue the ShutdownHost command.

0 Kudos
lamw
Community Manager
Community Manager
Jump to solution

Per James post, he's relying on using the auto startup feature for a given ESX(i) host and that'll take care of the power down, so long as you know your VMs and setup sufficient delay for each VM to power down based on the priority set.

Yes you can use the APIs to figure out which ones are powered on and setup SOAP request to do a power down but I think for the most basic setup of few hosts, you can just setup auto power down and let ESX(i) deal with it so long as you have VMware Tools installed else it'll be hard powered off.

=========================================================================

William Lam

VMware vExpert 2009

VMware ESX/ESXi scripts and resources at:

Twitter: @lamw

VMware Code Central - Scripts/Sample code for Developers and Administrators

VMware Developer Comuunity

If you find this information useful, please award points for "correct" or "helpful".

0 Kudos
fixitchris
Hot Shot
Hot Shot
Jump to solution

Right. Also make sure your host BIOS AC Recovery is set to PowerOn instead of LastState. You want the host to come up when AC comes back.

0 Kudos
J1mbo
Virtuoso
Virtuoso
Jump to solution

Just to add that shutting down each VM isn't needed - they can simply be suspended and this can be done regardless of vmware tools status.

Also setting PowerOn state may not work reliably - once the UPS issues the shutdown command and the server(s) power off, potentially that UPS could be left running (for example) a switch and a router. Because of the greatly reduced load it could survive running them, depending on its spec of course, for quite a while. Anyway if utility power is restored before the battery is completely drained, the server would see no outage so wouldn't resume.

What is really needed, and by far the simplest solution IMO, is for the APC management card to include,

- A list of ESX(i) servers to shutdown, with username and password fields specifiable against each, which it would achieve using the simple SOAP command set illustrated, and

- An tick-box option against each to wake-up on resume, via wake-on-lan

Anyone got access to the source code for these boards!?

Cheers

0 Kudos
fixitchris
Hot Shot
Hot Shot
Jump to solution

As far as I know, once a UPS issues a SHUTDOWN command it gives you a grace period before the UPS turns itself off. It turns itself off not to drain the batteries since the batteries can't survive too many deep charges. (http://apcupsd.com/manual/manual.html#full-power-down-test)

The issue now becomes if the ESXi shuts down and AC comes back within the grace period.

Your solution to rewrite the firmware is excellent. I tried to contact APC at one point to get some more info about my UPS but the info was considered TOP SECRET. I actually wanted to know what my grace period was, I think, w/o draining the UPS to find out.

I think a laptop running APCUPSD needs to support the ESXi host power operations. This is a list of the supported events:

http://apcupsd.com/manual/manual.html#customizing-event-handling

PowerShell apccontrol skeleton: http://communities.vmware.com/thread/205306?tstart=0

Re Grace Period: we can kill the UPS after the script by issuing a 'killpower' command. (http://apcupsd.com/manual/manual.html#shutdown-sequence) So after shutting down ESXi, we test for AC power, if no AC power then we kill the UPS. When AC comes back, the UPS will power on and so will ESXi. (reading the apcupsd docs further the grace period might actually be an issue after the 'killpower' command.)

0 Kudos
Innuendo
Contributor
Contributor
Jump to solution

Just tried to set it up yesterday (was unsuccessful, reason why I read this post) ans here is what I am 100% sure :

On vSphere (so ESXi4), VMStartPolicy is tied to a said VM, whatever host it is on, it will follow the VM but one property : boot order. You just have to fix them on a VM biasis (put one VM in auto start and then, click Edit for this VM or just use command line / PowerCLI).

If you move a boot ordered VM to another host, it will keep start and stop parameters (since they are VM ones) but won't keep boot order, since it is a host parameter.

The main problem when using such scripts is to be sure every VM was suspended in time. I use SUSPENSION since it is an emergency measure, you surely don't have enought time to wait for every guest OS to be shut down, you have to be quick. In my case, I only have 5 minutes to shut down 5 hosts and 130 VM (nehalem rules).

So I have a script that suspend Asynchronously every VM on each Host (each host its thread operation) but I have to be sure the VM that is leading the shutdown process WON'T receive a shutdown trap, or it will just be suspended and when power is back on, it will continue the suspended job, e.g. power off itself or worse, suspend other VM.

It is where I'm stuck actually. Will try to use your recommandations.

Great work here btw.

PS : pardon my poor english, not my mother tongue.

0 Kudos
J1mbo
Virtuoso
Virtuoso
Jump to solution

With APC UPS the amount of battery time remaining at the shutdown signal can be configured (low battery duration) to over 24 minutes on the 750RM I used to test this. I would suggest increasing the time much beyond five minutes.

Also it is worth thinking about the return battery capacity, i.e. how much the batteries can be charged before the UPS will re-enable output.

If the UPS is fully depleted and AC is restored, provided the BIOS is configured to always-on on the attached server, the server will resume (and start the VMs, eventually). However if AC is again interrupted, which in my experience of power failures is quite likely, there may not be enough capacity to complete the boot and resume, and then commence an orderly shutdown again. Hence specifying a high retun battery capacity, say 45%, and a return delay of a minute to filter further may be sensible.

All VMs need to be set to shutdown or suspend in the policy, as you have found. It is worth noting that the time between each can be reduced from the default of 60 seconds and that provided vmware tools are installed, the routine progresses to the next VM immediately that a shutdown or suspend is completed. Running these concurrently could well take longer though since there would be massive disk contention generated for what is basically sequential write IO.

One advantage of the Windows Powerchute software is that is will shut down the (virtual)machine it's running on after running whatever scripts are set, in this way the issue of sleep-resume you mention is presumably side-stepped.

Hope that helps.

0 Kudos
lamw
Community Manager
Community Manager
Jump to solution

Agreed, suspend probably makes more sense.

The other issue I think some of the folks are dealing with is whether this is 100% free solution from VMware's side of things, if you're using the free version of ESXi .... yea theres some hacking going on, if you're licensed, then that's a whole different game and your solution to send an sync to all VMs except for the one kicking it off isn't out of the question.

Heck you could just use this script which probably supports this use case : , just need to do some minor tweaks and instead of sync, do an async operation.

=========================================================================

William Lam

VMware vExpert 2009

VMware ESX/ESXi scripts and resources at:

Twitter: @lamw

VMware Code Central - Scripts/Sample code for Developers and Administrators

VMware Developer Comuunity

If you find this information useful, please award points for "correct" or "helpful".

0 Kudos