VMware Cloud Community
assanemd
Enthusiast
Enthusiast
Jump to solution

setup ESXi and VMs shutdown using APC PCNS

Hello,

I have a vSphere 5.5 HA cluster with three hosts and lots of VMs (win, linux). the power supply of this cluster is delivered by APC UPS with Network Management Card so this card car normally used with APC powerchute network services (PCNS) to properly shutdown the hosts and VMs in case of certains events in the UPS( powercut, batterie low...). reading the documentation i've seen that there are two ways to achieve this:

1. Download and install vMA, install PCNS 3.1 on it. and configure the automatic VMs shutdown with host shutdown when there is a non desirable event on the UPS. I'm hesitating on this solution cause i have read somewhere that automatic VMs shutdown is not recommended and/or desactivated in a HA cluster!!!!! Furthermore, in this case the VMs are shutdown automaticaly by the VM Tools but in the past i have faced lots of BSOD and kernel panic while these VMs were resetted by VM Tools when VM monitoring feature was activated in the Cluster. 

2. Download and install vMA, install PCNS 3.1 on it configure only the hosts shutdown. And install and configure  PCNS for each VMs. So that, in case of non desirable event in the UPS VMs are shutdown from the PCNS agent within them and the host is shutdown using PCNS on the vMA (OF course with a delay to allow the VMs to be shutdown properly first).

has anyone implemented this solution ?

What would you suggest ?

1 Solution

Accepted Solutions
dgrehan
Enthusiast
Enthusiast
Jump to solution

Hi,

If you do not wish to use VMWare tools to perform OS shut down command on the VMs then your only option is to install PCNS directly on each VM to be shut down or use a shutdown script triggered by PCNS that would perform a remote OS shutdown command on each of the VMs - that would require storing OS login credentials for the VMs to do the shut down.

View solution in original post

Reply
0 Kudos
16 Replies
Cyber201
Contributor
Contributor
Jump to solution

Hi assanemd, Have you resolved this problem?

I'm in the same situation...

3 Esxi Host 5.5 and 1 APC, vCenter Server 5.5 and the PCNS 3.1 inside the cluster. 6 VM up and running in the cluster.

Any idea?

Thanks a lot

Bye

Reply
0 Kudos
dgrehan
Enthusiast
Enthusiast
Jump to solution

You can download and install the vMA and then install PCNS 3.1 on it or use the PCNS 3.1 Virtual Appliance.

Assuming you have a Single UPS powering the 3 ESXi hosts, choose Managed by vCenter option in the Setup Wizard and Single UPS Configuration option.

On the Virtual Machine settings page in the Setup Wizard you can enable VM shutdown/startup options and configure a delay. Since your Hosts are part of a HA cluster you don't set Automatic VM shutdown/startup using the vSphere Client to shut down the VMs with the host. PCNS will shut down the VMs on each Host prior to shutting down the Hosts themselves. VMware tools must be installed on each VM so that PCNS issue a graceful guest OS shut down command - otherwise they are powered off.

Reply
0 Kudos
assanemd
Enthusiast
Enthusiast
Jump to solution

hello dgrehan

Thank you for your answer.

i don't want vmware to use vmware tools to shutdown VMs. because these are P2Ved VMs and in the past i faced lots of BSOD, Kernel panics while these VMs were resetted by VMware HA using vmware tools.

Reply
0 Kudos
dgrehan
Enthusiast
Enthusiast
Jump to solution

Hi,

If you do not wish to use VMWare tools to perform OS shut down command on the VMs then your only option is to install PCNS directly on each VM to be shut down or use a shutdown script triggered by PCNS that would perform a remote OS shutdown command on each of the VMs - that would require storing OS login credentials for the VMs to do the shut down.

Reply
0 Kudos
JarryG
Expert
Expert
Jump to solution

"...that would require storing OS login credentials for the VMs to do the shut down..."

Not necessary. For this I created one more common account on all "slave" VMs, added "sudo /sbin/shutdown -h now" at the end of its .bash_profile file, and of course allowed it to do shutdown (visudo). "Master" VM then only connects to all slave-VMs (using small bash-script, ssh-client & keyfiles) and right after logging-in, shutdown is auto-started. No root-access creditentials, and no vmware-tools on slave-VMs are required...

_____________________________________________ If you found my answer useful please do *not* mark it as "correct" or "helpful". It is hard to pretend being noob with all those points! 😉
dgrehan
Enthusiast
Enthusiast
Jump to solution

I was thinking more of the Windows VMs (think there is a mix of Linux and Windows VMs in the configuration) and using "net rpc" command to do the shutdown from the vMA.

But now that you mention it, it would be possible to install Cygwin on the windows VMs and an SSH daemon and connect using an ssh-client and keyfiles as you suggest?

Reply
0 Kudos
assanemd
Enthusiast
Enthusiast
Jump to solution

i have installed the pcns soft in all VMs. It's working very well. all the VMs have been shutdown.

i also use the vMA appliance to shutdown host. (i don't use the shutdown VM features) but i have the following behaviours.

I have 3 esxi. the vMA appliance is installed in the third esxi, (ESXi 3) HA is activated in the cluster , i noticed tha when we were testing all the VMs have been shutdown, also the ESXi 1 and 3 where the appliance is hosted. but the second  ESXi (ESXi 2) wasn't shutdown. what can explain this ?

Another question: is it possible to tell to ESXi hosts to be shutdown only after all VMs have been shutdown ?

Reply
0 Kudos
dgrehan
Enthusiast
Enthusiast
Jump to solution

Hi,

You have 3 ESXi hosts in a HA Cluster. PCNS is installed on vMA running on one of the ESXi hosts (ESXi 3) - is vCenter Server running on ESXi 2? Are you using an Active Directory user account or a local account in PCNS?

Does the vCenter Server account configured in PCNS exist as a local user on each of the 3 ESXi hosts and have administrator permissions? Please refer to FA228172 on FAQ

Reply
0 Kudos
assanemd
Enthusiast
Enthusiast
Jump to solution

i have a physical vcenter. i use local account in PCNS.

the vcenter account configured in PCNS is the vcenter SSO local default administrator@vsphere.local. it normally have permissions to shutdown ESXi cause it had succesfully shutdown the other ESXi and all the ESXi have the same config and belong to the same HA cluster.

Reply
0 Kudos
dgrehan
Enthusiast
Enthusiast
Jump to solution

The default administrator@vsphere.local account would not exist as a local user on the ESXi host and does not have admin permissions on the ESXi host i.e. that account cannot be used to connect directly to the ESXi host if vCenter Server is not accessible.

If for some reason PCNS cannot connect to the physical vCenter Server during the shutdown it will attempt to perform the host shutdown by connecting directly to the ESXi host using the vcenter account configured in PCNS. This will fail because administrator@vsphere.local cannot login to the ESXi host directly.

Could you provide a copy of the PCNS event log and /opt/APC/PowerChute/group1/error.log?

Reply
0 Kudos
assanemd
Enthusiast
Enthusiast
Jump to solution

i will take tomorrow these logs. but is it possible that the HA config impacts the shutdown of host cause we noticed  that all the VMs that were in ESXi1 were migrated to ESXi2 may be this was the cause of non-shutdown of the ESXi2 ?

Reply
0 Kudos
dgrehan
Enthusiast
Enthusiast
Jump to solution

Is DRS enabled and set to fully automated? If so then yes this could be the cause - PCNS issues a maintenance mode command at the start of the shutdown sequence and DRS will start moving VMs to other hosts if set to fully automated. Because the VMs are still running this prevents the Host from entering maintenance mode - the default timeout for maintenance mode is 0 i.e. it will wait indefinitely so PCNS does not shut down the host.

To avoid this you can add a key "Maintenance_Mode_Duration = 120" to the [HostSettings] section in /opt/APC/PowerChute/group1/pcnsconfig.ini (stop the service to edit the file and then re-start it). This forces the maintenance mode command to timeout after 120 seconds or whatever value you need to set.

Or you could just change the automation level for DRS to Partially automated - this will prevent DRS from automatically moving the VMs to other hosts in the cluster.

Reply
0 Kudos
assanemd
Enthusiast
Enthusiast
Jump to solution

Hello dgrehan,

i attached here the logs error.log, pcnsconfig.ini, EventLog.txt. We did the test at 05/22/2014 arround 11:05.

i noticed that the host2 didn't exit the maintenance mode

Reply
0 Kudos
dgrehan
Enthusiast
Enthusiast
Jump to solution

Hi,

There is nothing in the logs to indicate that Host 2 shutdown failed:

So you have a 15 minute delay for On Battery shutdown action - you mentioned that PCNS is also installed on each of the VMs - at what point do the VMs start shutting down? They should all be powered off before the ESXi hosts are commanded to shut down?

Could you replace the  /opt/APC/PowerChute/group1/log4j.xml file with the one I've attached here and run the shut down? This will create debug output in error.log and a file called VMwareDebug.log.

Will try to re-create the issue in my setup. Just to confirm - Host 2 remains powered on and there are VMs powered on on Host2? In the tasks view for Host 2 is the Maintenance mode task still in progress?

05/21/201411:05:55UPS has switched to battery power..3.5.1.5.4.1
05/21/201411:20:55UPS critical event: <b>On Battery</b> occurred on Hosts: <b>host1, host2, host3</b>..3.4.9.9
05/21/201411:20:55Enter maintenance mode: <b>host1</b>..3.4.9.9
05/21/201411:20:55Enter maintenance mode: <b>host2</b>..3.4.9.9
05/21/201411:20:56Enter maintenance mode: <b>host3</b>..3.4.9.9
05/21/201411:20:56Exit maintenance mode: <b>host3</b>..3.4.9.9
05/21/201411:20:56Shutting down Host <b>host1</b>..3.4.9.9
05/21/201411:23:10Shutting down Host <b>host2</b>..3.4.9.9
05/21/201411:23:58Shutting down Host <b>host3</b>..3.4.9.9
assanemd
Enthusiast
Enthusiast
Jump to solution

Hello dgrehan,

that for your support on this point. Exactely i checked there was 2 VMs on this host  that didn't have powerchute agent on them that's why the host wasn't shutdown.

Reply
0 Kudos
dgrehan
Enthusiast
Enthusiast
Jump to solution

No problem, glad to help.

Reply
0 Kudos