I wrote a script in Python to monitor my free ESXi servers. It is my very first script in Python language... I'm used to write in Perl so please be indulgent. The script was written for Nagios-oriented monitoring but you can easily translate it for another monitoring tool by re-defining exit codes.
i love the script, i use it from windows with whatsup. I wrote a whatsup wrapper to use it.
http://www.stephenjc.com/2009/01/whatsup-vmware-esxi-monitor-these.html
your script works great from the command line, but i ran into problems when trying to define the check command for nagios. would you mind sharing how you got it to work?
edit: i keep getting a (null) return ![]()
Thanks
Awesome work! Works great from a linux box. I did receive this error from a windows box:
C:\Python26\lib\site-packages\pywbem\cim_types.py:164: DeprecationWarning: object.__init__() takes no parameters
int.__init__(self, arg, base)
However the command appears to have completed successfully regardless.
Works in Zenoss too:
http://linuxtrek1.blogspot.com/2009/02/zenoss-monitor-free-esxi-version.html
It works great on Red Hat EL5 with the rpm above, or even on the ancient EL4 if you install the older pywbem-0.5 for python-2.3.
I also made a small improvement (full script available at http://staff.washington.edu/joshuadf/esxi/ ) to catch problems with EnumerateInstances. This catches AuthError for wrong password and should also work for the CIM_Memory problem described at http://communities.vmware.com/message/1069795 and http://communities.vmware.com/thread/163730
fix it!
Download the pywbem 0.7 (pywbem-0.7.0.tar.gz), open the file cim_types.py, copy the "# CIM integer types" section and replace it in your 0.6 "C:\Python26\Lib\site-packages\pywbem\cim_types.py"
The error doesn't come out anymore.
=========================================================================
Max
I'm having the same result via Nagios, (null), but from the command line I get OK, or in verbose mode see all of the checks.
Here's how I have the command defined in commands.cfg:
define command{
command_name check_esx_wbem
command_line $USER1$/check_esx_wbem.py https://$HOSTADDRESS:5989 $ARG2$ $ARG3$
}
And the check as defined for one of my ESXi servers
I modified the script by adding the try catch block and now it works through nagios, strange in that I didn't change any of the nagios configurations.
Thanks to Joshua:
http://staff.washington.edu/joshuadf/esxi/check_esx_wbem.py
Wow, I just discovered this and all I can say is thank you.
A tip for users of distros without python-wbem (like Ubuntu):
Get it from http://sourceforge.net/project/showfiles.php?group_id=133883
and install it with `python setup.py install`
I just saw this script. I have a few newbie questions:
1. Where on my ESXi host do I store this script?
2. Can I setup a cron job so that it runs the script at a certain time and then send an email?
Thanks
After ESXi have been updated with this pachege from HP: hp-esxi4.0uX-bundle-1.1.zip (google the file name if you want to find it)extra classes has to been added to the script: http://www.intellipool.se/forum/lofiversion/index.php/t1548.html to alså check the new features. (Storage)
After this update ESXi is aware of HP Storage adapers and disks.
One of our servers now shows a warning in vSpehere Client regarding storage (maybe faulty battery or something), but shows OK using this script.
Any ideas anyone?
Hi,
running the current HP VMware ESXi 4.0.0 build-208167, and tried to to force a storage error by drawing one disk of a mirror, and pulling a plug from one power supply.
Unfortunately only the power plug is shown as an CRITICAL error, but not the drawn disk. It is noticed, but not flagged as CRITICAL:
(output excerpt of check_esx_wbem.py verbose)
20091222 15:16:19 Check classe VMware_StorageExtent
20091222 15:16:20 Element Name = Disk 1 on HPSA1 : Port 1I Box 1 Bay 1 : 419GB : Data Disk
20091222 15:16:20 Element Name = Disk 2 on HPSA1 : Port 1I Box 1 Bay 2 : 419GB : Data Disk
20091222 15:16:20 Element Name = Disk 3 on HPSA1 : Port 1I Box 1 Bay 3 : 0GB : Data Disk : Disk Error
20091222 15:16:20 Element Name = Disk 4 on HPSA1 : Port 1I Box 1 Bay 4 : 931GB : Data Disk
20091222 15:16:20 Element Name = Disk 5 on HPSA1 : Port 2I Box 1 Bay 5 : 931GB : Data Disk
20091222 15:16:20 Element Name = Disk 6 on HPSA1 : Port 2I Box 1 Bay 6 : 931GB : Data Disk
20091222 15:16:20 Check classe VMware_Controller
20091222 15:16:20 Element Name = HP Smart Array P410i Controller : HPSA1
20091222 15:16:20 Check classe VMware_StorageVolume
20091222 15:16:20 Element Name = Logical Volume 1 on HPSA1 : RAID 1 : 419GB : Disk 1,2
20091222 15:16:20 Element Name = Logical Volume 2 on HPSA1 : RAID 1 : 931GB : Disk 3,4 : Interim Recovery
20091222 15:16:20 Element Name = Logical Volume 3 on HPSA1 : RAID 1 : 931GB : Disk 5,6
CRITICAL : Power Supply 1 Power Supply 1: Failure detected
CRITICAL : Power Supply 1
Does somebody already have a solution for this?
Cheers,
-Matthias
It's that because HP agents reports disk failure in the label instead of in the classe status... Actually I own Dell servers which make me lazy for modifying the script... May be someone that owns HP servers can help to modify it
Works great, thanks! I just set it up on Fedora using the python-pywbem package linked here: https://bugzilla.redhat.com/245688 (should be in Fedora soon). My next step will be building python-pywbem on RHEL5 and trying it there.