VMware

Monitoring ESXi with Python script

VERSION 5 Published

Created on: Aug 20, 2008 7:53 AM by couak - Last Modified:  Jan 18, 2009 11:58 AM by couak

I wrote a script in Python to monitor my free ESXi servers. It is my very first script in Python language... I'm used to write in Perl so please be indulgent. The script was written for Nagios-oriented monitoring but you can easily translate it for another monitoring tool by re-defining exit codes.

Average User Rating
(3 ratings)




Jan 10, 2009 12:34 PM joshuadf  says:

Works great, thanks! I just set it up on Fedora using the python-pywbem package linked here: https://bugzilla.redhat.com/245688 (should be in Fedora soon). My next step will be building python-pywbem on RHEL5 and trying it there.

Jan 18, 2009 11:53 AM stephen_c01  says:

i love the script, i use it from windows with whatsup. I wrote a whatsup wrapper to use it.

http://www.stephenjc.com/2009/01/whatsup-vmware-esxi-monitor-these.html

Jan 21, 2009 2:59 PM themali  says:

your script works great from the command line, but i ran into problems when trying to define the check command for nagios. would you mind sharing how you got it to work?

edit: i keep getting a (null) return :(

Thanks

Feb 2, 2009 12:15 PM ipman  says:

Awesome work! Works great from a linux box. I did receive this error from a windows box:
C:\Python26\lib\site-packages\pywbem\cim_types.py:164: DeprecationWarning: object.__init__() takes no parameters
int.__init__(self, arg, base)

However the command appears to have completed successfully regardless.

Feb 18, 2009 12:00 AM moot  says:

Works in Zenoss too:
http://linuxtrek1.blogspot.com/2009/02/zenoss-monitor-free-esxi-version.html

Feb 19, 2009 4:50 PM joshuadf  says: in response to: stephen_c01

It works great on Red Hat EL5 with the rpm above, or even on the ancient EL4 if you install the older pywbem-0.5 for python-2.3.

I also made a small improvement (full script available at http://staff.washington.edu/joshuadf/esxi/ ) to catch problems with EnumerateInstances. This catches AuthError for wrong password and should also work for the CIM_Memory problem described at http://communities.vmware.com/message/1069795 and http://communities.vmware.com/thread/163730

+ try:
instance_list = wbemclient.EnumerateInstances(classe)
+ except pywbem.cim_operations.CIMError,args:
+ verboseoutput("Unknown CIM Error: %s" % args, verbose)
+ except pywbem.cim_http.AuthError,arg:
+ verboseoutput("GLobal exit set to CRITICAL", verbose)
+ GlobalStatus = ExitCritical
+ ExitMsg += "CRITICAL : AuthError: %s
" % arg
+ else:

By the way at least on my Dell PowerEdge 2950s you can also get these:
'OMC_Fan',
'OMC_PowerSupply',

Jun 17, 2009 3:04 AM maxzam  says: in response to: ipman

fix it!
Download the pywbem 0.7 (pywbem-0.7.0.tar.gz), open the file cim_types.py, copy the "# CIM integer types" section and replace it in your 0.6 "C:\Python26\Lib\site-packages\pywbem\cim_types.py"
The error doesn't come out anymore.

=========================================================================

Max

Jun 26, 2009 2:33 PM mhanby  says: in response to: themali

I'm having the same result via Nagios, (null), but from the command line I get OK, or in verbose mode see all of the checks.

Here's how I have the command defined in commands.cfg:

define command{
command_name check_esx_wbem
command_line $USER1$/check_esx_wbem.py https://$HOSTADDRESS:5989 $ARG2$ $ARG3$
}

And the check as defined for one of my ESXi servers

  1. username and password masked
define service{
use linux-critical-server-service
host_name esx01
service_description ESXi Hardware Monitor
check_command check_esx_wbem!readonlyuser!somepassword
}

Jun 26, 2009 2:50 PM mhanby  says: in response to: mhanby

I modified the script by adding the try catch block and now it works through nagios, strange in that I didn't change any of the nagios configurations.

Thanks to Joshua:
http://staff.washington.edu/joshuadf/esxi/check_esx_wbem.py

Jul 1, 2009 7:16 AM voro  says:

Wow, I just discovered this and all I can say is thank you.

A tip for users of distros without python-wbem (like Ubuntu):
Get it from http://sourceforge.net/project/showfiles.php?group_id=133883
and install it with `python setup.py install`

Jul 26, 2009 12:50 PM cookieme  says: in response to: voro

I just saw this script. I have a few newbie questions:

1. Where on my ESXi host do I store this script?

2. Can I setup a cron job so that it runs the script at a certain time and then send an email?

Thanks

Jul 30, 2009 4:41 AM larvel  says:

After ESXi have been updated with this pachege from HP: hp-esxi4.0uX-bundle-1.1.zip (google the file name if you want to find it)extra classes has to been added to the script: http://www.intellipool.se/forum/lofiversion/index.php/t1548.html to alså check the new features. (Storage)

After this update ESXi is aware of HP Storage adapers and disks.
One of our servers now shows a warning in vSpehere Client regarding storage (maybe faulty battery or something), but shows OK using this script.

Any ideas anyone?

Dec 22, 2009 7:24 AM mflacke  says:

Hi,

running the current HP VMware ESXi 4.0.0 build-208167, and tried to to force a storage error by drawing one disk of a mirror, and pulling a plug from one power supply.

Unfortunately only the power plug is shown as an CRITICAL error, but not the drawn disk. It is noticed, but not flagged as CRITICAL:
(output excerpt of check_esx_wbem.py verbose)
20091222 15:16:19 Check classe VMware_StorageExtent
20091222 15:16:20 Element Name = Disk 1 on HPSA1 : Port 1I Box 1 Bay 1 : 419GB : Data Disk
20091222 15:16:20 Element Name = Disk 2 on HPSA1 : Port 1I Box 1 Bay 2 : 419GB : Data Disk
20091222 15:16:20 Element Name = Disk 3 on HPSA1 : Port 1I Box 1 Bay 3 : 0GB : Data Disk : Disk Error
20091222 15:16:20 Element Name = Disk 4 on HPSA1 : Port 1I Box 1 Bay 4 : 931GB : Data Disk
20091222 15:16:20 Element Name = Disk 5 on HPSA1 : Port 2I Box 1 Bay 5 : 931GB : Data Disk
20091222 15:16:20 Element Name = Disk 6 on HPSA1 : Port 2I Box 1 Bay 6 : 931GB : Data Disk
20091222 15:16:20 Check classe VMware_Controller
20091222 15:16:20 Element Name = HP Smart Array P410i Controller : HPSA1
20091222 15:16:20 Check classe VMware_StorageVolume
20091222 15:16:20 Element Name = Logical Volume 1 on HPSA1 : RAID 1 : 419GB : Disk 1,2
20091222 15:16:20 Element Name = Logical Volume 2 on HPSA1 : RAID 1 : 931GB : Disk 3,4 : Interim Recovery
20091222 15:16:20 Element Name = Logical Volume 3 on HPSA1 : RAID 1 : 931GB : Disk 5,6
CRITICAL : Power Supply 1 Power Supply 1: Failure detected
CRITICAL : Power Supply 1

Does somebody already have a solution for this?

Cheers,
-Matthias

Dec 22, 2009 7:44 AM couak  says: in response to: mflacke

It's that because HP agents reports disk failure in the label instead of in the classe status... Actually I own Dell servers which make me lazy for modifying the script... May be someone that owns HP servers can help to modify it

Dec 22, 2009 7:49 AM larvel  says: in response to: couak

It`s that I have 4 other esxi servers running on HP that confuses me. It`s just on this server that the warning shows.

Actions

VMware Beta Programs

Want to be Considered for Future Beta Programs?

Learn More

VMware Developer

Download SDKs, APIs, videos,
training, and more in the Developer community.

Learn More

Developer
Sample Code

Increase your developer productivity with VMware API sample code.

Learn More

VMworld
Sessions & Labs

Online access to the latest VMworld Sessions & Labs and online services.

Learn more

Purchase PSO Credits Online

Purchase credits to redeem training and consulting services online.

Buy Now

Community Hardware Software

View reported configurations or report your own.

Learn More

Only VMware ... Delivers Nexus 1000V

Ensure consistent, policy-based network capabilities to virtual machines across your data center.

Learn More

Communities