VMware Cloud Community
PezJunkie
Enthusiast
Enthusiast
Jump to solution

ESX 4 and IPMI / Dell OpenManage Server Administrator

Does anybody else use the Dell OpenManage Server Administrator software on their ESX hosts?

I am doing a clean/new install of ESX 4 on a Dell PowerEdge R900 server. I have installed the OMSA without errors, but when you try to start the services you get a "Starting ipmi driver: " message. The web interface is up and running, but it doesn't report any hardware info.

It acts like IPMI isn't installed at all, but a rpm query shows:

vmware-esx-drivers-ipmi-ipmi-msghandler-400.39.1vmw-1.0.4.164009
vmware-esx-drivers-ipmi-ipmi-si-drv-400.39.1vmw-1.0.4.164009
vmware-esx-drivers-ipmi-ipmi-devintf-400.39.1vmw-1.0.4.164009

srvadmin-ipmi-5.5.0-364.DUP (Shows up after running the OMSA install)

I tried install OpenIPMI via the srvadmin-openipmi.sh script and get:

Status: OpenIPMI driver module is not loaded in the kernel.
Recommended action: If OpenIPMI modules are available on the system,
execute 'modprobe' command to add modules to the kernel.

I looked in "/lib/modules/2.6.18-128.ESX/kernel/drivers/" for an IPMI driver to modprobe, but can't find any.

Anbody have any ideas on where to go from here?

Tags (3)
0 Kudos
1 Solution

Accepted Solutions
Schorschi
Expert
Expert
Jump to solution

We confirmed with Dell a couple of months ago, 6.0.3 required for ESX 4 classic, due out within 30 days of GA release of ESX 4 classic. So I would wait for 6.0.3, unless it is already available. Dell said they expected it to be out soon after GA release.

View solution in original post

0 Kudos
45 Replies
AndreTheGiant
Immortal
Immortal
Jump to solution

Have you tried OMSA with 5.6 ?

Andre

**if you found this or any other answer useful please consider allocating points for helpful or correct answers

Andrew | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro
0 Kudos
PezJunkie
Enthusiast
Enthusiast
Jump to solution

Where do I go to find 5.6? I only see 5.5 when I go to Support and search for the R900.

0 Kudos
AndreTheGiant
Immortal
Immortal
Jump to solution

http://ftp.dell.com/sysman/OM_6.0.1_ManNode_A00.tar.gz

On ESX 3.5 the 6.0.0 version doesn't work.

I've not tried yet on ESX 4.

Andre

**if you found this or any other answer useful please consider allocating points for helpful or correct answers

Andrew | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro
PezJunkie
Enthusiast
Enthusiast
Jump to solution

6.0.1 actually works less than 5.5.0.

I get "Unsupported Operating System. Cannot proceed with install."

0 Kudos
AndreTheGiant
Immortal
Immortal
Jump to solution

Ok, the same error of ESX 3.5.

I think that we have to wait for an update.

Anyway you have healt status to monitor phisycal sensors.

Andre

**if you found this or any other answer useful please consider allocating points for helpful or correct answers

Andrew | http://about.me/amauro | http://vinfrastructure.it/ | @Andrea_Mauro
0 Kudos
larryl0099
Contributor
Contributor
Jump to solution

I believe that 6.0.1 was the first release of OMSA for the new intel CPU servers, 6.0.3 is available as well i think, but im sure that both only work on the new servers. I seem to remember seeing an ESX OMSA installation download as well, but cant find it at the moment.

in fact from the compatibility option on the dell support site for the download of 6.0.1

Systems

PowerEdge M610

PowerEdge M710

PowerEdge R610

PowerEdge R710

PowerEdge T610

Operating systems

Red Hat Red Hat Enterprise Linux 4

Novell SuSE Linux ES 9 SP4 x86_64

Novell SuSE Linux ES 9

Novell SuSE Linux ES 10

Red Hat Red Hat Enterprise Linux 4.7

Red Hat Red Hat Enterprise Linux 5.2

Red Hat Red Hat Enterprise Linux 5

Novell SuSE Linux ES 10 SP2 x86_64

EDIT

found the information on 6.0.3, seems to be the release of 6.0 for ESX 3.5, no mention of 4 though ;(

http://support.dell.com/support/edocs/software/svradmin/6.0.3/ug/html/index.htm

0 Kudos
Schorschi
Expert
Expert
Jump to solution

We confirmed with Dell a couple of months ago, 6.0.3 required for ESX 4 classic, due out within 30 days of GA release of ESX 4 classic. So I would wait for 6.0.3, unless it is already available. Dell said they expected it to be out soon after GA release.

0 Kudos
PezJunkie
Enthusiast
Enthusiast
Jump to solution

Ok, the same error of ESX 3.5.

I think that we have to wait for an update.

Anyway you have healt status to monitor phisycal sensors.

Even the health status page in the vSphere client is lacking for my ESX 4 server when compared to the same model server running ESX 3.5 u4.

The ESX 4 server only shows the 2 CPU's and 5 Software Components.

The ESX 3.5 server has Memory, Storage, Temperature, Fans, etc... all of which are missing for the ESX 4 server.

0 Kudos
jasoncllsystems
Enthusiast
Enthusiast
Jump to solution

I have tried version 5.5 and 6.0.1 and boths are NOT worked.

Released Date April 2008 http://www.dell.com/downloads/global/solutions/installing_dell_openmanage_on_esx.pdf

Regards,

MALAYSIA VMware Communities http://www.malaysiavm.com

CLL SYSTEMS http://www.cllsystems.com

      • If you found this or any other answer useful please consider allocating points for helpful or correct answers ***

http://www.malaysiavm.com
0 Kudos
filbo
Enthusiast
Enthusiast
Jump to solution

PezJunkie wrote:

Even the health status page in the vSphere client is lacking for my ESX 4 server when compared to the same model server running ESX 3.5 u4.

The ESX 4 server only shows the 2 CPU's and 5 Software Components.

The ESX 3.5 server has Memory, Storage, Temperature, Fans, etc... all of which are missing for the ESX 4 server.

This is what you would see if the IPMI drivers were loaded on your ESX 3.5 server and not-loaded on ESX 4.0.

(Instructions given here for ESX/ESXi 3.5/4.0, for other readers who won't have the same releases as you.)

ESX 3.5 Classic uses vmnix Linux kernel IPMI drivers. Run lsmod | grep ipmi to check that they are loaded. I am certain you will see 3 drivers (ipmi_msghandler, ipmi_devintf, ipmi_si_drv).

ESXi 3.5, ESX 4.0 Classic and ESXi 4.0 all use a vmkernel port of the IPMI drivers. Run vmkload_mod -l | grep ipmi to check that they are loaded. I suspect that they won't show up on your 4.0 system.

You can also look for logged messages in /var/log/messages (except on ESX 4.0 Classic, where they would be in /var/log/vmkernel). The driver startup messages look similar to this reduced /var/log/messages excerpt:

ipmi message handler version 39

ipmi device interface version 39

IPMI System Interface driver version 39, KCS version 39, SMIC version 39, BT version 39

ipmi_si: Found SMBIOS-specified state machine at I/O address 0xca2

IPMI kcs interface initialized

The version numbers will be "39.1" on ESX/ESXi 4.0. On all of these releases except ESX 3.5 Classic there will also be a bunch of vmkernel module loading commentary, stuff like:

Loading module ipmi_msghandler ...

<ipmi_msghandler> symbols tagged as <GPL>

module heap : Initial heap size : 102400, max heap size: 4194304

module heap ipmi_msghandler: creation succeeded. id = 0x4100ba000000

Initialization for ipmi_msghandler succeeded with module ID 48.

ipmi_msghandler loaded successfully.

These aren't important if the loads succeed -- I just mention them so you aren't surprised.

Also remember that the logs are rotated over time -- if your host has been up for a while you will have to search messages.1 etc., possibly uncompressing them; and if it's been up a long time they might be completely gone.

Finally, if IPMI is not loaded and you think it should be (as in your case), you can manually load it with:

o ESX 3.5 Classic:

service ipmi start

o ESXi 3.5:

vmkload_mod -k ipmi_msghandler
vmkload_mod ipmi_si_drv
vmkload_mod ipmi_devintf

o ESX 4.0 Classic and ESXi 4.0:

esxcfg-init -I

Watch for related messages on the text console current screen, text console screen 12 (dynamic vmkernel log), and in /var/log/messages or /var/log/vmkernel.

The possibilities on your 4.0 system are:

o The machine just doesn't have an IPMI BMC. No matter what ESX release you load, IPMI driver load will fail.

o ESX 4.0 doesn't recognize that it has an IPMI BMC. Check BIOS setup to see if it's disabled. Post the driver load failure messages here. Run dmidecode from ESX console OS or from a random Linux 2.6 distro live CD: is IPMI mentioned? My SuperMicro box with IPMI, currently running ESX 3.5 Classic, shows:

Handle 0x0033
        DMI type 38, 18 bytes.
        IPMI Device Information
                Interface Type: KCS (Keyboard Control Style)
                Specification Version: 2.0
                I2C Slave Address: 0x10
                NV Storage Device: Not Present
                Base Address: 0x0000000000000CA2 (I/O)
                Register Spacing: Successive Byte Boundaries

o ESX 4.0 does recognize the BMC, but something goes wrong during loading. Proceed as for non-recognition.

>Bela<

PezJunkie
Enthusiast
Enthusiast
Jump to solution

I have two identical servers (Dell R900's). One is has ESX 3.5 u4 installed and the other has ESX 4. The BIOS settings are the same on both boxes. I'm OK with waiting on a new version of the Dell OpenManage software, but the fact that the Health Status in the vSphere Client is so lacking for the ESX 4 server does concern me that maybe the new IPMI driver isn't playing nice with my BMC. (EDIT - nevermind, this appears to be caused by the Dell OpenManage uninstall script)

During boot up, I can see a line that says IPMI is loading/starting succesfully.

I did find this in /var/log/messages:

May 26 12:08:42 esx13 sfcb[7536]: RawIpmiProvider::initialize: No IPMI Interface. Will not be polling. Error Message: File /dev/ipmi0 not found

Here's IPMI loading on bootup in /var/log/vmkernel:

May 21 09:20:17 esx13 vmkernel: 0:00:00:40.445 cpu0:4096)VMNIX: Logger: 475: sysboot: ipmi ...
May 21 09:20:17 esx13 vmkernel: 0:00:00:40.550 cpu1:4110)Loading module ipmi_msghandler ...
May 21 09:20:17 esx13 vmkernel: 0:00:00:40.551 cpu1:4110)Elf: 2320: &lt;ipmi_msghandler&gt; symbols tagged as &lt;GPL&gt;
May 21 09:20:17 esx13 vmkernel: 0:00:00:40.567 cpu1:4110)module heap ipmi_msghandler: creation succeeded. id = 0x4100b9c00000
May 21 09:20:17 esx13 vmkernel: 0:00:00:40.567 cpu1:4110)&lt;6&gt;ipmi message handler version 39.1
May 21 09:20:17 esx13 vmkernel: 0:00:00:40.567 cpu1:4110)Mod: 2892: Initialization for ipmi_msghandler succeeded with module ID 48.
May 21 09:20:17 esx13 vmkernel: 0:00:00:40.567 cpu1:4110)ipmi_msghandler loaded successfully.
May 21 09:20:17 esx13 vmkernel: 0:00:00:40.607 cpu1:4110)Loading module ipmi_si_drv ...
May 21 09:20:17 esx13 vmkernel: 0:00:00:40.607 cpu1:4110)Elf: 2320: &lt;ipmi_si_drv&gt; symbols tagged as &lt;GPL&gt;
May 21 09:20:17 esx13 vmkernel: 0:00:00:40.623 cpu2:4110)module heap ipmi_si_drv: creation succeeded. id = 0x4100bd800000
May 21 09:20:17 esx13 vmkernel: 0:00:00:40.623 cpu2:4110)&lt;6&gt;ipmi_si: Trying SMBIOS-specified kcs state machine at i/o address 0xca8, slave address 0x20, irq 0
May 21 09:20:17 esx13 vmkernel: 0:00:00:41.038 cpu2:4110)&lt;6&gt;ipmi: Found new BMC (man_id: 0x 0002a2, prod_id: 0x 0100, dev_id: 0x20)
May 21 09:20:17 esx13 vmkernel: 0:00:00:41.038 cpu2:4110)PCI: driver ipmi_si is looking for devices
May 21 09:20:17 esx13 vmkernel: 0:00:00:41.038 cpu2:4110)PCI: driver ipmi_si claimed 0 device
May 21 09:20:17 esx13 vmkernel: 0:00:00:41.038 cpu2:4110)Mod: 2892: Initialization for ipmi_si_drv succeeded with module ID 49.
May 21 09:20:17 esx13 vmkernel: 0:00:00:41.038 cpu2:4110)ipmi_si_drv loaded successfully.
May 21 09:20:17 esx13 vmkernel: 0:00:00:41.060 cpu2:4110)Loading module ipmi_devintf ...
May 21 09:20:17 esx13 vmkernel: 0:00:00:41.060 cpu2:4110)Elf: 2320: &lt;ipmi_devintf&gt; symbols tagged as &lt;GPL&gt;
May 21 09:20:17 esx13 vmkernel: 0:00:00:41.076 cpu2:4110)module heap ipmi_devintf: creation succeeded. id = 0x4100bdc00000
May 21 09:20:17 esx13 vmkernel: 0:00:00:41.076 cpu2:4110)&lt;6&gt;ipmi device interface
May 21 09:20:17 esx13 vmkernel: 0:00:00:41.076 cpu2:4110)Mod: 2892: Initialization for ipmi_devintf succeeded with module ID 50.
May 21 09:20:17 esx13 vmkernel: 0:00:00:41.076 cpu2:4110)ipmi_devintf loaded successfully.

esxcfg-init -I returns:

vmkload_mod: Can not load module ipmi_msghandler: module is already loaded
Error running operation: Exec of command '/usr/sbin/vmkload_mod -e ipmi_msghandler ' succeeded, but returned with non-zero status: 1

vmkload_mod returns:

ipmi_msghandler 0x418002201000 0x9000 0x417fc2fdf2e0 0x1000 48 Yes
ipmi_si_drv 0x41800220a000 0x9000 0x417fc2fe0b00 0x1000 49 Yes
ipmi_devintf 0x418002213000 0x3000 0x417fc2fe1b40 0x1000 50 Yes

dmidecode returns:

Handle 0x2600, DMI type 38, 18 bytes.
IPMI Device Information
Interface Type: KCS (Keyboard Control Style)
Specification Version: 2.0
I2C Slave Address: 0x10
NV Storage Device: Not Present
Base Address: 0x0000000000000CA8 (I/O)
Register Spacing: 32-bit Boundaries

0 Kudos
PezJunkie
Enthusiast
Enthusiast
Jump to solution

After testing with a different server, it appears that the problem with the Health Status page only presents itself after installing and then uninstalling the Dell OpenManage software.

I guess I'm just waiting on Dell at this point.

Thanks to everybody for your help!

0 Kudos
filbo
Enthusiast
Enthusiast
Jump to solution

PezJunkie wrote:

After testing with a different server, it appears that the problem with the Health Status page only presents itself after installing and then uninstalling the Dell OpenManage software.

Did that cause a permanent problem (persists on the ESX 4 system even after reboot), or is it just that after removing OM you must reboot to clear the system's head?

Dell isn't yet shipping a release of OM targeted at ESX Classic 4.0. The version for ESX 3.5 should be expected to work only if Dell claims that it does.

It sounds like the OM removal script deleted /dev/ipmi0. If that's all it did to the system, you should be able to repair it by running:# grep ipmi /etc/init.d/vmware # Preview the commands that will be executed

  1. grep ipmi /etc/init.d/vmware | sh

  2. service sfcbd-watchdog restart>Bela<

0 Kudos
PezJunkie
Enthusiast
Enthusiast
Jump to solution

You're correct... the OpenManage uninstall script is deleting /dev/ipmi0. It remains broken even after a reboot.

These servers aren't in production yet, so I went ahead and reinstalled ESX to ensure that everything is clean & back the way that it should be.

0 Kudos
filbo
Enthusiast
Enthusiast
Jump to solution

PezJunkie wrote:

You're correct... the OpenManage uninstall script is deleting /dev/ipmi0. It remains broken even after a reboot.

Hmmm, that's not good.

These servers aren't in production yet, so I went ahead and reinstalled ESX to ensure that everything is clean & back the way that it should be.

Ok, so you're fine but this problem remains lurking for the next person. If you want to do a public service, install OM again on one of those hosts, remove it, and let's see if we can reconstruct without having to do a full reinstall...

>Bela<

0 Kudos
Schorschi
Expert
Expert
Jump to solution

I believe OMSA 6.1 will officially support ESX 4.0. Due out first week of July, I believe.

0 Kudos
PezJunkie
Enthusiast
Enthusiast
Jump to solution

Ok, so you're fine but this problem remains lurking for the next person. If you want to do a public service, install OM again on one of those hosts, remove it, and let's see if we can reconstruct without having to do a full reinstall...

After recreating /dev/ipmi0 per your instructions, it looks like the Health Status page in the vSphere Client is displaying everything properly again.

0 Kudos
filbo
Enthusiast
Enthusiast
Jump to solution

PezJunkie wrote:

After recreating /dev/ipmi0 per your instructions, it looks like the Health Status page in the vSphere Client is displaying everything properly again.

Simply by creating the device node and restarting sfcbd?:

# grep ipmi /etc/init.d/vmware | sh
# service sfcbd-watchdog restart

That's peculiar. A reboot would have run through the full /etc/init.d/vmware script, it should also have recreated /dev/ipmi0. I don't see how to reconcile those facts.

So we have a workaround: if you mistakenly install (and then remove) Dell OM targeted at ESX Classic 3.5 onto your ESX 4.0 Classic host, recreate /dev/ipmi0 by running grep ipmi /etc/init.d/vmware | sh, then either reboot or run service sfcbd-watchdog restart. But it's not a fully understood workaround because a simple reboot should have done all the same things.

One detail that may have something to do with it: there is a long latency between starting the CIM broker and being able to see full health information in VI Client. Several different stages of the pipeline have to be filled, and some of them have cycles as long as 15 minutes. If all the cycles are in perfectly the right phase, you could see fresh health info within a couple of minutes; worst case, more like half an hour. So if your test of the proposition "IPMI is fixed" was "Do I have health info in VI Client?", you could easily have been tricked. I would recommend a test involving ipmitool -- see if ipmitool sdr produces reasonable output, for instance. (Install ipmitool RPM from RHEL5 -- or RHEL3 for ESX 3.5 Classic.) If that works then you can assume VI Client health info will eventually start working -- check it in 30 minutes.

>Bela<

0 Kudos
PezJunkie
Enthusiast
Enthusiast
Jump to solution

>I would recommend a test involving ipmitool -- see if ipmitool sdr produces reasonable output, for instance. (Install ipmitool RPM from RHEL5 -- or RHEL3 for ESX 3.5 Classic.) If that works then you can assume VI Client health info will eventually >start working -- check it in 30 minutes.

I compared the output of ipmitool sdr on this box to the output on a different ESX 4 server that has never had OpenManage installed, and they are exactly the same.

0 Kudos