VMware Cloud Community
breakaway9000
Enthusiast
Enthusiast
Jump to solution

ESXi + IBM x3500 7977: RAID Monitoring

I'm using a IBM x3500 system with 7977 for ESXi. It has a IBM ServeRAID 8k controller in it, with 8 x 300GB 10,000RPM SAS drives.

Currently, the server runs Windows Server 2003. It has IBM's java based ServeRAID management tool that notifies me when a disk is faulty, when a raid controller battery is starting to drop out, etc. It has a feature where I can right click on the faulty drive on screen, and then click on "Identify Drive" and it flashes a light on the chosen drive so I can see which physical drive it is.

If I switch the server to ESXi, will these features still be available? They're pretty critical. If yes I guess I'll need some sort of driver/application combo from IBM, the driver I install on the server and the management application on a networked PC and connect to the ESXi server. I'm reasonably well versed working in the shell in linux, so I'm not afraid to get my hands dirty and recompiling some drivers etc if need be.

Is my understanding of this correct?

If yes to all the above, which one of these drivers do I need?

Thanks in advance - Any Insight Appreciated

0 Kudos
1 Solution

Accepted Solutions
snowdog_2112
Enthusiast
Enthusiast
Jump to solution

In the x3500 M3, the IP for the IMM is set in the POST BIOS (i.e, press F1 for Setup at the POST screen).  I just gave it an IP on my subnet, and then point a browser to that IP.

For example - the one I am using for this testing is sitting on 192.168.1.52.

Point a browser to http://192.168.1.52 and you get the logon screen shown in the PDF I attached.

Configuring the email alerts is under Network Protocols in the menu on the left.  Just set an SMTP server (make sure your SMTP server allows relay from the IP address of the IMM).  Then configure the Alerts to add an alert for your email address and the level of alerting you want.

Again, this is on the newer 3500's with the IMM.  I have a couple of x3500's out in the field with the RSA-II modules, I believe the process is pretty much the same for those - though I can't recall if the settings for the IP address on the RSA is in the F1 - Setup menu at the IBM POST or not...

View solution in original post

0 Kudos
35 Replies
DSTAVERT
Immortal
Immortal
Jump to solution

There is an IBM specific version of ESXi with hardware monitoring, however there is no notification on hardware failure. Failures are displayed in the VI client but there is no mechanism within the client for forwarding a notice. Hardware monitoring is dependent on the the "newness" of the hardware. Newer hardware has better support for the different components, NICs, Fans, Temperature probes, etc. I don't know what IBM may have other than Tivoli that can provide notification. You can use Veeam Monitor a free tool that can send email notification on hardware and software issues.






Forum Upgrade Notice - We will be upgrading VMware Communities systems between 10-12 December 2010. During this time, the system will be placed in READ-ONLY mode.

-- David -- VMware Communities Moderator
0 Kudos
breakaway9000
Enthusiast
Enthusiast
Jump to solution

Interesting, Veeam looks like a nice piece of software. However, it doesn't look like it will integrate with the RAID card (i.e. query array / battery status, and email me in the event of a failure like IBM's ServeRAID manager does on Windows). Am I right in assuming this?

Also is there no way at all of setting up e-mail notification? Even something hacked together such as a cron job that runs every few minutes making sure everything is okay is not an option?

The RAID array monitoring is the only thing that I've got left to iron out, then I can put this box into production.

0 Kudos
DSTAVERT
Immortal
Immortal
Jump to solution

ESX(i) use something similar to SNMP. You will see references to CIM (Common Information Model) in the forums. ESXi uses these modules to communicate with the hardware. You won't be able to install anything directly on the ESXi host unless it has been designed specifically for ESXi so no java client. I haven't used the Veeam Monitor tool for a while so I can't say whether it will work for your situation. An other option would be to purchase at minimum, the VMware Essentials Package ($495 or so) which includes vCenter Server. vCenter is capable of sending out notifications based on many triggers including hardware conditions.






Forum Upgrade Notice - We will be upgrading VMware Communities systems between 10-12 December 2010. During this time, the system will be placed in READ-ONLY mode.

-- David -- VMware Communities Moderator
0 Kudos
snowdog_2112
Enthusiast
Enthusiast
Jump to solution

Is there a definite answer to this?  I am in the same position - several IBM 3500/3650's with ServeRAID controllers and RAID arrays for my datastores.  I see a lot of "it should do x" or "you might be able to do y" to monitor the physical hardware.

I'm looking for a hard-and-fast solution to make sure that the RAID disks under my VM's are stable, and need to know via alerting if something at the hardware level fails.

Thanks!

0 Kudos
breakaway9000
Enthusiast
Enthusiast
Jump to solution

I am installing ESXi in two days from now. I will let you know how well it works/doesn't work. Should be okay though.

In the meantime I suggest you look into Veeam monitor. Apparently it's free.

0 Kudos
breakaway9000
Enthusiast
Enthusiast
Jump to solution

Okay, I just got done installing ESXi 4.1 on my IBM x3500 MT 7977. I installed the latest ESXi 4.1 version, the one customised for IBM systems) and I can confirm that there are no software warnings that a disk has failed.

Even inside the vSphere client, it sees the two 'enclosures' on the system (each takes 4), but it doesn't see each physical disk like the Windows RAID monitoring software used to. I currently have 8 x 279GB 10K RPM SAS drives in RAID10 (striped, then mirrored). I walked over to the server and pulled out disk0 to simulate a failure. I got a warning on the side of the server (LED indicating DASD/RAID failure turned on). The orange LED indicating a warning condition on the front of the server also turned on.

However ESXi was none the wiser, neither was the vSphere client.

If the ESXi (i.e., the OS) cannot see my 8 physical disks, then how is it going to know when a drive fails?

I'll try to update the firmware of the ServeRAID 8k RAID controller and see if anything improves.

Edit: Apparently, your disks are supposed to show up like so:

health.PNG

But mine doesn't show up like that (note my vSphere client is missing "Storage")

healthfail.PNG

I'm downloading the latest firmware update for the IBM ServeRAID 8k card right now. Perhaps IBM  pdated it to add compatibility for ESX? The version I'm on right now is very old, from 2007 or earlier. There's been a release in mid 2010. Will try that and update the thread.

0 Kudos
breakaway9000
Enthusiast
Enthusiast
Jump to solution

Okay I updated all the firwmare to the latest versions and re-installed ESXi 4.1 IBM custom.

Still nothing. No disks showing up under "Storage" configuration.


Although ESXi sees the RAID controller, it sees the entire array as one large disk:

Untitled.png

It does see the two 'enclosures' though.

So that's it? There's no way to get it to work?

0 Kudos
snowdog_2112
Enthusiast
Enthusiast
Jump to solution

breakaway9000,

Have you found a solution to this?

I've installed vCenter on a VM on my esxi 4.1 host on my x3500 and configured the alarms.

I don't, however, want to simply pull a disk since it's in production.

Thanks!

0 Kudos
breakaway9000
Enthusiast
Enthusiast
Jump to solution

Nope, I've gotten no further.

I started this thread http://communities.vmware.com/message/1666282 Where someone suggested I install IBM Director. I did some research on IBM Director, and it turns out it interface with IPMI (BMC - baseboard management controller) of the IBM x3500. This is the 3rd network port on the back just near the dual gigabit. Supposedly, IBM Director can use connect to this IPMI interface to read the status of the server and report the status of various hardware components such as the power supplies, disks, fans, etc.

However, IBM Director is not very intuitive to use - I can NOT for the life of me figure out where it's supposed to report these statistics. I've gotten it to recognise the BMC interface on my x3500 7977 ESXi 4.1 host, but I can't get any further. I'm not even sure if IBM Director indeed does what we want it to. I can't seem to find anyone who knows anything about it to be able to help either, and the documentation is... poor to say the least.

Seriously, remote hardware monitoring functionality is CRITICAL - I can't figure out why is there such little information on it. Any searches I run "x3500 + raid monitoring ESXi" ultimately leads back to this very thread we are reading right now.

And on top of everything, why does VMWare state that ESXi is "fully compatible" with the ServeRAID 8k controller when in fact you can't monitor hardware status from within vSphere?

0 Kudos
snowdog_2112
Enthusiast
Enthusiast
Jump to solution

Is it just you and me that seem to think there is something missing in this picture?!?

Regarding Director - is that a free utility?  I have always thought it was a pricey option, and haven't looked into it.

Then again, it seems like adding another player to the mix and another potential point of failure.

I have a new x3650 with a RAID 10 I will be setting up this week, I will try to duplicate your experiment in which you pulled a disk from the running host to see if vCenter  will behave any different.

If you do happen to find a workable solution, please shoot an update to this thread - I will do the same!


Thanks again.

0 Kudos
breakaway9000
Enthusiast
Enthusiast
Jump to solution

IBM Director is free, give it a shot. Between the two of us perhaps we can figure something out. What RAID controller do you have in your x3560? (I have a x3500 w/ a ServeRAID 8K running 8 x 15K RPM 300GB SAS disks - all genuine IBM hardware.

0 Kudos
snowdog_2112
Enthusiast
Enthusiast
Jump to solution

I currently  have an x3500 m3 with the M1015 controller and SATA disks.

I am setting up an x3500 m3 with the M5014 controller and SAS disks tomorrow.

As with your setup - this is all IBM hardware. Do you run vCenter in your environment or is it all standalone ESXi?

Do you run Director in a VM on the gear you're monitoring?  Thanks for your time.

0 Kudos
breakaway9000
Enthusiast
Enthusiast
Jump to solution

My vmware 'environment' simply consists of one x3500 7977 running ESXi 4.1 Hypervisor IBM custom.

IBM Director runs on a separate whitebox system which is running Windows Server 2008 and SQL Server 2005 (oh yeah, make sure you install SQL Server 2005 before installing IBM Director. If you don't, no error messages will be given during or after installation, you will simply get a generic error message when attempting to log in).

Also, I am now 99% positive IBM Director doesn't do what we need to. After many hours of tinkering, I 'added' my x3500 to the "Inventory" in IBM Director, then I ran a "View and Collect Inventory" operation on said system, and when the collect inventory operation competed, I was presented with this:

ibmd.png

Note how there is nothing pertaining to 'disk drives' anywhere in the menu; Also, I went through each menu item and there is nothing pertaining to disk drives under any heading.

I guess that's it then - it's just not possible? Unless someone can prove me wrong?

Edit: Since making this post I've been searching for "LSI Monitoring ESXi", "Adaptec Monitoring ESXi" etc to see if there are any other cards that actually work as intended with ESXi. All I found were people moaning about a lack of monitoring support, back from mid 2008 till now. Apparently there are cards that are in the Official VMWare compatibiltiy / 'supported' list, they work fine and all, but you can't monitor them worth a damn.

It appears that VMWare and the hardware vendors of these cards are unwilling to work hand in hand to produce a product that works, so at the moment the only solution is to buy a card that you KNOW works well (i.e. someone else has it running).


I'm very dissappointed with my first foray into VMWare products.

0 Kudos
snowdog_2112
Enthusiast
Enthusiast
Jump to solution

I'm having different results with this M5015 controller.

I built a RAID10 array, installed esxi using the IBM-specific build from vmware.com, and pointed viClient to the management ip.

I initially had a "battery learning" warning on the Storage group.

Then...I pulled one of the disks.

The Health Status immediately went Alert.  I can send you the screenshots I've got, if you want to see them.

I'm going to build a VM on this box with vCenter and set up an Alarm with SNMP and email to see if it will raise alarms monitoring itself like that.

I'll keep you posted.  let me know if you want my screenshots.  Thanks.

0 Kudos
breakaway9000
Enthusiast
Enthusiast
Jump to solution

Yeah can you please post up the screenshots of the "Configuration" tab in vSphere (i.e. where you see the 'warning').

Also the M1015 RAID controller, does that sit in a PCI-e expansion slot or does it go into the daughterboard slot on the motherboard in the server?

Because I know the ServeRAID 8K is a daughterboard type setup that sits in a propriatery IBM connector slot on the bottom right hand corner of the motherboard.

0 Kudos
snowdog_2112
Enthusiast
Enthusiast
Jump to solution

Here are three snippets with the status at fail - after I pulled the drive.  I don't have any vm's on the hos, so I had to reboot and connect into the LSI BIOS to force it to rebuild when I reconnected the disk.  It's rebuilt now and all green in the health center.  The battery "learning" mode is normal according to the doc I've found (7 days to complete auto-learning).

I also found an LSI doc that describes how to install the MegaRaid monitoring app inside a VM.

http://www.lsi.com/DistributionSystem/User/AssetMgr.aspx?asset=52770

(page 235 - 237).  I'm trying this method right now to see if I can monitor/control the RAID controller from a VM directly on this host.

Installing the MSM on a remote box did not find the controller in my x3500, but I don't know which IP I should use - the IMM or the VMware management IP.  Neither seemed to work for me.

0 Kudos
snowdog_2112
Enthusiast
Enthusiast
Jump to solution

update:  I've tried countless attempts to get anything other than vCenter to monitor or alert, with no success.

I can't get the MSM in Windows to see the ESXi host.

Apparently, it is possible to install the MegaRAID Storage Manager on either a VM or remote box, and point it at the management IP of the ESXi host.  I have yet to get this to work.

I've also installed an LSI driver to the ESXi host using the viupdatehost in Remote CLI (rCLI has to be on a remote box because the host has to be in maintenance mode).

There is also supposedly a MegaCLI utility for ESXi which is supposed to allow CLI commands to rebuild an array.  I've searched high and low for that, and have only found a MegaCLI for linux, which contains a file MegaCli (no file extension) and no instructions on where to put it or how to use it.

In my test case, a rebuild did not auto-start because I put the same disk in that I had pulled, so it had RAID data on it.

I hope you're having better luck than me!

Barring any breakthrough today, I may be forced to rely on the vCenter alarm for notification of a failure, and then schedule a reboot if an auto-rebuild does not start on its own.

Please let me know if you find any solution!  THANKS!!!

0 Kudos
snowdog_2112
Enthusiast
Enthusiast
Jump to solution

I have some progress - though I still can't get any sort of alert/alarm to trigger when I pull a disk.

I was able to install the MegaCLI binary (free download) to the ESXi host, and I can get the disk, array, and adapter status, as well as replace disks and start a rebuild (assuming an auto-rebuild doesn't start on its own).

I can also see the Health status change from Normal to Alert, but it doesn't trigger the alarm in vCenter (and hence, no SNMP trap or email is sent).

I installed Veeam's Monitor (free download) and pointed it at the ESXi host.  I get an alarm when I pull a power cord, but not when I pull a disk.  i assume this is related to the fact that there is a default alarm definition in vCenter for Power, but not for Storage.  The Veeam alarm is keyed on a change in overall host hardware health, and the system health goes to alert when I pull a disk, so I'm still confused on that one.

Let me know if you've made any headway.

0 Kudos
breakaway9000
Enthusiast
Enthusiast
Jump to solution

Nope, nothing so far. I thought that perhaps it'd be possible to somehow integrate the Adaptec ServeRAID agent into the installation media for VMWare ESXi by modifying OEM.tgz, but extensive research and other people say that it's possible but not without intimate knowledge of kernels etc, which I simply don't have.

I have also heard that Adaptec is releasing their CIM provider this quarter, which will enable ESXi to see status of storage components such as the battery, individual disks etc for the ServeRAID 8K (and possibly a range of other Adaptec cards) storage adapter.

So at this point I'm officially giving up - it appears that there's no way to make it work in its current state until the appropriate drivers have been released by vmware/adaptec.


I'm just slightly blue in the face on account of the fact that all the research I did before implementing ESXi 4.1 pretty much stated that the ServeRAID 8k is a 'supported controller'. At one point, I even found a VMWare whitepaper that stated in black and white that the ServeRAID 8k is a 'supported controller'. In my book, supported controller = all features work, not just about 50% functionality. Unfortunately this was one of those things that you don't realise until you're done implementing.

Was setting up Veeam monitor easy?

0 Kudos