VMware Cloud Community
VinJordan
Contributor
Contributor

RAID-Controller E-Mail notification in ESXi 5.5 host

Hello together,

i'm german. So sorry about some grammatically mistakes πŸ˜‰

We got 6 physical server. 4 of them i want to virtualize.

I got a lot experience with ESXi and allready places some easier solutions in the past.

But now i got following problem:

-> On a physically server you install the raid management software from the raid controller on the operating system and configure the e-mail alert, which will activated, if a hdd in a raid or raid is faulty.

But how i can realize it on a esxi host?

I will not notice when a hdd in the host will broke, when the server is only able to beep. Nobody will hear that.

So i testet the vcenter 5.5 with an esxi 5.5. I just can setup a lot of alams, but nothing regarding broken raid or hdds.

PCI-passtrough will even not work, because the RAID-Controller will then 'only' reserved to this one vm.

I'm searching a functionally solutions which will not let my purse explode.

I hope you can help me.

Thanks in advance!

Best regards,

Reply
0 Kudos
9 Replies
JarryG
Expert
Expert

I *think* you should be able to get alarm-notifications even from vCenter server. I do not use it, but in vSphere Client (native) I can see health-status of my raid-controller (arrays, individual disks, temperature, battery, etc). I do not know what HW you have, but as an example: I have LSI hw-raid, and I have two more possibilities (I am aware of) to monitor its health:

1. MegaRAID Storage manager:

I can install it on any VM (windows/linux etc), or any other workstation on the same lan-segment (or VPN) and have it monitoring all raid-controllers (it is client-server app).

2. StorCLI/MegaCLI:

Command-line utilities for storage-management. Can be installed on VMware/ESXi (or any VM). With a little scripting it is possible to set-up cron-task periodically checking raid-controller's health, and get email when something suspicious is detected.

I suggest you chech your hw-raid manufacturer's web-site and check what possibilities they offer.

_____________________________________________ If you found my answer useful please do *not* mark it as "correct" or "helpful". It is hard to pretend being noob with all those points! πŸ˜‰
VinJordan
Contributor
Contributor

Hi there,

I plan to buy a LSI MegaRAID SAS 9271-8i. I checked to compatibility on vmware. Have to work.

To Point 1.
-> This solutions sounds like the best and easiest. But how can the software recognize the raid controller *through" the OS?

-> Such managing could just be able when the controller is directly connected to the network...isn't it (raid controller with rj45 jacket are really rare...and i'm not sure if they really reliable Look at this: Areca Technology Corporation )? I hope you can teach me πŸ™‚
-> An example: I install the storage software on my workstation. And the software should be able to screen me the controller which i placed in the ESXi Host in my serverroom?

After my previous researches i have to install the CIM-driver on the host (or VIBs), so the ESXi host is able to read the status of the RAID, hdds, etc an show me it the vSphere Client.

And further i need the vmware vCenter essentials kit to configure and send me alarms.

But i'm not sure if i have access to the raid components (such as array, healthy etc.) when i want to create an alarm.

Reply
0 Kudos
JarryG
Expert
Expert

As I said, MegaRAID Software Manager (MSM) works on "client-server" principle: you install LSI-driver on ESXi, it comes with small CIM (Common Information Model) server running on ESXi-host. And MSM then connect to this CIM-server running on ESXi-server (first you have to open ports for it). You just put IP-addresses of hosts you want to monitor, or you can use discovery-protocol and it finds all hosts responding. You pick then host you want to see, enter login/passwd and you can see very detailed info. If something goes wrong, you will get some pop-up notification.

Controller can be on the local host (127.0.0.1, the same as the one running MSM) or any other. IIRC, the only problem is it must be on the same switch (network-segment), but even this can be resolved by VPN. I *think* even vSphere-Client uses this approach (I doubt lm_sensors have direct access to this hardware)...

Unfortunatelly I have no experience with Areca-HW, so I do not know how it works. What I described is valid for LSI-controllers, both original and rebranded (IBM, Intel, Fujitsu, etc., they all use basically the same hardware). 9271-8i seems to be good choice, but do not forget to buy power-backup (either battery or even better, super-capacitor).

_____________________________________________ If you found my answer useful please do *not* mark it as "correct" or "helpful". It is hard to pretend being noob with all those points! πŸ˜‰
Reply
0 Kudos
VinJordan
Contributor
Contributor

That sounds like the solution i searched for.
When the Controller (inclusive Battery ;-)) will arrive, i will test it an keep you informed!

Thanks a lot!

Reply
0 Kudos
VinJordan
Contributor
Contributor

So,

the RAID-Controller arrived and i set up the System.

I downloaded the ESXi 5.5 U1 image and added the CIM-Provider and the Driver for the RAID-Controller with an ESXI-Customizer ->VMware Front Experience: ESXi-Customizer.

I got the driver from the manufacturer homepage -> http://www.lsi.com/products/raid-controllers/pages/megaraid-sas-9271-8i.aspx#tab/tab4

- I installed the hypervisor and set the general settings.

- After that, i installed a windows server 2008 testsystem. And after that on itself, the MegaRAID Storage Manager 14.02.01.03

- The Firewall has to be disabled, respectively the ports have to be added. In my case i just keep the firewall disabled and added the hostname of the esxi in the host-file of the server system

Otherwise the Storage manager can't find the ESXi Server.

- In the Manger i have to click on "Configure Host" and choose the radio button "Display all the ESXi-CIMON servers in the network of local server"

Just like in this german tutorial: Here

- After that the storage manager displays me the login mask.

Now to the problem:

An Error message with following content appears after i try to login (the login credentials are to 100% correct!)

error message.png

(Login failed: unable to connect to CIMON server!)

I checked the CIM-Service in the vSphere Client under "Configuration" and "Securityprofile".

The CIM-Server ist stopped. I just start him and edit the option that the service start "with" the host.

I try again to connect:

2nd error.png

When i click in "OK" the manager close completely.

I try again to login:

I try to restart the "CIM-Server"-service in the vSphere Client:

3rd error message.png

The vSphere is in the german language.

It says: timeout with the request.

I'm a bit confused.

How can the storage manager find the ESXi when the CIM-Server is not started at the first time?

And why he says - after the start of the CIM-Server-service- that i reached the maximum number of login attemps, when the server was just one login try 'before' not available?

Hope somebody can help me. πŸ˜•

Reply
0 Kudos
venkyVM
Enthusiast
Enthusiast

For the CIM Server startup problem could you log into the ESXi host and then exec "/etc/init.d/sfcbd-watchdog restart" , and then see if your problem goes away. Ideally CIM server is started by default.

"I checked the CIM-Service in the vSphere Client under "Configuration" and "Securityprofile".

The CIM-Server ist stopped." => This is a bug in the UI. CIM Server is started by default on ESXi

"Display all the ESXi-CIMON servers in the network of local server" ==> Can you manually enter the login details and dtry connnecting to the server.


Lastly, Just check if the CIM Server behaves well using a tool like wbemcli.

Reply
0 Kudos
VinJordan
Contributor
Contributor

Ok. Thanks for that info.

The command "/etc/init.d/sfcbd-watchdog restart" doesn't help.

I checked the parameter "status"...it says everything ist ok...but it doesn't seems to be.

To remove every possible mistakes with the esxi customizer, i just re-installed the host and add the SMIS-Provider and the driver for the 9271-8i manually.

For that i entered the maintenance-mode and enabled ssh under "Securityprofile" with the vSphere Client.

I downloaded the latest files from the lsi homepage and transfer them via WinSCP on the /tmp/ dir of the esxi host.

The SMIS-Provider was installed fine.

An error occured after the failed installation of the driver for the controller: "Could not find a trusted signer".

The solution was to add "--no-sig-check" at the end of the command.

Same error!

Now i try to check it out with an older SMIS-Provider and without to update the driver for the controller.

It wouldn't be the first time, that actual versions of any kind of driver or software causes problems like this.

Can you help me to use the "wbemcli"?

Is it an vib, which have to be installed on the esxi host?

Can i use it in the console?

The esxi don't know that command. Even not in the maintenance mode.

Thanks in advance!

edit: By the way: The Controller Firmware is actual: 23.28.0-0010

Reply
0 Kudos
VinJordan
Contributor
Contributor

Ok...even this attempt doesn't work.

Now i want to check the behavior with wbemcli, like you said.
But my problem is still:

~ # wbemcli

-sh: wbemcli: not found

~ #

I need some help in using this tool. I hope i can figure out the cause of that Problem.
Thanks in Advance!

Reply
0 Kudos
VinJordan
Contributor
Contributor

Hello again,

so it works! I can connect via MSM to the ESXi-Host and can check the status and/or configure it.

I changed two things:
1st: The Hardware respectively the mainboard. This one (just for testing) is a Gigabyte GA-B85M-HD3G)

2nd: The Software on the machine that connect to the CIM-Service on the ESXi-Host.
I additionally installed the "Server Component" after the installation of the "Client Installation".
So in future i just install the full package product an this will never be a problem.

I think the missing softwarecomponent on the Server was the reason.

So, now i got an other problem. But for that i will start a new thread because the problems are Independent.

Reply
0 Kudos