Highlighted
Contributor
Contributor

ESX 4.1 and LSI Megaraid Storage Manager

Jump to solution

Hello everyone!

I am testing a move from VMWare Server to vCenter for my organization and I am running into a problem.

In my test environment, I have a Supermicro H8DME-2 motherboard with an LSI 9261-8i SAS Raid controller on it. I currently have 2x750GB Sata drives in a Raid 1 configured on the LSI controller.

Now, the health in vSphere reports everything perfect, including degraded when I pull one of the hard drives.

My problem is my ability to rebuild the array if I replace a drive. I do not see that functionality inside vSphere (and I haven't progressed to vCenter yet, though I plan to do so). I was hoping to get LSI MSM installed, but I am having an error.

I installed MSM 8.00-05 from LSI's website and used the vmware install script. I opened the ports needed in ESX's firewall. I loaded the MSM client on either a guest on the ESX host or on my own local machine and I am able to connect and login to the server.

Everything loads and I have just enough time for a couple of clicks (about 20 seconds) before I lose connection to the server. After that, I cannot reconnect unless I go to the server console and issue

/etc/init.d/vivaldiframeworkd restart

. Then I can connect again, but only for another 20 seconds.

Due to this, I cannot manage my server, such as rebuilding a degraded array or configuring a new array, without rebooting the host, which is an unacceptable solution.

If I have an ssh connection open to the server when the framework crashes, I do get some terminal output. It is:

# *** glibc detected *** ../jre/bin/java: double free or corruption (!prev): 0x081d2d40 ***

======= Backtrace: =========

/lib/libc.so.6[0x8cd121]

/lib/libc.so.6(cfree+0x90)[0x8d0bf0]

/usr/local/MegaRAID Storage Manager/jre/lib/i386/client/libjvm.so[0x623f009]

/usr/local/MegaRAID Storage Manager/Framework/libstorelibjni.so(_ZN7JNIEnv_24ReleaseByteArrayElementsEP11_jbyteArrayPai+0x1f)[0xed4f26e7]

/usr/local/MegaRAID Storage Manager/Framework/libstorelibjni.so(Java_plugins_StorelibPlugin_processNativeCommand+0x1cb)[0xed4f1d59]

/usr/local/MegaRAID Storage Manager/jre/lib/i386/client/libjvm.so[0x621b25d]

/usr/local/MegaRAID Storage Manager/jre/lib/i386/client/libjvm.so[0x630f998]

/usr/local/MegaRAID Storage Manager/jre/lib/i386/client/libjvm.so[0x621ab70]

/usr/local/MegaRAID Storage Manager/jre/lib/i386/client/libjvm.so[0x621abfd]

/usr/local/MegaRAID Storage Manager/jre/lib/i386/client/libjvm.so[0x628b265]

/usr/local/MegaRAID Storage Manager/jre/lib/i386/client/libjvm.so[0x63a03dd]

/usr/local/MegaRAID Storage Manager/jre/lib/i386/client/libjvm.so[0x6310ac9]

/lib/libpthread.so.0[0x9b549b]

/lib/libc.so.6(clone+0x5e)[0x93533e]

======= Memory map: ========

00846000-00860000 r-xp 00000000 08:15 196046 /lib/ld-2.5.so

00860000-00861000 r-xp 00019000 08:15 196046 /lib/ld-2.5.so

00861000-00862000 rwxp 0001a000 08:15 196046 /lib/ld-2.5.so

00864000-009a2000 r-xp 00000000 08:15 196047 /lib/libc-2.5.so

009a2000-009a4000 r-xp 0013e000 08:15 196047 /lib/libc-2.5.so

009a4000-009a5000 rwxp 00140000 08:15 196047 /lib/libc-2.5.so

009a5000-009a8000 rwxp 009a5000 00:00 0

009aa000-009ac000 r-xp 00000000 08:15 196048 /lib/libdl-2.5.so

009ac000-009ad000 r-xp 00001000 08:15 196048 /lib/libdl-2.5.so

009ad000-009ae000 rwxp 00002000 08:15 196048 /lib/libdl-2.5.so

009b0000-009c3000 r-xp 00000000 08:15 196051 /lib/libpthread-2.5.so

009c3000-009c4000 r-xp 00012000 08:15 196051 /lib/libpthread-2.5.so

009c4000-009c5000 rwxp 00013000 08:15 196051 /lib/libpthread-2.5.so

009c5000-009c7000 rwxp 009c5000 00:00 0

009dd000-00a02000 r-xp 00000000 08:15 194744 /lib/libm-2.5.so

00a02000-00a03000 r-xp 00024000 08:15 194744 /lib/libm-2.5.so

00a03000-00a04000 rwxp 00025000 08:15 194744 /lib/libm-2.5.so

00a06000-00a0f000 r-xp 00000000 08:15 194796 /lib/libcrypt-2.5.so

00a0f000-00a10000 r-xp 00008000 08:15 194796 /lib/libcrypt-2.5.so

00a10000-00a11000 rwxp 00009000 08:15 194796 /lib/libcrypt-2.5.so

00a11000-00a38000 rwxp 00a11000 00:00 0

00b19000-00bf4000 r-xp 00000000 08:15 830095 /usr/lib/vmware/lib/libstdc++.so.6

00bf4000-00bf8000 r-xp 000da000 08:15 830095 /usr/lib/vmware/lib/libstdc++.so.6

00bf8000-00bf9000 rwxp 000de000 08:15 830095 /usr/lib/vmware/lib/libstdc++.so.6

00bf9000-00bff000 rwxp 00bf9000 00:00 0

00c55000-00c68000 r-xp 00000000 08:15 196064 /lib/libnsl-2.5.so

00c68000-00c69000 r-xp 00012000 08:15 196064 /lib/libnsl-2.5.so

00c69000-00c6a000 rwxp 00013000 08:15 196064 /lib/libnsl-2.5.so

00c6a000-00c6c000 rwxp 00c6a000 00:00 0

00cab000-00cb2000 r-xp 00000000 08:15 196057 /lib/librt-2.5.so

00cb2000-00cb3000 r-xp 00006000 08:15 196057 /lib/librt-2.5.so

00cb3000-00cb4000 rwxp 00007000 08:15 196057 /lib/librt-2.5.so

06000000-0642a000 r-xp 00000000 08:15 941214 /usr/local/MegaRAID Storage Manager/jre/lib/i386/client/libjvm.so

0642a000-06444000 rwxp 0042a000 08:15 941214 /usr/local/MegaRAID Storage Manager/jre/lib/i386/client/libjvm.so

06444000-06864000 rwxp 06444000 00:00 0

08048000-08052000 r-xp 00000000 08:15 941045 /usr/local/MegaRAID Storage Manager/jre/bin/java

08052000-08053000 rwxp 00009000 08:15 941045 /usr/local/MegaRAID Storage Manager/jre/bin/java

080ef000-08350000 rwxp 080ef000 00:00 0

ebc83000-ebc84000 ---p ebc83000 00:00 0

ebc84000-ec684000 rwxp ebc84000 00:00 0

ec684000-ec685000 ---p ec684000 00:00 0

ec685000-ed085000 rwxp ec685000 00:00 0

ed085000-ed094000 r-xp 00000000 08:15 196062 /lib/libresolv-2.5.so

ed094000-ed095000 r-xp 0000e000 08:15 196062 /lib/libresolv-2.5.so

ed095000-ed096000 rwxp 0000f000 08:15 196062 /lib/libresolv-2.5.so

ed096000-ed098000 rwxp ed096000 00:00 0

ed098000-ed09c000 r-xp 00000000 08:15 194726 /lib/libnss_dns-2.5.so

ed09c000-ed09d000 r-xp 00003000 08:15 194726

A google search revealed trying

export MALLOC_CHECK_=0

but that did not resolve the issue.

Can someone point me to some solution? I can't imagine that this is a new issue, so what does everyone else do when they need to rebuild an array?

Thanks!

Tags (3)
76 Replies
Highlighted
Contributor
Contributor

After I opened the ticket w/LSI Support, they said the issue would have to be moved up to development. That was months ago and I haven't heard anything since. Haven't bothered with it to tell the truth.

New server builds to deal with and migrations to do, which bring their own issues. Gotta keep movin the ball down field.

The alert thing wasn't a show stopper so I moved on. I think I just setup SNMP for the cards and that forwards to one of my syslog servers. That way I'll know if something is amiss.

Good Luck

0 Kudos
Highlighted
Contributor
Contributor

Greetings to ALL!

I won this freaking alerts!

And it was so easy as falling off a log :smileyshocked:


1. Stop mrmonitor -  ./etc/init.d/mrmonitor stop

2. Manually edit a beginning of "/usr/local/MegaRAID\ Storage\ Manager/MegaMonitor/config-current.xml" - written true emails, server's ip, don't touch user, pass and auth-type (for anonymous connection to server) and add "<do-email/>" to severity levels what we want (default - only FATAL-level :))

3. Start mrmonitor - ./etc/init.d/mrmonitor start


AND THATS ALL!

0 Kudos
Highlighted
Contributor
Contributor

I have esx 4/1/0/433742 - with all lates updates.

[root@esx init.d]# ./mrmonitor status
Monitor is not running

[root@esx log]# cat mrmonitord.debug
/usr/local/bin/mrmonitord: /usr/lib/vmware/lib/libgcc_s.so.1: version `GCC_4.2.0' not found (required by /usr/lib/libstdc++.so.6)

[root@esx-01 lib]# rpm -qa | grep "libgcc"
libgcc-4.1.2-46.el5_4.2
libgcc-4.1.2-46.el5_4.2

MSM - 9.00-01_Linux_Megaraid_Storadge_Manager.zip

Before updates mrmonitor was started

Smiley Sad

0 Kudos
Highlighted
Contributor
Contributor

Hi!

Just uninstall old staff and install new MSM for linux (if LSI have it for you controller) - i have LSI 8408E and new MSM dating Aug 11, 2011.

Also do this from readme (this is saying in before posts):

11.For VMware 4.1, it is necessary to create a softlink as mentioned below before installing
    MegaRAID Storage Manager(MSM). Run the below command to create the necessary soft link
    required for MegaRAID Storage Manager(MSM) to work.
    "sudo ln -sf /lib/libgcc_s.so.1 /usr/lib/vmware/lib/libgcc_s.so.1"

I install MSM ib that way - unarchived all archives and copy folder "disk" to host, then "chmod 777 *.*" and run "./vmware_install.sh" and then reboot host. During install i select the Storelib from MSM package.

0 Kudos
Highlighted
Contributor
Contributor

for my MegaRAID SAS9260-8i only 9.00-01virsion Megaraid by 2010.

create a softlink i do it - this was in manual

0 Kudos
Highlighted
Contributor
Contributor

What is this? Smiley Wink (LSI have a freaking sorting at site)

http://www.lsi.com/support/products/Pages/MegaRAID%20SAS%209260-8i.aspx

SAS MegaRAID Storage Manager - Linux - Version 11.06.00-05
Miscellaneous
MegaRAID Storage Manager - Linux
OS: Linux
Version: 11.06.00-05
Readme Link: readme
Aug 11, 201154.56M

0 Kudos
Highlighted
Contributor
Contributor

Big thanks, i try it Smiley Happy

0 Kudos
Highlighted
Contributor
Contributor

I've found the ones that work with ESXi 4.1 and ESXi 5.0.

For ESXi 4.1, you'll need MSM version 2.91-05 (Windows version tested and found working, no other version works) and LSI CIM providers 00.14.V0.02 for ESXi 4.1 (later CIM may work, or may need update as for ESXi 5.0, or may not work at all, not tested).

For ESXi 5.0, you'll need MSM version 6.50-11 (Windows version tested and found working, no other version works) and LSI CIM providers 00.24.V0.03 for ESXi 5.0.

Don't forget to properly setup host names for ESXi machine and management machine in DNS to be forward and reverse resolvable to their respective IPs.

LSI support still resolves VMWare as an unsupported OS: so if you're into VMWare, consider using adapters from some other vendor, or use these old versions of the software somehow compatible.

Still don't know if there are some missing functions, but all the basic array management works.

0 Kudos
Highlighted
Contributor
Contributor

So, I have loaded the latest CIM, the drivers are included in the latest 5.x build of ESXi, but how does the MSM talk to the ESXi5 with no MRM?

I watched this video, very poorly made and she jumps all over the place.

According to her, load the driver,load the cim and then start stoage manager and it magically appears.

Something is missing.

http://www.youtube.com/watch?v=wJVwiSuyFNc

0 Kudos
Highlighted
Contributor
Contributor

Nothing is "missing", per se. The driver in ESXi is supposed to broadcast a packet out that the MSM picks up and shows as a list of servers MSM can connect to to manage.

Unfortunately it doesn't always seem to work, ESPECIALLY with the drivers included with ESXi… loading the offline-bundle.zip from the LSI ESXi driver CD seems to help for some.

I had it working in 4.0, but lost the ability in 4.1 and have not tried 5.0.

http://kb.lsi.com/KnowledgebaseArticle16438.aspx

0 Kudos
Highlighted
Contributor
Contributor

Can someone provide a link to the EXi5 drivers?  I found the CIM providers and the MSM but no drivers except the VMWare provided ones.

Thank you,

0 Kudos
Highlighted
Contributor
Contributor

For ESXi 5 with Update 1, you don`t need new drivers for the storage manager to work remotely, you need to only install LSI SMI-S Provider (00.24.V0.03). The included drivers are just fine.

In theory, you should now be able to discover the server from any pc running MSM in the same network.

But in practice, most of the time it doesn`t work, you cannot discover the server. When this happens, usually a server restart works. If you get lucky and discover the server, you can then connect remotely to the server card, but pay attention, it is working very slowly. :smileyangry:

Don`t know why the LSIprovider for ESXi5 is so unstable and unusable. :smileyplain:

0 Kudos
Highlighted
Contributor
Contributor

I'm posting to concur that for ESXi 5 update 1 you need the LSI SMI-S Provider as 'caraboy' stated, but the stability issue seems to be that the CIM Server on the VMWare host needs to be running.  If it's not, you'll never find your host in MegaRAID Storage Manager.

And, at least on my two 5 Update 1 servers, the CIM Server doesn't start automatically after restarting the host (even though it's set to); I have to go in manually under Configuration->Security Profile->Services->Properties and start it.

Give it a few minutes to broadcast its availability and then run discovery in MSM again and it should now show up.

0 Kudos
Highlighted
Contributor
Contributor

Hi kjcorbin, well, in my case, only after restart I can see the server. After 2-3 days, I think the server stops sending broadcasts, and the only way to rediscover the server is to reboot (CIM server restart is not working in my case, I still cannot see the server).

Anywayz, even if we discover the server, as I stated before, the MSM is barely usable, too slow to be able to do some virtual drives, etc. :smileyplain:

0 Kudos
Highlighted
Contributor
Contributor

Yes. Something is amiss. This "something" is SLP server in ESXi that does not respond to multicast queries from MSM/RWC2 (but responds to unicast perfectly).

The clean solution is here (SLP protocol (the root cause of the problem) multicast to unicast proxy):

http://alex-at.ru/it/lsi-vmware-esxi-1 (use Google Translate if you don't understand Russian)

Tool download link: http://alex-at.ru/media/blogs/alex/Code/slp_helper.zip

Short instructions:

1. I assume you have ESXi, MSM/RWC, LSI driver under ESXi and LSI CIM/SMIS providers for ESXi all installed and ready to go.

2. Unzip tool mentioned above (it contains PHP 5.4 runtime, therefore so large, ~3Mb ZIP)

3. Edit slp_helper.php file with text editor, look for "$unicast_ips = array('192.168.1.1');". Place IP of your ESXi servers to manage here (in single quotes, comma-separated if many).

4. Run runme.cmd. Tool will start and must display no errors (only the startup information). After that it'll begin logging SLP requests to screen. It's enough to have one copy of tool in LAN, it will proxy all multicast SLP requests to ESXi servers specified.
5. Run MSM/RWC2. Discover hosts.

6. Enjoy.

0 Kudos
Highlighted
Contributor
Contributor

I'm now with the same problem. My server only support ESXi 4.1 and I have a Avago LSI 9240-4i on it and I want to be able to manage the array from another machine on the network.

Not finding the solution. Please help.

0 Kudos
Highlighted
Contributor
Contributor

Install CIM provider for LSI 9240

0 Kudos