VMware Cloud Community
warrenwalker
Enthusiast
Enthusiast

Health Status in ESX 3.5

Does anyone know if its possible to obtain disk staus info from the health status tab in Update 2? Does the health status tab work with any system on the HCL? Lastly does anyonehave any documentation relating to this feature?

Cheers,

Warren

Reply
0 Kudos
20 Replies
Texiwill
Leadership
Leadership

Hello,

In general that is talking to the CIM server of an ESXi installation not an ESX v3.5 installation. However, while there is a lot of debate on why this was not made available, I do believe VMware is looking into it. No promises however.


Best regards,

Edward L. Haletky

VMware Communities User Moderator

====

Author of the book 'VMWare ESX Server in the Enterprise: Planning and Securing Virtualization Servers', Copyright 2008 Pearson Education.

CIO Virtualization Blog: http://www.cio.com/blog/index/topic/168354

As well as the Virtualization Wiki at http://www.astroarch.com/wiki/index.php/Virtualization

--
Edward L. Haletky
vExpert XIV: 2009-2023,
VMTN Community Moderator
vSphere Upgrade Saga: https://www.astroarch.com/blogs
GitHub Repo: https://github.com/Texiwill
Reply
0 Kudos
RickPollock
Enthusiast
Enthusiast

Is there a way to disable this feature?

Reply
0 Kudos
lamw
Community Manager
Community Manager

Disable CIM or Health Status GUI display? You can do the first one but not the latter, just do "/sbin/service pegasus stop" and also "/sbin/chkconfig --level 2345 pegasus off" and this will ensure it will not startup on reboot through the runlevels. This is a nice feature but if you're using something like HP SIM, all this information is already being gathered on a finer/granular level, espcially with HP's ICE pack, you get SIM as part of the package if you get iLO 2 or what not.

Reply
0 Kudos
dock
Contributor
Contributor

don't want to hijack your thread, but...

I just updated 2 servers, and only one of them has the Health Status. Tried a reboot but nogo.

The one with the Health Status however, disables HA after a minute..

dr_k

Reply
0 Kudos
lamw
Community Manager
Community Manager

One of the known fixes for this issue after an upgrade is to disconnect your host from the cluster and re-connect, usually the health status will show up. Also make sure that your pegasus service is in fact running as that is your CIM application that will be polling the hardware for the information.

Reply
0 Kudos
dock
Contributor
Contributor

Worked. Tnx!

Reply
0 Kudos
stuten
Enthusiast
Enthusiast

I'm currently running the latest of VC and ESX as of this post. We're running full ESX 3.5 not ESXi. I see the health status tab but only get Processors, memory and storage. Fans, power supplies, etc aren't there. Searching around it appears that this is a difference of ESX 3.5 and 3.5i. I'd certainly rather not have to install OpenManage on each server in order to see health of fans and power supplies. It seems silly that VMWare thinks that ESXi customers want to see hardware health and not ESX customers. The code is obviously there... it should be added to ESX.

I always find it silly when the free version of a product has an area of more features/functionality than it's paid counterpart.

Just my 2 cents

Reply
0 Kudos
owjeff
Enthusiast
Enthusiast

The really strange thing is that I have 1 server where the full health status (including fans, power, etc.) is available, but 2 where only the Processors, Memory, and Disks are available. I am running ESX 3.5 on all three, not ESXi. The only difference is that the one showing everything is a Dell M600, and the other two are Dell 1955's. Also, I installed the M600 from the 3.5 Update 2 ISO recently, and the other two were built over a year ago with 3.5 RTM and then upgraded - all three are showing build 120512 now.

Reply
0 Kudos
warrenwalker
Enthusiast
Enthusiast

Check the SIM software revision - we had the same problem - one was an older version and once upgraded everything was viewable.

Reply
0 Kudos
stuten
Enthusiast
Enthusiast

Could you elaborate on upgrading the SIM software (do you mean CIM?) Either way, is it part of OpenManage or a firmware update, etc.

Thanks for the info

Reply
0 Kudos
ncentech
Enthusiast
Enthusiast

Has anybody dealt with the Sun x4450 regarding the Health Status? I only see a few counters like CPU and I cannot drill down. Do you know if I need to install something on the hardward or update the any of the firmware on it. I have other dell's and they showing evertyhing with drill downs and all. I guess is because Open manage is installed on those boxes. Either way I would like to hear your feedback. Thanks

Reply
0 Kudos
warrenwalker
Enthusiast
Enthusiast

I mean the Insight Manager agent. Its a software install from memory.

Reply
0 Kudos
warrenwalker
Enthusiast
Enthusiast

We use Sun x4600's and we were just having this very conversation. I havent mananged an install of U2 yet but id like to see what stats i get back but i can bet it wont be much....

Reply
0 Kudos
ncentech
Enthusiast
Enthusiast

Sounds good, please let me know I'm curious to see what you come back with. I'll see what else I find in the meantime.

Reply
0 Kudos
owjeff
Enthusiast
Enthusiast

We're using Dell's, so the SIM agent doesn't apply in our case. Regardless, I never installed any agent on the 1 box showing all the info - just installed from the ISO and added it to the cluster. The other two boxes may have OpenManage installed (the Dell equivalent of SIM). I'll look at installing the latest OM on all 3 and see if it gets me anywhere.

Reply
0 Kudos
stuten
Enthusiast
Enthusiast

I upgraded to the latest supported OpenManage listed as support on ESX today (5.4.0) on one of our servers and nothing changed -- I see CPU, Memory and Storage. Tomorrow I'm going to upgrade all the firmware and such to see if that has any effect.

Reply
0 Kudos
stuten
Enthusiast
Enthusiast

Ok, Here's what I've learned so far...

It appears my issue was that OpenManage and Pegasus are fighting. If I stop Pegasus and then restart it I get an error message. After looking in the log it has an error to the affect that the event is something or other to another provider. I uninstalled OpenManage but that didn't seem to help the Pegasus error so I decided to wipe the machine and install 3.5 u2 from scratch. When the server started Pegasus was happy and now I se proc, memory, storage, cable/interconnect, fan, power, chasis, watchdog, voltage, battery, temperature and software components.

Next I'm going to install the newest supported OpenManage and see what happens. Most of my hosts have been around for awhile and have had esx updated and older OpenManage clients on them. I'm hoping that something was left behind that caused the root issue. I'll update with my findings.

Reply
0 Kudos
stuten
Enthusiast
Enthusiast

Update:

After wiping and reinstalling and everything showing up I installed OpenManage 5.4.0 A1 and everything still works -- after rebooting as well. This is just a guess, but in the procedure that we use to setup a new host has been around since 2.5 when I installed our first ESX server. Now before you beat me up, I've updated it along the way but one piece that was out dated was installing OpenManage. In older version you had to install an openipmi piece prior to installing the OM agent. I think even our newer hosts have had this installed on them and the ipmi piece from Dell was conflicting with the ipmi piece now provided in ESX. It has never caused a problem with OM, but it appears it was causing a problem for Pegasus. With the new clean machine, with OM on it, the Pegasus service starts sub second.

This type thing is exactly why I use to just wipe and install new versions from the CD instead of ever trying to upgrade them. The only problem is that was feasible when I only had a few hosts, now with 18 hosts that would be a very time consuming task (and I can't imagine for you guys with many many more hosts than me). vMotion makes it possible, but large numbers makes it impractical.

Soooo, if you're running a piece of hardware that is certified to run ESXi I would certainly say that you should be able to get the Health Status information in VC. If you aren't, I'd look at things like management agents. If you're using them then it is possible that you have a conflict. If you're using Dell I can say that OM 5.4.0 A1 seems to not cause problems. Also, if you're familiar with Dell's SUU (it creates a DVD that you can use to update every firmware, bios, etc in the server) also was a way to update the IPMI on the hardware. The machine I did all my testing on was quite new (2950 III that is only 6 or so months old). I have not yet tried out some of my older Dells.

I'm not expert, but if you can't start the Pegasus service without it throwing errors then I'd say you aren't going to get everything to show up in Health Status. Although, I'm not saying that if it does run everything will "just work".

I'll stop rambling now.

Reply
0 Kudos
owjeff
Enthusiast
Enthusiast

Awesome work. There's no way I want to rebuild my ESX hosts from scratch, I'll mess with SUU and see if I can update the IPMI on my Dell 1955's.

Reply
0 Kudos