I have a call outstanding with cm support. Indications so far and from looking at other posts is that it might be an issue with the HP agents. To solve, we think go to HP website and download the latest ISO image from there and apply it.... Ensure you back up your config first. Then use update manager to apply the latest patches. I have done this on my test cluster but since the failure is random I don't know if its fixed. I will apply to production tomorrow as well as updating the bios to the latest rev.
Is anyone seeing the following entries in /var/log/messages and/or syslog output?
UserThread: ###: Peer table full for sfcbd
World: vm #####: ####: WorldInit failed: trying to cleanup.
World: vm #####: ###: init fn user failed with: Out of resources!
We received an indication from VMware that these errors and quite possibly the instability are due to an issue with CIM. VMware provided steps to disable CIM and thus far the errors have not returned. We'll continue to monitor the stability on the BL460c G1's.
We have seen the same messages. Did you load the hp version of hp esxi ?
UserThread: 406: Peer table full for sfcbd
Apr 26 02:13:36 vmkernel: 0:00:01:18.420 cpu14:1682)WARNING:World: vm 1779: 911: init fn user failed with: Out of resources!
Apr 26 02:13:36 vmkernel: 0:00:01:18.420 cpu14:1682)WARNING: World: vm 1779: 1776: WorldInit failed: trying to cleanup.
Can you forward the instructions on disabling cim ?
Yes, we installed the latest version of ESX 3i U4 from HP (http://h20392.www2.hp.com/portal/swdepot/displayProductInfo.do?productNumber=HPVM05) on SAS drives and applied the 10-Apr and 29-Apr patches via VMware Update Manager.
We used the following steps to disable CIM:
1.) On each host, under the configuration tab, select Advanced Settings, select Misc, and set Misc.CIMEnabled to 0.
2.) Put host into maintenance mode
3.) Via unsupported mode (ALT-F1, type unsupported, enter root password)
a.) /etc/init.d/sfcbd-watchdog stop
b.) /etc/init.d/wsmand stop
c.) /etc/init.d/slpd stop
d.) Edit /etc/vmware/hostd/config.xml with VI
e.) Set the tag at path "plugins" -> "cimsvc" -> "enabled" to false
4.) Reboot the host via vCenter
If possible, I'd also open a case with VMware to better track and resolve this issue.
same issue here:
was running esx3.5 until i discovered esx3i. decided to migrate to esx3i.
the servers we are using are hp bl460c g1 blades. installed esx3i onto the quickly ordered hp usb flash drives. used the vmware installable and extracted the image cause i prefered to run without hp agents and wanted to get rid of the hp agents.
running esx3i u4 and the last two patches: ESXe350-200904201-O-SG and ESXe350-200904401-O-SG
all over sudden, one out of three servers appeared unreachable in vc. i could ping but not log in using the console f2. the backdoor alt-f1 worked. some guests responded to ping but most of them did not. the only way to resolve this was a reset of the server.
opened up a support call at vmware. engineer looked at the available logfiles (saved them away before i resetted the server) but was not able to find anything.
today, 2 days later, the same happend again!
as i dont use the hp agents this problem is not related to the agents!
though, i'm using hp usb flash drives! may be this usb drives have an issue? whats the story behind the replaced hp usb flash drives?
meanwhile i use remote syslog to atleast have logfiles available after a restart!
thinking of going back to esx3.5 and dump the idea of using esx3i embedded..
thanks for the quick reply!
no, my keys are black plastic!
checked my logfiles in the meantime and found the described messages:
2009-05-07 00:10:53 User.Error 10.90.4.152 LSIESG: LSIESG:INTERNAL :: StorelibManager::createDefaultSelfCheckSettings - failed to get TopLevelSystem
2009-05-07 00:10:53 User.Error 10.90.4.152 sfcbd: INTERNAL StorelibManager::createDefaultSelfCheckSettings - failed to get TopLevelSystem
2009-05-07 00:10:53 Local6.Warning 10.90.4.152 vmkernel: 0:03:15:30.941 cpu6:1713)WARNING: UserThread: 406: Peer table full for sfcbd
2009-05-07 00:10:53 Local6.Warning 10.90.4.152 vmkernel: 0:03:15:30.941 cpu6:1713)WARNING: World: vm 49111: 911: init fn user failed with: Out of resources!
2009-05-07 00:10:53 Local6.Warning 10.90.4.152 vmkernel: 0:03:15:30.941 cpu6:1713)WARNING: World: vm 49111: 1776: WorldInit failed: trying to cleanup.