Solved: clean install, hardware monitoring service on this...

stoute · ‎08-26-2013

Hallo,

I have 3 servers running esxi5.1.0 version 1157734 updated form image 799733 managed by vsphere

On one of my DGL380 G6 servers have the error: hardware monitoring service on this host is not responding

what i have done to try to resolve the issue is:

/etc/init.d/sfcbd-watchdog restart
Go back to the Hardware Status tab on vCenter Server and click the Update link. It may take up to 5 minutes to refresh
stop the service and resart it
reset sensors
update the data
checked the firewall setting they are all the same

the issue is still the same

Does anyone have an idea how to fix this?

stoute · ‎08-28-2013

After i reinstalled the host the hardware status was good (used a new build iso. # 1065491)

after this i configured my host then I patched the host to # 1157734

then i check to see if the hardware status was OK but i noticed that all sensors where Normal but the HP ProLiant DL380 G6 was unknown after a minute is was Normal.

so not a real issue then went to install the dell multipathing module after this sensors where still good. before adding the host in the cluster i rebooted the server again

still no problem

before adding the host back into the cluster I removed the old host and then added the new host to the cluster.

now the hardware status tab is still OK

the issue seems to be resolved for now.

View solution in original post

ScreamingSilenc · ‎08-26-2013

Can you remove the host and re add to vCenter and check whether the issue get resolve.

OR

Check whether "CIM Server" is enabled running by traversing to to “Configuration“, select “Security Profile” under Software and click “Properties” next to “Services“.

Select “CIM Server” and hit the “Options” button and enable, select "Start and stop with host" policy and click the Start button.

Thanks

Please consider marking this answer "correct" or "helpful" if you found it useful.

stoute · ‎08-26-2013

I disconnected the host and reconnected the host restart de CIM server rest sensor still not working

i removed the host form cluster and re-added then i got the message:

Hardware monitoring service not responding, the host is not cennected

restarted the service CIM and reset sensors still not working got message no new data even after update

next step was restart management agent

after this step the server did not want to reconnect to the cluster so i gave the server a reboot.

after this the server could reconnect. when i whent back to the hardware status i got the message: hardware monitoring service not responding, the host is not connected.

when i checked the CIM Server services it was running. then i whent and reset sensors.

after this i got the message Hardware monitoring service on this host is not responding or not available

now i restarted the service then i got:

A general system error occourred: Invalid fault

followed by Call "HostServiceSystem.Restart" for object "serviceSystem-15403" on vCenter Server "servername-vsphere." failed.

then i stoped the service and started it again reset sensors but still service not resonding

ScreamingSilenc · ‎08-26-2013

Can you please answer below questions

On one of my DGL380 G6 servers have the error: hardware monitoring service on this host is not responding

1) So are the other two servers are also HP DL 380 G6, if it is whether sensor information is populating there ?

1) Was Hardware Status populating with sensor information on previous build i.e 799733 before upgrade.

2) Have you upgraded ESXi using HP Custom ISO Image.

Thanks

Please consider marking this answer "correct" or "helpful" if you found it useful.

stoute · ‎08-26-2013

1) I have 2 servers that are G6 servers. the other G6 does not have this issue.

2) The hardware status was available after the upgrade but is not now.(days later)

3) I used the images from the vmware site so no not HP image (install + upgrade)

ScreamingSilenc · ‎08-27-2013

Its strange behaviour, could you please provide syslog from all the host.

Are sensors working fine if you connect directly to the host using VI Client or vCenter Web Interface.

Thanks

Please consider marking this answer "correct" or "helpful" if you found it useful.

stoute · ‎08-27-2013

If i login directly to the host with a vsphere client i get this error when reset sensors

Call "HostHealthStatusSystem.ResetSystemHealthInfo" for object "healthStatusSystem" on ESXi "192.168.5.102" failed..ResetSystemHealthInfo" for object "healthStatusSystem" on ESXi "192.168.5.102" failed.

and if i restart the CIM service i get this error:

Call "HostServiceSystem.Restart" for object "serviceSystem" on ESXi "192.168.5.102" failed.

i also notice that i have this task every minute:

Recompute virtual disk digest Completed vpxuser

when i try to configue the log file location i get an error on the host with the issue:

Update option values A general system error occurred: Internal error

if i do the same with the other hosts i get no error

when i do this on the host directly i get:

Call "OptionManager.UpdateValues" for object "ha-adv-options" on ESXi failed.

stoute · ‎08-27-2013

i used ssh to get the logs from this host

and the other log files

stoute · ‎08-27-2013

in the log a noticed:

2013-08-26T13:09:32Z Unknown: out of memory [274552]

so now dont know if this will help but i gave the host some memory in the system resource reservation from 0mb to 892mb

when i do a restart service i noticed this:

</soapenv:Header>

<soapenv:Body>

<HostImageConfigGetAcceptance xmlns="urn:vim25"><_this type="HostImageConfigManager">ha-image-config-manager</_this></HostImageConfigGetAcceptance>

</soapenv:Body>

</soapenv:Envelope>^@

2013-08-27T11:14:41Z ImageConfigManager: [2013-08-27 11:14:41,606 vmware.runcommand INFO] runcommand called with: args = '['/sbin/esxcfg-advcfg', '-U', 'host-

2013-08-27T11:14:41Z ImageConfigManager: [2013-08-27 11:14:41,646 root DEBUG] <?xml version="1.0" encoding="UTF-8"?><soapenv:Envelope xmlns:soapenc="http:

<soapenv:Body><HostImageConfigGetAcceptanceResponse xmlns='urn:vim25'><returnval>partner</returnval></HostImageConfigGetAcceptanceResponse></soapenv:Body></so

2013-08-27T11:14:54Z sfcbd-watchdog: Terminating watchdog process with PID 59406

2013-08-27T11:14:54Z sfcbd-watchdog: stopping sfcbd pid

2013-08-27T11:14:54Z sfcbd: Sending TERM signal to sfcbd

2013-08-27T11:14:54Z EHCMD: ***ERR***:StopService:440:Stopping EHCM service

2013-08-27T11:14:54Z EHCMD: ***ERR***:DellEqlProviderCleanup:125:CIM Provider is unloading

2013-08-27T11:14:54Z EHCMD: ***ERR***:DellEqlProviderMethodCleanup:390:SFCB requested to unload our provider

2013-08-27T11:14:54Z EHCMD: ***ERR***:DellEqlProviderMethodCleanup:395:CIM Provider is unloading

2013-08-27T11:14:54Z EHCMD: ***ERR***:Poll:242:Error polling driver: 0x4

2013-08-27T11:14:54Z sfcbd: Sending TERM signal to sfcbd

2013-08-27T11:14:54Z sfcbd-watchdog: Sleeping for 20 seconds

2013-08-27T11:15:01Z crond[8695]: crond: USER root pid 62719 cmd /sbin/hostd-probe

2013-08-27T11:15:02Z syslog[62720]: starting hostd probing.

2013-08-27T11:15:14Z sfcbd-watchdog: Providers have terminated, lets kill the sfcbd.

2013-08-27T11:15:14Z sfcbd: Stopping sfcbd

2013-08-27T11:15:14Z sfcb-CIMXML-Processor[62751]: [62751] Semop failed (semid:655362 sem_num:4 sem_op:-1 sem_flg:4096 nsops:1) rc -1, error 22:Invalid argume

2013-08-27T11:15:14Z sfcbd-watchdog: Sleeping for 20 seconds

2013-08-27T11:15:17Z syslog[62720]: hostd probing is done.

apenv:Envelope xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/"

xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"

xmlns:xsd="http://www.w3.org/2001/XMLSchema"

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

<soapenv:Header>

<taskKey xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:vim25" versionId="dev2" xsi:type="xs

</soapenv:Header>

<soapenv:Body>

<HostImageConfigGetAcceptance xmlns="urn:vim25"><_this type="HostImageConfigManager">ha-image-config-manager</_this></HostImageConfigGetAcceptance>

</soapenv:Body>

</soapenv:Envelope>^@

2013-08-27T11:14:41Z ImageConfigManager: [2013-08-27 11:14:41,606 vmware.runcommand INFO] runcommand called with: args = '['/sbin/esxcfg-advcfg', '-U', 'ho

2013-08-27T11:14:41Z ImageConfigManager: [2013-08-27 11:14:41,646 root DEBUG] <?xml version="1.0" encoding="UTF-8"?><soapenv:Envelope xmlns:soapenc="ht

<soapenv:Body><HostImageConfigGetAcceptanceResponse xmlns='urn:vim25'><returnval>partner</returnval></HostImageConfigGetAcceptanceResponse></soapenv:Body><

2013-08-27T11:14:54Z sfcbd-watchdog: Terminating watchdog process with PID 59406

2013-08-27T11:14:54Z sfcbd-watchdog: stopping sfcbd pid

2013-08-27T11:14:54Z sfcbd: Sending TERM signal to sfcbd

2013-08-27T11:14:54Z EHCMD: ***ERR***:StopService:440:Stopping EHCM service

2013-08-27T11:14:54Z EHCMD: ***ERR***:DellEqlProviderCleanup:125:CIM Provider is unloading

2013-08-27T11:14:54Z EHCMD: ***ERR***:DellEqlProviderMethodCleanup:390:SFCB requested to unload our provider

2013-08-27T11:14:54Z EHCMD: ***ERR***:DellEqlProviderMethodCleanup:395:CIM Provider is unloading

2013-08-27T11:14:54Z EHCMD: ***ERR***:Poll:242:Error polling driver: 0x4

2013-08-27T11:14:54Z sfcbd: Sending TERM signal to sfcbd

2013-08-27T11:14:54Z sfcbd-watchdog: Sleeping for 20 seconds

2013-08-27T11:15:01Z crond[8695]: crond: USER root pid 62719 cmd /sbin/hostd-probe

2013-08-27T11:15:02Z syslog[62720]: starting hostd probing.

2013-08-27T11:15:14Z sfcbd-watchdog: Providers have terminated, lets kill the sfcbd.

2013-08-27T11:15:14Z sfcbd: Stopping sfcbd

2013-08-27T11:15:14Z sfcb-CIMXML-Processor[62751]: [62751] Semop failed (semid:655362 sem_num:4 sem_op:-1 sem_flg:4096 nsops:1) rc -1, error 22:Invalid arg

2013-08-27T11:15:14Z sfcbd-watchdog: Sleeping for 20 seconds

2013-08-27T11:15:17Z syslog[62720]: hostd probing is done.

could this have something to do with the dell multipathing module i have installed (dell-eql-mem-esx5-1.1.2.292203)

ScreamingSilenc · ‎08-27-2013

Your right there might be two issue now one your system RAM disk is full and second your Dell multipathing module is causing the sfcbd kill.

from the syslog

2013-08-26T13:52:33Z sfcb-vmware_base[11686]: *** dlopen error: libhd.so.18: failed to map segment from shared object: No space left on device

2013-08-26T13:52:33Z sfcb-vmware_base[11686]: *** Failed to load libomc_smbios_provider.so for SMBIOSChassisProvider ((null)

1) Can you check your RAM disk space whether its full.

# esxcli system visorfs ramdisk list

2) To root cause this issue can you remove your Dell Multipath module and then reboot the host and the check

a) Ramdisk is freed up

b) Then check whether CIM Server is running and Hardware Status is populating.

To list all installed software

esxcli software vib list

To remove the "provider-vib"

esxcli software vib remove -n <provider-vib-name>

Thanks

Please consider marking this answer "correct" or "helpful" if you found it useful.

stoute · ‎08-27-2013

1)

~ # esxcli system visorfs ramdisk list

Ramdisk Name System Reserved Maximum Used Peak Used Free Reserved Free Maximum Inodes Allocated Inodes Used Inodes Mount Point

------------ ------ --------- ---------- -------- --------- ---- ------------- -------------- ---------------- ----------- ---------------------------

root true 32768 KiB 32768 KiB 460 KiB 476 KiB 98 % 98 % 8192 4096 2516 /

etc true 28672 KiB 28672 KiB 280 KiB 316 KiB 99 % 99 % 4096 1024 445 /etc

tmp false 2048 KiB 196608 KiB 4 KiB 344 KiB 99 % 99 % 8192 256 3 /tmp

hostdstats false 0 KiB 454656 KiB 7532 KiB 7532 KiB 98 % 0 % 8192 32 4 /var/lib/vmware/hostd/stats

~ #

2) i'll have to check this later today.

stoute · ‎08-27-2013

I just removed the Dell Multipath module

after reseting the sensors i can see the hardware status on the esxi when login in directly.

i checked the free space

then i checked from the cluster to see if this is the same.

i had to reset the sensors again because i got the message no new data available

after the reset got the same error.

so now i tried to see if i could configure the log disk for the logs

this worked now.

it seems that i'm getting some kind of timeout error

if i restart the service from the host i'm getting a timeout error again

if i restart the service directly from the host no probleem.

now i removed the host from cluster and added it again to see if this works but no does not help the issue.

<soapenv:Envelope xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/"

xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"

xmlns:xsd="http://www.w3.org/2001/XMLSchema"

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

<soapenv:Header>

<taskKey xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:vim25" versionId="dev2" xsi:type="xsd:s

</soapenv:Header>

<soapenv:Body>

<HostImageConfigGetProfile xmlns="urn:vim25"><_this type="HostImageConfigManager">ha-image-config-manager</_this></HostImageConfigGetProfile>

</soapenv:Body>

</soapenv:Envelope>^@

2013-08-27T14:07:45Z ImageConfigManager: [2013-08-27 14:07:45,723 vmware.runcommand INFO] runcommand called with: args = '['/sbin/esxcfg-advcfg', '-U', 'host-

2013-08-27T14:07:45Z ImageConfigManager: [2013-08-27 14:07:45,761 imageprofile INFO] Adding VIB VMware_locker_tools-light_5.1.0-1.16.1157734 to ImageProfile (

2013-08-27T14:07:45Z ImageConfigManager: [2013-08-27 14:07:45,762 root DEBUG] <?xml version="1.0" encoding="UTF-8"?><soapenv:Envelope xmlns:soapenc="http:

<soapenv:Body><HostImageConfigGetProfileResponse xmlns='urn:vim25'><returnval><name>(Updated) ESXi-5.1.0-799733-standard</name><vendor>esx-dgl02</vendor></ret

2013-08-27T14:09:16Z watchdog-vpxa: [13388] Begin '/usr/lib/vmware/vpxa/bin/vpxa ++min=0,swapscope=system,group=vpxa -D /etc/vmware/vpxa', min-uptime = 60, ma

2013-08-27T14:09:16Z watchdog-vpxa: Executing '/usr/lib/vmware/vpxa/bin/vpxa ++min=0,swapscope=system,group=host/vim/vmvisorswap/vpxa -D /etc/vmware/vpxa'

2013-08-27T14:09:47Z ImageConfigManager: [2013-08-27 14:09:47,662 vmware.runcommand INFO] runcommand called with: args = '['/sbin/bootOption', '-rp']', outfil

2013-08-27T14:09:47Z ImageConfigManager: [2013-08-27 14:09:47,668 vmware.runcommand INFO] runcommand called with: args = '['/sbin/bootOption', '-ro']', outfil

2013-08-27T14:09:47Z ImageConfigManager: [2013-08-27 14:09:47,693 root DEBUG] <?xml version="1.0" encoding="UTF-8"?>

<soapenv:Envelope xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/"

xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"

xmlns:xsd="http://www.w3.org/2001/XMLSchema"

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

<soapenv:Header>

<taskKey xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:vim25" versionId="dev2" xsi:type="xsd:s

</soapenv:Header>

<soapenv:Body>

<HostImageConfigGetAcceptance xmlns="urn:vim25"><_this type="HostImageConfigManager">ha-image-config-manager</_this></HostImageConfigGetAcceptance>

</soapenv:Body>

</soapenv:Envelope>^@

2013-08-27T14:09:47Z ImageConfigManager: [2013-08-27 14:09:47,694 vmware.runcommand INFO] runcommand called with: args = '['/sbin/esxcfg-advcfg', '-U', 'host-

2013-08-27T14:09:47Z ImageConfigManager: [2013-08-27 14:09:47,732 root DEBUG] <?xml version="1.0" encoding="UTF-8"?><soapenv:Envelope xmlns:soapenc="http:

<soapenv:Body><HostImageConfigGetAcceptanceResponse xmlns='urn:vim25'><returnval>partner</returnval></HostImageConfigGetAcceptanceResponse></soapenv:Body></so

2013-08-27T14:10:00Z sfcbd-watchdog: Terminating watchdog process with PID 12950

2013-08-27T14:10:00Z sfcbd-watchdog: stopping sfcbd pid

2013-08-27T14:10:00Z sfcbd: Sending TERM signal to sfcbd

ScreamingSilenc · ‎08-27-2013

These problem is still because of Ramdisk full since its 98% full, can you check "/" root and try to delete any unwanted files or logs.

KB: VMware KB: Freeing ESXi inodes

VMware KB: RAM disk is full

Please consider marking this answer "correct" or "helpful" if you found it useful.

stoute · ‎08-28-2013

I think i don't understand what you mean if i run the

vdf -h

Ramdisk Size Used Available Use% Mounted on

root 32M 808K 31M 2% --

etc 28M 280K 27M 0% --

tmp 192M 4K 191M 0% --

hostdstats 444M 6M 437M 1% --

All disks have space.

I emptied all the log files.

If i compare them to an other host that works they have about the same space free.

after restarting the service and reset sensors i'm now seeing processor and memory showing up in the hardware status tab

ScreamingSilenc · ‎08-28-2013

Sorry my bad I haven't checked the earlier screenshot properly.

After un-installing the Dell module have you restarted the host or hostd service.

Thanks

Please consider marking this answer "correct" or "helpful" if you found it useful.

stoute · ‎08-28-2013

yes i just rebooted the host but still not working

when i hit the restart service button i get this in the syslog

2013-08-28T11:24:04Z sfcb-CIMXML-Processor[11100]: [11100] Semop failed (semid:262146 sem_num:4 sem_op:-1 sem_flg:4096 nsops:1) rc -1, error 22:Invalid argument. Exiting...

2013-08-28T11:24:04Z sfcb-CIMXML-Processor[11107]: [11107] Semop failed (semid:262146 sem_num:4 sem_op:-1 sem_flg:4096 nsops:1) rc -1, error 22:Invalid argument. Exiting...

2013-08-28T11:24:04Z sfcbd-watchdog: Sleeping for 20 seconds

start and then stopping works

2013-08-28T11:24:24Z sfcbd-watchdog: Providers have terminated, lets kill the sfcbd.

2013-08-28T11:24:24Z sfcbd: Stopping sfcbd

2013-08-28T11:24:24Z sfcbd-watchdog: Watchdog active: interval 60 seconds, pid 11137

2013-08-28T11:24:24Z sfcbd-watchdog: starting sfcbd

2013-08-28T11:24:25Z sfcbd: Starting sfcbd

2013-08-28T11:24:25Z sfcb-sfcb[11270]: --- Log syslog level: 3

2013-08-28T11:24:30Z cimslp: --- Using /etc/sfcb/sfcb.cfg

2013-08-28T11:24:37Z cimslp: Found 17 profiles in namespace root/interop

2013-08-28T11:25:01Z crond[8677]: crond: USER root pid 11307 cmd /sbin/hostd-probe

2013-08-28T11:25:01Z syslog[11308]: starting hostd probing.

2013-08-28T11:25:16Z syslog[11308]: hostd probing is done.

2013-08-28T11:25:22Z sfcbd-watchdog: Terminating watchdog process with PID 11137

2013-08-28T11:25:22Z sfcbd-watchdog: stopping sfcbd pid

2013-08-28T11:25:22Z sfcbd: Sending TERM signal to sfcbd

2013-08-28T11:25:22Z sfcbd-watchdog: Providers have terminated, lets kill the sfcbd.

2013-08-28T11:25:22Z sfcbd: Stopping sfcbd

2013-08-28T11:25:34Z sfcbd-watchdog: Watchdog active: interval 60 seconds, pid 11384

2013-08-28T11:25:34Z sfcbd-watchdog: starting sfcbd

2013-08-28T11:25:34Z sfcbd: Starting sfcbd

2013-08-28T11:25:35Z sfcb-sfcb[11518]: --- Log syslog level: 3

2013-08-28T11:25:40Z cimslp: --- Using /etc/sfcb/sfcb.cfg

2013-08-28T11:25:47Z cimslp: Found 17 profiles in namespace root/interop

2013-08-28T11:26:17Z sfcb-ProviderManager[11518]: cimRPmodule -- Failed to set container group 6208 parent to 2635. Admission check failed for memory resource : 195887233

2013-08-28T11:26:17Z sfcb-ProviderManager[11518]: Cannot move provider 11567 to group vmware_raw

2013-08-28T11:26:17Z sfcb-ProviderManager[11518]: cimRPmodule -- Failed to set container group 6210 parent to 2635. Admission check failed for memory resource : 195887233

2013-08-28T11:26:17Z sfcb-ProviderManager[11518]: Cannot move provider 11568 to group vmware_raw

2013-08-28T11:26:17Z sfcb-ProviderManager[11518]: cimRPmodule -- Failed to set container group 6212 parent to 2635. Admission check failed for memory resource : 195887233

2013-08-28T11:26:17Z sfcb-ProviderManager[11518]: Cannot move provider 11569 to group vmware_raw

2013-08-28T11:26:17Z sfcb-vmware_base[11560]: rcvMsg receiving from 46 11560-12 Cannot allocate memory

2013-08-28T11:26:17Z sfcb-vmware_base[11560]: --- spRcvMsg drop bogus request chunking 13468 payLoadSize 4 chunkSize

2013-08-28T11:26:17Z sfcb-vmware_base[11560]: --- spRcvMsg drop bogus request chunking 336 payLoadSize 6420 chunkSize

2013-08-28T11:26:17Z sfcb-vmware_base[11560]: --- spRcvMsg drop bogus request chunking 55 payLoadSize 132 chunkSize

2013-08-28T11:26:17Z sfcb-vmware_base[11560]: --- spRcvMsg drop bogus request chunking 5 payLoadSize 0 chunkSize

2013-08-28T11:26:17Z sfcb-vmware_base[11560]: --- spRcvMsg drop bogus request chunking 0 payLoadSize 1 chunkSize

2013-08-28T11:26:17Z sfcb-vmware_base[11560]: --- spRcvMsg drop bogus request chunking 11 payLoadSize 5888 chunkSize

2013-08-28T11:26:17Z sfcb-vmware_base[11560]: ### 11560 ??? 5632-256

2013-08-28T11:26:17Z sfcb-vmware_base[11560]: --- spRcvMsg drop bogus request chunking 25 payLoadSize 0 chunkSize

2013-08-28T11:26:17Z sfcb-vmware_base[11560]: --- spRcvMsg drop bogus request chunking 0 payLoadSize 0 chunkSize

2013-08-28T11:26:17Z sfcb-vmware_base[11560]: spGetMsg receiving from 46 11560-14 Bad address

2013-08-28T11:26:17Z sfcb-vmware_base[11560]: Error getting partial message, freeing allocated data

2013-08-28T11:26:17Z sfcb-vmware_base[11560]: rcvMsg receiving from 46 11560-14 Bad address

2013-08-28T11:26:17Z sfcb-vmware_base[11560]: --- spRcvMsg drop bogus request chunking 8336 payLoadSize 65537 chunkSize

2013-08-28T11:26:17Z sfcb-vmware_base[11560]: --- spRcvMsg drop bogus request chunking 4 payLoadSize 2100 chunkSize

2013-08-28T11:26:17Z sfcb-vmware_base[11560]: --- spRcvMsg drop bogus request chunking 2164 payLoadSize 0 chunkSize

2013-08-28T11:26:17Z sfcb-vmware_base[11560]: ### 11560 ??? 5632-256

2013-08-28T11:26:38Z sfcb-vmware_base[11560]: Exceeded number of tries waiting for control message to be consumed. Control message discarded.

2013-08-28T11:26:38Z sfcb-vmware_base[11560]: --- spRcvMsg drop bogus request chunking 97 payLoadSize 0 chunkSize

2013-08-28T11:26:38Z sfcb-vmware_base[11560]: --- spRcvMsg drop bogus request chunking 0 payLoadSize 0 chunkSize

2013-08-28T11:26:38Z sfcb-vmware_base[11560]: spGetMsg receiving from 46 11560-14 Bad address

2013-08-28T11:26:38Z sfcb-vmware_base[11560]: Error getting partial message, freeing allocated data

2013-08-28T11:26:38Z sfcb-vmware_base[11560]: rcvMsg receiving from 46 11560-14 Bad address

2013-08-28T11:26:38Z sfcb-vmware_base[11560]: --- spRcvMsg drop bogus request chunking 5632 payLoadSize 196611 chunkSize

2013-08-28T11:26:38Z sfcb-vmware_base[11560]: --- spRcvMsg drop bogus request chunking 4 payLoadSize 2452 chunkSize

2013-08-28T11:26:38Z sfcb-vmware_base[11560]: --- spRcvMsg drop bogus request chunking 2516 payLoadSize 0 chunkSize

2013-08-28T11:26:38Z sfcb-vmware_base[11560]: ### 11560 ??? 5632-256

2013-08-28T11:27:19Z sfcb-vmware_base[11560]: ERROR READING FROM SOCKET

ScreamingSilenc · ‎08-28-2013

I don't really know what's causing this problem, I would suggest you to Open a case with VMware Support team, this might be a bug in your case.

Thanks,

Please consider marking this answer "correct" or "helpful" if you found it useful.

stoute · ‎08-28-2013

Well just started with a new install of the esxi to see if this will resolve my issue.

Thanks for the help and input

stoute · ‎08-28-2013

After i reinstalled the host the hardware status was good (used a new build iso. # 1065491)

after this i configured my host then I patched the host to # 1157734

then i check to see if the hardware status was OK but i noticed that all sensors where Normal but the HP ProLiant DL380 G6 was unknown after a minute is was Normal.

so not a real issue then went to install the dell multipathing module after this sensors where still good. before adding the host in the cluster i rebooted the server again

still no problem

before adding the host back into the cluster I removed the old host and then added the new host to the cluster.

now the hardware status tab is still OK

the issue seems to be resolved for now.

ScreamingSilenc · ‎08-28-2013

Good to know that, if you have the logs you can still go ahead and report this issue with VMware can be potentially a bug

Thanks

Please consider marking this answer "correct" or "helpful" if you found it useful.

All

clean install, hardware monitoring service on this host is not responding