We are using nagios as monitoring for ESXi
Whenever we get this alert CHECK_NRPE: Socket timeout after 20 seconds.
We restart or stop/start the sfcbd-watchdog
But this happens a lot
Most of the time the restart fails and stop start will do the trick and the nagios alert will go away
So I tail the syslog.log and whenever the error appears I see this error
Then I have to do the stop start again of the service and alert is clear again
2014-09-04T22:59:36Z sfcb-ProviderManager[1098360]: SendMsg sending to 1 1098360-9 Bad file descriptor
2014-09-04T22:59:36Z sfcbd[1098566]: Error opening socket pair for getProviderContext: Too many open files
2014-09-04T22:59:36Z sfcbd[1098566]: Failed to set recv timeout (30) for socket -1. Errno = 9
2014-09-04T22:59:36Z sfcbd[1098566]: Failed to set timeout for local socket (e.g. provider)
2014-09-04T22:59:36Z sfcbd[1098566]: spGetMsg receiving from -1 1098566-9 Bad file descriptor
2014-09-04T22:59:36Z sfcbd[1098566]: rcvMsg receiving from -1 1098566-9 Bad file descriptor
2014-09-04T22:59:36Z sfcbd[1098566]: Error getting provider context from provider manager: 9 (1098566)
2014-09-04T22:59:36Z sfcb-ProviderManager[1098360]: SendMsg sending to 1 1098360-9 Bad file descriptor
2014-09-04T22:59:36Z sfcbd[1098566]: Failed to set send timeout (30) for socket 11. Errno = 9
2014-09-04T22:59:36Z sfcbd[1098566]: SendMsg sending to 11 1098566-9 Bad file descriptor
2014-09-04T22:59:36Z sfcbd[1098566]: spSendMsg sending to 11 1098566-9 Bad file descriptor
2014-09-04T22:59:43Z sfcbd[1098566]: Error opening socket pair for getProviderContext: Too many open files
2014-09-04T22:59:43Z sfcbd[1098566]: Failed to set recv timeout (30) for socket -1. Errno = 9
2014-09-04T22:59:43Z sfcbd[1098566]: Failed to set timeout for local socket (e.g. provider)
2014-09-04T22:59:43Z sfcbd[1098566]: spGetMsg receiving from -1 1098566-9 Bad file descriptor
2014-09-04T22:59:43Z sfcbd[1098566]: rcvMsg receiving from -1 1098566-9 Bad file descriptor
2014-09-04T22:59:43Z sfcb-ProviderManager[1098360]: SendMsg sendin
I was planning to to reinstall ESXi using HP customized image since once of the forums I saw says it disappear after using the customized image. And the same error I see in the forum
Does using the latest HP provides its own CIM bundle?
https://communities.vmware.com/message/2130426
I have HP Proliant DL385 G2