VMware Cloud Community
Rynardts
Enthusiast
Enthusiast

ESX 3.5 vpxa agent keeps on shutting down

Hi All,

I've always been game for a good challenge, but this one is just doing my head in now.

Yesterday, one of our ESX hosts (let's call it ESX01) dropped off vCenter (Host not responding).

Normally, running service mgmt-vmware restart and service vmware-vpxa restart would fix the problem. However, this time it doesn't. After restarting the management and VC agents, it only connects to the VC for about 1 minute and then drops off again.

So, like any ESX admin, I jumped onto the host using SSH and started to read the logs files.

I first checked out /var/log/vmware/hostd.log. The last entries in the log shows (Note the line that says "Broken pipe"):

Task Completed : haTask-pool3-vim.ResourcePool.updateChildResourceConfiguration-122
Task Created : haTask-ha-root-pool-vim.ResourcePool.updateConfig-124
Task Completed : haTask-ha-root-pool-vim.ResourcePool.updateConfig-124
Task Created : haTask-ha-root-pool-vim.ResourcePool.updateConfig-125
Task Completed : haTask-ha-root-pool-vim.ResourcePool.updateConfig-125
VMLicense_GetInfo: feature not found: 'ESX_STARTER_BACKUP' v2005.05
Event 31 : User vpxuser@127.0.0.1 logged in
Event 32 : User vpxuser logged out
Event 33 : User vpxuser@127.0.0.1 logged in
Event 34 : User vpxuser logged out
Event 35 : User vpxuser@127.0.0.1 logged in
Failed to send response to the client: Broken pipe
Event 36 : User vpxuser@127.0.0.1 logged in
Event 37 : User vpxuser logged out
Event 38 : User vpxuser@127.0.0.1 logged in
Event 39 : User vpxuser logged out
Event 40 : User vpxuser@127.0.0.1 logged in
Failed to send response to the client: Broken pipe

I then looked at /var/log/vmware/vpx/vpxa.log and it shows:

Found diagnostic manager vim.DiagnosticManager:ha-diagnosticmgr+

Retrieved disk manager

Retrieved nfc service

Found AuthorizationManager vim.AuthorizationManager:ha-authmgr

Creating new singleton standalone impl

Starting constructor for VpxaHalResourcePoolHostTypeImpl...

Finished constructor for VpxaHalResourcePoolHostTypeImpl...

Received callback in WaitForUpdatesDone

-> eip 0x8ed4ffc

-> eip 0x8ed50b9

-> eip 0x8ed81a3

-> eip 0x8edb4b1

-> eip 0x8edb667

-> eip 0x8fa1942

-> eip 0x8f9ccf1

-> eip 0x8f9c842

-> eip 0x907f6cb

-> eip 0x8948627

-> eip 0x8946ce6

-> eip 0x54f79a

-> eip 0x8946781

Received unexpected error from property collector: at line number 14, not well-formed (invalid token)

eip 0x909fc52

eip 0x9045304

eip 0x9081775

eip 0x8ed4ffc

eip 0x8ed50b9

eip 0x8ed81a3

eip 0x8edb4b1

eip 0x8edb667

eip 0x8fa1942

eip 0x8f9ccf1

eip 0x8f9c842

eip 0x907f6cb

eip 0x8948627

eip 0x8946ce6

eip 0x54f79a

eip 0x8946781

Can't connect to hostd/serverd. Shutting down...

Shutting down now

By restarting the agents every 3 minutes, I then managed to get onto the host just long enough to migrate all VMs off the host. I then rebooted the host and the problems went away. However, I left it a couple of minutes and bang! Host ESX05 in the same cluster started doing exactly the same! I followed the same procedure with ESX05 and got that fixed but now ESX10 is doing it!

Has anyone come across this or a similar issue? I've checked VC agent licenses and their fine. I've also checked the ESX hosts local disk space, and there's heaps available!

One thing I have to mention though. You can use the VI Client to connect directly to the ESX host, so I know hostd is running fine. You just cannot use the VC to manage the server as vpxa keeps on shutting down.

Any help would be much appreciated.

Regards

Rynardt Spies

VCP / vExpert

Rynardt Spies VCP | VCAP-DCA#50 | VCAP-DCD#129 www.virtualvcp.com
Reply
0 Kudos
3 Replies
Lightbulb
Virtuoso
Virtuoso

Have you tried restarting vpxd on the VC server?

Also is there anything intresting in C:\Documents and Settings\All Users\Application Data\VMware\VMware VirtualCenter\Logs ?

Reply
0 Kudos
Rynardts
Enthusiast
Enthusiast

Hi,

Yes, that was one of the first things I did on Monday morning to try and see if it would resolve the issue. I also had to restart vpxd yesterday because of another issue. The logs on the VC look fine with no unexpected entries.

Last night I placed ESX10 in maintenance mode to ensure that VCB would be able to mount all VMs for backups. This morning I saw that ESX10's vpxa agent was still running fine. I placed the server back in production and it's been running fine without problems for about an hour now.

The problem seems to have sorted itself out for now. I'll monitor it throughout the day and week and report back with the results.

Thanks

Rynardt Spies

www.virtualvcp.com

VCP / vExpert

Rynardt Spies VCP | VCAP-DCA#50 | VCAP-DCD#129 www.virtualvcp.com
Reply
0 Kudos
arthurdent78
Contributor
Contributor

did you ever manage to get to the bottom of this?

I am seeing the exact same thing in my environment. I am a little suspicious of the NAS we attached as it all worked fine before that.

Any help would be appreshiated.

p.s.

I have obviously tried all that was suggested in this converstation. I noticed that when i log onto the host after I have restarted the service I can see "update child resource configuration" running constantly in the taks bar. It is realy running the host CPU high doing this and I can seem to find a way of stopping it. I was wondering if you were seeing the same thing?

Message was edited by: arthurdent78

Reply
0 Kudos