VMware Cloud Community
WiEmWaerUser
Enthusiast
Enthusiast

Connection to management agent not working anymore - Howto find the reason(s)?

After some days, we can't connect anymore to the ESXi 4.1 on our new HP ML350 G6 from a vSphere Client (on Win XP). The guests are not affected, run fine. If we restart the management agent manually, the connection works again for the next couple of days.

How can we find out why this happens? We just know that it doesn't seems to make a difference, if there are one ore more virtual machines running as guests on this HP servers.

Can you please tell us, where we can have a look for the 'typical' or even known reasons for this behaviour?

Which logfiles should we check?

Which entries?

Which settings can be wrong on a fresh installed ESXi 4.1 or should be checked?

0 Kudos
2 Replies
GreatWhiteTec
VMware Employee
VMware Employee

I would start with the basics. The first thing I would look at is to make sure that you have the right agent. You mentioned that you recently upgraded to 4.1 so I am hoping/assuming that you upgraded vCenter, Update Manager, and then host (Yes, in that order).

I would also check to see if the heartbeat is consistent between vCenter and host. As with MSCS, having a separate network for hearbeat/mgmt is the best way to do it, but we all know sometimes is just not feasable.

You can also check the vCenter server and make sure it has enough resources (RAM, CPU, etc).

This should keep you busy for a while...

How to check for agent <http://kb.vmware.com/selfservice/microsites/search.do?cmd=displayKC&docType=kc&externalId=1003714&sliceId=1&docTypeID=DT_KB_1_1&dialogID=116984414&stateId=0 0 116982967>

0 Kudos
WiEmWaerUser
Enthusiast
Enthusiast

Thanks for your advice!

Yes, we use the matching agent.

We are not sure, what exactly do you mean with "consistent heartbeat". How can we check this?

A separate network is not possible in our situation.

The ressources seems to be enough, we can't find a bottleneck for RAM, CPU, Disk etc.

But we suspect that one of our selfmade checks could be the reason:

We check some values every minute from a downloaded MOB-webpage (parsing the needed values). If we do this only every 30 minutes, the crashes on the agent seems to be (nearly) gone.

Could it be, that a too frequent download of a MOB-page crashes the agent?

We can't see a relation between MOB and agent, is there any?

0 Kudos