VMware Cloud Community
TylerDurden77
Enthusiast
Enthusiast

Strange problems with one my ESXi 4.1 U1 servers

Hi admins,

I have alot of problems with one of my HP DL385 G7 servers.

This host will randomly disconnect itself from my vCenter. (vCenter 4.1 u1)

When this happens i try to restart the managment agents from DCUI or by /sbin/services.sh restart.

But nothing happens!!??

I try /sbin/reboot but nothing still happens...

I have updated all the firmware  on this server according HP recommendations.

I have reinstalled the server 3 times with different media just to be sure...

The only way to solve the problem is to powercycle the server throw "push the button" and when i do this i will ofcourse have to shutdown my VMs Smiley Sad

Input from you before i raise a SR?

Regards

Tyler

0 Kudos
10 Replies
idle-jam
Immortal
Immortal

there are many possibilities but since you have access to create SR i would advise doing so as from the logs the identification of the root cause would be fast and precise. I'm tilting towards hardware related issue ..

0 Kudos
krishna_v78
Enthusiast
Enthusiast

Hi Tyler,

First thing, is your Management IP pingable? (during unresponsive time)

1. check for vpxd service status on ESX host,

2. check for errors in /var/log/messages

3. in vcenter look for any errors in Tasks and Events and also in logs.

Regards,

Balu.

0 Kudos
TylerDurden77
Enthusiast
Enthusiast

Hi,

Thanks for your replys,

Like idle-jam says it feel like a hardware problem... i agree but on this special servers i have changed the systemboard due to this problem which didnt help. (and i have 3 more servers with the same hardware spec that dont have this strange problem.)

Maybe i have to replace ram and CPU as well...  (All HP test says that everything is top notch)

HP DL 385 Proliant G7

AMD Opt 6174

144 GB ram

@krishna

The server is pingable all the time.

Can access throw SSH .

How can i check the vpxd service on the server when this problem occurs? (I cant use the vi client and the server is disconnected in vCenter)

Cant see anything special in the logs ( but im not such a good log interpreter i think Smiley Wink)

In vCenter i just see that the host is disconnected...

Regards

Tyler

0 Kudos
TylerDurden77
Enthusiast
Enthusiast

Update:

When i restart the managment agent from DCUI it seems to hang on "starting USB arbitrator" ....

Like before a powercycle will solve the problem...

0 Kudos
krishna_v78
Enthusiast
Enthusiast

Hi,

you can check the vpxd service using local or remote tech support mode.  service vpxd status

Moreover, also check the power options in BIOS, it looks like server is not able to return from sleep mode.

Regards,

Balu.

0 Kudos
TylerDurden77
Enthusiast
Enthusiast

Hi Balu,

See below.

Any ideas?

~ # chkconfig --list
DCUI           on
TSM-SSH        off
TSM            off
usbarbitrator  on
lbtd           on
storageRM      on
sensord        on
vprobed        on
vobd           on
wsman          on
slpd           on
sfcbd-watchdog on
sfcbd          off
ntpd           on
hostd          on
netlogond      off
lwiod          off
lsassd         off
iked           off
vmware-vpxa    on
~ # service vpxd status
-ash: service: not found

0 Kudos
krishna_v78
Enthusiast
Enthusiast

Hi,

Can you confirm, is  vcenter agent(vmware-vpxa) is in running state when its status changed to disconnect in vCenter server?

Try to remove Power saving settings in BIOS. This will help us in isolating the problem.

Regards,

Balu.

0 Kudos
TylerDurden77
Enthusiast
Enthusiast

Hi,

Found another who has the same problem like me..

This exactly describes my problem

--------------------------------------------------------------------------

http://www.experts-exchange.com/Software/VMWare/Q_26366554.html

My new ESXi 4.1 server is having major connectivity problems.  It will run fine for a day or so, then suddenly disconnect itself from  my vCenter server. All attempts reconnect fail. Restarting the  management agents works, but does nothing to actually bring the  connectivity back up.

I can ping the machine. I have changed the  IP to a different one, and I still couldn't connect. I have tried  removing it from vCenter but I cannot even add it back in. When I try to  add it back in I get the following error:

Call "Datacenter.QueryConnectionInfo" for object "Test ESXi" on vCenter Server "Test" failed.

I have tried to connect via DNS name, IP, and FQDN DNS name. I can ping the DNS name and it works fine. All firewalls between the machines are down. Like I said in the begining it will work for about a day or so, then disconnect. It will refuse to connect until rebooted. So the system works, but something on the server is crashing. The tests of the management network also works fine.

So really I have 2 parts to this  question. What is failing and how to fix it? How to recover without  rebooting my host. I imagine I could restart a service or something similar, but I am unfamiliar with the ESXi commands :(.

If  you need further logs please just ask. Also, I am running vCenter 4.1 U1,  on Server 2008 R2. So there should be no incompatibility between them.  Also, I cannot connect to the host via HTTP or HTTPS. It just times out. SSH works though

-----------------------------------------------------------------------------------------------

Any advice??

BR

Tyler

0 Kudos
TylerDurden77
Enthusiast
Enthusiast

Hi,

This morning my ESX server was disconnected from the vCenter again.

50% of my VMs on this host was non responsive.

At the console i pressed alt F12 :

See attached file.

What does this mean? Disk failure?

BR

Tyler

0 Kudos
CedricAnto
VMware Employee
VMware Employee

Looks like you are having storage issues, this will cause host to disconnect.

Also note that what you see in DCUI need not necessarily mean that it is hung at that particular process, in my experience its mostly the hostd(management agent) that can indefinitely if there are storage related issues.

If issue still persists, post the vmkernel log here

Cedric http://in.linkedin.com/in/cedricrajendran/ http://virtualknightz.com/
0 Kudos