VMware Cloud Community
CosmosConsultin
Contributor
Contributor

ESX4 Host disconnects von vCenter within a few seconds

Hi all,

we set up 2 new Servers with 5500 Xeons and the same hardware. Each card (10GBE, Quad Port Lan, Adaptec 5805) is in the same PCIe slot on both Servers. both machines have ESX 4 installed with the same configuration.

One of the servers is running fine, but the other is always disconnecting from the vCenter with all VMs, of course they keep running. The VMs are showed with "(disconnected)" and the server with "(not responding)"

I have searched some days now but have just found problems witch i don't have. no Firewall is running at the vCenter, no firewall is running at the ESX (we have disabled it for testing). And there's no hardwarefirewall between the ESX and the vCenter.

And i really don't understand, why one of the ESX runs without problems and the other disconnects all the time from the vCenter. The server doesn't loos any ping over a few days, but disconnected many times.

the vpxa.log shows the following entries:

// begin

LicenseManagerChange Event fired

Increment master gen. no to (44174): LicenseManager:VpxaInvtHostLicenseManagerListener::LicenseChanged

Received callback in WaitForUpdatesDone

Applying updates from 316361 to 316362 (at 316361)

HostChanged Event Fired, properties changed []

Received callback in WaitForUpdatesDone

Applying updates from 316362 to 316363 (at 316362)

320: GuestInfo changed 'guest.disk'

VmGuestDiskChange Event for vm(19) 320

Guest DiskInfo Changed

CheckEnvBrowserChanges

CheckEnvBrowserChanges (took 22 ms)

Unexpected return result. Expect 1 sample, receive 2

Host CounterId 262165 has no value

Host CounterId 262165 has no value

Host CounterId 262165 has no value

Host CounterId 262168 has no value

Host CounterId 262168 has no value

Monitoring AAM health: vpxdDasStateOnLastInvocation(uninitialized) currentVpxdDasState(uninitialized) forceRunOfListNodes(0) isDasEnabled(0) skipOperation(1)

Received callback in WaitForUpdatesDone

Applying updates from 316363 to 316364 (at 316363)

208: GuestInfo changed 'guest.disk'

VmGuestDiskChange Event for vm(13) 208

Guest DiskInfo Changed

Received callback in WaitForUpdatesDone

Applying updates from 316364 to 316365 (at 316364)

752: GuestInfo changed 'guest.disk'

VmGuestDiskChange Event for vm(34) 752

Guest DiskInfo Changed

User agent is 'VMware-client/4.0.0'

HTTP Response: Client: NeedsContentLength: false UnderstandsChunking: true CanKeepAlive: true (PresetContentLength -1)

Invoking on session

-- BEGIN task-internal-18019 -- -- vpxapi.VpxaService.querySummaryStatistics -- 52afd4b7-d8da-9b65-8fad-bff731efbb2f

Could not translate vpxd counter 119, metric omitted.

Could not translate vpxd counter 120, metric omitted.

Could not translate vpxd counter 121, metric omitted.

...

Could not translate vpxd counter 214, metric omitted.

Could not translate vpxd counter 215, metric omitted.

Invoke done: vpxapi.VpxaService.querySummaryStatistics session: 52afd4b7-d8da-9b65-8fad-bff731efbb2f

HTTP Response: Complete (processed 32479 bytes)

-- FINISH task-internal-18019 -- -- vpxapi.VpxaService.querySummaryStatistics -- 52afd4b7-d8da-9b65-8fad-bff731efbb2f

Unexpected return result. Expect 1 sample, receive 2

Host CounterId 262165 has no value

Host CounterId 262165 has no value

Host CounterId 262165 has no value

Host CounterId 262168 has no value

Host CounterId 262168 has no value

Received callback in WaitForUpdatesDone

Applying updates from 316365 to 316366 (at 316365)

HostChanged Event Fired, properties changed []

Monitoring AAM health: vpxdDasStateOnLastInvocation(uninitialized) currentVpxdDasState(uninitialized) forceRunOfListNodes(0) isDasEnabled(0) skipOperation(1)

Received callback in WaitForUpdatesDone

Applying updates from 316366 to 316367 (at 316366)

208: GuestInfo changed 'guest.disk'

VmGuestDiskChange Event for vm(13) 208

Guest DiskInfo Changed

Received callback in WaitForUpdatesDone

Applying updates from 316367 to 316368 (at 316367)

144: GuestInfo changed 'guest.disk'

VmGuestDiskChange Event for vm(9) 144

Guest DiskInfo Changed

Unexpected return result. Expect 1 sample, receive 2

Host CounterId 262165 has no value

Host CounterId 262165 has no value

...

// end

I do not know, was the logs want to say with:

LicenseManagerChange Event fired

HostChanged Event Fired, properties changed []

If you've got an idea, just let me know. this VMs aren't in testing mode, but I can test nearly anything exept rebooting the ESX, because I can't use VMotion to move VMs 😕

Thanks,

Philipp

0 Kudos
5 Replies
prakashraj
Expert
Expert

Hi,

Check the below Knowledge Base

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=101164...

Prakash

If you found this information useful, please consider awarding points for "Correct" or "Helpful". Thanks!!!
0 Kudos
CosmosConsultin
Contributor
Contributor

Hi Prakash,

thx for the fast answer. In the .cfg File is the right vCenter and the workaround from the KB does not solve the problem.

Philipp

edit: all servers have been set up new and not updatet from a previous version, including the vCenter...

0 Kudos
admin
Immortal
Immortal

When the host goes not-responding does it stay like that indefinitely (until you manually reconnect) or does it reconnect by itself after a minute or so? If the latter, it might be that vpxa or hostd is crashing for some reason. Check if any of the vpxa or hostd logs end with a backtrace or some error.

0 Kudos
CosmosConsultin
Contributor
Contributor

hi eziskind, thanks for your answer.

yes, the host stays not-responding until i manually reconnect it. i have now made the following: i saved the vpxa and hostd log after reconnecting the esx. then the host went not-responding again and i copied the logs again. no change at the hostd log. the vpxa log shows some entries. i have attached the log changes.

0 Kudos
CosmosConsultin
Contributor
Contributor

Sometimes it could be so easy.

restarting the services "mgmt-vmware" and "vmware-vpxa" with the command "service restart" solved the problem.

we found a warning with the tag "VpxaHalStats" and searched after it. that was it Smiley Happy

thx at all...

0 Kudos