VMware Cloud Community
tla-scc
Contributor
Contributor

Host disconnect from VC

I have a host that intermittently generate an alert that it has lost connection to VC even though it apparently has not lost connection. VC never issued a connection restored alert either. Here's what I see in the vpxd.log on the VC server at the time of the event.

-- FINISH task-internal-79926 -- host-1603 -- VpxdInvtHostSyncHostLRO.Synchronize

VpxdClientAdapter::InvokeOnSoap: (123.domain.com, vpxapi.VpxaService.setConfig) took 10109 ms

Got vmacore exception: No connection could be made because the target machine actively refused it.

Backtrace:

backtrace[00] eip 0x01219096 ?GenerateCoreDump@System@Vmacore@@YAXXZ

backtrace[01] eip 0x01166bba ?CreateBacktrace@SystemFactoryImpl@System@Vmacore@@UAEXAAV?$Ref@VBacktrace@System@Vmacore@@@3@@Z

backtrace[02] eip 0x011401de ??0Throwable@Vmacore@@QAE@ABV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@Z

backtrace[03] eip 0x01228126 ?ListDirectory@File@System@Vmacore@@SAHABV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@PAPAPAD@Z

backtrace[04] eip 0x0122a1d1 ?ListDirectory@File@System@Vmacore@@SAHABV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@PAPAPAD@Z

backtrace[05] eip 0x0122b178 ?ListDirectory@File@System@Vmacore@@SAHABV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@PAPAPAD@Z

backtrace[06] eip 0x0117968a ?CreateHTTPDocRequestHandler@Http@Vmacore@@YAXABV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@ABV?$map@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@V12@U?$less@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@2@V?$allocator@U?$pair@$$CBV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@V12@@std@@@2@@4@AAV?$Ref@VRequestHandler@Http@Vmacore@@@2@@Z

backtrace[07] eip 0x0117bab5 ?CreateHttpConnectionPool@Http@Vmacore@@YAXHAAV?$Ref@VHttpConnectionPool@Http@Vmacore@@@2@@Z

backtrace[08] eip 0x0117bdbe ?CreateHttpConnectionPool@Http@Vmacore@@YAXHAAV?$Ref@VHttpConnectionPool@Http@Vmacore@@@2@@Z

backtrace[09] eip 0x0117bef9 ?CreateHttpConnectionPool@Http@Vmacore@@YAXHAAV?$Ref@VHttpConnectionPool@Http@Vmacore@@@2@@Z

backtrace[10] eip 0x0117fb5f ?CreateHttpRequest@Http@Vmacore@@YAXW4HttpMethod@12@ABV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@AAV?$Ref@VClientRequest@Http@Vmacore@@@2@@Z

backtrace[11] eip 0x0117d15d ?CreateHttpConnectionPool@Http@Vmacore@@YAXHAAV?$Ref@VHttpConnectionPool@Http@Vmacore@@@2@@Z

backtrace[12] eip 0x0117db19 ?CreateHttpUserAgent@Http@Vmacore@@YAXPAVConnectionSpec@12@PAVCookieStore@12@PAVHttpConnectionPool@12@AAV?$Ref@VUserAgent@Http@Vmacore@@@2@@Z

backtrace[13] eip 0x0117f56d ?CreateHttpRequest@Http@Vmacore@@YAXW4HttpMethod@12@ABV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@AAV?$Ref@VClientRequest@Http@Vmacore@@@2@@Z

backtrace[14] eip 0x0117f8ea ?CreateHttpRequest@Http@Vmacore@@YAXW4HttpMethod@12@ABV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@AAV?$Ref@VClientRequest@Http@Vmacore@@@2@@Z

backtrace[15] eip 0x01196a79 ?CreateReverseHttpProxy@Http@Vmacore@@YAXPAVLogger@Service@2@PAVServerSocket@System@2@PAVSSLContext@Ssl@2@HAAV?$Ref@VProxySvc@Http@Vmacore@@@2@@Z

backtrace[16] eip 0x0117ad57 ?CreateHTTPDocRequestHandler@Http@Vmacore@@YAXABV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@ABV?$map@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@V12@U?$less@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@2@V?$allocator@U?$pair@$$CBV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@V12@@std@@@2@@4@AAV?$Ref@VRequestHandler@Http@Vmacore@@@2@@Z

backtrace[17] eip 0x01196a79 ?CreateReverseHttpProxy@Http@Vmacore@@YAXPAVLogger@Service@2@PAVServerSocket@System@2@PAVSSLContext@Ssl@2@HAAV?$Ref@VProxySvc@Http@Vmacore@@@2@@Z

backtrace[18] eip 0x01178555 ?CreateHTTPDocRequestHandler@Http@Vmacore@@YAXABV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@ABV?$map@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@V12@U?$less@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@2@V?$allocator@U?$pair@$$CBV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@V12@@std@@@2@@4@AAV?$Ref@VRequestHandler@Http@Vmacore@@@2@@Z

backtrace[19] eip 0x01252868 ?GetMonth@DateTime@System@Vmacore@@QBEHXZ

backtrace[20] eip 0x011c26a3 ?CreateSSLContext@Ssl@Vmacore@@YAXPAVKeyStore@Crypto@2@W4SupportedVersion@12@AAV?$Ref@VSSLContext@Ssl@Vmacore@@@2@@Z

backtrace[21] eip 0x011c561b ?CreateSSLContext@Ssl@Vmacore@@YAXPAVKeyStore@Crypto@2@W4SupportedVersion@12@AAV?$Ref@VSSLContext@Ssl@Vmacore@@@2@@Z

backtrace[22] eip 0x01196a79 ?CreateReverseHttpProxy@Http@Vmacore@@YAXPAVLogger@Service@2@PAVServerSocket@System@2@PAVSSLContext@Ssl@2@HAAV?$Ref@VProxySvc@Http@Vmacore@@@2@@Z

backtrace[23] eip 0x01178555 ?CreateHTTPDocRequestHandler@Http@Vmacore@@YAXABV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@ABV?$map@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@V12@U?$less@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@2@V?$allocator@U?$pair@$$CBV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@V12@@std@@@2@@4@AAV?$Ref@VRequestHandler@Http@Vmacore@@@2@@Z

backtrace[24] eip 0x01252868 ?GetMonth@DateTime@System@Vmacore@@QBEHXZ

backtrace[25] eip 0x01229bc2 ?ListDirectory@File@System@Vmacore@@SAHABV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@PAPAPAD@Z

backtrace[26] eip 0x01196a79 ?CreateReverseHttpProxy@Http@Vmacore@@YAXPAVLogger@Service@2@PAVServerSocket@System@2@PAVSSLContext@Ssl@2@HAAV?$Ref@VProxySvc@Http@Vmacore@@@2@@Z

backtrace[27] eip 0x01178555 ?CreateHTTPDocRequestHandler@Http@Vmacore@@YAXABV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@ABV?$map@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@V12@U?$less@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@2@V?$allocator@U?$pair@$$CBV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@V12@@std@@@2@@4@AAV?$Ref@VRequestHandler@Http@Vmacore@@@2@@Z

backtrace[28] eip 0x012201a4 ?IsEnlisted@System@Vmacore@@YA_NXZ

backtrace[29] eip 0x012214a7 ?IsEnlisted@System@Vmacore@@YA_NXZ

backtrace[30] eip 0x0121cdae ?GetThreadId@System@Vmacore@@YA_JXZ

backtrace[31] eip 0x7c349565 _endthreadex

backtrace[32] eip 0x77e64829 GetModuleHandleA

Enter DAS_PROFILE UpdateDasStatus

-- BEGIN task-internal-79927 -- host-4294 -- VpxdInvtHostSyncHostLRO.Synchronize

Exit DAS_PROFILE UpdateDasStatus (109 ms)

**WARNING** Lock datacenter-2 mode EXCLUSIVE held for 1281 ms

**WARNING** Lock domain-c14 mode EXCLUSIVEALL held for 1281 ms

**WARNING** Lock domain-c14 mode EXCLUSIVEALL held for 1281 ms

Can anyone give me an idea what is going on here and the best way to resolve?

TIA

0 Kudos
4 Replies
admin
Immortal
Immortal

Does this only happen to one ESX server?

Does the problem occur randomly or at regular predictable time intervals?

Have you restarted vpxa and hostd/mgmt-vmware on the ESX server?

service mgmt-vmware restart

service vmware-vpxa restart

First disable "Virtual Machine Startup/Shutdown" via the "Configuration" tab before running the above command or your virtual machines may reboot.

Are there any errors in the following log files on the ESX server round the same time that vpxd reports the error?

/var/log/vmare/vpx/vpxa.log

/var/log/vmware/hostd.log

/var/log/messages

/var/log/secure

On the ESX server in question:

Are any of your partitions full or near-full? df -h

Are you swapping heavily? free -m

Can you paste the contents of the file /etc/hosts

Can you paste the output of the commands hostname and hostname -i

Message was edited by: appk

0 Kudos
tla-scc
Contributor
Contributor

Yes i've restart mgmt-vmware & vmware-vpxa and have plenty of free space

Here's hostd.log at that time:

Resource pool instantiated

Resource pool instantiated

Resource pool instantiated

Resource pool instantiated

Resource pool instantiated

Resource pool instantiated

Resource pool instantiated

Resource pool instantiated

Refresh interval: 60 seconds

Plugin initialized

Trying internalsvc

Trying nfcsvc

Breakpoints disabled

System libcrypto.so.0.9.7 library is older than our library (90701F < 90709F)

HOSTINFO: Seeing AMD CPU, numCoresPerCPU 2 numThreadsPerCore 1.

HOSTINFO: hyperthreading disabled, setting number of threads per core to 1.

HOSTINFO: This machine has 4 physical CPUS, 8 total cores, and 8 logical CPUs.

File Services initialized fmgr()

Trying partitionsvc

Initializing partitionsvc plugin

Trying proxysvc

Proxy config filepath is /etc/vmware/hostd/proxy.xml

Plugin initialized

Trying snmpsvc

Init: Plugin initialized

Trying solo

Vmacore::InitSSL: doVersionCheck = false, handshakeTimeoutUs = 120000000

Initialized SSL context with version all

Host name: 123.domain.com

Compute resource instantiated

Environments file: /etc/vmware/hostd/environments.xml

Descriptor loaded: 0

Options loaded

Descriptor loaded: 1

Options loaded

HAL05LoadHALLibraries: Could not dlopen libhal.so.1.

HAL04LoadHALLibraries: Could not dlopen libhal.so.0.

Environments file: /etc/vmware/hostd/environments.xml

Descriptor loaded: 0

Options loaded

Descriptor loaded: 1

Options loaded

HAL05LoadHALLibraries: Could not dlopen libhal.so.1.

HAL04LoadHALLibraries: Could not dlopen libhal.so.0.

local swap policy: false

Plugin initialized

Trying statssvc

Initializing statssvc plugin

Collection interval is 20 sec

Trying supportsvc

Plugin initialized

Trying vcsvc

Trying vdisksvc

Plugin initialized

Trying vimsvc

Initializing vimsvc plugin

Posix user manager initialized

Task retention time: 10

Cleanup time: 1

Maximum number of events: 1000

Locale subsystem initialized from /usr/lib/vmware/hostd/locale/ with default locale en.

PAM symbols loaded.

Loading authorization data from /etc/vmware/hostd/authorization.xml

VMLicense_InitJob: putenv "HOME=/root"

Acquire: Attempting to acquire license for feature: esxFull, need qty=4 licenses

Trying vmsvc

Using fifo directory /var/run/vmware/hostd-fifo for VMDB instance.

Foundry Init: setting up global state (1 threads)

vmxLoaderPath="/usr/lib/vmware/bin/vmkload_app"

vmxFilePath="/usr/lib/vmware/bin/vmware-vmx"

vmxFilePathDebug="/usr/lib/vmware/bin-debug/vmware-vmx"

vmxFilePathStats="/usr/lib/vmware/bin-stats/vmware-vmx"

HOSTINFO: Seeing AMD CPU, numCoresPerCPU 2 numThreadsPerCore 1.

HOSTINFO: hyperthreading disabled, setting number of threads per core to 1.

HOSTINFO: This machine has 4 physical CPUS, 8 total cores, and 8 logical CPUs.

Foundry Init: version 2, hostType 1, hostName (null), hostPort 0, options 0x64

VMServices Plugin initialized

Block List Service Plugin started

Plugin started

About:(vim.AboutInfo) {

dynamicType = ,

osType = "vmnix-x86",

productLineId = "esx",

apiType = "HostAgent",

apiVersion = "2.5.0",

}

var/log/messages

May 15 00:13:26 123 modprobe: modprobe: Can't locate module char-major-14

May 15 00:13:26 123 modprobe: modprobe: Can't locate module char-major-14

May 15 00:13:27 123 vmware-authd(pam_unix)[4121]: authentication failure; logname= uid=0 euid=0 tty= ruser= rhost= user=root

May 15 00:13:31 123 modprobe: modprobe: Can't locate module char-major-14

var/log/secure

May 15 00:13:32 123 vmware-authd[4238]: Local connection for vmx-vmdb established.

May 15 00:13:33 123 vmware-authd[4243]: Local connection for vmx-vmdb established.

May 15 00:13:34 123 vmware-authd[4246]: Local connection for vmx-vmdb established.

May 15 00:13:35 123 vmware-authd[4249]: Local connection for vmx-vmdb established.

May 15 00:13:35 123 vmware-authd[4252]: Local connection for vmx-vmdb established.

May 15 00:13:36 123 vmware-authd[4255]: Local connection for vmx-vmdb established.

May 15 00:13:37 123 vmware-authd[4258]: Local connection for vmx-vmdb established.

May 15 00:13:37 123 vmware-authd[4261]: Local connection for vmx-vmdb established.

May 15 00:13:38 123 vmware-authd[4265]: Local connection for vmx-vmdb established.

May 15 00:13:39 123 vmware-authd[4268]: Local connection for vmx-vmdb established.

May 15 00:13:39 123 vmware-authd[4284]: Local connection for vmx-vmdb established.

May 15 00:13:41 123 vmware-authd[4303]: Local connection for vmx-vmdb established.

May 15 00:13:41 123 vmware-authd[4306]: Local connection for vmx-vmdb established.

May 15 00:13:42 123 vmware-authd[4309]: Local connection for vmx-vmdb established.

May 15 00:13:44 123 vmware-authd[4313]: Local connection for vmx-vmdb established.

May 15 00:13:44 123 vmware-authd[4316]: Local connection for vmx-vmdb established.

May 15 00:13:45 123 vmware-authd[4319]: Local connection for vmx-vmdb established.

May 15 00:13:45 123 vmware-authd[4322]: Local connection for vmx-vmdb established.

May 15 00:13:46 123 vmware-authd[4325]: Local connection for vmx-vmdb established.

May 15 00:13:47 123 vmware-authd[4328]: Local connection for vmx-vmdb established.

May 15 00:13:48 123 vmware-authd[4332]: Local connection for vmx-vmdb established.

May 15 00:13:49 123 vmware-authd[4335]: Local connection for vmx-vmdb established.

May 15 00:13:49 123 vmware-authd[4338]: Local connection for vmx-vmdb established.

May 15 00:13:50 123 vmware-authd[4341]: Local connection for vmx-vmdb established.

May 15 00:13:50 123 vmware-authd[4344]: Local connection for vmx-vmdb established.

May 15 00:13:51 123 vmware-authd[4347]: Local connection for vmx-vmdb established.

May 15 00:13:52 123 vmware-authd[4350]: Local connection for vmx-vmdb established.

May 15 00:13:53 123 vmware-authd[4353]: Local connection for vmx-vmdb established.

May 15 00:13:53 123 vmware-authd[4357]: Local connection for vmx-vmdb established.

May 15 00:13:54 123 vmware-authd[4360]: Local connection for vmx-vmdb established.

May 15 00:13:55 123 vmware-authd[4363]: Local connection for vmx-vmdb established.

May 15 00:13:56 123 vmware-authd[4368]: Local connection for vmx-vmdb established.

May 15 00:13:57 123 vmware-authd[4371]: Local connection for vmx-vmdb established.

May 15 00:13:58 123 vmware-authd[4377]: Local connection for vmx-vmdb established.

May 15 00:13:58 123 vmware-authd[4381]: Local connection for vmx-vmdb established.

May 15 00:13:59 123 vmware-authd[4384]: Local connection for vmx-vmdb established.

May 15 00:14:00 123 vmware-authd[4387]: Local connection for vmx-vmdb established.

May 15 00:14:00 123 vmware-authd[4390]: Local connection for vmx-vmdb established.

May 15 00:14:01 123 vmware-authd[4393]: Local connection for vmx-vmdb established.

May 15 00:14:02 123 vmware-authd[4396]: Local connection for vmx-vmdb established.

Swap activity:

total used free shared buffers cached

Mem: 262 255 7 0 23 102

-/+ buffers/cache: 129 133

Swap: 1023 162 861

0 Kudos
admin
Immortal
Immortal

Those hostd logs seem to show that hostd was in the process of restarting at that time.

What do the vpxa logs show at that time? Check the vpxa logs in the directory /var/log/vmware/vpx/

Look back through the hostd logs: ls -ltr /var/log/vmware/hostd*.log

Do they show that its restarting alot?

0 Kudos
admin
Immortal
Immortal

That output from free -m shows that the service console is swapping really heavily - 162MB's.

This can result in instability of the management of the ESX server and may explain this problem.

You should increase the Service Console RAM from 262MB's to at least 600MB's.

A reboot is needed for the change to be applied.

You can change the RAM through the VI Client by clicking the host, then the "Configuration" tab, then the "Memory" option.

0 Kudos