VMware Cloud Community
dobrinsi
Contributor
Contributor

Hosts randomly show a state not responding from a vCenter server 6.0U3!

Hello, from 13.09.2018 is this problem.

Hosts randomly show a state not responding from a vCenter server for 2-10sec and returns to normal state ! Virtual machines become gray but work!

veeam backup does not work properly after hosts not responding through vcenter.

I checked all the tips here but none of them solved my problem!!!

-------------------------

esxi is 6.0.0

vcenter server is 6.0.0 working under windows server 2008r2

I have HA and DRS included

--------------------------

In vpxd.log I see the following things

..........

2018-10-01T11:38:23.385+03:00 info vpxd[27956] [Originator@6876 sub=vpxLro opID=task-internal-843877-7da5a68d] [VpxLRO] -- BEGIN task-internal-843877 --  -- ScheduledTaskLRO --

2018-10-01T11:38:23.385+03:00 error vpxd[27956] [Originator@6876 sub=Snmp opID=task-internal-843877-7da5a68d] [VpxdSNMP::Init] ColdStartTrap sending failed.

2018-10-01T11:38:23.388+03:00 error vpxd[27956] [Originator@6876 sub=Snmp opID=task-internal-843877-7da5a68d] [VpxdSNMP::Init] ColdStartTrap sending failed.

2018-10-01T11:38:23.389+03:00 info vpxd[27956] [Originator@6876 sub=vpxLro opID=task-internal-843877-7da5a68d] [VpxLRO] -- FINISH task-internal-843877

2018-10-01T11:38:23.415+03:00 info vpxd[07328] [Originator@6876 sub=HostUpgrader opID=HeartbeatStartHandler-5b7b9dc0] [VpxdHostUpgrader] Preinstalled bundle found: not installing

2018-10-01T11:38:23.415+03:00 info vpxd[07328] [Originator@6876 sub=InvtHostCnx opID=HeartbeatStartHandler-5b7b9dc0] [VpxdIntHost] Missed 3 heartbeats for host esx14.mydomain.com

2018-10-01T11:38:23.415+03:00 info vpxd[12524] [Originator@6876 sub=vpxLro opID=HB-SpecSync-host-36308@9465-1b0c72ef] [VpxLRO] -- BEGIN task-internal-843879 -- host-36308 -- SpecSyncLRO.Synchronize --

2018-10-01T11:38:24.651+03:00 info vpxd[07328] [Originator@6876 sub=HostUpgrader opID=HeartbeatStartHandler-5b7b9dc0] [VpxdHostUpgrader] Preinstalled bundle found: not installing

2018-10-01T11:38:24.651+03:00 info vpxd[07328] [Originator@6876 sub=InvtHostCnx opID=HeartbeatStartHandler-5b7b9dc0] [VpxdIntHost] Missed 3 heartbeats for host esx05.mydomain.com

2018-10-01T11:38:24.651+03:00 info vpxd[13528] [Originator@6876 sub=vpxLro opID=HB-SpecSync-host-22@74098-5c0c3c15] [VpxLRO] -- BEGIN task-internal-843883 -- host-22 -- SpecSyncLRO.Synchronize --

2018-10-01T11:38:27.890+03:00 info vpxd[04836] [Originator@6876 sub=vpxLro opID=6a3dabdc] [VpxLRO] -- BEGIN task-internal-843887 -- ServiceInstance -- vim.ServiceInstance.GetContent -- 524cc963-9d76-a082-fdd3-4834c0013d11(52ddeafc-b7b4-b847-f02b-1672676fb31b)

2018-10-01T11:38:27.891+03:00 info vpxd[04836] [Originator@6876 sub=vpxLro opID=6a3dabdc] [VpxLRO] -- FINISH task-internal-843887

2018-10-01T11:38:28.240+03:00 info vpxd[07240] [Originator@6876 sub=vpxLro opID=opId-8fe2e91c-264b-427e-8601-10531dc0cba8-ab-9e] [VpxLRO] -- BEGIN task-internal-843888 -- ServiceInstance -- vim.ServiceInstance.GetServerClock -- 52a01871-8b92-aec6-0313-7eaa738aa24b(529c2f36-3ac9-6456-e306-41442bd24b2a)......

......

Reply
0 Kudos
5 Replies
vijayrana968
Virtuoso
Virtuoso

On Windows based vCenter machine, have you installed July 2018 patches ? Can you check if there's TCP/IP warnings in Event Viewer on vCenter windows machine !

Reply
0 Kudos
dobrinsi
Contributor
Contributor

from 07.2018 in update history I see this:

Windows Malicious Software Removal Tool x64 - July 2018 (KB890830)

Installation date: ‎12.‎7.‎2018 ‎г. 03:01 ч.

Installation status: Successful

Update type: Important

After the download, this tool runs one time to check your computer for infection by specific, prevalent malicious software (including Blaster, Sasser, and Mydoom) and helps remove any infection that is found. If an infection is found, the tool will display a status report the next time that you start your computer. A new version of the tool will be offered every month. If you want to manually run the tool on your computer, you can download a copy from the Microsoft Download Center, or you can run an online version from microsoft.com. This tool is not a replacement for an antivirus product. To help protect your computer, you should use an antivirus product.

More information:

http://support.microsoft.com/kb/890830

Help and Support:

http://support.microsoft.com

--------------------------------

I found a similar tcp/ip warning event but he is old!

Log Name:      System

Source:        Tcpip

Date:          12.8.2018 г. 23:05:19 ч.

Event ID:      4227

Task Category: None

Level:         Warning

Keywords:      Classic

User:          N/A

Computer:      vcenter.mydomain.com

Description:

TCP/IP failed to establish an outgoing connection because the selected local endpoint was recently used to connect to the same remote endpoint. This error typically occurs when outgoing connections are opened and closed at a high rate, causing all available local ports to be used and forcing TCP/IP to reuse a local port for an outgoing connection. To minimize the risk of data corruption, the TCP/IP standard requires a minimum time period to elapse between successive connections from a given local endpoint to a given remote endpoint.

Event Xml:

<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">

  <System>

    <Provider Name="Tcpip" />

    <EventID Qualifiers="32768">4227</EventID>

    <Level>3</Level>

    <Task>0</Task>

    <Keywords>0x80000000000000</Keywords>

    <TimeCreated SystemTime="2018-08-12T20:05:19.078331700Z" />

    <EventRecordID>89849</EventRecordID>

    <Channel>System</Channel>

    <Computer>vcenter.mydomain.com</Computer>

    <Security />

  </System>

  <EventData>

    <Data>

    </Data>

    <Binary>00000000010000000000000083100080000000000000000000000000000000000000000000000000</Binary>

  </EventData>

</Event>

Reply
0 Kudos
MikeStoica
Expert
Expert

What  version of Windows Server you have? I mean with or without SP?

Check this VMware Knowledge Base

Reply
0 Kudos
dobrinsi
Contributor
Contributor

windows server 2008R2/64bit without SP1

I noticed that with the occurrence of the problem, an error occurred in the system event

Log Name:      Microsoft-Windows-CAPI2/Operational

Source:        Microsoft-Windows-CAPI2

Date:          2.10.2018 г. 14:16:09 ч.

Event ID:      11

Task Category: Build Chain

Level:         Error

Keywords:      Path Discovery,Path Validation

User:          SYSTEM

Computer:      vcenter..mydomain.com

Description:

For more details for this event, please refer to the "Details" section

Event Xml:

<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">

  <System>

    <Provider Name="Microsoft-Windows-CAPI2" Guid="{5bbca4a8-b209-48dc-a8c7-b23d3e5216fb}" />

    <EventID>11</EventID>

    <Version>0</Version>

    <Level>2</Level>

    <Task>11</Task>

    <Opcode>2</Opcode>

    <Keywords>0x4000000000000003</Keywords>

    <TimeCreated SystemTime="2018-10-02T11:16:09.593150200Z" />

    <EventRecordID>6922</EventRecordID>

    <Correlation />

    <Execution ProcessID="6772" ThreadID="16228" />

    <Channel>Microsoft-Windows-CAPI2/Operational</Channel>

    <Computer>vcenter.mydomain.com</Computer>

    <Security UserID="S-1-5-18" />

  </System>

  <UserData>

    <CertGetCertificateChain>

      <Certificate fileRef="FB1871881C12AE75F2CC08AF49AC60E1EEAD7A31.cer" subjectName="vcenter.mydomain.com" />

      <AdditionalStore>

        <Certificate fileRef="1FA2BCB97A5EB2A0C6590B7191079D07475F8B79.cer" subjectName="CA, CN=vcenter, dc=vsphere,dc=local" />

      </AdditionalStore>

      <ExtendedKeyUsage />

      <Flags value="0" />

      <ChainEngineInfo context="user" />

      <AdditionalInfo>

        <NetworkConnectivityStatus value="1" _SENSAPI_NETWORK_ALIVE_LAN="true" />

      </AdditionalInfo>

      <CertificateChain chainRef="{75D1C484-6AC5-413D-876E-02A4909DD170}">

        <TrustStatus>

          <ErrorStatus value="20" CERT_TRUST_IS_UNTRUSTED_ROOT="true" />

          <InfoStatus value="100" CERT_TRUST_HAS_PREFERRED_ISSUER="true" />

        </TrustStatus>

        <ChainElement>

          <Certificate fileRef="FB1871881C12AE75F2CC08AF49AC60E1EEAD7A31.cer" subjectName="vcenter.mydomain.com" />

          <SignatureAlgorithm oid="1.2.840.113549.1.1.11" hashName="SHA256" publicKeyName="RSA" />

          <PublicKeyAlgorithm oid="1.2.840.113549.1.1.1" publicKeyName="RSA" publicKeyLength="2048" />

          <TrustStatus>

            <ErrorStatus value="0" />

            <InfoStatus value="104" CERT_TRUST_HAS_NAME_MATCH_ISSUER="true" CERT_TRUST_HAS_PREFERRED_ISSUER="true" />

          </TrustStatus>

          <ApplicationUsage any="true" />

          <IssuanceUsage />

        </ChainElement>

        <ChainElement>

          <Certificate fileRef="1FA2BCB97A5EB2A0C6590B7191079D07475F8B79.cer" subjectName="CA, CN=vcenter, dc=vsphere,dc=local" />

          <SignatureAlgorithm oid="1.2.840.113549.1.1.11" hashName="SHA256" publicKeyName="RSA" />

          <PublicKeyAlgorithm oid="1.2.840.113549.1.1.1" publicKeyName="RSA" publicKeyLength="2048" />

          <TrustStatus>

            <ErrorStatus value="20" CERT_TRUST_IS_UNTRUSTED_ROOT="true" />

            <InfoStatus value="10C" CERT_TRUST_HAS_NAME_MATCH_ISSUER="true" CERT_TRUST_IS_SELF_SIGNED="true" CERT_TRUST_HAS_PREFERRED_ISSUER="true" />

          </TrustStatus>

          <ApplicationUsage any="true" />

          <IssuanceUsage any="true" />

        </ChainElement>

      </CertificateChain>

      <EventAuxInfo ProcessName="vpxd.exe" />

      <CorrelationAuxInfo TaskId="{EC71EDCC-5A38-474C-94AD-C183B283B254}" SeqNumber="31" />

      <Result value="800B0109">A certificate chain processed, but terminated in a root certificate which is not trusted by the trust provider.</Result>

    </CertGetCertificateChain>

  </UserData>

</Event>

Reply
0 Kudos
Dave_the_Wave
Hot Shot
Hot Shot

What's the CPU usage on that 2008R2 box?

If it were freaking out, you will surely get the problems described.

This may be helpful to you:

https://communities.vmware.com/message/2736529#2736529

Reply
0 Kudos