VMware Horizon Community
dborgill
Enthusiast
Enthusiast

URGENT - Frequent Horizon Agent Disconnects / Desktop Unavailable

We are currently troubleshooting a very frustrating issue where we are having frequent disconnects across hundreds of users out of 1500 (at random times) where we see sessions disconnect and when the user gets disconnected, they get the "This desktop currently has no desktop resources available." error. Upon checking the machine, we see the Horizon Agent service is stopped. If we manually start the Agent service, it will become available again but tends to crash again within 24 hours. Below are the details of the environment, along with the errors in the Windows Event Viewer. We can find no correlation at all on why some VDIs crash and some don't. We did find that if we reboot the VM, we don't see another crash for several days to a week. If we simply restart the Horizon Agent, it crashes or stops again within 24 hours.

Very long story short, we've engaged VMware support and the only thing they have found is errors regarding to WMI. They then pointed us to Microsoft support. Microsoft support confirmed that machines having the Agent issues, the WMI service is in a semi functioning state. We rebuilt the WMI repository on several machines but still cannot find a root cause and cannot determine after weeks if this is a WMI/OS issue or some sort of VMware Agent issue. Any help, similar experience or fix would be greatly appreciated.

Windows OS on All Desktops: Windows 10 1709 16299.579 (patched up to Sept 2018)

Pools: All Persistent desktops

VDI Specs: 3-4 vCPUs / 8-16 GB RAM (Disconnects happening on all pools in all regions, with all specs)

Antivirus: Cylance

Domain: One Global AD 2012 Domain

Connection Server Version: 7.5.0

Desktop Horizon Agent Version: 7.5.0

Protocol: PCOIP

Endpoints: HP Zero Clients (Disconnects happening on all types of endpoints)

Error showing up in only a few machines on crash: (On most disconnects, there are NO related errors in the Windows Event Logs)

Log Name:      System

Source:        Service Control Manager

Date:          11/8/2018 4:40:40 PM

Event ID:      7031

Task Category: None

Level:         Error

Keywords:      Classic

User:          N/A

Computer:   

Description:

The VMware Horizon View Agent service terminated unexpectedly.  It has done this 1 time(s).  The following corrective action will be taken in 60000 milliseconds: Restart the service.

Event Xml:

<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">

  <System>

    <Provider Name="Service Control Manager" Guid="{555908d1-a6d7-4695-8e1e-26931d2012f4}" EventSourceName="Service Control Manager" />

    <EventID Qualifiers="49152">7031</EventID>

    <Version>0</Version>

    <Level>2</Level>

    <Task>0</Task>

    <Opcode>0</Opcode>

    <Keywords>0x8080000000000000</Keywords>

    <TimeCreated SystemTime="2018-11-08T21:40:40.729919700Z" />

    <EventRecordID>4656</EventRecordID>

    <Correlation />

    <Execution ProcessID="664" ThreadID="12948" />

    <Channel>System</Channel>

    <Computer></Computer>

    <Security />

  </System>

  <EventData>

    <Data Name="param1">VMware Horizon View Agent</Data>

    <Data Name="param2">1</Data>

    <Data Name="param3">60000</Data>

    <Data Name="param4">1</Data>

    <Data Name="param5">Restart the service</Data>

    <Binary>570053004E004D000000</Binary>

  </EventData>

</Event>

0 Kudos
2 Replies
techguy129
Expert
Expert

If you are rebooting and it stays stable for a several days it leads me to believe that there may be a memory leak. Are you running anything like App Volumes or thinapp? How about nVidia Grid? Have you tried reinstalling the agents in the correct order? I would suggest trying a newer horizon agent.

I had a similar issue and it was related to memory on the virtual machine running out. The root cause for us was we were running nvidia grid with under 1GB of video memory. For this issue we could see an error in the vmx.log but I don't think this applies to your issue.

0 Kudos
BenFB
Virtuoso
Virtuoso

What version of VMware Tools/Horizon Agent are you running? We had a similar issue and we believe it was caused by resource exhaustion caused by either VMware Tools 10.1.10 or 10.2.0. Upgrading VMware Tools to 10.2.5 solved it for us and we noticed a decrease in memory consumption.

0 Kudos